Regexp named capture groups

Started by Joel Jacobsonabout 8 years ago3 messageshackers
Jump to latest
#1Joel Jacobson
joel@trustly.com

Hi hackers,

Is anyone working on this feature[1]http://2ality.com/2017/05/regexp-named-capture-groups.html also for PostgreSQL's regex engine?

I'm thinking it could work something like this:

joel=# \df regexp_match
List of functions
Schema | Name | Result data type | Argument data types | Type
------------+--------------+------------------+---------------------+--------
pg_catalog | regexp_match | jsonb | text, text | normal
pg_catalog | regexp_match | jsonb | text, text, text | normal
(2 rows)

joel=#* SELECT regexp_match_named(
joel(#* '2018-12-31',
joel(#* '(?<year>[0-9]{4})-(?<month>[0-9]{2})-(?<day>[0-9]{2})'
joel(#* );
regexp_match_named
----------------------------------------------
{"day": "31", "year": "2018", "month": "12"}
(1 row)

I think this feature would be awesome, for the reasons mentioned in [1]http://2ality.com/2017/05/regexp-named-capture-groups.html, quote:

"Referring to capture groups via numbers has several disadvantages:
1. Finding the number of a capture group is a hassle: you have to
count parentheses.
2. You need to see the regular expression if you want to understand
what the groups are for.
3. If you change the order of the capture groups, you also have to
change the matching code."

[1]: http://2ality.com/2017/05/regexp-named-capture-groups.html

Best regards,

Joel Jacobson

#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Joel Jacobson (#1)
Re: Regexp named capture groups

2018-02-03 11:19 GMT+01:00 Joel Jacobson <joel@trustly.com>:

Hi hackers,

Is anyone working on this feature[1] also for PostgreSQL's regex engine?

I'm thinking it could work something like this:

joel=# \df regexp_match
List of functions
Schema | Name | Result data type | Argument data types | Type
------------+--------------+------------------+-------------
--------+--------
pg_catalog | regexp_match | jsonb | text, text |
normal
pg_catalog | regexp_match | jsonb | text, text, text |
normal
(2 rows)

joel=#* SELECT regexp_match_named(
joel(#* '2018-12-31',
joel(#* '(?<year>[0-9]{4})-(?<month>[0-9]{2})-(?<day>[0-9]{2})'
joel(#* );
regexp_match_named
----------------------------------------------
{"day": "31", "year": "2018", "month": "12"}
(1 row)

I think this feature would be awesome, for the reasons mentioned in [1],
quote:

"Referring to capture groups via numbers has several disadvantages:
1. Finding the number of a capture group is a hassle: you have to
count parentheses.
2. You need to see the regular expression if you want to understand
what the groups are for.
3. If you change the order of the capture groups, you also have to
change the matching code."

[1] http://2ality.com/2017/05/regexp-named-capture-groups.html

looks like nice feature

Pavel

Show quoted text

Best regards,

Joel Jacobson

#3Michael Paquier
michael@paquier.xyz
In reply to: Pavel Stehule (#2)
Re: Regexp named capture groups

On Sat, Feb 03, 2018 at 01:55:31PM +0100, Pavel Stehule wrote:

2018-02-03 11:19 GMT+01:00 Joel Jacobson <joel@trustly.com>:

Is anyone working on this feature[1] also for PostgreSQL's regex
engine?

Note that I know of.

I think this feature would be awesome, for the reasons mentioned in [1],
quote:

"Referring to capture groups via numbers has several disadvantages:
1. Finding the number of a capture group is a hassle: you have to
count parentheses.
2. You need to see the regular expression if you want to understand
what the groups are for.
3. If you change the order of the capture groups, you also have to
change the matching code."

[1] http://2ality.com/2017/05/regexp-named-capture-groups.html

looks like nice feature

Yes, it looks that this could allow the simplification of equivalent
queries, which I guess would use a CTE to achieve the same.
--
Michael