regexp_replace 'g' flag

Started by Bruce Momjianover 12 years ago5 messagesdocs
Jump to latest
#1Bruce Momjian
bruce@momjian.us

Why doesn't the 'g' flag appear in this table?

http://www.postgresql.org/docs/9.2/static/functions-matching.html#POSIX-EMBEDDED-OPTIONS-TABLE

I see text here that says:

http://www.postgresql.org/docs/9.2/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP
Other supported flags are described in Table 9-19.

Seems 'g' should appear there too.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#2Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#1)
Re: regexp_replace 'g' flag

On Thu, Sep 5, 2013 at 08:37:44PM -0400, Bruce Momjian wrote:

Why doesn't the 'g' flag appear in this table?

http://www.postgresql.org/docs/9.2/static/functions-matching.html#POSIX-EMBEDDED-OPTIONS-TABLE

I see text here that says:

http://www.postgresql.org/docs/9.2/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP
Other supported flags are described in Table 9-19.

Seems 'g' should appear there too.

Is it because the table has generic pattern modififers and 'g' only is
relevant for regexp_replace? I assume so.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#2)
Re: regexp_replace 'g' flag

Bruce Momjian <bruce@momjian.us> writes:

On Thu, Sep 5, 2013 at 08:37:44PM -0400, Bruce Momjian wrote:

Why doesn't the 'g' flag appear in this table?
http://www.postgresql.org/docs/9.2/static/functions-matching.html#POSIX-EMBEDDED-OPTIONS-TABLE

Is it because the table has generic pattern modififers and 'g' only is
relevant for regexp_replace? I assume so.

The table is specifically about ARE options, and 'g' is *not* one of
those. Adding 'g' to the table would be wrong.

It does seem to me to be a bit confusing that the text description of
substring() mentions 'i' and 'g' explicitly, when only 'i' is listed in
the table. You could make a case for phrasing along the line of
"substring() supports the 'g' flag that specifies ..., as well as all the
flags listed in Table 9-19". On the other hand, 'i' is the most useful of
the flags listed in the table by several country miles, and it doesn't
seem quite right to make people go off and consult the table to find out
about it.

Not sure whether there's any real improvement that can be made here,
but I suppose it'd be nice if the text descriptions of substring() and
regexp_replace() handled this matter in the same way ...

regards, tom lane

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: Bruce Momjian (#1)
Re: regexp_replace 'g' flag

Sorry if you get this twice but I use Nabble and didn't subscribe to the list
so my originals got put into the verification queue. I've subscribed now
and am re-posting hoping it will go through clean.

See my self-quote comment and my direct comment at the end.

David Johnston wrote

Tom Lane-2 wrote

Bruce Momjian &lt;

bruce@

&gt; writes:

On Thu, Sep 5, 2013 at 08:37:44PM -0400, Bruce Momjian wrote:

Why doesn't the 'g' flag appear in this table?
http://www.postgresql.org/docs/9.2/static/functions-matching.html#POSIX-EMBEDDED-OPTIONS-TABLE

Is it because the table has generic pattern modififers and 'g' only is
relevant for regexp_replace? I assume so.

The table is specifically about ARE options, and 'g' is *not* one of
those. Adding 'g' to the table would be wrong.

It does seem to me to be a bit confusing that the text description of
substring() mentions 'i' and 'g' explicitly, when only 'i' is listed in
the table. You could make a case for phrasing along the line of
"substring() supports the 'g' flag that specifies ..., as well as all the
flags listed in Table 9-19". On the other hand, 'i' is the most useful
of
the flags listed in the table by several country miles, and it doesn't
seem quite right to make people go off and consult the table to find out
about it.

Not sure whether there's any real improvement that can be made here,
but I suppose it'd be nice if the text descriptions of substring() and
regexp_replace() handled this matter in the same way ...

regards, tom lane

substring(text from pattern) returns a scalar text which corresponds to
either the entire first match found or the sub-portion of the first match
corresponding to the first (and only first if more than one) matching
group in the expression. It cannot act globally and so cannot accept/use
a "g" flag even if there was some way to provide it.

regexp_replace indeed handles a "g" flag because while it too returns a
scalar text it returns the entire source string post-modification as
opposed to only a subset thereof and the modification itself makes use of
the "g" flag to decide whether to replace one or ALL occurrences.

I cannot find where "the text description of substring() mentions 'i' and
'g' explicitly"; could you maybe copy-paste a direct quote and also note
the exaction section of the page you are looking in?

David J.

A little bit rambly but hopefully instructive...

"embedded" is the key word here. Although not applicable to PostgreSQL an
embedded modifier alters the interpretation of the pattern between the
"start" and "end" modifier expression (for PostgreSQL there is only a
"start", no end, and so the embedded modifier affects the entire pattern).
While it is possible to turn on/off case insensitivity, .-newline, and some
other options the "g" (global) option can only apply to the pattern as a
whole and conceptually belongs to the executor of the pattern as opposed to
the pattern itself.

The "g" option is relevant to both "regexp_replace" and "regexp_matches".
In the later case using the "g" modifier allows for more than one row to be
returned from the SRF. In both cases the entire pattern is being applied to
the input text and the "g" modifier tells the matching algorithm not only to
affirm there is at least one match but to identify all sections of the
source text that match the entire pattern.

PostgreSQL is somewhat more limited in using these embedded options than
other implementations since, IIRC (and my quick scan of the linked documents
just now), you can only begin the pattern with these and so they apply to
the entire pattern too. Basically they provide a way to include flags in
the pattern when dealing with operator-based invocation. In other
implementations it is possible to write something like:

'(?i)this section is case insensitive(?-i)this section is case sensitive'

namely toggling these on/off within a pattern.

Since the "g"lobal flag only makes sense in function-call invocations it is
not needed nor useful to have embedded within the expression itself. i.e.,
operator-based invocations only deal with 'true/false' evaluations which is
a one+-or-none evaluation.

David J.

--
View this message in context: http://postgresql.1045698.n5.nabble.com/regexp-replace-g-flag-tp5769814p5769912.html
Sent from the PostgreSQL - docs mailing list archive at Nabble.com.

--
Sent via pgsql-docs mailing list (pgsql-docs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-docs

#5Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#3)
Re: regexp_replace 'g' flag

On Thu, Sep 5, 2013 at 09:59:13PM -0400, Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

On Thu, Sep 5, 2013 at 08:37:44PM -0400, Bruce Momjian wrote:

Why doesn't the 'g' flag appear in this table?
http://www.postgresql.org/docs/9.2/static/functions-matching.html#POSIX-EMBEDDED-OPTIONS-TABLE

Is it because the table has generic pattern modififers and 'g' only is
relevant for regexp_replace? I assume so.

The table is specifically about ARE options, and 'g' is *not* one of
those. Adding 'g' to the table would be wrong.

It does seem to me to be a bit confusing that the text description of
substring() mentions 'i' and 'g' explicitly, when only 'i' is listed in
the table. You could make a case for phrasing along the line of
"substring() supports the 'g' flag that specifies ..., as well as all the
flags listed in Table 9-19". On the other hand, 'i' is the most useful of
the flags listed in the table by several country miles, and it doesn't
seem quite right to make people go off and consult the table to find out
about it.

Not sure whether there's any real improvement that can be made here,
but I suppose it'd be nice if the text descriptions of substring() and
regexp_replace() handled this matter in the same way ...

I went ahead and just explicitly documented that 'g' is not in the
table.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

Attachments:

regex.difftext/x-diff; charset=us-asciiDownload+5-3