regexp_matches bug in 9.3.4 and 9.4.1
Repro:
drop table if exists regex_test;
create table regex_test(filter citext);
insert into regex_test(filter) values ('Filter Filter Filter');
select filter, regexp_matches(filter, 'Filter', 'g') from regex_test;
select filter, regexp_matches('Filter Filter Filter', 'Filter', 'g') from regex_test;
Expected result:
Both select statements should return the same number of rows. Specifically, in this case, I expect to get back 3 rows.
Actual result:
When referencing the filter column, only one row is ever returned. When supplying the string that is the same as the data in the filter column, the correct number of rows are returned.
Result 1 (incorrect):
select filter, regexp_matches(filter, 'Filter', 'g') from regex_test;
Filter Filter Filter {Filter}
Result 2 (correct):
select filter, regexp_matches('Filter Filter Filter', 'Filter', 'g') from regex_test;
Filter Filter Filter {Filter}
Filter Filter Filter {Filter}
Filter Filter Filter {Filter}
Versions
PostgreSQL 9.3.4, compiled by Visual C++ build 1600, 64-bit
PostgreSQL 9.4.1 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-11), 64-bit
Please note...
On Mon, May 4, 2015 at 9:56 AM, Jeff Certain <gcertain@dynamicsignal.com>
wrote:
Repro:
drop table if exists regex_test;
create table regex_test(filter citext);
insert into regex_test(filter) values ('Filter Filter Filter');
select filter, regexp_matches(filter, 'Filter', 'g') from regex_test;
"filter" is of type "citext"
select filter, regexp_matches('Filter Filter Filter', 'Filter', 'g') from
regex_test;
'Filter Filter Filter' is of type "text"
Expected result:
Both select statements should return the same number of rows.
Specifically, in this case, I expect to get back 3 rows.
The following non-citext version of your first query works just fine on 9.3
select filter, regexp_matches(filter, 'Filter', 'g') from (values ('Filter
Filter Filter'::text)) regex_test (filter);
regexp_matches is not mentioned on the following page in the documentation:
http://www.postgresql.org/docs/9.4/interactive/citext.html
Actual result:
When referencing the filter column, only one row is ever returned. When
supplying the string that is the same as the data in the filter column, the
correct number of rows are returned.Result 1 (incorrect):
select filter, regexp_matches(filter, 'Filter', 'g') from regex_test;
Filter Filter Filter {Filter}
Result 2 (correct):
select filter, regexp_matches('Filter Filter Filter', 'Filter', 'g') from
regex_test;Filter Filter Filter {Filter}
Filter Filter Filter {Filter}
Filter Filter Filter {Filter}
This fails on 9.3 if the citext type is used instead of text.
Seems the however the citext version of regexp_matches is invoked it
ignores the global flag. Haven't explored deeper.
David J.
Thanks for the quick response.
From my end of things, I would expect either a) an exception; b) 3 matches; or c) 0 matches. The current behavior (returning one match and ignoring the global flag) strikes me as being misleading and challenging to debug.
Jeff
From: David G. Johnston [mailto:david.g.johnston@gmail.com]
Sent: Monday, May 4, 2015 3:37 PM
To: Jeff Certain
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] regexp_matches bug in 9.3.4 and 9.4.1
Please note...
On Mon, May 4, 2015 at 9:56 AM, Jeff Certain <gcertain@dynamicsignal.com<mailto:gcertain@dynamicsignal.com>> wrote:
Repro:
drop table if exists regex_test;
create table regex_test(filter citext);
insert into regex_test(filter) values ('Filter Filter Filter');
select filter, regexp_matches(filter, 'Filter', 'g') from regex_test;
"filter" is of type "citext"
select filter, regexp_matches('Filter Filter Filter', 'Filter', 'g') from regex_test;
'Filter Filter Filter' is of type "text"
Expected result:
Both select statements should return the same number of rows. Specifically, in this case, I expect to get back 3 rows.
The following non-citext version of your first query works just fine on 9.3
select filter, regexp_matches(filter, 'Filter', 'g') from (values ('Filter Filter Filter'::text)) regex_test (filter);
regexp_matches is not mentioned on the following page in the documentation:
http://www.postgresql.org/docs/9.4/interactive/citext.html
Actual result:
When referencing the filter column, only one row is ever returned. When supplying the string that is the same as the data in the filter column, the correct number of rows are returned.
Result 1 (incorrect):
select filter, regexp_matches(filter, 'Filter', 'g') from regex_test;
Filter Filter Filter {Filter}
Result 2 (correct):
select filter, regexp_matches('Filter Filter Filter', 'Filter', 'g') from regex_test;
Filter Filter Filter {Filter}
Filter Filter Filter {Filter}
Filter Filter Filter {Filter}
This fails on 9.3 if the citext type is used instead of text.
Seems the however the citext version of regexp_matches is invoked it ignores the global flag. Haven't explored deeper.
David J.
Please don't top-post...
On Mon, May 4, 2015 at 3:41 PM, Jeff Certain <gcertain@dynamicsignal.com>
wrote:
Thanks for the quick response.
From my end of things, I would expect either a) an exception; b) 3
matches; or c) 0 matches. The current behavior (returning one match and
ignoring the global flag) strikes me as being misleading and challenging to
debug.
Unless someone tries to argue that this is not a bug the solution will be
"b) 3 matches". I stopped where I did because I'm not a -hacker.
David J.
"David G. Johnston" <david.g.johnston@gmail.com> writes:
Seems the however the citext version of regexp_matches is invoked it
ignores the global flag. Haven't explored deeper.
Hm ...
regression=# create extension citext;
CREATE EXTENSION
regression=# \df regexp_matches
List of functions
Schema | Name | Result data type | Argument data types | Type
------------+----------------+------------------+----------------------+--------
pg_catalog | regexp_matches | SETOF text[] | text, text | normal
pg_catalog | regexp_matches | SETOF text[] | text, text, text | normal
public | regexp_matches | text[] | citext, citext | normal
public | regexp_matches | text[] | citext, citext, text | normal
(4 rows)
Looks like somebody just fat-fingered the declaration of citext's
regexp_matches() functions: they should be marked proretset and
they ain't. Sigh. I wonder if any of the other ones are wrong.
regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Jeff Certain <gcertain@dynamicsignal.com> writes:
create table regex_test(filter citext);
insert into regex_test(filter) values ('Filter Filter Filter');
select filter, regexp_matches(filter, 'Filter', 'g') from regex_test;
select filter, regexp_matches('Filter Filter Filter', 'Filter', 'g') from regex_test;
Expected result:
Both select statements should return the same number of rows. Specifically, in this case, I expect to get back 3 rows.
Actual result:
When referencing the filter column, only one row is ever returned. When supplying the string that is the same as the data in the filter column, the correct number of rows are returned.
Fixed, although after discussion we concluded that we should not change
the default behavior in the back branches. See commit log message:
Thanks for the report!
regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs