BUG #14245: Segfault on weird to_tsquery

Started by Nonameover 9 years ago10 messages
#1Noname
david@gravitext.com

The following bug has been logged on the website:

Bug reference: 14245
Logged by: David Kellum
Email address: david@gravitext.com
PostgreSQL version: 9.6beta2
Operating system: Linux
Description:

I am doing some (fuzz) testing of full text queries and managed to
generate the following case which causes a SEGFAULT on PostgreSQL 9.6
beta1 and beta2:

select to_tsquery('!(a & !b) & c') as tsquery

This weird query outputs the following on 9.5.2, instead of crashing:

"!( !'b' ) & 'c'"

Below is my log output, which includes a stack trace:

Jul 12 10:04:01 klein kernel: postgres[22191]: segfault at 10 ip
00000000007754cd sp 00007ffc64b4a950 error 4 in postgres[400000+5f8000]
Jul 12 10:04:01 klein systemd[1]: Started Process Core Dump (PID 22192/UID
0).
Jul 12 10:04:01 klein postgres[482]: LOG: server process (PID 22191) was
terminated by signal 11: Segmentation fault
Jul 12 10:04:01 klein postgres[482]: DETAIL: Failed process was running:
select to_tsquery('!(a & !b) & c') as tsquery
Jul 12 10:04:01 klein postgres[482]: LOG: terminating any other active
server processes
Jul 12 10:04:01 klein postgres[482]: WARNING: terminating connection
because of crash of another server process
Jul 12 10:04:01 klein postgres[482]: DETAIL: The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
Jul 12 10:04:01 klein postgres[482]: HINT: In a moment you should be able
to reconnect to the database and repeat your command.
Jul 12 10:04:01 klein postgres[482]: LOG: all server processes terminated;
reinitializing
Jul 12 10:04:01 klein postgres[482]: LOG: database system was interrupted;
last known up at 2016-07-12 10:03:47 PDT
Jul 12 10:04:01 klein systemd-coredump[22193]: Process 22191 (postgres) of
user 88 dumped core.
Stack trace of thread
22191:
#0 0x00000000007754cd
normalize_phrase_tree (postgres)
#1 0x00000000007756e1
normalize_phrase_tree (postgres)
#2 0x00000000007756d5
normalize_phrase_tree (postgres)
#3 0x00000000007759bb
cleanup_fakeval_and_phrase (postgres)
#4 0x0000000000774613
parse_tsquery (postgres)
#5 0x00000000006ca21a
to_tsquery_byid (postgres)
#6 0x00000000007ad5a7
DirectFunctionCall2Coll (postgres)
#7 0x00000000005b79c1
ExecMakeFunctionResultNoSets (postgres)
#8 0x00000000005bd285
ExecProject (postgres)
#9 0x00000000005d1722
ExecResult (postgres)
#10 0x00000000005b6a58
ExecProcNode (postgres)
#11 0x00000000005b2fef
standard_ExecutorRun (postgres)
#12 0x00000000006bbaf8
PortalRunSelect (postgres)
#13 0x00000000006bcf1e
PortalRun (postgres)
#14 0x00000000006ba979
PostgresMain (postgres)
#15 0x000000000046f35f
ServerLoop (postgres)
#16 0x000000000066124c
PostmasterMain (postgres)
#17 0x00000000004703ff main
(postgres)
#18 0x00007fe114812741
__libc_start_main (libc.so.6)
#19 0x0000000000470499 _start
(postgres)
Jul 12 10:04:02 klein postgres[482]: LOG: database system was not properly
shut down; automatic recovery in progress
Jul 12 10:04:02 klein postgres[482]: LOG: invalid record length at
1/2FA3E1C8: wanted 24, got 0
Jul 12 10:04:02 klein postgres[482]: LOG: redo is not required
Jul 12 10:04:02 klein postgres[482]: LOG: MultiXact member wraparound
protections are now enabled
Jul 12 10:04:02 klein postgres[482]: LOG: database system is ready to
accept connections
Jul 12 10:04:02 klein postgres[482]: LOG: autovacuum launcher started

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#2Peter Geoghegan
pg@heroku.com
In reply to: Noname (#1)
Re: BUG #14245: Segfault on weird to_tsquery

On Tue, Jul 12, 2016 at 10:58 AM, <david@gravitext.com> wrote:

The following bug has been logged on the website:

Bug reference: 14245
Logged by: David Kellum
Email address: david@gravitext.com
PostgreSQL version: 9.6beta2
Operating system: Linux
Description:

I am doing some (fuzz) testing of full text queries and managed to
generate the following case which causes a SEGFAULT on PostgreSQL 9.6
beta1 and beta2:

select to_tsquery('!(a & !b) & c') as tsquery

Interesting discovery. How did you fuzz test?

--
Peter Geoghegan

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#3Peter Geoghegan
pg@heroku.com
In reply to: Peter Geoghegan (#2)
Re: BUG #14245: Segfault on weird to_tsquery

On Tue, Jul 12, 2016 at 11:40 AM, Peter Geoghegan <pg@heroku.com> wrote:

Interesting discovery. How did you fuzz test?

This appears to be a NULL pointer dereference. Here is a backtrace
with proper debug info:

#0 0x0000000000e45ada in normalize_phrase_tree (node=0x0) at
tsquery_cleanup.c:397
#1 0x0000000000e468f3 in normalize_phrase_tree (node=<optimized out>)
at tsquery_cleanup.c:416
#2 0x0000000000e4687f in normalize_phrase_tree (node=0x0) at
tsquery_cleanup.c:543
#3 0x0000000000e44ce9 in cleanup_fakeval_and_phrase (in=<optimized
out>) at tsquery_cleanup.c:603
#4 0x0000000000e3f528 in parse_tsquery (buf=<optimized out>,
pushval=0x6250002e9490, opaque=<optimized out>, isplain=<optimized
out>) at tsquery.c:695
#5 0x0000000000c8abcf in to_tsquery_byid (fcinfo=<optimized out>) at
to_tsany.c:372
#6 0x0000000000ee0cc6 in DirectFunctionCall2Coll (func=0xc8aac0
<to_tsquery_byid>, collation=1342381084, arg1=12126,
arg2=108095739809240) at fmgr.c:1049
#7 0x000000000093d2a9 in ExecMakeFunctionResultNoSets
(fcache=<optimized out>, econtext=0x6250002ee368, isNull=<optimized
out>, isDone=<optimized out>) at execQual.c:2041
#8 0x000000000093a89c in ExecTargetList (targetlist=0x6250002ef0e0,
tupdesc=<optimized out>, econtext=<optimized out>,
values=0x6250002eefb8, isnull=0x6250002eefd8 "\276~\276\276\276"...,
itemIsDone=0x6250002ef118, isDone=<optimized out>) at execQual.c:5376
#9 0x000000000093a5ab in ExecProject (projInfo=<optimized out>,
isDone=<optimized out>) at execQual.c:5600
***SNIP ***

--
Peter Geoghegan

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#4David Kellum
david@gravitext.com
In reply to: Peter Geoghegan (#2)
Re: BUG #14245: Segfault on weird to_tsquery

On Tue, Jul 12, 2016 at 11:40 AM, Peter Geoghegan <pg@heroku.com> wrote:

On Tue, Jul 12, 2016 at 10:58 AM, <david@gravitext.com> wrote:

The following bug has been logged on the website:

Bug reference: 14245

I am doing some (fuzz) testing of full text queries and managed to
generate the following case which causes a SEGFAULT on PostgreSQL
9.6
beta1 and beta2:

select to_tsquery('!(a & !b) & c') as tsquery

Interesting discovery. How did you fuzz test?

Motivated by the new phrase search support in 9.6, I'm working on a
query language which is lenient to any user input when parsed and can
be transformed and output to PG tsquery syntax. The fuzz testing is by
randomly permuted fragments in the custom query language. Using this,
I found and fixed a bunch of issues in my own parser, and identified
lots of characters to treat as whitespace and filter before output to
tsquery, before stumbling on this Postgres crash.

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noname (#1)
Re: BUG #14245: Segfault on weird to_tsquery

david@gravitext.com writes:

I am doing some (fuzz) testing of full text queries and managed to
generate the following case which causes a SEGFAULT on PostgreSQL 9.6
beta1 and beta2:
select to_tsquery('!(a & !b) & c') as tsquery
This weird query outputs the following on 9.5.2, instead of crashing:
"!( !'b' ) & 'c'"

Note that while crashing is certainly not good, the pre-9.6 behavior
can hardly be called correct either. What happened to 'a'?

Also, it looks like this is specific to to_tsquery; if you just feed
the same thing to tsqueryin, it seems fine with it:

# select '!(a & !b) & c'::tsquery;
tsquery
-----------------------
!( 'a' & !'b' ) & 'c'
(1 row)

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#6David Kellum
david@gravitext.com
In reply to: Tom Lane (#5)
Re: BUG #14245: Segfault on weird to_tsquery

On Tue, Jul 12, 2016 at 12:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

david@gravitext.com writes:

I am doing some (fuzz) testing of full text queries and managed to
generate the following case which causes a SEGFAULT on PostgreSQL
9.6
beta1 and beta2:
select to_tsquery('!(a & !b) & c') as tsquery
This weird query outputs the following on 9.5.2, instead of
crashing:
"!( !'b' ) & 'c'"

Note that while crashing is certainly not good, the pre-9.6 behavior
can hardly be called correct either. What happened to 'a'?

'a' is a stopword, dropped by to_tsquery() as described here:

https://www.postgresql.org/docs/9.6/static/textsearch-controls.html#TEXTSEARCH-PARSING-QUERIES

The difference is that while basic tsquery input takes the tokens at
face value, to_tsquery normalizes each token into a lexeme using the
specified or default configuration, and discards any tokens that are
stop words according to the configuration.

...and I believe I want this behavior. Otherwise queries with stopword
in '&' condition will not match anything. In truth I have no reason to
want to support this kind of weird double negative, on any version, and
will also look at filtering it out in my code before calling
to_tsquery().

It might be worth noting that these other slightly different cases are
fine on 9.6:

select to_tsquery('!(apple & !b) & c'); ---> !( 'appl' & !'b' ) & 'c'
select to_tsquery('!(apple & !a) & c'); ---> !'appl' & 'c'\

Clearly a pretty obscure case, but a crash nonetheless.

Also, it looks like this is specific to to_tsquery; if you just feed
the same thing to tsqueryin, it seems fine with it:

# select '!(a & !b) & c'::tsquery;
tsquery
-----------------------
!( 'a' & !'b' ) & 'c'
(1 row)

Against another test table, English search config, I confirmed that 'a
& ball'::tsquery doesn't match anything, but to_tsquery('a & ball')
does.

Thanks,
David

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Kellum (#6)
Re: BUG #14245: Segfault on weird to_tsquery

David Kellum <david@gravitext.com> writes:

On Tue, Jul 12, 2016 at 12:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Note that while crashing is certainly not good, the pre-9.6 behavior
can hardly be called correct either. What happened to 'a'?

'a' is a stopword, dropped by to_tsquery() as described here:

Ah! OK, so it's probably necessary to have a stopword there in order
to break it.

BTW, all these variants also crash:

select to_tsquery('!(a | !b) & c') as tsquery;
select to_tsquery('!( !b & a) & c') as tsquery;
select to_tsquery('!( !b | a) & c') as tsquery;

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#8Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#7)
Re: BUG #14245: Segfault on weird to_tsquery

On Tue, Jul 12, 2016 at 05:11:32PM -0400, Tom Lane wrote:

David Kellum <david@gravitext.com> writes:

On Tue, Jul 12, 2016 at 12:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Note that while crashing is certainly not good, the pre-9.6 behavior
can hardly be called correct either. What happened to 'a'?

'a' is a stopword, dropped by to_tsquery() as described here:

Ah! OK, so it's probably necessary to have a stopword there in order
to break it.

BTW, all these variants also crash:

select to_tsquery('!(a | !b) & c') as tsquery;
select to_tsquery('!( !b & a) & c') as tsquery;
select to_tsquery('!( !b | a) & c') as tsquery;

[Action required within 72 hours. This is a generic notification.]

The above-described topic is currently a PostgreSQL 9.6 open item. Teodor,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
9.6 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1]/messages/by-id/20160527025039.GA447393@tornado.leadboat.com and send a status update within 72 hours of this
message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping 9.6rc1. Consequently, I will appreciate your
efforts toward speedy resolution. Thanks.

[1]: /messages/by-id/20160527025039.GA447393@tornado.leadboat.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Teodor Sigaev
teodor@sigaev.ru
In reply to: Noah Misch (#8)
Re: [HACKERS] BUG #14245: Segfault on weird to_tsquery

The above-described topic is currently a PostgreSQL 9.6 open item. Teodor,

I'm working on it now and believe that fix will be published today.

since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
9.6 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1] and send a status update within 72 hours of this
message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping 9.6rc1. Consequently, I will appreciate your
efforts toward speedy resolution. Thanks.

[1] /messages/by-id/20160527025039.GA447393@tornado.leadboat.com

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#10Teodor Sigaev
teodor@sigaev.ru
In reply to: Noname (#1)
Re: BUG #14245: Segfault on weird to_tsquery

select to_tsquery('!(a & !b) & c') as tsquery

Thank you very much for your report, fixed.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs