BUG #14552: tsquery converts AND operator into OR when nested inside OR operations

Started by Nonameabout 9 years ago2 messagesbugs
Jump to latest
#1Noname
bjorn@eventmy.com

The following bug has been logged on the website:

Bug reference: 14552
Logged by: Bjorn Linder
Email address: bjorn@eventmy.com
PostgreSQL version: 9.4.5
Operating system: OS 10.11.6
Description:

Working correctly, no results:
SELECT ts_rank(to_tsvector('lets eat a cat'), ('fat & bat | rat'::tsquery &&
'cat'::tsquery));
ts_rank
---------
1e-20
(1 row)

Should also yield no results:
SELECT ts_rank(to_tsvector('lets eat a fat cat'), ('fat & bat |
rat'::tsquery && 'cat'::tsquery));
ts_rank
-----------
0.0991032
(1 row)

Is this intended behavior? Is there a recommended way to nest AND operators
inside OR operations? The relevant documentation looks to be the same for
newer versions so I'm assuming this behavior hasn't been changed between
versions - let me know. Thanks!

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noname (#1)
Re: BUG #14552: tsquery converts AND operator into OR when nested inside OR operations

bjorn@eventmy.com writes:

Working correctly, no results:
SELECT ts_rank(to_tsvector('lets eat a cat'), ('fat & bat | rat'::tsquery &&
'cat'::tsquery));
ts_rank
---------
1e-20
(1 row)

Should also yield no results:
SELECT ts_rank(to_tsvector('lets eat a fat cat'), ('fat & bat |
rat'::tsquery && 'cat'::tsquery));
ts_rank
-----------
0.0991032
(1 row)

Is this intended behavior?

Don't see what you find surprising about it? ts_rank() is documented as

Ranks vectors based on the frequency of their matching lexemes.

The first example has one lexeme that matches the query's lexemes,
the second has two. It should get a higher ranking.

If you want to know whether the tsvector formally matches the query,
you should be applying the @@ operator. ts_rank() is not a binary
yes/no thing, it's trying to identify stuff that is more or less
relevant to the query's terms. At least from the documentation,
I'd suspect it pays no attention to the operators in the query.

In short: the intended use of ts_rank() is for sorting values that
have already passed an @@ match. It's not a substitute for @@.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs