Question about partial functional indexes and the query planner
Hi everyone,
I am using a partial functional index on a table where F(a) = a. Querying
whre F(a) = a hits the index as expected. However the reverse statement a
= F(a) does not. I have verified this in 9.3.4.
Is this a deficiency with the query planner, or are these not actually
equivalent? I've been stumped on it.
-Brian Dunavant
Test script to display behavior below:
-- Setup the test data
CREATE OR REPLACE FUNCTION public.return_if_even(v_id integer) returns
integer
LANGUAGE sql AS
$$
SELECT case when v_id % 2 = 1 then 0 else v_id end;
$$;
create table public.partial_functional_index_test as
select id from generate_series(1,1000000) AS s(id);
create index partial_functional_idx ON public.partial_functional_index_test
USING btree ( public.return_if_even(id) )
WHERE public.return_if_even(id) = id;
-- This will hit the index
explain analyze select count(1) from public.partial_functional_index_test
where public.return_if_even(id) = id;
-- This will not hit the index
explain analyze select count(1) from public.partial_functional_index_test
where id = public.return_if_even(id);
-- To work around it, I can index both ways:
drop index partial_functional_idx;
create index partial_functional_idx ON public.partial_functional_index_test
USING btree ( public.return_if_even(id) )
WHERE public.return_if_even(id) = id OR id = public.return_if_even(id);
-- Now both versions will hit the index
explain analyze select count(1) from public.partial_functional_index_test
where public.return_if_even(id) = id;
explain analyze select count(1) from public.partial_functional_index_test
where id = public.return_if_even(id);
-- Cleanup test data
drop table public.partial_functional_index_test;
drop function public.return_if_even(integer);
Brian Dunavant <brian@omniti.com> writes:
I am using a partial functional index on a table where F(a) = a. Querying
whre F(a) = a hits the index as expected. However the reverse statement a
= F(a) does not. I have verified this in 9.3.4.
Is this a deficiency with the query planner, or are these not actually
equivalent? I've been stumped on it.
Interesting. The reason this doesn't work is that predicate_implied_by()
fails to prove "a = b" given "b = a", at least in the general case.
It will figure that out if either a or b is a constant, but not if
neither one is. The fact that it works with constants might explain
the lack of previous complaints, but this is still pretty surprising
given the amount of knowledge about equality that the system evinces
otherwise. (I'm also wondering whether the EquivalenceClass logic
might not sometimes rewrite "a = b" into "b = a", causing a failure
of this sort even when the user *had* phrased his query consistently.)
It would be fairly straightforward to add a proof rule along the lines of
"if both expressions are binary opclauses, and the left input expression
of each one is equal() to the right input of the other, and the operators
are each other's commutators, then the implication holds".
An objection might be made that this would add noticeably to the cost of
failing proofs, but I think it wouldn't be that bad, especially if we did
the syscache lookup for the commutator check last. In most scenarios the
equal() checks would fail pretty quickly when the proof rule doesn't
apply. Also, I believe that in the case where a or b is a constant,
even though we can make the proof already, this approach would succeed
noticeably more quickly than btree_predicate_proof() can. (Worth noting
also is that predicate_implied_by() isn't even used unless you have
things like partial indexes involved, so that its cost is not especially
relevant to "simple" queries.)
I'd be inclined to add some similar proof rules to predicate_refuted_by,
along the lines of "a op1 b refutes a op2 b if op1 is op2's negator" and
"a op1 b refutes b op2 a if op1 is the negator of op2's commutator".
Again, the code is currently unable to make such deductions unless a or b
is a constant.
Given the lack of previous complaints, I'm not sure this amounts to
a back-patchable bug, but it does seem like something worth fixing
going forward.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Jun 10, 2014 at 7:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Given the lack of previous complaints, I'm not sure this amounts to
a back-patchable bug, but it does seem like something worth fixing
going forward.
Agreed, although I'd be willing to see us slip it into 9.4. It's
doubtful that anyone will get upset if their query plans change
between beta1 and beta2, but the same cannot be said for released
branches.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Jun 10, 2014 at 7:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Given the lack of previous complaints, I'm not sure this amounts to
a back-patchable bug, but it does seem like something worth fixing
going forward.
Agreed, although I'd be willing to see us slip it into 9.4. It's
doubtful that anyone will get upset if their query plans change
between beta1 and beta2, but the same cannot be said for released
branches.
After further thought about this I realized that there's another category
of proof possibilities that is pretty simple to add while we're at it.
Namely, once we've found that both subexpressions of the two operator
clauses are equal(), we can use btree opclass relationships to prove that,
say, "x < y implies x <= y" or "x < y refutes x > y", independently of
just what x and y are. (Well, they have to be immutable expressions, but
we'd not get this far if they're not.) We already had pretty nearly all
of the machinery for that, but again it was only used for proving cases
involving comparisons to constants.
A little bit of refactoring later, I offer the attached draft patch.
I'm thinking this is probably more than we want to slip into 9.4
at this point, but I'll commit it to HEAD soon if there are not
objections.
regards, tom lane
Attachments:
better-predicate-proofs-1.patchtext/x-diff; charset=us-ascii; name=better-predicate-proofs-1.patchDownload+511-348
On Wed, Jun 11, 2014 at 7:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
On Tue, Jun 10, 2014 at 7:19 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Given the lack of previous complaints, I'm not sure this amounts to
a back-patchable bug, but it does seem like something worth fixing
going forward.Agreed, although I'd be willing to see us slip it into 9.4. It's
doubtful that anyone will get upset if their query plans change
between beta1 and beta2, but the same cannot be said for released
branches.After further thought about this I realized that there's another category
of proof possibilities that is pretty simple to add while we're at it.
Namely, once we've found that both subexpressions of the two operator
clauses are equal(), we can use btree opclass relationships to prove that,
say, "x < y implies x <= y" or "x < y refutes x > y", independently of
just what x and y are. (Well, they have to be immutable expressions, but
we'd not get this far if they're not.) We already had pretty nearly all
of the machinery for that, but again it was only used for proving cases
involving comparisons to constants.A little bit of refactoring later, I offer the attached draft patch.
I'm thinking this is probably more than we want to slip into 9.4
at this point, but I'll commit it to HEAD soon if there are not
objections.regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I applied Tom's patch to the latest HEAD
(e04a9ccd2ccd6e31cc4af6b08257a0a186d0fce8) and showed it to Brian. Looks to
solve the problem he originally reported
$ patch -p1 -i ../better-predicate-proofs-1.patch
(Stripping trailing CRs from patch.)
patching file src/backend/optimizer/util/predtest.c
$ /opt/pgsql_patch_review/bin/psql postgres
Timing is on.
Null display (null) is "«NULL»".
Expanded display (expanded) is used automatically.
psql (9.5devel)
Type "help" for help.
postgres=# CREATE OR REPLACE FUNCTION public.return_if_even(v_id integer)
returns integer
postgres-# LANGUAGE sql AS
postgres-# $$
postgres$# SELECT case when v_id % 2 = 1 then 0 else v_id end;
postgres$# $$;
CREATE FUNCTION
Time: 44.669 ms
postgres=# create table public.partial_functional_index_test as
postgres-# select id from generate_series(1,1000000) AS s(id);
SELECT 1000000
Time: 1037.993 ms
postgres=# create index partial_functional_idx ON
public.partial_functional_index_test
postgres-# USING btree ( public.return_if_even(id) )
postgres-# WHERE public.return_if_even(id) = id;
LOG: sending cancel to blocking autovacuum PID 12521
DETAIL: Process 12424 waits for ShareLock on relation 16385 of database
12217.
STATEMENT: create index partial_functional_idx ON
public.partial_functional_index_test
USING btree ( public.return_if_even(id) )
WHERE public.return_if_even(id) = id;
ERROR: canceling autovacuum task
CONTEXT: automatic analyze of table
"postgres.public.partial_functional_index_test"
CREATE INDEX
Time: 1658.245 ms
postgres=# explain analyze select count(1) from
public.partial_functional_index_test where public.return_if_even(id) = id;
QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=4818.05..4818.06 rows=1 width=0) (actual
time=2503.851..2503.854 rows=1 loops=1)
-> Bitmap Heap Scan on partial_functional_index_test
(cost=82.67..4805.55 rows=5000 width=0) (actual time=43.724..1309.309
rows=500000 loops=1)
Recheck Cond: (CASE WHEN ((id % 2) = 1) THEN 0 ELSE id END = id)
Heap Blocks: exact=4425
-> Bitmap Index Scan on partial_functional_idx (cost=0.00..81.42
rows=5000 width=0) (actual time=42.961..42.961 rows=500000 loops=1)
Planning time: 4.245 ms
Execution time: 2505.281 ms
(7 rows)
Time: 2515.344 ms
postgres=# explain analyze select count(1) from
public.partial_functional_index_test where id = public.return_if_even(id);
QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=4818.05..4818.06 rows=1 width=0) (actual
time=2483.862..2483.866 rows=1 loops=1)
-> Bitmap Heap Scan on partial_functional_index_test
(cost=82.67..4805.55 rows=5000 width=0) (actual time=40.704..1282.955
rows=500000 loops=1)
Recheck Cond: (CASE WHEN ((id % 2) = 1) THEN 0 ELSE id END = id)
Heap Blocks: exact=4425
-> Bitmap Index Scan on partial_functional_idx (cost=0.00..81.42
rows=5000 width=0) (actual time=39.657..39.657 rows=500000 loops=1)
Planning time: 0.127 ms
Execution time: 2483.979 ms
(7 rows)
--
Keith Fiske
Database Administrator
OmniTI Computer Consulting, Inc.
http://www.keithf4.com