avoid bitmapOR-ing indexes with scan condition inconsistent with partition constraint
(resending to the list)
Hi All
I started looking into Konstantin's 30 month old thread/patch:
|Re: [HACKERS] Secondary index access optimizations
/messages/by-id/27516421-5afa-203c-e22a-8407e9187327@postgrespro.ru
..to which David directed me 12 months ago:
|Subject: Re: scans on table fail to be excluded by partition bounds
/messages/by-id/CAKJS1f_iOmCW11dFzifpDGUgSLAoSTDOjw2tcec=7Cgq+sR80Q@mail.gmail.com
My complaint at the time was for a query plan like:
CREATE TABLE p (i int, j int) PARTITION BY RANGE(i);
SELECT format('CREATE TABLE p%s PARTITION OF p FOR VALUES FROM(%s)TO(%s)', i, 10*(i-1), 10*i) FROM generate_series(1,10)i; \gexec
INSERT INTO p SELECT i%99, i%9 FROM generate_series(1,99999)i;
VACUUM ANALYZE p;
CREATE INDEX ON p(i);
CREATE INDEX ON p(j);
postgres=# explain analyze SELECT * FROM p WHERE (i=10 OR i=20 OR i=30) AND j<2;
Append (cost=28.51..283.25 rows=546 width=12) (actual time=0.100..1.364 rows=546 loops=1)
-> Bitmap Heap Scan on p2 (cost=28.51..93.51 rows=182 width=12) (actual time=0.099..0.452 rows=182 loops=1)
Recheck Cond: ((i = 10) OR (i = 20) OR (i = 30))
Filter: (j < 2)
Rows Removed by Filter: 818
Heap Blocks: exact=45
-> BitmapOr (cost=28.51..28.51 rows=1000 width=0) (actual time=0.083..0.083 rows=0 loops=1)
-> Bitmap Index Scan on p2_i_idx (cost=0.00..19.79 rows=1000 width=0) (actual time=0.074..0.074 rows=1000 loops=1)
Index Cond: (i = 10)
-> Bitmap Index Scan on p2_i_idx (cost=0.00..4.29 rows=1 width=0) (actual time=0.008..0.008 rows=0 loops=1)
Index Cond: (i = 20)
-> Bitmap Index Scan on p2_i_idx (cost=0.00..4.29 rows=1 width=0) (actual time=0.001..0.001 rows=0 loops=1)
Index Cond: (i = 30)
...
This 2nd and 3rd index scan on p2_i_idx are useless, and benign here, but
harmful if we have a large OR list.
I tried rebasing Konstantin's patch, but it didn't handle the case of
"refuting" inconsistent arms of an "OR" list, so I came up with this. This
currently depends on the earlier patch only to call RelationGetPartitionQual,
so appears to be mostly a separate issue.
I believe the current behavior of "OR" lists is also causing another issue I
reported, which a customer hit again last week:
/messages/by-id/20191216184906.GA2082@telsasoft.com
|ERROR: could not resize shared memory segment...No space left on device
When I looked into it, their explain(format text) was 50MB, due to a list of
~500 "OR" conditions, *each* of which was causing an index scan for each of
~500 partitions, where only one index scan per partition was needed or useful,
all the others being inconsistent with the partition constraint. Thus the
query ultimately errors when it exceeds a resource limit (maybe no surprise
with 8500 index scans).
Here, I was trying to create a test case reproducing that error to see if this
resolves it, but so far hasn't failed. Tomas has a reproducer with a different
(much simpler) plan, though.
CREATE TABLE p (i int, j int) PARTITION BY RANGE(i);
\pset pager off
SELECT format('CREATE TABLE p%s PARTITION OF p FOR VALUES FROM(%s)TO(%s)', i, 10*(i-1), 10*i) FROM generate_series(1,500)i;
\timing off
\set quiet
\set echo always
\gexec
\timing on
INSERT INTO p SELECT i%5000, i%500 FROM generate_series(1,9999999)i;
VACUUM ANALYZE p;
CREATE INDEX ON p(i);
CREATE INDEX ON p(j);
SELECT format('explain analyze SELECT * FROM p WHERE %s', array_to_string(array_agg('i='||(i*10)::text),' OR ')) FROM generate_series(1,500)i;
--
Justin
Added here:
https://commitfest.postgresql.org/29/2644/
And updated tests to pass following:
|commit 689696c7110f148ede8004aae50d7543d05b5587
|Author: Tom Lane <tgl@sss.pgh.pa.us>
|Date: Tue Jul 14 18:56:49 2020 -0400
|
| Fix bitmap AND/OR scans on the inside of a nestloop partition-wise join.
Rebased and updated for tests added in 13838740f.
--
Justin Pryzby
System Administrator
Telsasoft
+1-952-707-8581
Hi Justin,
Attached is a minimal patch that is rebased and removes the
dependency on Konstantin's earlier patch[1]/messages/by-id/attachment/112074/0001-Secondary-index-access-optimizations.patch, making it enough to remove
the extraneous index scans as Justin brought up. Is this the direction we
want to head in?
Tagging Konstantin in the email to hear his input on his old patch.
Since Justin's attached patch [1]/messages/by-id/attachment/112074/0001-Secondary-index-access-optimizations.patch does not include the work that was done
on the operator_predicate_proof() and as discussed here in [2]/messages/by-id/5A006016.1010009@postgrespro.ru, that
work is important to see real benefits? Just wanted to check before
reviewing [1]/messages/by-id/attachment/112074/0001-Secondary-index-access-optimizations.patch.
Regards,
Soumyadeep (VMware)
[1]: /messages/by-id/attachment/112074/0001-Secondary-index-access-optimizations.patch
[2]: /messages/by-id/5A006016.1010009@postgrespro.ru
Attachments:
v3-0001-Avoid-index-scan-inconsistent-with-partition-cons.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Avoid-index-scan-inconsistent-with-partition-cons.patchDownload+54-7
On Wed, Sep 30, 2020 at 04:52:02PM -0700, Soumyadeep Chakraborty wrote:
Hi Justin,
Attached is a minimal patch that is rebased and removes the
dependency on Konstantin's earlier patch[1], making it enough to remove
the extraneous index scans as Justin brought up. Is this the direction we
want to head in?
Yes, thanks for doing that. I hadn't dug into it yet to figure out what to put
where to separate the patches. It seems like my patch handles a different goal
than Konstantin's, but they both depend on having the constraints populated.
--
Justin
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: not tested
I think that work on improving operator_predicate_proof should really be done in separate patch.
And this minimal patch is doing it's work well.
The new status of this patch is: Ready for Committer
I started looking through this patch. I really quite dislike solving
this via a kluge in indxpath.c. There are multiple disadvantages
to that:
* It only helps for the very specific problem of redundant bitmap
index scans, whereas the problem of applying redundant qual checks
in partition scans seems pretty general.
* It's not unlikely that this will end up trying to make the same
proof multiple times (and the lack of any way to turn that off,
through constraint_exclusion or some other knob, isn't too cool).
* It does nothing to fix rowcount estimates in the light of the
knowledge that some of the restriction clauses are no-ops. Now,
if we have up-to-date stats we'll probably manage to come out with
an appropriate 0 or 1 selectivity anyway, but we might not have those.
In any case, spending significant effort to estimate a selectivity
when some other part of the code has taken the trouble to *prove* the
clause true or false seems very undesirable.
* I'm not even convinced that the logic is correct, specifically that
it's okay to just "continue" if we refute the OR clause. That seems
likely to break generate_bitmap_or_paths' surrounding loop logic about
"We must be able to match at least one index to each of the arms of
the OR". At least, if that still works it requires more than zero
commentary about why.
So I like much better the idea of Konstantin's old patch, that we modify
the rel's baserestrictinfo list by removing quals that we can prove
true. We could extend that to solve the bitmapscan problem by removing
OR arms that we can prove false. So I started to review that patch more
carefully, and after awhile realized that it has a really fundamental
problem: it is trying to use CHECK predicates to prove WHERE clauses.
But we don't know that CHECK predicates are true, only that they are
not-false, and there is no proof mode in predtest.c that will allow
proving some clauses true based only on other ones being not-false.
We can salvage something by restricting the input quals to be only
partition quals, since those are built to be guaranteed-true-or-false;
we can assume they don't yield NULL. There's a hole in that for
hashing, as I noted elsewhere, but we'll fail to prove anything anyway
from a satisfies_hash_partition() qual. (In principle we could also use
attnotnull quals, which also have that property. But I'm dubious that
that will help often enough to be worth the extra cycles for predtest.c
to process them.)
So after a bit of coding I had the attached. This follows Konstantin's
original patch in letting relation_excluded_by_constraints() change
the baserestrictinfo list. I read the comments in the older thread
about people not liking that, and I can see the point. But I'm not
convinced that the later iterations of the patch were an improvement,
because (a) the call locations for
remove_restrictions_implied_by_constraints() seemed pretty random, and
(b) it seems necessary to have relation_excluded_by_constraints() and
remove_restrictions_implied_by_constraints() pretty much in bed with
each other if we don't want to duplicate constraint-fetching work.
Note the comment on get_relation_constraints() that it's called at
most once per relation; that's not something I particularly desire
to give up, because a relcache open isn't terribly cheap. Also
(c) I think it's important that there be a way to suppress this
overhead when it's not useful. In the patch as attached, turning off
constraint_exclusion does that since relation_excluded_by_constraints()
falls out before getting to the new code. If we make
remove_restrictions_implied_by_constraints() independent then it
will need some possibly-quite-duplicative logic to check
constraint_exclusion. (Of course, if we'd rather condition this
on some other GUC then that argument falls down. But I think we
need something.) So, I'm not dead set on this code structure, but
I haven't seen one I like better.
Anyway, this seems to work, and if the regression test changes are
any guide then it may fire often enough in the real world to be useful.
Nonetheless, I'm concerned about performance, because predtest.c is a
pretty expensive thing and there will be a lot of cases where the work
is useless. I did a quick check using pgbench's option to partition
the tables, and observed that the -S (select only) test case seemed to
get about 2.5% slower with the patch than without. That's not far
outside the noise floor, so maybe it's not real, but if it is real then
it seems pretty disastrous. Perhaps we could avoid that problem by
removing the "predicate_implied_by" cases and only trying the
"predicate_refuted_by" case, so that no significant time is added
unless you've got an OR restriction clause on a partitioned table.
That seems like it'd lose a lot of the benefit though :-(.
So I'm not sure where to go from here. Thoughts? Anyone else
care to run some performance tests?
regards, tom lane
Attachments:
v5-0001-prune-restrictions-using-constraints.patchtext/x-diff; charset=us-ascii; name=v5-0001-prune-restrictions-using-constraints.patchDownload+511-547
Hi,
On Fri, Nov 13, 2020 at 5:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I started looking through this patch. I really quite dislike solving
this via a kluge in indxpath.c. There are multiple disadvantages
to that:* It only helps for the very specific problem of redundant bitmap
index scans, whereas the problem of applying redundant qual checks
in partition scans seems pretty general.* It's not unlikely that this will end up trying to make the same
proof multiple times (and the lack of any way to turn that off,
through constraint_exclusion or some other knob, isn't too cool).* It does nothing to fix rowcount estimates in the light of the
knowledge that some of the restriction clauses are no-ops. Now,
if we have up-to-date stats we'll probably manage to come out with
an appropriate 0 or 1 selectivity anyway, but we might not have those.
In any case, spending significant effort to estimate a selectivity
when some other part of the code has taken the trouble to *prove* the
clause true or false seems very undesirable.* I'm not even convinced that the logic is correct, specifically that
it's okay to just "continue" if we refute the OR clause. That seems
likely to break generate_bitmap_or_paths' surrounding loop logic about
"We must be able to match at least one index to each of the arms of
the OR". At least, if that still works it requires more than zero
commentary about why.So I like much better the idea of Konstantin's old patch, that we modify
the rel's baserestrictinfo list by removing quals that we can prove
true. We could extend that to solve the bitmapscan problem by removing
OR arms that we can prove false. So I started to review that patch more
carefully, and after awhile realized that it has a really fundamental
problem: it is trying to use CHECK predicates to prove WHERE clauses.
But we don't know that CHECK predicates are true, only that they are
not-false, and there is no proof mode in predtest.c that will allow
proving some clauses true based only on other ones being not-false.We can salvage something by restricting the input quals to be only
partition quals, since those are built to be guaranteed-true-or-false;
we can assume they don't yield NULL. There's a hole in that for
hashing, as I noted elsewhere, but we'll fail to prove anything anyway
from a satisfies_hash_partition() qual. (In principle we could also use
attnotnull quals, which also have that property. But I'm dubious that
that will help often enough to be worth the extra cycles for predtest.c
to process them.)So after a bit of coding I had the attached. This follows Konstantin's
original patch in letting relation_excluded_by_constraints() change
the baserestrictinfo list. I read the comments in the older thread
about people not liking that, and I can see the point. But I'm not
convinced that the later iterations of the patch were an improvement,
because (a) the call locations for
remove_restrictions_implied_by_constraints() seemed pretty random, and
(b) it seems necessary to have relation_excluded_by_constraints() and
remove_restrictions_implied_by_constraints() pretty much in bed with
each other if we don't want to duplicate constraint-fetching work.
Note the comment on get_relation_constraints() that it's called at
most once per relation; that's not something I particularly desire
to give up, because a relcache open isn't terribly cheap. Also
(c) I think it's important that there be a way to suppress this
overhead when it's not useful. In the patch as attached, turning off
constraint_exclusion does that since relation_excluded_by_constraints()
falls out before getting to the new code. If we make
remove_restrictions_implied_by_constraints() independent then it
will need some possibly-quite-duplicative logic to check
constraint_exclusion. (Of course, if we'd rather condition this
on some other GUC then that argument falls down. But I think we
need something.) So, I'm not dead set on this code structure, but
I haven't seen one I like better.Anyway, this seems to work, and if the regression test changes are
any guide then it may fire often enough in the real world to be useful.
Nonetheless, I'm concerned about performance, because predtest.c is a
pretty expensive thing and there will be a lot of cases where the work
is useless. I did a quick check using pgbench's option to partition
the tables, and observed that the -S (select only) test case seemed to
get about 2.5% slower with the patch than without. That's not far
outside the noise floor, so maybe it's not real, but if it is real then
it seems pretty disastrous. Perhaps we could avoid that problem by
removing the "predicate_implied_by" cases and only trying the
"predicate_refuted_by" case, so that no significant time is added
unless you've got an OR restriction clause on a partitioned table.
That seems like it'd lose a lot of the benefit though :-(.So I'm not sure where to go from here. Thoughts? Anyone else
care to run some performance tests?
Status update for a commitfest entry.
Reading through the discussion, several patches have been proposed and
it has been inactive for almost 2 months. Does anyone listed as the
author plan to work on this patch? It looks like we're waiting for
some reviews on the patch including from the performance perspective
but this patch entry has been set to "Waiting on Author" since
2021-01-12. If no one works on this and it's really waiting on the
author, I'm going to set it to "Returned with Feedback", barring
objections.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
On Mon, Feb 1, 2021 at 12:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
Hi,
On Fri, Nov 13, 2020 at 5:14 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I started looking through this patch. I really quite dislike solving
this via a kluge in indxpath.c. There are multiple disadvantages
to that:* It only helps for the very specific problem of redundant bitmap
index scans, whereas the problem of applying redundant qual checks
in partition scans seems pretty general.* It's not unlikely that this will end up trying to make the same
proof multiple times (and the lack of any way to turn that off,
through constraint_exclusion or some other knob, isn't too cool).* It does nothing to fix rowcount estimates in the light of the
knowledge that some of the restriction clauses are no-ops. Now,
if we have up-to-date stats we'll probably manage to come out with
an appropriate 0 or 1 selectivity anyway, but we might not have those.
In any case, spending significant effort to estimate a selectivity
when some other part of the code has taken the trouble to *prove* the
clause true or false seems very undesirable.* I'm not even convinced that the logic is correct, specifically that
it's okay to just "continue" if we refute the OR clause. That seems
likely to break generate_bitmap_or_paths' surrounding loop logic about
"We must be able to match at least one index to each of the arms of
the OR". At least, if that still works it requires more than zero
commentary about why.So I like much better the idea of Konstantin's old patch, that we modify
the rel's baserestrictinfo list by removing quals that we can prove
true. We could extend that to solve the bitmapscan problem by removing
OR arms that we can prove false. So I started to review that patch more
carefully, and after awhile realized that it has a really fundamental
problem: it is trying to use CHECK predicates to prove WHERE clauses.
But we don't know that CHECK predicates are true, only that they are
not-false, and there is no proof mode in predtest.c that will allow
proving some clauses true based only on other ones being not-false.We can salvage something by restricting the input quals to be only
partition quals, since those are built to be guaranteed-true-or-false;
we can assume they don't yield NULL. There's a hole in that for
hashing, as I noted elsewhere, but we'll fail to prove anything anyway
from a satisfies_hash_partition() qual. (In principle we could also use
attnotnull quals, which also have that property. But I'm dubious that
that will help often enough to be worth the extra cycles for predtest.c
to process them.)So after a bit of coding I had the attached. This follows Konstantin's
original patch in letting relation_excluded_by_constraints() change
the baserestrictinfo list. I read the comments in the older thread
about people not liking that, and I can see the point. But I'm not
convinced that the later iterations of the patch were an improvement,
because (a) the call locations for
remove_restrictions_implied_by_constraints() seemed pretty random, and
(b) it seems necessary to have relation_excluded_by_constraints() and
remove_restrictions_implied_by_constraints() pretty much in bed with
each other if we don't want to duplicate constraint-fetching work.
Note the comment on get_relation_constraints() that it's called at
most once per relation; that's not something I particularly desire
to give up, because a relcache open isn't terribly cheap. Also
(c) I think it's important that there be a way to suppress this
overhead when it's not useful. In the patch as attached, turning off
constraint_exclusion does that since relation_excluded_by_constraints()
falls out before getting to the new code. If we make
remove_restrictions_implied_by_constraints() independent then it
will need some possibly-quite-duplicative logic to check
constraint_exclusion. (Of course, if we'd rather condition this
on some other GUC then that argument falls down. But I think we
need something.) So, I'm not dead set on this code structure, but
I haven't seen one I like better.Anyway, this seems to work, and if the regression test changes are
any guide then it may fire often enough in the real world to be useful.
Nonetheless, I'm concerned about performance, because predtest.c is a
pretty expensive thing and there will be a lot of cases where the work
is useless. I did a quick check using pgbench's option to partition
the tables, and observed that the -S (select only) test case seemed to
get about 2.5% slower with the patch than without. That's not far
outside the noise floor, so maybe it's not real, but if it is real then
it seems pretty disastrous. Perhaps we could avoid that problem by
removing the "predicate_implied_by" cases and only trying the
"predicate_refuted_by" case, so that no significant time is added
unless you've got an OR restriction clause on a partitioned table.
That seems like it'd lose a lot of the benefit though :-(.So I'm not sure where to go from here. Thoughts? Anyone else
care to run some performance tests?Status update for a commitfest entry.
Reading through the discussion, several patches have been proposed and
it has been inactive for almost 2 months. Does anyone listed as the
author plan to work on this patch? It looks like we're waiting for
some reviews on the patch including from the performance perspective
but this patch entry has been set to "Waiting on Author" since
2021-01-12. If no one works on this and it's really waiting on the
author, I'm going to set it to "Returned with Feedback", barring
objections.
I've moved this patch to "Returned with Feedback". Depending on
timing, this may be reversable, so let us know if there are
extenuating circumstances. In any case, you are welcome to address
the feedback you have received, and resubmit the patch to the next CommitFest.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/