BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

Started by PG Bug reporting formalmost 4 years ago35 messageshackersbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org
hackersbugs

The following bug has been logged on the website:

Bug reference: 17540
Logged by: William Duclot
Email address: william.duclot@gmail.com
PostgreSQL version: 14.4
Operating system: GNU/Linux (Red Hat 8.5.0)
Description:

My application uses prepared statements. This section of the documentation
is going to be very relevant to the rest of this report:
https://www.postgresql.org/docs/current/sql-prepare.html#SQL-PREPARE-NOTES.

This is a minimal reproduction of the problem I observe, which I will
explain below:
https://dbfiddle.uk/?rdbms=postgres_14&fiddle=6b01d161da27379844e7602a16543626

Scenario:
- I create a fairly simple table (id + timestamp). Timestamp is indexed.
- I create a simple-ish prepared statement for `SELECT MIN(id), MAX(id) from
relation_tuple_transaction WHERE timestamp >= $1;`
- I execute the prepared statement multiple times (> 5 times)

From the 6th time onwards, the query plan used by Postgres changes, which
isn't fully unexpected as the documentation linked above does make it clear
that Postgres might decide to change the query plan for a generic query plan
after the 5th execution. And indeed, the estimated "cost" of the generic
plan is lower than the custom plan's: therefore the query planner behaves
correctly according to the documentation.

Now, the problem: the execution of the generic plan is multiple orders of
magnitude slower than the custom query plan ("actual time" for the generic
plan is over 6500x slower), yet Postgres decides to stick with the generic
plan. Very unexpected for me: I was very happy with the first 5 plans, yet
Postgres decides to change the plan for another that's enormously slower and
stick with it.
Giving a different parameter passed to the prepared statement (eg `now() -
interval '5 days'`) does give a "slow" custom plan (similar to the generic
plan). This means that the query planner does not realise that the actual
parameter value matters a lot, and that the parameters used _in practice_
result in a faster plan than the generic plan (100% of the first 5
executions), and that therefore it shouldn't stick to the generic plan.

It is particularly insidious as actually I wasn't even aware I was using
prepared statements. Like most applications I use a database driver (pgx, in
Go) which I learnt uses `PQexecPrepared` under the hood, which creates a
sort of "unnamed prepared statement" behaving the same as this minimal
reproduction without me ever being aware that prepared statements are
involved anywhere between my code and the database. This makes debugging
very complex as there's no reason to suspect anything
prepared-statement-related and a manual EXPLAIN ANALYZE outside of a
prepared statement won't show the problem.

Note: setting `plan_cache_mode = force_custom_plan` database-wide solved the
immediate problem but is a workaround. It was a very welcome workaround,
though.

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: PG Bug reporting form (#1)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

On Wed, Jul 6, 2022 at 2:41 PM PG Bug reporting form <noreply@postgresql.org>
wrote:

The following bug has been logged on the website:

Bug reference: 17540
Logged by: William Duclot
Email address: william.duclot@gmail.com
PostgreSQL version: 14.4
Operating system: GNU/Linux (Red Hat 8.5.0)
Description:

This means that the query planner does not realise that the actual
parameter value matters a lot, and that the parameters used _in practice_
result in a faster plan than the generic plan (100% of the first 5
executions), and that therefore it shouldn't stick to the generic plan.

I mean, it is the planner and so, no, it doesn't understand that the
executor encountered an issue.

It is particularly insidious as actually I wasn't even aware I was using
prepared statements. Like most applications I use a database driver (pgx,
in
Go) which I learnt uses `PQexecPrepared` under the hood, which creates a
sort of "unnamed prepared statement" behaving the same as this minimal
reproduction without me ever being aware that prepared statements are
involved anywhere between my code and the database.

Yep, and the core project pretty much says that if you don't like this you
need to complain to the driver writer and ask them to provide you an
interface to the unnamed parse-bind-execute API which lets you perform
parameterization without memory, just safety.

PostgreSQL has built the needed tools to make this less problematic, and
has made solid attempts to improve matters in the current state of things.
There doesn't seem to be a bug here. There is potentially room for
improvement but no one presently is working on things in this area.

David J.

#3David Rowley
dgrowleyml@gmail.com
In reply to: PG Bug reporting form (#1)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

(On Thu, 7 Jul 2022 at 09:41, PG Bug reporting form
<noreply@postgresql.org> wrote:

Scenario:
- I create a fairly simple table (id + timestamp). Timestamp is indexed.
- I create a simple-ish prepared statement for `SELECT MIN(id), MAX(id) from
relation_tuple_transaction WHERE timestamp >= $1;`
- I execute the prepared statement multiple times (> 5 times)

From the 6th time onwards, the query plan used by Postgres changes, which
isn't fully unexpected as the documentation linked above does make it clear
that Postgres might decide to change the query plan for a generic query plan
after the 5th execution. And indeed, the estimated "cost" of the generic
plan is lower than the custom plan's: therefore the query planner behaves
correctly according to the documentation.

It's a pretty narrow fix for a fairly generic problem, but I think the
planner wouldn't have picked the pk_rttx index if build_minmax_path()
hadn't added the "id IS NOT NULL" qual.

I know that Andy Fan has been proposing a patch to add a Bitmapset
field to RelOptInfo to record the non-NULLable columns. That's a
fairly lightweight patch, so it might be worth adding that just so
build_minmax_path() can skip adding the NULL test if the column is a
NOT NULL column.

I see that preprocess_minmax_aggregates() won't touch anything that's
not a query to a single relation, so the Var can't be NULLable from
being on the outside of an outer join. So it looks like to plumb in
Andy's patch, build_minmax_path() would need to be modified to check
if mminfo->target is a plain Var and then test if that Var is NOT
NULLable then skip adding the NullTest.

All seems fairly trivial. It's just a fairly narrow fix to side-step a
more generic costing problem we have for Params. I just don't have
any bright ideas on how to fix the more generic problem right now.

I've been looking for a good excuse to commit Andy's NOT NULL patch so
that he has some more foundations for the other work he's doing. This
might be that excuse.

Does anyone think differently?

David

[1]: /messages/by-id/CAKU4AWoZrFaWAkTn9tE2_dd4RYnUiQUiX8xc=ryUywhBWQv89w@mail.gmail.com

#4David Rowley
dgrowleyml@gmail.com
In reply to: David Rowley (#3)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower
#5Andres Freund
andres@anarazel.de
In reply to: David G. Johnston (#2)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

Hi,

On 2022-07-06 15:07:46 -0700, David G. Johnston wrote:

On Wed, Jul 6, 2022 at 2:41 PM PG Bug reporting form <noreply@postgresql.org>
wrote:

It is particularly insidious as actually I wasn't even aware I was using
prepared statements. Like most applications I use a database driver (pgx,
in
Go) which I learnt uses `PQexecPrepared` under the hood, which creates a
sort of "unnamed prepared statement" behaving the same as this minimal
reproduction without me ever being aware that prepared statements are
involved anywhere between my code and the database.

Yep, and the core project pretty much says that if you don't like this you
need to complain to the driver writer and ask them to provide you an
interface to the unnamed parse-bind-execute API which lets you perform
parameterization without memory, just safety.

PostgreSQL has built the needed tools to make this less problematic, and
has made solid attempts to improve matters in the current state of things.
There doesn't seem to be a bug here. There is potentially room for
improvement but no one presently is working on things in this area.

I think the cost for the slow plan being so much cheaper can almost be
qualified as bug.

The slow plan seems pretty nonsensical to me. ISTM that something in the
costing there is at least almost broken.

Result (cost=1.06..1.07 rows=1 width=16) (actual time=148.732..148.734 rows=1 loops=1)
Buffers: shared hit=4935
InitPlan 1 (returns $0)
-> Limit (cost=0.42..0.53 rows=1 width=8) (actual time=73.859..73.860 rows=0 loops=1)
Buffers: shared hit=2113
-> Index Scan using pk_rttx on relation_tuple_transaction (cost=0.42..9445.44 rows=86400 width=8) (actual time=73.857..73.858 rows=0 loops=1)
Index Cond: (id IS NOT NULL)
Filter: ("timestamp" >= $1)
Rows Removed by Filter: 259201
Buffers: shared hit=2113
InitPlan 2 (returns $1)
-> Limit (cost=0.42..0.53 rows=1 width=8) (actual time=74.869..74.870 rows=0 loops=1)
Buffers: shared hit=2822
-> Index Scan Backward using pk_rttx on relation_tuple_transaction relation_tuple_transaction_1 (cost=0.42..9445.44 rows=86400 width=8) (actual time=74.868..74.868 rows=0 loops=1)
Index Cond: (id IS NOT NULL)
Filter: ("timestamp" >= $1)
Rows Removed by Filter: 259201
Buffers: shared hit=2822
Planning Time: 0.224 ms
Execution Time: 148.781 ms

The planner assumes the table has 259201 rows. Somehow we end up
assuming that a estimate-less filter reduces the number of rows to 86400
both on a backward and a forward scan.

And for some reason we don't take the filter clause into account *at
all* for the cost of returning the first row.

SET enable_seqscan = false;
EXPLAIN SELECT * FROM relation_tuple_transaction WHERE id IS NOT NULL LIMIT 1;
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ Limit (cost=0.42..0.45 rows=1 width=16) │
│ -> Index Scan using pk_rttx on relation_tuple_transaction (cost=0.42..8797.44 rows=259201 width=16) │
│ Index Cond: (id IS NOT NULL) │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(3 rows)

It's also pointless that we use "Index Cond: (id IS NOT NULL)" for a
primary key index, but that's a minor thing.

Greetings,

Andres Freund

#6David Rowley
dgrowleyml@gmail.com
In reply to: Andres Freund (#5)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

On Thu, 7 Jul 2022 at 12:46, Andres Freund <andres@anarazel.de> wrote:

I think the cost for the slow plan being so much cheaper can almost be
qualified as bug.

The slow plan seems pretty nonsensical to me. ISTM that something in the
costing there is at least almost broken.

I forgot to mention what the "generic problem" is when I posted my
reply. I should have mentioned that this is how we cost LIMIT. We
assume that we'll find the LIMIT 1 row after incurring the scan cost
multiplied by (1 / 259201).

For the plan with WHERE timestamp >= $1, the seqscan plan looks pretty
cheap for fetching DEFAULT_INEQ_SEL of the 259201 rows considering the
LIMIT multiples the cost of the scan by (1 / 86400).

David

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#3)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

David Rowley <dgrowleyml@gmail.com> writes:

I've been looking for a good excuse to commit Andy's NOT NULL patch so
that he has some more foundations for the other work he's doing. This
might be that excuse.

Does anyone think differently?

While I don't have any problem with tracking column NOT NULL flags
in RelOptInfo once the planner has a use for that info, I'm not sure
that we have a solid use-case for it quite yet. In particular, the
fact that the table column is marked NOT NULL doesn't mean that any
particular occurrence of that column's Var can be freely assumed to be
non-null. The patch I'm working on to label Vars that have possibly
been nulled by outer joins [1]/messages/by-id/830269.1656693747@sss.pgh.pa.us seems like essential infrastructure for
doing anything very useful with the info.

Maybe that objection doesn't apply to build_minmax_path's usage in
particular, but that's an awfully narrow use-case.

regards, tom lane

[1]: /messages/by-id/830269.1656693747@sss.pgh.pa.us

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#5)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

Andres Freund <andres@anarazel.de> writes:

I think the cost for the slow plan being so much cheaper can almost be
qualified as bug.
The slow plan seems pretty nonsensical to me. ISTM that something in the
costing there is at least almost broken.

I think this is probably an instance of the known problem that a generic
plan is made without knowledge of the actual parameter values, and that
can lead us to make statistical assumptions that are not valid for the
actual values, but nonetheless make one plan look cheaper than another
even though the opposite is true given the actual values. In essence,
comparing the cost estimate for the generic plan to the cost estimate
for a custom plan is not really logically valid, because those estimates
are founded on different statistics. I don't know how to fix that :-(.

regards, tom lane

#9David Rowley
dgrowleyml@gmail.com
In reply to: Tom Lane (#7)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

On Thu, 7 Jul 2022 at 15:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

While I don't have any problem with tracking column NOT NULL flags
in RelOptInfo once the planner has a use for that info, I'm not sure
that we have a solid use-case for it quite yet. In particular, the
fact that the table column is marked NOT NULL doesn't mean that any
particular occurrence of that column's Var can be freely assumed to be
non-null. The patch I'm working on to label Vars that have possibly
been nulled by outer joins [1] seems like essential infrastructure for
doing anything very useful with the info.

I was aware that you'd done that work. I'm interested in it, but just
not found the time to look yet.

Maybe that objection doesn't apply to build_minmax_path's usage in
particular, but that's an awfully narrow use-case.

I thought I'd quickly put the idea together and fairly quickly noticed
that we do preprocess_minmax_aggregates() in grouping_planner(), which
is long before we load the RelOptInfo data in
add_base_rels_to_query(), which is called in query_planner(). I
considered if we could move the preprocess_minmax_aggregates(), but
that does not seem right, although, surprisingly, no tests seem to
fail from doing so. I'd have expected at least some EXPLAIN outputs to
have changed from the no-longer-present IS NOT NULL quals.

I imagine a much less narrow case would be to check for redundant
RestrictInfos in distribute_restrictinfo_to_rels(). That would also
catch cases such as WHERE non_nullable_col IS NULL, provided that qual
made it down to baserestrictinfo. When I realised that, I thought I
might be starting to overlap with your work in the link below.

[1] /messages/by-id/830269.1656693747@sss.pgh.pa.us

The 2 attached patches do fix the bad reported plan, it's just that
it's a very roundabout way of fixing it

Anyway, I've no current plans to take the attached any further. I
think it'll be better to pursue your NULLable-Var stuff and see if we
can do something more generic like remove provably redundant NullTests
from baserestrictinfo.

David

Attachments:

v1-0001-Track-NOT-NULL-columns-in-RelOptInfo-for-base-rel.patchtext/plain; charset=US-ASCII; name=v1-0001-Track-NOT-NULL-columns-in-RelOptInfo-for-base-rel.patchDownload+14-1
v1-0002-Don-t-add-is-NOT-NULL-clause-to-MinMaxAggInfo-tra.patchtext/plain; charset=US-ASCII; name=v1-0002-Don-t-add-is-NOT-NULL-clause-to-MinMaxAggInfo-tra.patchDownload+27-22
#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#9)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

David Rowley <dgrowleyml@gmail.com> writes:

Anyway, I've no current plans to take the attached any further. I
think it'll be better to pursue your NULLable-Var stuff and see if we
can do something more generic like remove provably redundant NullTests
from baserestrictinfo.

Yeah, I suspect that the way forward is to allow
preprocess_minmax_aggregates to do what it does now, and then
remove the IS NOT NULL clause again later when we have the
info available to let us do that in a generic way.

In any case, as you said, it's just a band-aid that happens to
help in this exact scenario. It's not doing much for the bad
cost estimation that's the root of the problem.

regards, tom lane

#11Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#8)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

Hi,

On 2022-07-06 23:13:18 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

I think the cost for the slow plan being so much cheaper can almost be
qualified as bug.
The slow plan seems pretty nonsensical to me. ISTM that something in the
costing there is at least almost broken.

I think this is probably an instance of the known problem that a generic
plan is made without knowledge of the actual parameter values, and that
can lead us to make statistical assumptions that are not valid for the
actual values, but nonetheless make one plan look cheaper than another
even though the opposite is true given the actual values. In essence,
comparing the cost estimate for the generic plan to the cost estimate
for a custom plan is not really logically valid, because those estimates
are founded on different statistics. I don't know how to fix that :-(.

I think there's something more fundamentally wrong - somehow we end up with
assuming > 50% selectivity on both the min and the max initplan, for the same
condition! And afaics (although it's a bit hard to see with the precision
explain prints floating point values as) don't charge cpu_operator_cost /
cpu_tuple_cost. And this is on a table where we can know, despite not know the
parameter value, that the column being compared has a correlation of 1.

In this case the whole generic plan part seems like a red herring. The generic
plan is *awful* and would still be awful if the value were known, but
somewhere around the middle of the value range.

Here's the op's tables + query, but without the prepared statement part:

CREATE TABLE relation_tuple_transaction (
id BIGSERIAL NOT NULL UNIQUE,
timestamp TIMESTAMP WITHOUT TIME ZONE NOT NULL UNIQUE,
CONSTRAINT pk_rttx PRIMARY KEY (id)
);
CREATE INDEX ix_relation_tuple_transaction_by_timestamp on relation_tuple_transaction(timestamp);
INSERT INTO relation_tuple_transaction(timestamp) SELECT * FROM generate_series
( now() - interval '3 days'
, now()
, '1 second'::interval) dd
;
vacuum freeze analyze;
EXPLAIN ANALYZE SELECT MIN(id), MAX(id) from relation_tuple_transaction WHERE timestamp >= (now() - interval '1.5 days');

postgres[631148][1]=# EXPLAIN ANALYZE SELECT MIN(id), MAX(id) from relation_tuple_transaction WHERE timestamp >= (now() - interval '1.5 days');;

Result (cost=1.01..1.02 rows=1 width=16) (actual time=113.379..113.381 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.42..0.50 rows=1 width=8) (actual time=113.347..113.348 rows=1 loops=1)
-> Index Scan using pk_rttx on relation_tuple_transaction (cost=0.42..10741.45 rows=127009 width=8) (actual time=113.345..113.345 rows=1 loops=1)
Index Cond: (id IS NOT NULL)
Filter: ("timestamp" >= (now() - '1 day 12:00:00'::interval))
Rows Removed by Filter: 129746
InitPlan 2 (returns $1)
-> Limit (cost=0.42..0.50 rows=1 width=8) (actual time=0.024..0.024 rows=1 loops=1)
-> Index Scan Backward using pk_rttx on relation_tuple_transaction relation_tuple_transaction_1 (cost=0.42..10741.45 rows=127009 width=8) (actual time=0.023..0.023 rows=1 loops=1)
Index Cond: (id IS NOT NULL)
Filter: ("timestamp" >= (now() - '1 day 12:00:00'::interval))
Planning Time: 0.370 ms
Execution Time: 113.441 ms
(14 rows)

We're pretty much by definition scanning half the table via the index scans,
and end up with a cost of 1.02 (yes, aware that the paths are costed
separately).

FWIW, manually writing the min/max as ORDER BY timestamp ASC/DESC LIMIT 1
queries yields a *vastly* better plan:

EXPLAIN ANALYZE SELECT (SELECT id FROM relation_tuple_transaction WHERE timestamp >= (now() - interval '1.5 days') ORDER BY timestamp ASC LIMIT 1), (SELECT id FROM relation_tuple_transaction WHERE timestamp >= (now() - interval '1.5 days') ORDER BY timestamp DESC LIMIT 1);

Result (cost=0.92..0.93 rows=1 width=16) (actual time=0.110..0.111 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.42..0.46 rows=1 width=16) (actual time=0.079..0.079 rows=1 loops=1)
-> Index Scan using ix_relation_tuple_transaction_by_timestamp on relation_tuple_transaction (cost=0.42..4405.46 rows=129602 width=16) (actual time=0.077..0.078 rows=1 loops=1)
Index Cond: ("timestamp" >= (now() - '1 day 12:00:00'::interval))
InitPlan 2 (returns $1)
-> Limit (cost=0.42..0.46 rows=1 width=16) (actual time=0.028..0.028 rows=1 loops=1)
-> Index Scan Backward using ix_relation_tuple_transaction_by_timestamp on relation_tuple_transaction relation_tuple_transaction_1 (cost=0.42..4405.46 rows=129602 width=16) (actual time=0.027..0.027 rows=1 loops=1)
Index Cond: ("timestamp" >= (now() - '1 day 12:00:00'::interval))
Planning Time: 0.270 ms
Execution Time: 0.159 ms (11 rows)

And it stays sane even if you add a (redundantly evaluated) AND id IS NOT NULL.

EXPLAIN SELECT id FROM relation_tuple_transaction WHERE timestamp >= (now() - interval '1.5 days') AND id IS NOT NULL ORDER BY timestamp ASC LIMIT 1;
QUERY PLAN
Limit (cost=0.42..0.46 rows=1 width=16)
-> Index Scan using ix_relation_tuple_transaction_by_timestamp on relation_tuple_transaction (cost=0.42..4405.46 rows=129602 width=16)
Index Cond: ("timestamp" >= (now() - '1 day 12:00:00'::interval))
Filter: (id IS NOT NULL)
(4 rows)

EXPLAIN SELECT min(id) FROM relation_tuple_transaction WHERE timestamp >= (now() - interval '1.5 days');
QUERY PLAN
Result (cost=0.50..0.51 rows=1 width=8)
InitPlan 1 (returns $0)
-> Limit (cost=0.42..0.50 rows=1 width=8)
-> Index Scan using pk_rttx on relation_tuple_transaction (cost=0.42..10741.45 rows=129602 width=8)
Index Cond: (id IS NOT NULL)
Filter: ("timestamp" >= (now() - '1 day 12:00:00'::interval))
(6 rows)

Greetings,

Andres Freund

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#11)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

Andres Freund <andres@anarazel.de> writes:

On 2022-07-06 23:13:18 -0400, Tom Lane wrote:

comparing the cost estimate for the generic plan to the cost estimate
for a custom plan is not really logically valid, because those estimates
are founded on different statistics. I don't know how to fix that :-(.

I think there's something more fundamentally wrong - somehow we end up with
assuming > 50% selectivity on both the min and the max initplan, for the same
condition!

Well, sure, because it *is* the same condition. AFAICS this is operating
as designed. Do I wish it were better? Sure, but there is no simple fix
here.

The reasoning that's being applied in the generic plan is

(1) default selectivity estimate for a scalar inequality is
#define DEFAULT_INEQ_SEL 0.3333333333333333

(2) therefore, the filter condition on the indexscan will select a random
one-third of the table;

(3) therefore, the LIMIT will be able to stop after about three rows,
whichever direction we scan in.

The information that is lacking is that the "id" and "timestamp"
columns are heavily correlated, so that we may have to scan far more
than three rows in "id" order before finding a row satisfying the
inequality on "timestamp". This is a problem we've understood for
a long time --- I recall talking about it at PGCon a decade ago.

The extended stats machinery provides a framework wherein we could
calculate and save the ordering correlation between the two columns,
but I don't believe it actually calculates that number yet --- I think
the functional-dependency stuff is close but not the right thing.
Even if we had the stats, it's not very clear where to fit this
type of consideration into the planner's estimates.

In this case the whole generic plan part seems like a red herring. The generic
plan is *awful* and would still be awful if the value were known, but
somewhere around the middle of the value range.

If the value were somewhere around the middle (which is more or less
what we're assuming for the generic plan), then an indexscan on the
timestamp column isn't going to be that great either; you'd still
be scanning half the table.

FWIW, manually writing the min/max as ORDER BY timestamp ASC/DESC LIMIT 1
queries yields a *vastly* better plan:

Those queries give the wrong answers. We're looking for the min or max
id, not the id associated with the min or max timestamp. (They're
accidentally the same with this toy dataset.)

regards, tom lane

#13David Rowley
dgrowleyml@gmail.com
In reply to: Tom Lane (#10)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

On Thu, 7 Jul 2022 at 15:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Rowley <dgrowleyml@gmail.com> writes:

Anyway, I've no current plans to take the attached any further. I
think it'll be better to pursue your NULLable-Var stuff and see if we
can do something more generic like remove provably redundant NullTests
from baserestrictinfo.

Yeah, I suspect that the way forward is to allow
preprocess_minmax_aggregates to do what it does now, and then
remove the IS NOT NULL clause again later when we have the
info available to let us do that in a generic way.

I started looking at a more generic way to fix this. In the attached
I'm catching quals being added to baserestrictinfo in
distribute_restrictinfo_to_rels() and checking for IS NOT NULL quals
on columns defined with NOT NULL.

I did this by adding a new function add_baserestrictinfo_to_rel()
which can be the place where we add any future logic to ignore other
always-true quals. Perhaps in the future, we can add some logic there
to look for quals on partitions which are always true based on the
partition constraint.

I also took the opportunity here to slightly modernised the Bitmapset
code in this area. We previously called bms_membership() and then
bms_singleton_member(), which is not quite optimal. We invented
bms_get_singleton_member() as a more efficient way of getting that.
The empty set case can just be handled more easily now since you
changed empty sets to always be NULL. If it's not an empty set and not
a singleton, then it must contain multiple members.

I'm quite keen to see some forward progress on improving things for
this bug report. It would be good to take some more measures to stop
the planner being tricked into making silly mistakes. This is one
example of somewhere we could do better.

David

Attachments:

ignore_is_not_null_base_quals_on_not_null_columns.patchtext/plain; charset=US-ASCII; name=ignore_is_not_null_base_quals_on_not_null_columns.patchDownload+83-46
#14Richard Guo
guofenglinux@gmail.com
In reply to: David Rowley (#13)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

On Thu, Jul 6, 2023 at 7:55 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Thu, 7 Jul 2022 at 15:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Rowley <dgrowleyml@gmail.com> writes:

Anyway, I've no current plans to take the attached any further. I
think it'll be better to pursue your NULLable-Var stuff and see if we
can do something more generic like remove provably redundant NullTests
from baserestrictinfo.

Yeah, I suspect that the way forward is to allow
preprocess_minmax_aggregates to do what it does now, and then
remove the IS NOT NULL clause again later when we have the
info available to let us do that in a generic way.

I started looking at a more generic way to fix this. In the attached
I'm catching quals being added to baserestrictinfo in
distribute_restrictinfo_to_rels() and checking for IS NOT NULL quals
on columns defined with NOT NULL.

I did this by adding a new function add_baserestrictinfo_to_rel()
which can be the place where we add any future logic to ignore other
always-true quals. Perhaps in the future, we can add some logic there
to look for quals on partitions which are always true based on the
partition constraint.

I think this is a good start. Maybe we can extend it with little effort
to cover OR clauses. For an OR clause, we can test its sub-clauses and
if one of them is IS NOT NULL qual on a NOT NULL column then we can know
that the OR clause is always true.

Maybe we can also test if the qual is always true according to the
applicable constraint expressions of the given relation, something that
is like the opposite of relation_excluded_by_constraints(). Of course
that would require much more efforts.

Another thing I'm wondering is that since we already have the
outer-join-aware-Var infrastructure, maybe we can also test whether a IS
NOT NULL qual in join clauses is always true. I imagine we need to test
whether the Var in the IS NOT NULL qual has an empty varnullingrels
besides that the Var is a NOT NULL column.

BTW, with this patch the variable ‘rel’ in function
distribute_restrictinfo_to_rels is unused.

Thanks
Richard

#15Richard Guo
guofenglinux@gmail.com
In reply to: Richard Guo (#14)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

On Thu, Jul 6, 2023 at 5:26 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Thu, Jul 6, 2023 at 7:55 AM David Rowley <dgrowleyml@gmail.com> wrote:

I started looking at a more generic way to fix this. In the attached
I'm catching quals being added to baserestrictinfo in
distribute_restrictinfo_to_rels() and checking for IS NOT NULL quals
on columns defined with NOT NULL.

I did this by adding a new function add_baserestrictinfo_to_rel()
which can be the place where we add any future logic to ignore other
always-true quals. Perhaps in the future, we can add some logic there
to look for quals on partitions which are always true based on the
partition constraint.

I think this is a good start. Maybe we can extend it with little effort
to cover OR clauses. For an OR clause, we can test its sub-clauses and
if one of them is IS NOT NULL qual on a NOT NULL column then we can know
that the OR clause is always true.

Maybe we can also test if the qual is always true according to the
applicable constraint expressions of the given relation, something that
is like the opposite of relation_excluded_by_constraints(). Of course
that would require much more efforts.

Another thing I'm wondering is that since we already have the
outer-join-aware-Var infrastructure, maybe we can also test whether a IS
NOT NULL qual in join clauses is always true. I imagine we need to test
whether the Var in the IS NOT NULL qual has an empty varnullingrels
besides that the Var is a NOT NULL column.

BTW, with this patch the variable ‘rel’ in function
distribute_restrictinfo_to_rels is unused.

Attached is what I have in mind. The patch extends the logic from two
points.

* it also checks OR clauses to see if it is always true.

* it also checks for join clauses by additionally testing if the nulling
bitmap is empty.

I did not try the logic about testing a qual against the relation's
constraints though.

Thanks
Richard

Attachments:

v2-0001-ignore-is-not-null-quals-on-not-null-columns.patchapplication/octet-stream; name=v2-0001-ignore-is-not-null-quals-on-not-null-columns.patchDownload+150-48
#16David Rowley
dgrowleyml@gmail.com
In reply to: Richard Guo (#15)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

On Fri, 7 Jul 2023 at 19:03, Richard Guo <guofenglinux@gmail.com> wrote:

Attached is what I have in mind. The patch extends the logic from two
points.

* it also checks OR clauses to see if it is always true.

* it also checks for join clauses by additionally testing if the nulling
bitmap is empty.

Do you mind writing some regression tests for this?

I don't really see an existing test file that would suit, maybe it's
worth adding something like predicate.sql

David

#17Richard Guo
guofenglinux@gmail.com
In reply to: David Rowley (#16)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

On Mon, Jul 10, 2023 at 10:14 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Fri, 7 Jul 2023 at 19:03, Richard Guo <guofenglinux@gmail.com> wrote:

Attached is what I have in mind. The patch extends the logic from two
points.

* it also checks OR clauses to see if it is always true.

* it also checks for join clauses by additionally testing if the nulling
bitmap is empty.

Do you mind writing some regression tests for this?

I don't really see an existing test file that would suit, maybe it's
worth adding something like predicate.sql

Here is v3 patch with regression tests. I add the new test into the
group where stats test is in, but I'm not sure if this is the right
place.

Thanks
Richard

Attachments:

v3-0001-ignore-is-not-null-quals-on-not-null-columns.patchapplication/octet-stream; name=v3-0001-ignore-is-not-null-quals-on-not-null-columns.patchDownload+308-49
#18Richard Guo
guofenglinux@gmail.com
In reply to: Richard Guo (#17)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

On Mon, Jul 10, 2023 at 2:39 PM Richard Guo <guofenglinux@gmail.com> wrote:

Here is v3 patch with regression tests. I add the new test into the
group where stats test is in, but I'm not sure if this is the right
place.

cfbot says there is a test failure in postgres_fdw. So update to v4 to
fix that.

Thanks
Richard

Attachments:

v4-0001-ignore-is-not-null-quals-on-not-null-columns.patchapplication/octet-stream; name=v4-0001-ignore-is-not-null-quals-on-not-null-columns.patchDownload+313-53
#19David Rowley
dgrowleyml@gmail.com
In reply to: Richard Guo (#17)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

On Mon, 10 Jul 2023 at 18:39, Richard Guo <guofenglinux@gmail.com> wrote:

Here is v3 patch with regression tests. I add the new test into the
group where stats test is in, but I'm not sure if this is the right
place.

Thanks for taking an interest in this.

I spent more time looking at the idea and I wondered why we should
just have it skip distributing IS NOT NULL quals to the relations.
Should we also be allow IS NULL quals on non-nullable Vars to be
detected as false?

I did some work on your v3 patch to see if that could be made to work.
I ended up just trying to make a new RestrictInfo with a "false"
clause, but quickly realised that it's not safe to go making new
RestrictInfos during deconstruct_distribute_oj_quals(). A comment
there mentions:

/*
* Each time we produce RestrictInfo(s) from these quals, reset the
* last_rinfo_serial counter, so that the RestrictInfos for the "same"
* qual condition get identical serial numbers. (This relies on the
* fact that we're not changing the qual list in any way that'd affect
* the number of RestrictInfos built from it.) This'll allow us to
* detect duplicative qual usage later.
*/

I ended up moving the function that looks for the NullTest quals in
the joinlist out so it's done after the quals have been distributed to
the relations. I'm not really that happy with this as if we ever
found some way to optimise quals that could be made part of an
EquivalenceClass then those quals would have already have been
processed to become EquivalenceClasses. I just don't see how to do it
earlier as deconstruct_distribute_oj_quals() calls
remove_nulling_relids() which changes the Var's varnullingrels causing
them to be empty during the processing of the NullTest qual.

It's also not so great that the RestrictInfo gets duplicated in:

CREATE TABLE t1 (a INT NOT NULL, b INT);
CREATE TABLE t2 (c INT NOT NULL, d INT);
CREATE TABLE t3 (e INT NOT NULL, f INT);

postgres=# EXPLAIN (costs off) SELECT * FROM t1 JOIN t2 ON t1.a = 1
LEFT JOIN t3 ON t2.c IS NULL AND t2.d = 1;
QUERY PLAN
-------------------------------------------------------
Nested Loop
-> Nested Loop Left Join
Join Filter: (false AND false AND (t2.d = 1))
-> Seq Scan on t2
-> Result
One-Time Filter: false
-> Materialize
-> Seq Scan on t1
Filter: (a = 1)
(9 rows)

Adjusting the code to build a new false clause and setting that in the
existing RestrictInfo rather than building a new RestrictInfo seems to
fix that. I wondered if the duplication was a result of the
rinfo_serial number changing.

Checking back to the original MinMaxAgg I'm not sure if this is all
getting more complex than it's worth or not.

I've attached what I've ended up with so far.

David

David

Attachments:

v4_transform_NullTest_quals.patchapplication/octet-stream; name=v4_transform_NullTest_quals.patchDownload+510-60
#20Richard Guo
guofenglinux@gmail.com
In reply to: David Rowley (#19)
hackersbugs
Re: BUG #17540: Prepared statement: PG switches to a generic query plan which is consistently much slower

On Tue, Sep 26, 2023 at 9:42 PM David Rowley <dgrowleyml@gmail.com> wrote:

I ended up moving the function that looks for the NullTest quals in
the joinlist out so it's done after the quals have been distributed to
the relations.

It seems that the RestrictInfos for the "same" qual condition still may
get different serial numbers even if transform_join_clauses() is called
after we've distributed all the quals. For example,

select * from t1
left join t2 on t1.a = t2.c
left join t3 on t2.c = t3.e and t2.c is null;

There are two versions for qual 't2.c is null': with and without being
marked nullable by t1/t2 join. Let's write them as 'c* is null' and 'c
is null'. They are supposed to have identical serial number. But after
we've transformed 'c is null' to 'false', they do not have identical
serial number any more. This may cause problems where the logic of
serial numbers is relied on?

I'm not really that happy with this as if we ever
found some way to optimise quals that could be made part of an
EquivalenceClass then those quals would have already have been
processed to become EquivalenceClasses. I just don't see how to do it
earlier as deconstruct_distribute_oj_quals() calls
remove_nulling_relids() which changes the Var's varnullingrels causing
them to be empty during the processing of the NullTest qual.

Hmm, I don't think it's a problem that deconstruct_distribute_oj_quals
changes the nullingrels. It will compute the correct nullingrels at
last for different clones of the same qual condition. We can just check
the nullingrels whatever it computes.

It's also not so great that the RestrictInfo gets duplicated in:

Adjusting the code to build a new false clause and setting that in the
existing RestrictInfo rather than building a new RestrictInfo seems to
fix that. I wondered if the duplication was a result of the
rinfo_serial number changing.

The RestrictInfo nodes in different joinlists are multiply-linked rather
than copied, so when building restrictlist for a joinrel we use pointer
equality to remove any duplication. In your patch new RestrictInfo
nodes are created in transform_join_clauses(), so pointer equality no
longer works and we see duplication in the plan.

Checking back to the original MinMaxAgg I'm not sure if this is all
getting more complex than it's worth or not.

It seems that optimizing IS NULL quals is more complex than optimizing
IS NOT NULL quals. I also wonder if it's worth the trouble to optimize
IS NULL quals.

BTW, there is an Assert failure running regression tests with your
patch. I haven't looked into it.

Thanks
Richard

#21David Rowley
dgrowleyml@gmail.com
In reply to: Richard Guo (#20)
hackersbugs
#22Richard Guo
guofenglinux@gmail.com
In reply to: David Rowley (#21)
hackersbugs
#23Andrei Lepikhov
lepihov@gmail.com
In reply to: Richard Guo (#22)
hackersbugs
#24Richard Guo
guofenglinux@gmail.com
In reply to: Andrei Lepikhov (#23)
hackersbugs
#25David Rowley
dgrowleyml@gmail.com
In reply to: Richard Guo (#24)
hackersbugs
#26Richard Guo
guofenglinux@gmail.com
In reply to: David Rowley (#25)
hackersbugs
#27David Rowley
dgrowleyml@gmail.com
In reply to: Richard Guo (#26)
hackersbugs
#28Andy Fan
zhihui.fan1213@gmail.com
In reply to: David Rowley (#27)
hackersbugs
#29Richard Guo
guofenglinux@gmail.com
In reply to: Andy Fan (#28)
hackersbugs
#30Andy Fan
zhihui.fan1213@gmail.com
In reply to: Richard Guo (#29)
hackersbugs
#31Peter Smith
smithpb2250@gmail.com
In reply to: David Rowley (#27)
hackersbugs
#32David Rowley
dgrowleyml@gmail.com
In reply to: Andy Fan (#28)
hackersbugs
#33David Rowley
dgrowleyml@gmail.com
In reply to: Peter Smith (#31)
hackersbugs
#34David Rowley
dgrowleyml@gmail.com
In reply to: David Rowley (#33)
hackersbugs
#35Richard Guo
guofenglinux@gmail.com
In reply to: David Rowley (#34)
hackersbugs