analyze causes query planner to choose suboptimal plan for a select query in separate transaction

Started by Martin Kováčikalmost 7 years ago12 messagesgeneral
Jump to latest
#1Martin Kováčik
kovacik@redbyte.eu

Hi group,

I have a Jenkins build server which runs integration tests of a Java
application against PostgreSQL 11.2. The tests run in serial fashion and
are reproducible because each test runs in separate DB transaction which is
rolled back at the end of each test. Each full build (with 1400+ tests)
takes approx 1h:45m.

My problem is that sometimes the build gets stuck (for several hours, often
10+) in random test waiting for random select query with postgres process
consuming 100% CPU (1 core).
Well, at least I thought it was random, then I figured out it was triggered
by autovacuum daemon. The problem is not the vacuum process itself, but in
the analyze part that runs together with autovacuum.

To illustrate my situation let's consider my tests look like this:

BEGIN;

-- A: insert data for the test

-- B: i'll refer to this point later

-- C: select(s)

ROLLBACK;

Everything is fine, until autovacuum (analyze) runs when the test is at
point B. After that the query planner at point C chooses wrong plan and the
query takes a long time to complete, blocking one CPU core for a long time.
It seems like the planner statistics inside running transaction are
affected by analyze task running outside of the transaction. In this case
after running analyze (outside the transaction) when the transaction is at
B, causes query planner to think there are no rows (because the inserts at
point A were not yet committed).

I did prepare a simple test case to reproduce this behavior:

First you need to create a table:

create table a (id bigint primary key);

Then run this transaction:

begin;
insert into a
select * from generate_series(1, 1000);

-- during sleep execute analyze on this db in separate connection
select pg_sleep(10);

explain analyze select count(*) from (
select distinct a1, a2, a3, a4 from a a1
left join a a2 on a1.id > 900
left join a a3 on a2.id = a3.id
left join a a4 on a3.id = a4.id
left join a a5 on a5.id = a4.id
left join a a6 on a6.id = a5.id
left join a a7 on a7.id = a6.id
left join a a8 on a8.id = a7.id) temp;

rollback;

The query plan for the select is as follows and the query takes 865ms to
finish.

QUERY
PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1987304286213.68..1987304286213.69 rows=1 width=8)
(actual time=862.406..862.407 rows=1 loops=1)
-> Unique (cost=1942976891422.08..1987284286213.68 rows=1600000000
width=128) (actual time=783.281..858.261 rows=100900 loops=1)
-> Sort (cost=1942976891422.08..1951838370380.40
rows=3544591583328 width=128) (actual time=783.280..825.055 rows=100900
loops=1)
Sort Key: a1.*, a2.*, a3.*, a4.*
Sort Method: external merge Disk: 12368kB
-> Merge Right Join (cost=5497387.28..53175003562.02
rows=3544591583328 width=128) (actual time=171.612..196.342 rows=100900
loops=1)
Merge Cond: (a4.id = a3.id)
-> Merge Left Join (cost=449798.05..1010691.54
rows=36848656 width=40) (actual time=8.257..8.815 rows=1000 loops=1)
Merge Cond: (a4.id = a5.id)
-> Sort (cost=158.51..164.16 rows=2260
width=40) (actual time=1.174..1.241 rows=1000 loops=1)
Sort Key: a4.id
Sort Method: quicksort Memory: 103kB
-> Seq Scan on a a4 (cost=0.00..32.60
rows=2260 width=40) (actual time=0.040..0.639 rows=1000 loops=1)
-> Materialize (cost=449639.54..465944.26
rows=3260943 width=8) (actual time=7.077..7.254 rows=1000 loops=1)
-> Sort (cost=449639.54..457791.90
rows=3260943 width=8) (actual time=7.066..7.148 rows=1000 loops=1)
Sort Key: a5.id
Sort Method: quicksort Memory: 71kB
-> Merge Right Join
(cost=2897.82..52270.24 rows=3260943 width=8) (actual time=4.352..6.630
rows=1000 loops=1)
Merge Cond: (a7.id = a6.id)
-> Merge Left Join
(cost=317.01..711.38 rows=25538 width=8) (actual time=1.236..2.482
rows=1000 loops=1)
Merge Cond: (a7.id =
a8.id)
-> Sort
(cost=158.51..164.16 rows=2260 width=8) (actual time=0.618..0.792 rows=1000
loops=1)
Sort Key: a7.id
Sort Method:
quicksort Memory: 71kB
-> Seq Scan on a
a7 (cost=0.00..32.60 rows=2260 width=8) (actual time=0.017..0.265
rows=1000 loops=1)
-> Sort
(cost=158.51..164.16 rows=2260 width=8) (actual time=0.605..0.773 rows=1000
loops=1)
Sort Key: a8.id
Sort Method:
quicksort Memory: 71kB
-> Seq Scan on a
a8 (cost=0.00..32.60 rows=2260 width=8) (actual time=0.024..0.261
rows=1000 loops=1)
-> Sort
(cost=2580.81..2644.65 rows=25538 width=16) (actual time=3.112..3.278
rows=1000 loops=1)
Sort Key: a6.id
Sort Method: quicksort
Memory: 71kB
-> Merge Left Join
(cost=317.01..711.38 rows=25538 width=16) (actual time=1.386..2.653
rows=1000 loops=1)
Merge Cond: (a5.id
= a6.id)
-> Sort
(cost=158.51..164.16 rows=2260 width=8) (actual time=0.697..0.870 rows=1000
loops=1)
Sort Key:
a5.id
Sort Method:
quicksort Memory: 71kB
-> Seq Scan
on a a5 (cost=0.00..32.60 rows=2260 width=8) (actual time=0.026..0.303
rows=1000 loops=1)
-> Sort
(cost=158.51..164.16 rows=2260 width=8) (actual time=0.681..0.851 rows=1000
loops=1)
Sort Key:
a6.id
Sort Method:
quicksort Memory: 71kB
-> Seq Scan
on a a6 (cost=0.00..32.60 rows=2260 width=8) (actual time=0.018..0.286
rows=1000 loops=1)
-> Materialize (cost=5047589.24..5143782.35
rows=19238623 width=104) (actual time=163.351..175.058 rows=100900 loops=1)
-> Sort (cost=5047589.24..5095685.80
rows=19238623 width=104) (actual time=163.343..167.508 rows=100900 loops=1)
Sort Key: a3.id
Sort Method: quicksort Memory: 16766kB
-> Merge Right Join
(cost=322881.15..615728.13 rows=19238623 width=104) (actual
time=119.338..144.270 rows=100900 loops=1)
Merge Cond: (a3.id = a2.id)
-> Sort (cost=158.51..164.16
rows=2260 width=40) (actual time=1.153..1.230 rows=1000 loops=1)
Sort Key: a3.id
Sort Method: quicksort
Memory: 103kB
-> Seq Scan on a a3
(cost=0.00..32.60 rows=2260 width=40) (actual time=0.039..0.629 rows=1000
loops=1)
-> Materialize
(cost=322722.64..331235.31 rows=1702533 width=72) (actual
time=118.179..131.501 rows=100900 loops=1)
-> Sort
(cost=322722.64..326978.97 rows=1702533 width=72) (actual
time=118.173..122.691 rows=100900 loops=1)
Sort Key: a2.id
Sort Method: quicksort
Memory: 16780kB
-> Nested Loop Left
Join (cost=0.00..76684.85 rows=1702533 width=72) (actual
time=1.331..100.252 rows=100900 loops=1)
Join Filter: (a1.id

900)

Rows Removed by
Join Filter: 900000
-> Seq Scan on a
a1 (cost=0.00..32.60 rows=2260 width=40) (actual time=0.019..0.292
rows=1000 loops=1)
-> Materialize
(cost=0.00..43.90 rows=2260 width=40) (actual time=0.000..0.047 rows=1000
loops=1000)
-> Seq Scan
on a a2 (cost=0.00..32.60 rows=2260 width=40) (actual time=0.016..0.606
rows=1000 loops=1)
Planning Time: 2.234 ms
Execution Time: 865.635 ms
(64 rows)

Now run the transaction again, but this time, run analyze in separate DB
connection when the transactions sleeps at select pg_sleep(10).
The query plan is as follows and the select query takes 81s to complete. As
you can see the planner greatly underestimated number of rows (1 vs 100900)
and chosen nested loop plan which took a long time to finish.

QUERY
PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=40.12..40.13 rows=1 width=8) (actual
time=81482.190..81482.190 rows=1 loops=1)
-> Unique (cost=40.10..40.11 rows=1 width=128) (actual
time=81418.552..81477.211 rows=100900 loops=1)
-> Sort (cost=40.10..40.10 rows=1 width=128) (actual
time=81418.551..81440.453 rows=100900 loops=1)
Sort Key: a1.*, a2.*, a3.*, a4.*
Sort Method: external merge Disk: 12376kB
-> Nested Loop Left Join (cost=0.00..40.09 rows=1
width=128) (actual time=1.424..81324.707 rows=100900 loops=1)
Join Filter: (a8.id = a7.id)
Rows Removed by Join Filter: 100800000
-> Nested Loop Left Join (cost=0.00..35.08 rows=1
width=136) (actual time=1.271..70436.483 rows=100900 loops=1)
Join Filter: (a7.id = a6.id)
Rows Removed by Join Filter: 100800000
-> Nested Loop Left Join (cost=0.00..30.06
rows=1 width=136) (actual time=1.117..59388.674 rows=100900 loops=1)
Join Filter: (a6.id = a5.id)
Rows Removed by Join Filter: 100800000
-> Nested Loop Left Join
(cost=0.00..25.05 rows=1 width=136) (actual time=0.962..48361.409
rows=100900 loops=1)
Join Filter: (a5.id = a4.id)
Rows Removed by Join Filter:
100800000
-> Nested Loop Left Join
(cost=0.00..20.04 rows=1 width=136) (actual time=0.807..37348.411
rows=100900 loops=1)
Join Filter: (a3.id = a4.id)
Rows Removed by Join Filter:
100800000
-> Nested Loop Left Join
(cost=0.00..15.02 rows=1 width=104) (actual time=0.544..18687.670
rows=100900 loops=1)
Join Filter: (a2.id =
a3.id)
Rows Removed by Join
Filter: 100800000
-> Nested Loop Left
Join (cost=0.00..10.01 rows=1 width=72) (actual time=0.283..190.095
rows=100900 loops=1)
Join Filter: (a1.id

900)

Rows Removed by
Join Filter: 900000
-> Seq Scan on a
a1 (cost=0.00..5.00 rows=1 width=40) (actual time=0.013..0.352 rows=1000
loops=1)
-> Seq Scan on a
a2 (cost=0.00..5.00 rows=1 width=40) (actual time=0.003..0.138 rows=1000
loops=1000)
-> Seq Scan on a a3
(cost=0.00..5.00 rows=1 width=40) (actual time=0.003..0.140 rows=1000
loops=100900)
-> Seq Scan on a a4
(cost=0.00..5.00 rows=1 width=40) (actual time=0.003..0.141 rows=1000
loops=100900)
-> Seq Scan on a a5
(cost=0.00..5.00 rows=1 width=8) (actual time=0.003..0.057 rows=1000
loops=100900)
-> Seq Scan on a a6 (cost=0.00..5.00
rows=1 width=8) (actual time=0.003..0.057 rows=1000 loops=100900)
-> Seq Scan on a a7 (cost=0.00..5.00 rows=1
width=8) (actual time=0.003..0.057 rows=1000 loops=100900)
-> Seq Scan on a a8 (cost=0.00..5.00 rows=1 width=8)
(actual time=0.003..0.057 rows=1000 loops=100900)
Planning Time: 0.522 ms
Execution Time: 81483.797 ms
(36 rows)

What do you think, is this intended behavior or a potential bug ?

Regards, Martin
*Martin Kováčik*
*CEO*
*redByte*, s.r.o.
+421 904 236 791
kovacik@redbyte.eu, www.redbyte.eu <http://redbyte.eu&gt;

#2Michael Lewis
mlewis@entrata.com
In reply to: Martin Kováčik (#1)
Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction

I assume it is in the documentation, but I am not aware of how stats are
handled for uncommitted work. Obviously in the example you provided the
table would be empty, but in your real tests do they start out empty? Would
it suffice to use temp tables created like the regular ones and analyze
after insert to ensure stats are up to date? Or would it make sense to turn
off auto-vacuum/analyze during these tests? The workload is not expected to
be similar to real world I would think, and so you don't need the vacuum
and the stats update perhaps.

Show quoted text
#3Martin Kováčik
kovacik@redbyte.eu
In reply to: Michael Lewis (#2)
Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction

Turning off autovacuum for the tests is a valid option and I will
definitely do this as a workaround. Each test pretty much starts with empty
schema and data for it is generated during the run and rolled back at the
end. I have a lot of tests and at the moment it is not feasible to modify
them.

The real workload for the application is different, but there are some
cases, when we import data from remote web service in a transaction do some
work with it and then we do a commit. If there is an autovacuum during this
process I assume there will be similar problem regarding planner statistics.

The real problem here is that the statistics that are seen from one
(uncommited) transaction are affected by analyze statement from another
(commited) transaction. I've seen similar behavior in production for other
applications using PostgreSQL too (10.x and 11.x) - 100% CPU consumed by an
otherwise fine select query. I was puzzled by this behavior back then but
now it makes sense.
*Martin Kováčik*
*CEO*
*redByte*, s.r.o.
+421 904 236 791
kovacik@redbyte.eu, www.redbyte.eu <http://redbyte.eu&gt;

On Thu, Apr 25, 2019 at 5:26 PM Michael Lewis <mlewis@entrata.com> wrote:

Show quoted text

I assume it is in the documentation, but I am not aware of how stats are
handled for uncommitted work. Obviously in the example you provided the
table would be empty, but in your real tests do they start out empty? Would
it suffice to use temp tables created like the regular ones and analyze
after insert to ensure stats are up to date? Or would it make sense to turn
off auto-vacuum/analyze during these tests? The workload is not expected to
be similar to real world I would think, and so you don't need the vacuum
and the stats update perhaps.

#4Michael Lewis
mlewis@entrata.com
In reply to: Martin Kováčik (#3)
Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction

On Thu, Apr 25, 2019, 11:34 AM Martin Kováčik <kovacik@redbyte.eu> wrote:

Turning off autovacuum for the tests is a valid option and I will
definitely do this as a workaround. Each test pretty much starts with empty
schema and data for it is generated during the run and rolled back at the
end. I have a lot of tests and at the moment it is not feasible to modify
them.

The real workload for the application is different, but there are some
cases, when we import data from remote web service in a transaction do some
work with it and then we do a commit. If there is an autovacuum during this
process I assume there will be similar problem regarding planner statistics.

Unless you are importing a huge amount of data relative to what is already
there, it seems likely to be significantly less impactful than adding data
to a completely empty table. The stats on a table with 0 rows and then 5000
rows is going to be night and day, while the difference between stats on
100,000 rows and 105,000 is not as impactful. Musing here. I expect others
will chime in.

Stats are not versioned with MVCC so it would expected that a commit in
another transaction that is updating stats would influence the query plan
for another transaction that is active.

Show quoted text
#5Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Martin Kováčik (#1)
Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction

On 4/25/19 7:37 AM, Martin Kováčik wrote:

Hi group,

See comments inline below

To illustrate my situation let's consider my tests look like this:

BEGIN;

-- A: insert data for the test

-- B: i'll refer to this point later

-- C: select(s)

ROLLBACK;

Everything is fine, until autovacuum (analyze) runs when the test is at
point B. After that the query planner at point C chooses wrong plan and
the query takes a long time to complete, blocking one CPU core for a
long time. It seems like the planner statistics inside running
transaction are affected by analyze task running outside of the
transaction. In this case after running analyze (outside the
transaction) when the transaction is at B, causes query planner to think
there are no rows (because the inserts at point A were not yet committed).

I did prepare a simple test case to reproduce this behavior:

First you need to create a table:

create table a (id bigint primary key);

Then run this transaction:

begin;
insert into a
select * from generate_series(1, 1000);

-- during sleep execute analyze on this db in separate connection
select pg_sleep(10);

analyze a;

On my machine that changes the time from:

29715.763 ms

to

291.765 ms

when running ANALYZE in the concurrent connection during the sleep.

explain analyze select count(*) from (
select distinct a1, a2, a3, a4 from a a1
left join a a2 on a1.id <http://a1.id&gt; > 900
left join a a3 on a2.id <http://a2.id&gt; = a3.id <http://a3.id&gt;
left join a a4 on a3.id <http://a3.id&gt; = a4.id <http://a4.id&gt;
left join a a5 on a5.id <http://a5.id&gt; = a4.id <http://a4.id&gt;
left join a a6 on a6.id <http://a6.id&gt; = a5.id <http://a5.id&gt;
left join a a7 on a7.id <http://a7.id&gt; = a6.id <http://a6.id&gt;
left join a a8 on a8.id <http://a8.id&gt; = a7.id <http://a7.id&gt;) temp;

rollback;

--
Adrian Klaver
adrian.klaver@aklaver.com

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Martin Kováčik (#1)
Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction

=?UTF-8?B?TWFydGluIEtvdsOhxI1paw==?= <kovacik@redbyte.eu> writes:

To illustrate my situation let's consider my tests look like this:

BEGIN;

-- A: insert data for the test

-- B: i'll refer to this point later

-- C: select(s)

ROLLBACK;

Everything is fine, until autovacuum (analyze) runs when the test is at
point B. After that the query planner at point C chooses wrong plan and the
query takes a long time to complete, blocking one CPU core for a long time.
It seems like the planner statistics inside running transaction are
affected by analyze task running outside of the transaction.

Yup, they are. However, you're already at risk of a pretty bad plan for
a case like this, since (by assumption) the stats before you did the
insert at step A are radically different from what they should be after
the insert.

The standard recommendation, when you need the results of a data change
to be understood by the planner immediately, is to do your own ANALYZE:

BEGIN;

-- A: insert data for the test

ANALYZE test_table;

-- C: select(s)

ROLLBACK;

This should protect step C against seeing any irrelevant stats, because

(a) once your transaction has done an ANALYZE, autovacuum shouldn't
think the stats are out of date, and

(b) even if it does, your transaction is now holding
ShareUpdateExclusiveLock on the table so auto-ANALYZE can't get
that lock to do a fresh ANALYZE.

Another thing you could do is not even allow the test table to exist
outside your transaction:

BEGIN;

CREATE TABLE test_table (...);

-- A: insert data for the test

ANALYZE test_table; -- this is now somewhat optional

-- C: select(s)

ROLLBACK;

Then there's nothing for auto-ANALYZE to get its hands on. If you're
satisfied with the planner's default behavior in the absence of any
stats, you could omit the post-insertion ANALYZE in this case. But
I'm not sure that that would represent a test that has much to do with
production situations.

regards, tom lane

#7Martin Kováčik
kovacik@redbyte.eu
In reply to: Michael Lewis (#4)
Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction

As my example shows you don't have to import a lot of rows - 1000 is enough
to make a difference - it all depends on the query. When a cartesian
product is involved only a few records is enough.
I think that stats should be MVCC versioned otherwise the planner is using
wrong statistics and chooses wrong plans.
*Martin Kováčik*
*CEO*
*redByte*, s.r.o.
+421 904 236 791
kovacik@redbyte.eu, www.redbyte.eu <http://redbyte.eu&gt;

On Thu, Apr 25, 2019 at 9:28 PM Michael Lewis <mlewis@entrata.com> wrote:

Show quoted text

On Thu, Apr 25, 2019, 11:34 AM Martin Kováčik <kovacik@redbyte.eu> wrote:

Turning off autovacuum for the tests is a valid option and I will
definitely do this as a workaround. Each test pretty much starts with empty
schema and data for it is generated during the run and rolled back at the
end. I have a lot of tests and at the moment it is not feasible to modify
them.

The real workload for the application is different, but there are some
cases, when we import data from remote web service in a transaction do some
work with it and then we do a commit. If there is an autovacuum during this
process I assume there will be similar problem regarding planner statistics.

Unless you are importing a huge amount of data relative to what is already
there, it seems likely to be significantly less impactful than adding data
to a completely empty table. The stats on a table with 0 rows and then 5000
rows is going to be night and day, while the difference between stats on
100,000 rows and 105,000 is not as impactful. Musing here. I expect others
will chime in.

Stats are not versioned with MVCC so it would expected that a commit in
another transaction that is updating stats would influence the query plan
for another transaction that is active.

#8Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Martin Kováčik (#7)
Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction

On 4/25/19 12:47 PM, Martin Kováčik wrote:

As my example shows you don't have to import a lot of rows - 1000 is
enough to make a difference - it all depends on the query. When a
cartesian product is involved only a few records is enough.
I think that stats should be MVCC versioned otherwise the planner is
using wrong statistics and chooses wrong plans.

Then you are looking at moving the choke point to looking up the correct
stats across possibly hundreds/thousands of transactions in flight.

*Martin Kováčik*
/CEO/
*redByte*, s.r.o.
+421 904 236 791
kovacik@redbyte.eu <mailto:kovacik@redbyte.eu>, www.redbyte.eu
<http://redbyte.eu&gt;

On Thu, Apr 25, 2019 at 9:28 PM Michael Lewis <mlewis@entrata.com
<mailto:mlewis@entrata.com>> wrote:

On Thu, Apr 25, 2019, 11:34 AM Martin Kováčik <kovacik@redbyte.eu
<mailto:kovacik@redbyte.eu>> wrote:

Turning off autovacuum for the tests is a valid option and I
will definitely do this as a workaround. Each test pretty much
starts with empty schema and data for it is generated during the
run and rolled back at the end. I have a lot of tests and at the
moment it is not feasible to modify them.

The real workload for the application is different, but there
are some cases, when we import data from remote web service in a
transaction do some work with it and then we do a commit. If
there is an autovacuum during this process I assume there will
be similar problem regarding planner statistics.

Unless you are importing a huge amount of data relative to what is
already there, it seems likely to be significantly less impactful
than adding data to a completely empty table. The stats on a table
with 0 rows and then 5000 rows is going to be night and day, while
the difference between stats on 100,000 rows and 105,000 is not as
impactful. Musing here. I expect others will chime in.

Stats are not versioned with MVCC so it would expected that a commit
in another transaction that is updating stats would influence the
query plan for another transaction that is active.

--
Adrian Klaver
adrian.klaver@aklaver.com

#9Martin Kováčik
kovacik@redbyte.eu
In reply to: Tom Lane (#6)
Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction

Thanks Tom,

that is a great explanation. The default plan, which overestimates number
of rows is much better than the plan (which is based on wrong stats) that
underestimates it. This is true for my particular case but may be bad for
other cases.

The schema is created before the tests begin so the suggestion to create
required tables inside a transaction isn't feasible for my case. For the
build server I'm going with Michael's suggestion to disable autovacuum.
Instead I'll do manual vacuuming/analyzing before build starts.

For the production use case I'll do my own analyze after bulk insert as you
suggested.

Initially I thought that inside a transaction stats are automatically
updated after the inserts because the query ran much faster than the query
which was based on statistics with no rows. I did not realize the planner
chooses a "default" plan when there are no stats.

*Martin Kováčik*
*CEO*
*redByte*, s.r.o.
+421 904 236 791
kovacik@redbyte.eu, www.redbyte.eu <http://redbyte.eu&gt;

On Thu, Apr 25, 2019 at 9:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

=?UTF-8?B?TWFydGluIEtvdsOhxI1paw==?= <kovacik@redbyte.eu> writes:

To illustrate my situation let's consider my tests look like this:

BEGIN;

-- A: insert data for the test

-- B: i'll refer to this point later

-- C: select(s)

ROLLBACK;

Everything is fine, until autovacuum (analyze) runs when the test is at
point B. After that the query planner at point C chooses wrong plan and

the

query takes a long time to complete, blocking one CPU core for a long

time.

It seems like the planner statistics inside running transaction are
affected by analyze task running outside of the transaction.

Yup, they are. However, you're already at risk of a pretty bad plan for
a case like this, since (by assumption) the stats before you did the
insert at step A are radically different from what they should be after
the insert.

The standard recommendation, when you need the results of a data change
to be understood by the planner immediately, is to do your own ANALYZE:

BEGIN;

-- A: insert data for the test

ANALYZE test_table;

-- C: select(s)

ROLLBACK;

This should protect step C against seeing any irrelevant stats, because

(a) once your transaction has done an ANALYZE, autovacuum shouldn't
think the stats are out of date, and

(b) even if it does, your transaction is now holding
ShareUpdateExclusiveLock on the table so auto-ANALYZE can't get
that lock to do a fresh ANALYZE.

Another thing you could do is not even allow the test table to exist
outside your transaction:

BEGIN;

CREATE TABLE test_table (...);

-- A: insert data for the test

ANALYZE test_table; -- this is now somewhat optional

-- C: select(s)

ROLLBACK;

Then there's nothing for auto-ANALYZE to get its hands on. If you're
satisfied with the planner's default behavior in the absence of any
stats, you could omit the post-insertion ANALYZE in this case. But
I'm not sure that that would represent a test that has much to do with
production situations.

regards, tom lane

#10Martin Kováčik
kovacik@redbyte.eu
In reply to: Adrian Klaver (#8)
Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction

I'm not sure if I understand what you mean. My initial thought was that
stats are fixed per transaction, i.e. analyze from comitted transaction
doesn't interfere with another running transaction. Maybe I was confused by
this because analyze can be run inside a transaction, so my assumption was
it is isolated like other statements.
*Martin Kováčik*
*CEO*
*redByte*, s.r.o.
+421 904 236 791
kovacik@redbyte.eu, www.redbyte.eu <http://redbyte.eu&gt;

On Thu, Apr 25, 2019 at 9:58 PM Adrian Klaver <adrian.klaver@aklaver.com>
wrote:

Show quoted text

On 4/25/19 12:47 PM, Martin Kováčik wrote:

As my example shows you don't have to import a lot of rows - 1000 is
enough to make a difference - it all depends on the query. When a
cartesian product is involved only a few records is enough.
I think that stats should be MVCC versioned otherwise the planner is
using wrong statistics and chooses wrong plans.

Then you are looking at moving the choke point to looking up the correct
stats across possibly hundreds/thousands of transactions in flight.

*Martin Kováčik*
/CEO/
*redByte*, s.r.o.
+421 904 236 791
kovacik@redbyte.eu <mailto:kovacik@redbyte.eu>, www.redbyte.eu
<http://redbyte.eu&gt;

On Thu, Apr 25, 2019 at 9:28 PM Michael Lewis <mlewis@entrata.com
<mailto:mlewis@entrata.com>> wrote:

On Thu, Apr 25, 2019, 11:34 AM Martin Kováčik <kovacik@redbyte.eu
<mailto:kovacik@redbyte.eu>> wrote:

Turning off autovacuum for the tests is a valid option and I
will definitely do this as a workaround. Each test pretty much
starts with empty schema and data for it is generated during the
run and rolled back at the end. I have a lot of tests and at the
moment it is not feasible to modify them.

The real workload for the application is different, but there
are some cases, when we import data from remote web service in a
transaction do some work with it and then we do a commit. If
there is an autovacuum during this process I assume there will
be similar problem regarding planner statistics.

Unless you are importing a huge amount of data relative to what is
already there, it seems likely to be significantly less impactful
than adding data to a completely empty table. The stats on a table
with 0 rows and then 5000 rows is going to be night and day, while
the difference between stats on 100,000 rows and 105,000 is not as
impactful. Musing here. I expect others will chime in.

Stats are not versioned with MVCC so it would expected that a commit
in another transaction that is updating stats would influence the
query plan for another transaction that is active.

--
Adrian Klaver
adrian.klaver@aklaver.com

#11Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Martin Kováčik (#10)
Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction

On 4/25/19 1:28 PM, Martin Kováčik wrote:

I'm not sure if I understand what you mean. My initial thought was that
stats are fixed per transaction, i.e. analyze from comitted transaction
doesn't interfere with another running transaction. Maybe I was confused
by this because analyze can be run inside a transaction, so my
assumption was it is isolated like other statements.

To make it clearer I would take a look at:

https://www.postgresql.org/docs/11/sql-analyze.html
https://www.postgresql.org/docs/11/catalog-pg-statistic.html

The gist is currently there is one entry(with exception noted in second
link) per column in each table. This is the most recent information
available subject to the conditions Tom pointed out in his post. To make
it MVCC aware like you suggest would mean tracking the state of all
transactions currently open and recording that information, so each
transaction could find its stats. This means the planning problem could
then move to the stats table as it would need to be analyzed itself to
work efficiently. This would probably also need to be MVCC aware to be
relevant, which would add to the overhead. I could see this turn into a
'hall of mirrors' problem quickly.

*Martin Kováčik*
/CEO/
*redByte*, s.r.o.
+421 904 236 791
kovacik@redbyte.eu <mailto:kovacik@redbyte.eu>, www.redbyte.eu
<http://redbyte.eu&gt;

On Thu, Apr 25, 2019 at 9:58 PM Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>> wrote:

On 4/25/19 12:47 PM, Martin Kováčik wrote:

As my example shows you don't have to import a lot of rows - 1000 is
enough to make a difference - it all depends on the query. When a
cartesian product is involved only a few records is enough.
I think that stats should be MVCC versioned otherwise the planner is
using wrong statistics and chooses wrong plans.

Then you are looking at moving the choke point to looking up the
correct
stats across possibly hundreds/thousands of transactions in flight.

*Martin Kováčik*
/CEO/
*redByte*, s.r.o.
+421 904 236 791
kovacik@redbyte.eu <mailto:kovacik@redbyte.eu>

<mailto:kovacik@redbyte.eu <mailto:kovacik@redbyte.eu>>,
www.redbyte.eu <http://www.redbyte.eu&gt;

<http://redbyte.eu&gt;

On Thu, Apr 25, 2019 at 9:28 PM Michael Lewis <mlewis@entrata.com

<mailto:mlewis@entrata.com>

<mailto:mlewis@entrata.com <mailto:mlewis@entrata.com>>> wrote:

     On Thu, Apr 25, 2019, 11:34 AM Martin Kováčik

<kovacik@redbyte.eu <mailto:kovacik@redbyte.eu>

     <mailto:kovacik@redbyte.eu <mailto:kovacik@redbyte.eu>>> wrote:

         Turning off autovacuum for the tests is a valid option and I
         will definitely do this as a workaround. Each test pretty

much

         starts with empty schema and data for it is generated

during the

         run and rolled back at the end. I have a lot of tests and

at the

         moment it is not feasible to modify them.

         The real workload for the application is different, but there
         are some cases, when we import data from remote web

service in a

         transaction do some work with it and then we do a commit. If
         there is an autovacuum during this process I assume there

will

         be similar problem regarding planner statistics.

     Unless you are importing a huge amount of data relative to

what is

     already there, it seems likely to be significantly less impactful
     than adding data to a completely empty table. The stats on a

table

     with 0 rows and then 5000 rows is going to be night and day,

while

     the difference between stats on 100,000 rows and 105,000 is

not as

     impactful. Musing here. I expect others will chime in.

     Stats are not versioned with MVCC so it would expected that a

commit

     in another transaction that is updating stats would influence the
     query plan for another transaction that is active.

--
Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com

#12Martin Kováčik
kovacik@redbyte.eu
In reply to: Adrian Klaver (#11)
Re: analyze causes query planner to choose suboptimal plan for a select query in separate transaction

I’ve got it now, thank you for the clarification. You are right.

Martin Kováčik
+421904236791

Show quoted text

On 26 Apr 2019, at 00:25, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 4/25/19 1:28 PM, Martin Kováčik wrote:
I'm not sure if I understand what you mean. My initial thought was that stats are fixed per transaction, i.e. analyze from comitted transaction doesn't interfere with another running transaction. Maybe I was confused by this because analyze can be run inside a transaction, so my assumption was it is isolated like other statements.

To make it clearer I would take a look at:

https://www.postgresql.org/docs/11/sql-analyze.html
https://www.postgresql.org/docs/11/catalog-pg-statistic.html

The gist is currently there is one entry(with exception noted in second link) per column in each table. This is the most recent information available subject to the conditions Tom pointed out in his post. To make it MVCC aware like you suggest would mean tracking the state of all transactions currently open and recording that information, so each transaction could find its stats. This means the planning problem could then move to the stats table as it would need to be analyzed itself to work efficiently. This would probably also need to be MVCC aware to be relevant, which would add to the overhead. I could see this turn into a 'hall of mirrors' problem quickly.

*Martin Kováčik*
/CEO/
*redByte*, s.r.o.
+421 904 236 791
kovacik@redbyte.eu <mailto:kovacik@redbyte.eu>, www.redbyte.eu <http://redbyte.eu&gt;
On Thu, Apr 25, 2019 at 9:58 PM Adrian Klaver <adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>> wrote:
On 4/25/19 12:47 PM, Martin Kováčik wrote:

As my example shows you don't have to import a lot of rows - 1000 is
enough to make a difference - it all depends on the query. When a
cartesian product is involved only a few records is enough.
I think that stats should be MVCC versioned otherwise the planner is
using wrong statistics and chooses wrong plans.

Then you are looking at moving the choke point to looking up the
correct
stats across possibly hundreds/thousands of transactions in flight.

*Martin Kováčik*
/CEO/
*redByte*, s.r.o.
+421 904 236 791
kovacik@redbyte.eu <mailto:kovacik@redbyte.eu>

<mailto:kovacik@redbyte.eu <mailto:kovacik@redbyte.eu>>,
www.redbyte.eu <http://www.redbyte.eu&gt;

<http://redbyte.eu&gt;

On Thu, Apr 25, 2019 at 9:28 PM Michael Lewis <mlewis@entrata.com

<mailto:mlewis@entrata.com>

<mailto:mlewis@entrata.com <mailto:mlewis@entrata.com>>> wrote:

On Thu, Apr 25, 2019, 11:34 AM Martin Kováčik

<kovacik@redbyte.eu <mailto:kovacik@redbyte.eu>

<mailto:kovacik@redbyte.eu <mailto:kovacik@redbyte.eu>>> wrote:

Turning off autovacuum for the tests is a valid option and I
will definitely do this as a workaround. Each test pretty

much

starts with empty schema and data for it is generated

during the

run and rolled back at the end. I have a lot of tests and

at the

moment it is not feasible to modify them.

The real workload for the application is different, but there
are some cases, when we import data from remote web

service in a

transaction do some work with it and then we do a commit. If
there is an autovacuum during this process I assume there

will

be similar problem regarding planner statistics.

Unless you are importing a huge amount of data relative to

what is

already there, it seems likely to be significantly less impactful
than adding data to a completely empty table. The stats on a

table

with 0 rows and then 5000 rows is going to be night and day,

while

the difference between stats on 100,000 rows and 105,000 is

not as

impactful. Musing here. I expect others will chime in.

Stats are not versioned with MVCC so it would expected that a

commit

in another transaction that is updating stats would influence the
query plan for another transaction that is active.

-- Adrian Klaver
adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>

--
Adrian Klaver
adrian.klaver@aklaver.com