Strange influence of default_statistics_target
Hello!
I have encountered a problem with querry plan building:
I'd set default_statistics_target=700, run analyze. Postgres optimize had
chosen plan with hash_join and it took ~1 min for qerry to complete.
Then i set default_statistics_target=500 and the plan was significantly
changed and was using merge_join instead, complition time reduced in
hundreds times, cost reduced drastically.
Now i can't understand why more precise statistics leads to less optimized
plan and what is the right way to use default_statistics_target parameter?
I attach both good and bad querry plans and the querry itself
=?UTF-8?B?0JLQsNC00LjQvCDQkNC60LHQsNGI0LXQsg==?= <ufaowl@gmail.com> writes:
I have encountered a problem with querry plan building:
I'd set default_statistics_target=700, run analyze. Postgres optimize had
chosen plan with hash_join and it took ~1 min for qerry to complete.
Then i set default_statistics_target=500 and the plan was significantly
changed and was using merge_join instead, complition time reduced in
hundreds times, cost reduced drastically.
Now i can't understand why more precise statistics leads to less optimized
plan and what is the right way to use default_statistics_target parameter?
I attach both good and bad querry plans and the querry itself
Are those really the same query? Plan 2 is enforcing a "number_value IS
NOT NULL" condition on "attribute_value av1" that I don't see in plan 1.
And neither plan seems to have much to do with the query, since the
query has UNIONs that aren't in the plans.
But the short answer seems to be that in both cases, the only reason that
the plan doesn't take forever to run is that one sub-join chances to yield
precisely zero rows, and the PG executor happens to be more efficient
about that corner case in the one plan shape than the other. The planner
doesn't take the possibility of that short-circuit happening into account,
since it generally cannot be sure that a sub-join wouldn't yield any rows.
So it's just luck that one plan is noticeably faster in this case.
regards, tom lane
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
Hi, Tom!
Thank you for your answer.
Those plans indeed were for the querry i've sent. But plans where not
complete as i sent only parts of them that,as i tought, contain the
problem. In this letter i attach full plans and also statistics for tables
the querry works with. For default_statistics_target=700
and default_statistics_target=500.
This querry works much faster in our test enviroment which is 75% of our
production server in size.
Also i've noticed that the plan becomes "bad" after 9-10 hours after it was
fixed. And if I run ANALYZE again without changing effective_cache_size
in postgresql.conf it remains "bad".
Thank you in advance
Vadim
2017-01-18 19:18 GMT+05:00 Tom Lane <tgl@sss.pgh.pa.us>:
Show quoted text
=?UTF-8?B?0JLQsNC00LjQvCDQkNC60LHQsNGI0LXQsg==?= <ufaowl@gmail.com>
writes:I have encountered a problem with querry plan building:
I'd set default_statistics_target=700, run analyze. Postgres optimizehad
chosen plan with hash_join and it took ~1 min for qerry to complete.
Then i set default_statistics_target=500 and the plan was significantly
changed and was using merge_join instead, complition time reduced in
hundreds times, cost reduced drastically.
Now i can't understand why more precise statistics leads to lessoptimized
plan and what is the right way to use default_statistics_target
parameter?
I attach both good and bad querry plans and the querry itself
Are those really the same query? Plan 2 is enforcing a "number_value IS
NOT NULL" condition on "attribute_value av1" that I don't see in plan 1.
And neither plan seems to have much to do with the query, since the
query has UNIONs that aren't in the plans.But the short answer seems to be that in both cases, the only reason that
the plan doesn't take forever to run is that one sub-join chances to yield
precisely zero rows, and the PG executor happens to be more efficient
about that corner case in the one plan shape than the other. The planner
doesn't take the possibility of that short-circuit happening into account,
since it generally cannot be sure that a sub-join wouldn't yield any rows.
So it's just luck that one plan is noticeably faster in this case.regards, tom lane
Sorry, i've forgot to attach files themselves
2017-01-18 19:18 GMT+05:00 Tom Lane <tgl@sss.pgh.pa.us>:
Show quoted text
=?UTF-8?B?0JLQsNC00LjQvCDQkNC60LHQsNGI0LXQsg==?= <ufaowl@gmail.com>
writes:I have encountered a problem with querry plan building:
I'd set default_statistics_target=700, run analyze. Postgres optimizehad
chosen plan with hash_join and it took ~1 min for qerry to complete.
Then i set default_statistics_target=500 and the plan was significantly
changed and was using merge_join instead, complition time reduced in
hundreds times, cost reduced drastically.
Now i can't understand why more precise statistics leads to lessoptimized
plan and what is the right way to use default_statistics_target
parameter?
I attach both good and bad querry plans and the querry itself
Are those really the same query? Plan 2 is enforcing a "number_value IS
NOT NULL" condition on "attribute_value av1" that I don't see in plan 1.
And neither plan seems to have much to do with the query, since the
query has UNIONs that aren't in the plans.But the short answer seems to be that in both cases, the only reason that
the plan doesn't take forever to run is that one sub-join chances to yield
precisely zero rows, and the PG executor happens to be more efficient
about that corner case in the one plan shape than the other. The planner
doesn't take the possibility of that short-circuit happening into account,
since it generally cannot be sure that a sub-join wouldn't yield any rows.
So it's just luck that one plan is noticeably faster in this case.regards, tom lane
Hello.
Information i've sent, is it sufficient?
Thank you.
2017-01-19 14:11 GMT+05:00 Вадим Акбашев <ufaowl@gmail.com>:
Show quoted text
Sorry, i've forgot to attach files themselves
2017-01-18 19:18 GMT+05:00 Tom Lane <tgl@sss.pgh.pa.us>:
=?UTF-8?B?0JLQsNC00LjQvCDQkNC60LHQsNGI0LXQsg==?= <ufaowl@gmail.com>
writes:I have encountered a problem with querry plan building:
I'd set default_statistics_target=700, run analyze. Postgres optimizehad
chosen plan with hash_join and it took ~1 min for qerry to complete.
Then i set default_statistics_target=500 and the plan was significantly
changed and was using merge_join instead, complition time reduced in
hundreds times, cost reduced drastically.
Now i can't understand why more precise statistics leads to lessoptimized
plan and what is the right way to use default_statistics_target
parameter?
I attach both good and bad querry plans and the querry itself
Are those really the same query? Plan 2 is enforcing a "number_value IS
NOT NULL" condition on "attribute_value av1" that I don't see in plan 1.
And neither plan seems to have much to do with the query, since the
query has UNIONs that aren't in the plans.But the short answer seems to be that in both cases, the only reason that
the plan doesn't take forever to run is that one sub-join chances to yield
precisely zero rows, and the PG executor happens to be more efficient
about that corner case in the one plan shape than the other. The planner
doesn't take the possibility of that short-circuit happening into account,
since it generally cannot be sure that a sub-join wouldn't yield any rows.
So it's just luck that one plan is noticeably faster in this case.regards, tom lane