improve EXPLAIN for wide tables
Recently, a user reported that running an explain for a query joining
many wide tables taking more than 1 minute to complete. Running the
query without explain takes only a few seconds.
Further research showed that this is similar to a report from
2018 [1]/messages/by-id/1537818224423-0.post@n3.nabble.com. colname_is_unique is used to assign unique
column names during deparse and this has O(N^2) behavior.
The good news is this behavior was optimized with commit [2]https://github.com/postgres/postgres/commit/52c707483ce4d0161127e4958d981d1b5655865e with the
help of a hash table. While the main stated purpose of this commit was to
improved eparse for views and rules, it also significantly improved
EXPLAIN which goes through the same routine via ExplainPrintPlan and
deparse_context_for_plan_tree.
Looking further into this improvement, I started to question if it is
necessary to make columns unique for EXPLAIN purposes?
An experimental patch ( attached ) to skip making columns unique
for EXPLAIN breaks only one test case for columns with default
column names from a function [3]https://github.com/postgres/postgres/blob/master/src/test/regress/sql/rangefuncs.sql#L586-L588. I have not had success
finding other cases that break.
src/test/regress/results/rangefuncs.out
@@ -2130,10 +2130,10 @@
explain (verbose, costs off)
select * from testrngfunc();
- QUERY PLAN
-----------------------------------------------------------
+ QUERY PLAN
+--------------------------------------------------------
Subquery Scan on "*SELECT*"
- Output: "*SELECT*"."?column?", "*SELECT*"."?column?_1"
+ Output: "*SELECT*"."?column?", "*SELECT*"."?column?"
-> Unique
Output: (1), (2)
-> Sort
While the performance impact will likely only be noticeable
on rare use cases, removing the make columns unique
work appears to be a good idea overall. There are probably
a few good ways to fix the broken case above.
running the attached measure.sql
query involving 60 tables with 150 columns
# 17.2 without commit [2]https://github.com/postgres/postgres/commit/52c707483ce4d0161127e4958d981d1b5655865e
Time: 58950.505 ms
Time: 472.912 ms
## HEAD with commit [2]https://github.com/postgres/postgres/commit/52c707483ce4d0161127e4958d981d1b5655865e
Time: 1229.704 ms
Time: 473.494 ms
## without uniquifying ( attached experimental patch )
Time: 499.185 ms
Time: 473.118 ms
[1]: /messages/by-id/1537818224423-0.post@n3.nabble.com
[2]: https://github.com/postgres/postgres/commit/52c707483ce4d0161127e4958d981d1b5655865e
[3]: https://github.com/postgres/postgres/blob/master/src/test/regress/sql/rangefuncs.sql#L586-L588
Regards,
Sami Imseih
Amazon Web Services (AWS)
Sami Imseih <samimseih@gmail.com> writes:
Looking further into this improvement, I started to question if it is
necessary to make columns unique for EXPLAIN purposes?
Yes, otherwise references to them elsewhere in the plan will be
ambiguous.
It looks like your proposal tries to dodge that by unique-ifying
in some cases but not others, which strikes me as a totally
random and confusing thing to do.
Is there any reason to think that 52c707483 wasn't a sufficient
response to this issue?
regards, tom lane
Looking further into this improvement, I started to question if it is
necessary to make columns unique for EXPLAIN purposes?
Yes, otherwise references to them elsewhere in the plan will be
ambiguous.
Explain will qualify the column name with the table name such as the simple
example below with the experimental patch. I have not been able to find cases,
except for the failed test case that involves "?column?", in which the
plan will result
in ambiguous column names.
postgres=# create table test (id int, id2 int);
CREATE TABLE
postgres=# create table test2 (id int, id2 int);
CREATE TABLE
postgres=# explain select * from test, test2 where test.id = test2.id
order by 1;
QUERY PLAN
---------------------------------------------------------------------
Merge Join (cost=317.01..711.38 rows=25538 width=16)
Merge Cond: (test.id = test2.id)
-> Sort (cost=158.51..164.16 rows=2260 width=8)
Sort Key: test.id
-> Seq Scan on test (cost=0.00..32.60 rows=2260 width=8)
-> Sort (cost=158.51..164.16 rows=2260 width=8)
Sort Key: test2.id
-> Seq Scan on test2 (cost=0.00..32.60 rows=2260 width=8)
(8 rows)
postgres=# explain verbose select * from test, test2 where test.id =
test2.id order by 1;
QUERY PLAN
----------------------------------------------------------------------------
Merge Join (cost=317.01..711.38 rows=25538 width=16)
Output: test.id, test.id2, test2.id, test2.id2
Merge Cond: (test.id = test2.id)
-> Sort (cost=158.51..164.16 rows=2260 width=8)
Output: test.id, test.id2
Sort Key: test.id
-> Seq Scan on public.test (cost=0.00..32.60 rows=2260 width=8)
Output: test.id, test.id2
-> Sort (cost=158.51..164.16 rows=2260 width=8)
Output: test2.id, test2.id2
Sort Key: test2.id
-> Seq Scan on public.test2 (cost=0.00..32.60 rows=2260 width=8)
Output: test2.id, test2.id2
Query Identifier: 2523359249885908438
(14 rows)
It looks like your proposal tries to dodge that by unique-ifying
in some cases but not others, which strikes me as a totally
random and confusing thing to do.
The work is required for things like deparsing views because Postgres
has to construct the underlying sql definition and part of that is it has to
make columns unique to generate a valid sql, such as the case here [1]https://github.com/postgres/postgres/blob/master/src/test/regress/sql/create_view.sql#L550-L565.
For planning purposes, the column name must already be unambiguous.
right?
Is there any reason to think that 52c707483 wasn't a sufficient
response to this issue?regards, tom lane
In the case reported, similar to the one in the earlier attached
experiment.sql, even with the hash table optimization, there is still
significant
overhead from the unique-ifying work. for such users, this can impact real-world
performance particularly if they have enabled extensions such as auto_explain.
[1]: https://github.com/postgres/postgres/blob/master/src/test/regress/sql/create_view.sql#L550-L565
Regards,
Sami
Sami Imseih <samimseih@gmail.com> writes:
Looking further into this improvement, I started to question if it is
necessary to make columns unique for EXPLAIN purposes?
Yes, otherwise references to them elsewhere in the plan will be
ambiguous.
Explain will qualify the column name with the table name such as the simple
example below with the experimental patch.
But if the column names are ambiguous within the same RTE, how does
table-qualification fix that? And it's within-the-same-RTE that
we're concerned with here.
I have not been able to find cases,
except for the failed test case that involves "?column?", in which the
plan will result in ambiguous column names.
It only takes one case to mean we have to deal with it ;-). But I'm
fairly sure that there are many other cases, since the parser doesn't
restrict the output names of a sub-SELECT to be unique.
Is there any reason to think that 52c707483 wasn't a sufficient
response to this issue?
In the case reported, similar to the one in the earlier attached
experiment.sql, even with the hash table optimization, there is still
significant
overhead from the unique-ifying work. for such users, this can impact real-world
performance particularly if they have enabled extensions such as auto_explain.
This detail-free claim doesn't persuade me that we have to do
something more. (And I'm not buying the idea that people
should expect auto_explain to be free, anyway.)
regards, tom lane
I had a thought about this: I don't think EXPLAIN is ever required
to print the names of join alias variables (since the planner flattens
all join alias variables to some kind of expression over their
underlying columns). So we could skip assigning column names to
join RTEs at all, if we know that it's EXPLAIN rather than view/rule
decompilation. That might let us skip all the mess around
unique-ifying JOIN USING column names, too.
regards, tom lane
But if the column names are ambiguous within the same RTE, how does
table-qualification fix that? And it's within-the-same-RTE that
we're concerned with here
It only takes one case to mean we have to deal with it ;-). But I'm
fairly sure that there are many other cases, since the parser doesn't
restrict the output names of a sub-SELECT to be unique.
good point. I see the error in my original line of thinking now.
In fact, it's this simple to prove that we still need to unique-ify
something like this subquery is valid:
select * from (select 1 a, 2 a) as s
I had a thought about this: I don't think EXPLAIN is ever required
to print the names of join alias variables (since the planner flattens
all join alias variables to some kind of expression over their
underlying columns). So we could skip assigning column names to
join RTEs at all, if we know that it's EXPLAIN rather than view/rule
decompilation. That might let us skip all the mess around
unique-ifying JOIN USING column names, too.
That makes sense and a comment inside deparse_context_for_plan_tree
describes this from what I can tell.
/*
* Set up column name aliases. We will get rather bogus results for join
* RTEs, but that doesn't matter because plan trees don't contain any join
* alias Vars.
*/
set_simple_column_names(dpns);
I suspect that we can also skip RTE_RELATION, since columns must
be unique, but I am not sure it's worth the extra effort. At least my test
does not show any real benefit.
I am attaching a patch that deals with the RTE_JOIN case.
# HEAD
postgres=# SELECT FROM v1 WHERE false;
--
(0 rows)
Time: 494.572 ms
postgres=# EXPLAIN SELECT FROM v1 WHERE false;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.00 rows=0 width=0)
One-Time Filter: false
(2 rows)
Time: 1236.127 ms (00:01.236)
# with the patch applied
postgres=# SELECT FROM v1 WHERE false;
--
(0 rows)
Time: 503.049 ms
postgres=# EXPLAIN SELECT FROM v1 WHERE false;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.00 rows=0 width=0)
One-Time Filter: false
(2 rows)
Time: 525.812 ms
Regards,
Sami
Attachments:
0001-v1-skip-join-RTE-in-deparse_context_for_plan_tree.patchapplication/octet-stream; name=0001-v1-skip-join-RTE-in-deparse_context_for_plan_tree.patchDownload+9-6
Sami Imseih <samimseih@gmail.com> writes:
It only takes one case to mean we have to deal with it ;-). But I'm
fairly sure that there are many other cases, since the parser doesn't
restrict the output names of a sub-SELECT to be unique.
good point. I see the error in my original line of thinking now.
In fact, it's this simple to prove that we still need to unique-ify
something like this subquery is valid:
select * from (select 1 a, 2 a) as s
Actually, I was expecting you to cite that as a counterexample ;-)
because EXPLAIN doesn't show the sub-select's column names in
such cases:
=# explain verbose select * from (select 1 a, 2 a) as s;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=8)
Output: 1, 2
(2 rows)
You need something where we don't elide the SubqueryScan node
to show there's an issue, for example:
=# explain verbose select * from (select 1 a, 2 b, 3 b limit 4) as s where a < 3;
QUERY PLAN
-------------------------------------------------------
Subquery Scan on s (cost=0.00..0.02 rows=1 width=12)
Output: s.a, s.b, s.b_1
Filter: (s.a < 3)
-> Limit (cost=0.00..0.01 rows=1 width=12)
Output: 1, 2, 3
-> Result (cost=0.00..0.01 rows=1 width=12)
Output: 1, 2, 3
(7 rows)
I suspect that we can also skip RTE_RELATION, since columns must
be unique, but I am not sure it's worth the extra effort. At least my test
does not show any real benefit.
Yeah, I was thinking of that too. Seems like it ought to be
a noticeable improvement if the join case is.
I am attaching a patch that deals with the RTE_JOIN case.
I'll take a look. Thanks for the test demonstrating that
this makes a visible performance difference.
regards, tom lane
I wrote:
Sami Imseih <samimseih@gmail.com> writes:
I am attaching a patch that deals with the RTE_JOIN case.
I'll take a look. Thanks for the test demonstrating that
this makes a visible performance difference.
Pushed with some simplification: we don't need a new flag,
because none of the callers of set_simple_column_names need it
to do anything with join RTEs. This is better anyway because
set_relation_column_names' comment explicitly says it is not
for join RTEs, and now we don't use it on them ever.
I poked at the question of whether it's worth skipping
unique-ification for relation RTEs, and I came to the same
conclusion as you: it doesn't seem to be. The related code
is down in the noise according to "perf" once we skip join
RTEs. I think the reason the join RTEs are so expensive for
this is that the upper ones get very wide in join nests like
the example query.
regards, tom lane
I wrote:
Sami Imseih <samimseih@gmail.com> writes:
I am attaching a patch that deals with the RTE_JOIN case.
I'll take a look. Thanks for the test demonstrating that
this makes a visible performance difference.Pushed with some simplification: we don't need a new flag,
because none of the callers of set_simple_column_names need it
to do anything with join RTEs. This is better anyway because
set_relation_column_names' comment explicitly says it is not
for join RTEs, and now we don't use it on them ever.
Thank you for pushing the enhancement!
Regards,
Sami Imseih
Amazon Web Services (AWS)