Excessive disk usage in WindowAgg
This one just came up on IRC:
create table tltest(a integer, b text, c text, d text);
insert into tltest
select i, repeat('foo',100), repeat('foo',100), repeat('foo',100)
from generate_series(1,100000) i;
set log_temp_files=0;
set client_min_messages=log;
select count(a+c) from (select a, count(*) over () as c from tltest s1) s;
LOG: temporary file: path "base/pgsql_tmp/pgsql_tmp82513.3", size 92600000
Using 92MB of disk for one integer seems excessive; the reason is clear
from the explain:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=16250.00..16250.01 rows=1 width=8) (actual time=1236.260..1236.260 rows=1 loops=1)
Output: count((tltest.a + (count(*) OVER (?))))
-> WindowAgg (cost=0.00..14750.00 rows=100000 width=12) (actual time=1193.846..1231.216 rows=100000 loops=1)
Output: tltest.a, count(*) OVER (?)
-> Seq Scan on public.tltest (cost=0.00..13500.00 rows=100000 width=4) (actual time=0.006..14.361 rows=100000 loops=1)
Output: tltest.a, tltest.b, tltest.c, tltest.d
so the whole width of the table is being stored in the tuplestore used
by the windowagg.
In create_windowagg_plan, we have:
/*
* WindowAgg can project, so no need to be terribly picky about child
* tlist, but we do need grouping columns to be available
*/
subplan = create_plan_recurse(root, best_path->subpath, CP_LABEL_TLIST);
Obviously we _do_ need to be more picky about this; it seems clear that
using CP_SMALL_TLIST | CP_LABEL_TLIST would be a win in many cases.
Opinions?
--
Andrew (irc:RhodiumToad)
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
Using 92MB of disk for one integer seems excessive; the reason is clear
from the explain:
...
so the whole width of the table is being stored in the tuplestore used
by the windowagg.
In create_windowagg_plan, we have:
/*
* WindowAgg can project, so no need to be terribly picky about child
* tlist, but we do need grouping columns to be available
*/
subplan = create_plan_recurse(root, best_path->subpath, CP_LABEL_TLIST);
Obviously we _do_ need to be more picky about this; it seems clear that
using CP_SMALL_TLIST | CP_LABEL_TLIST would be a win in many cases.
Opinions?
Seems reasonable to me, do you want to do the honors?
regards, tom lane
Hi,
On 2019-11-04 12:18:48 -0500, Tom Lane wrote:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
Using 92MB of disk for one integer seems excessive; the reason is clear
from the explain:
...
so the whole width of the table is being stored in the tuplestore used
by the windowagg.In create_windowagg_plan, we have:
/*
* WindowAgg can project, so no need to be terribly picky about child
* tlist, but we do need grouping columns to be available
*/
subplan = create_plan_recurse(root, best_path->subpath, CP_LABEL_TLIST);Obviously we _do_ need to be more picky about this; it seems clear that
using CP_SMALL_TLIST | CP_LABEL_TLIST would be a win in many cases.
Opinions?Seems reasonable to me, do you want to do the honors?
I was briefly wondering if this ought to be backpatched. -0 here, but...
Greetings,
Andres Freund
"Andres" == Andres Freund <andres@anarazel.de> writes:
Obviously we _do_ need to be more picky about this; it seems clear
that using CP_SMALL_TLIST | CP_LABEL_TLIST would be a win in many
cases. Opinions?
Seems reasonable to me, do you want to do the honors?
Andres> I was briefly wondering if this ought to be backpatched. -0
Andres> here, but...
Uh, it seems obvious to me that it should be backpatched?
--
Andrew (irc:RhodiumToad)
Hi,
On 2019-11-04 19:04:52 +0000, Andrew Gierth wrote:
"Andres" == Andres Freund <andres@anarazel.de> writes:
Obviously we _do_ need to be more picky about this; it seems clear
that using CP_SMALL_TLIST | CP_LABEL_TLIST would be a win in many
cases. Opinions?Seems reasonable to me, do you want to do the honors?
Andres> I was briefly wondering if this ought to be backpatched. -0
Andres> here, but...Uh, it seems obvious to me that it should be backpatched?
Fine with me. But I don't think it's just plainly obvious, it's
essentailly a change in query plans etc, and we've been getting more
hesitant with those over time.
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2019-11-04 19:04:52 +0000, Andrew Gierth wrote:
Uh, it seems obvious to me that it should be backpatched?
Fine with me. But I don't think it's just plainly obvious, it's
essentailly a change in query plans etc, and we've been getting more
hesitant with those over time.
Since this is happening during create_plan(), it affects no planner
decisions; it's just a pointless inefficiency AFAICS. Back-patching
seems fine.
regards, tom lane
"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
On 2019-11-04 19:04:52 +0000, Andrew Gierth wrote:
Uh, it seems obvious to me that it should be backpatched?
Fine with me. But I don't think it's just plainly obvious, it's
essentailly a change in query plans etc, and we've been getting more
hesitant with those over time.
Tom> Since this is happening during create_plan(), it affects no
Tom> planner decisions; it's just a pointless inefficiency AFAICS.
Tom> Back-patching seems fine.
I will deal with it then. (probably tomorrow or so)
--
Andrew (irc:RhodiumToad)