possible optimization: push down aggregates
Hi
one user asked about using a partitioning for faster aggregates queries.
I found so there is not any optimization.
create table x1(a int, d date);
create table x_1 ( check(d = '2014-01-01'::date)) inherits(x1);
create table x_2 ( check(d = '2014-01-02'::date)) inherits(x1);
create table x_3 ( check(d = '2014-01-03'::date)) inherits(x1);
When I have this schema, then optimizer try to do
postgres=# explain verbose select max(a) from x1 group by d order by d;
QUERY
PLAN
--------------------------------------------------------------------------------
GroupAggregate (cost=684.79..750.99 rows=200 width=8)
Output: max(x1.a), x1.d
Group Key: x1.d
-> Sort (cost=684.79..706.19 rows=8561 width=8)
Output: x1.d, x1.a
Sort Key: x1.d
-> Append (cost=0.00..125.60 rows=8561 width=8)
-> Seq Scan on public.x1 (cost=0.00..0.00 rows=1 width=8)
Output: x1.d, x1.a
-> Seq Scan on public.x_1 (cost=0.00..31.40 rows=2140
width=8)
Output: x_1.d, x_1.a
-> Seq Scan on public.x_2 (cost=0.00..31.40 rows=2140
width=8)
Output: x_2.d, x_2.a
-> Seq Scan on public.x_3 (cost=0.00..31.40 rows=2140
width=8)
Output: x_3.d, x_3.a
-> Seq Scan on public.x_4 (cost=0.00..31.40 rows=2140
width=8)
Output: x_4.d, x_4.a
Planning time: 0.333 ms
It can be reduced to:
sort by d
Append
Aggegate (a), d
seq scan from x_1
Aggregate (a), d
seq scan from x_2
Are there some plans to use partitioning for aggregation?
Regards
Pavel
On Wed, Aug 27, 2014 at 2:07 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi
one user asked about using a partitioning for faster aggregates queries.
I found so there is not any optimization.
create table x1(a int, d date);
create table x_1 ( check(d = '2014-01-01'::date)) inherits(x1);
create table x_2 ( check(d = '2014-01-02'::date)) inherits(x1);
create table x_3 ( check(d = '2014-01-03'::date)) inherits(x1);When I have this schema, then optimizer try to do
postgres=# explain verbose select max(a) from x1 group by d order by d;
QUERY PLAN
--------------------------------------------------------------------------------
GroupAggregate (cost=684.79..750.99 rows=200 width=8)
Output: max(x1.a), x1.d
Group Key: x1.d
-> Sort (cost=684.79..706.19 rows=8561 width=8)
Output: x1.d, x1.a
Sort Key: x1.d
-> Append (cost=0.00..125.60 rows=8561 width=8)
-> Seq Scan on public.x1 (cost=0.00..0.00 rows=1 width=8)
Output: x1.d, x1.a
-> Seq Scan on public.x_1 (cost=0.00..31.40 rows=2140
width=8)
Output: x_1.d, x_1.a
-> Seq Scan on public.x_2 (cost=0.00..31.40 rows=2140
width=8)
Output: x_2.d, x_2.a
-> Seq Scan on public.x_3 (cost=0.00..31.40 rows=2140
width=8)
Output: x_3.d, x_3.a
-> Seq Scan on public.x_4 (cost=0.00..31.40 rows=2140
width=8)
Output: x_4.d, x_4.a
Planning time: 0.333 msIt can be reduced to:
sort by d
Append
Aggegate (a), d
seq scan from x_1
Aggregate (a), d
seq scan from x_2Are there some plans to use partitioning for aggregation?
Besides min/max, what other aggregates (mean/stddev come to mind)
would you optimize and how would you determine which ones could be?
Where is that decision made?
For example, could user defined aggregates be pushed down if you had a
reaggregation routine broken out from the main one?
merlin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Aug 27, 2014 at 4:41 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Wed, Aug 27, 2014 at 2:07 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi
one user asked about using a partitioning for faster aggregates queries.
I found so there is not any optimization.
create table x1(a int, d date);
create table x_1 ( check(d = '2014-01-01'::date)) inherits(x1);
create table x_2 ( check(d = '2014-01-02'::date)) inherits(x1);
create table x_3 ( check(d = '2014-01-03'::date)) inherits(x1);When I have this schema, then optimizer try to do
postgres=# explain verbose select max(a) from x1 group by d order by d;
QUERY PLAN
--------------------------------------------------------------------------------
GroupAggregate (cost=684.79..750.99 rows=200 width=8)
Output: max(x1.a), x1.d
Group Key: x1.d
-> Sort (cost=684.79..706.19 rows=8561 width=8)
Output: x1.d, x1.a
Sort Key: x1.d
-> Append (cost=0.00..125.60 rows=8561 width=8)
-> Seq Scan on public.x1 (cost=0.00..0.00 rows=1 width=8)
Output: x1.d, x1.a
-> Seq Scan on public.x_1 (cost=0.00..31.40 rows=2140
width=8)
Output: x_1.d, x_1.a
-> Seq Scan on public.x_2 (cost=0.00..31.40 rows=2140
width=8)
Output: x_2.d, x_2.a
-> Seq Scan on public.x_3 (cost=0.00..31.40 rows=2140
width=8)
Output: x_3.d, x_3.a
-> Seq Scan on public.x_4 (cost=0.00..31.40 rows=2140
width=8)
Output: x_4.d, x_4.a
Planning time: 0.333 msIt can be reduced to:
sort by d
Append
Aggegate (a), d
seq scan from x_1
Aggregate (a), d
seq scan from x_2Are there some plans to use partitioning for aggregation?
Besides min/max, what other aggregates (mean/stddev come to mind)
would you optimize and how would you determine which ones could be?
Where is that decision made?
You can't with mean and stddev, only with associative aggregates.
That's min, max, sum, bit_and, bit_or, bool_and, bool_or, count.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2014-08-27 21:41 GMT+02:00 Merlin Moncure <mmoncure@gmail.com>:
On Wed, Aug 27, 2014 at 2:07 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:Hi
one user asked about using a partitioning for faster aggregates queries.
I found so there is not any optimization.
create table x1(a int, d date);
create table x_1 ( check(d = '2014-01-01'::date)) inherits(x1);
create table x_2 ( check(d = '2014-01-02'::date)) inherits(x1);
create table x_3 ( check(d = '2014-01-03'::date)) inherits(x1);When I have this schema, then optimizer try to do
postgres=# explain verbose select max(a) from x1 group by d order by d;
QUERY PLAN--------------------------------------------------------------------------------
GroupAggregate (cost=684.79..750.99 rows=200 width=8)
Output: max(x1.a), x1.d
Group Key: x1.d
-> Sort (cost=684.79..706.19 rows=8561 width=8)
Output: x1.d, x1.a
Sort Key: x1.d
-> Append (cost=0.00..125.60 rows=8561 width=8)
-> Seq Scan on public.x1 (cost=0.00..0.00 rows=1width=8)
Output: x1.d, x1.a
-> Seq Scan on public.x_1 (cost=0.00..31.40 rows=2140
width=8)
Output: x_1.d, x_1.a
-> Seq Scan on public.x_2 (cost=0.00..31.40 rows=2140
width=8)
Output: x_2.d, x_2.a
-> Seq Scan on public.x_3 (cost=0.00..31.40 rows=2140
width=8)
Output: x_3.d, x_3.a
-> Seq Scan on public.x_4 (cost=0.00..31.40 rows=2140
width=8)
Output: x_4.d, x_4.a
Planning time: 0.333 msIt can be reduced to:
sort by d
Append
Aggegate (a), d
seq scan from x_1
Aggregate (a), d
seq scan from x_2Are there some plans to use partitioning for aggregation?
Besides min/max, what other aggregates (mean/stddev come to mind)
would you optimize and how would you determine which ones could be?
Where is that decision made?
I am thinking so all aggregates are possible
when you have a partitions by column X -- then you have a natural sets by X,
so you can directly calculate any aggregates on any column when GROUP BY
clause is a "GROUP BY X"
isn't it?
probably some similar optimizations are possible when you have "GROUP BY
X,Y" -- minimally you have more sets, and you can do aggregations on
smaller sets.
Pavel
Show quoted text
For example, could user defined aggregates be pushed down if you had a
reaggregation routine broken out from the main one?merlin
2014-08-27 21:46 GMT+02:00 Claudio Freire <klaussfreire@gmail.com>:
On Wed, Aug 27, 2014 at 4:41 PM, Merlin Moncure <mmoncure@gmail.com>
wrote:On Wed, Aug 27, 2014 at 2:07 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:
Hi
one user asked about using a partitioning for faster aggregates queries.
I found so there is not any optimization.
create table x1(a int, d date);
create table x_1 ( check(d = '2014-01-01'::date)) inherits(x1);
create table x_2 ( check(d = '2014-01-02'::date)) inherits(x1);
create table x_3 ( check(d = '2014-01-03'::date)) inherits(x1);When I have this schema, then optimizer try to do
postgres=# explain verbose select max(a) from x1 group by d order by d;
QUERY PLAN--------------------------------------------------------------------------------
GroupAggregate (cost=684.79..750.99 rows=200 width=8)
Output: max(x1.a), x1.d
Group Key: x1.d
-> Sort (cost=684.79..706.19 rows=8561 width=8)
Output: x1.d, x1.a
Sort Key: x1.d
-> Append (cost=0.00..125.60 rows=8561 width=8)
-> Seq Scan on public.x1 (cost=0.00..0.00 rows=1width=8)
Output: x1.d, x1.a
-> Seq Scan on public.x_1 (cost=0.00..31.40 rows=2140
width=8)
Output: x_1.d, x_1.a
-> Seq Scan on public.x_2 (cost=0.00..31.40 rows=2140
width=8)
Output: x_2.d, x_2.a
-> Seq Scan on public.x_3 (cost=0.00..31.40 rows=2140
width=8)
Output: x_3.d, x_3.a
-> Seq Scan on public.x_4 (cost=0.00..31.40 rows=2140
width=8)
Output: x_4.d, x_4.a
Planning time: 0.333 msIt can be reduced to:
sort by d
Append
Aggegate (a), d
seq scan from x_1
Aggregate (a), d
seq scan from x_2Are there some plans to use partitioning for aggregation?
Besides min/max, what other aggregates (mean/stddev come to mind)
would you optimize and how would you determine which ones could be?
Where is that decision made?You can't with mean and stddev, only with associative aggregates.
That's min, max, sum, bit_and, bit_or, bool_and, bool_or, count.
I don't think
I have a partitions by X .. and my query has group by clause GROUP BY X
so I can calculate any aggregate
Pavel
On Wed, Aug 27, 2014 at 2:46 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
On Wed, Aug 27, 2014 at 4:41 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Wed, Aug 27, 2014 at 2:07 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi
one user asked about using a partitioning for faster aggregates queries.
I found so there is not any optimization.
create table x1(a int, d date);
create table x_1 ( check(d = '2014-01-01'::date)) inherits(x1);
create table x_2 ( check(d = '2014-01-02'::date)) inherits(x1);
create table x_3 ( check(d = '2014-01-03'::date)) inherits(x1);When I have this schema, then optimizer try to do
postgres=# explain verbose select max(a) from x1 group by d order by d;
QUERY PLAN
--------------------------------------------------------------------------------
GroupAggregate (cost=684.79..750.99 rows=200 width=8)
Output: max(x1.a), x1.d
Group Key: x1.d
-> Sort (cost=684.79..706.19 rows=8561 width=8)
Output: x1.d, x1.a
Sort Key: x1.d
-> Append (cost=0.00..125.60 rows=8561 width=8)
-> Seq Scan on public.x1 (cost=0.00..0.00 rows=1 width=8)
Output: x1.d, x1.a
-> Seq Scan on public.x_1 (cost=0.00..31.40 rows=2140
width=8)
Output: x_1.d, x_1.a
-> Seq Scan on public.x_2 (cost=0.00..31.40 rows=2140
width=8)
Output: x_2.d, x_2.a
-> Seq Scan on public.x_3 (cost=0.00..31.40 rows=2140
width=8)
Output: x_3.d, x_3.a
-> Seq Scan on public.x_4 (cost=0.00..31.40 rows=2140
width=8)
Output: x_4.d, x_4.a
Planning time: 0.333 msIt can be reduced to:
sort by d
Append
Aggegate (a), d
seq scan from x_1
Aggregate (a), d
seq scan from x_2Are there some plans to use partitioning for aggregation?
Besides min/max, what other aggregates (mean/stddev come to mind)
would you optimize and how would you determine which ones could be?
Where is that decision made?You can't with mean and stddev, only with associative aggregates.
associative bit just makes it easier (which is important of course!).
mean for example can be pushed down if the 'pushed down' aggregates
return to the count to the "reaggregator" so that you can weight the
final average. that's a lot more complicated though.
merlin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Merlin Moncure <mmoncure@gmail.com> writes:
associative bit just makes it easier (which is important of course!).
mean for example can be pushed down if the 'pushed down' aggregates
return to the count to the "reaggregator" so that you can weight the
final average. that's a lot more complicated though.
The real question is what you're expecting to get out of such an
"optimization". If the aggregate has to visit all rows then it's
not apparent to me that any win emerges from the extra complication.
We do already have optimization of min/max across inheritance trees,
and that's certainly a win because you don't have to visit all rows.
regression=# create table pp(f1 int unique);
CREATE TABLE
regression=# create table cc(unique(f1)) inherits(pp);
CREATE TABLE
regression=# create table cc2(unique(f1)) inherits(pp);
CREATE TABLE
regression=# explain select max(f1) from pp;
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Result (cost=0.51..0.52 rows=1 width=0)
InitPlan 1 (returns $0)
-> Limit (cost=0.46..0.51 rows=1 width=4)
-> Merge Append (cost=0.46..267.71 rows=4777 width=4)
Sort Key: pp.f1
-> Index Only Scan Backward using pp_f1_key on pp (cost=0.12..8.14 rows=1 width=4)
Index Cond: (f1 IS NOT NULL)
-> Index Only Scan Backward using cc_f1_key on cc (cost=0.15..85.94 rows=2388 width=4)
Index Cond: (f1 IS NOT NULL)
-> Index Only Scan Backward using cc2_f1_key on cc2 (cost=0.15..85.94 rows=2388 width=4)
Index Cond: (f1 IS NOT NULL)
Planning time: 0.392 ms
(12 rows)
That doesn't currently extend to the GROUP BY case unfortunately.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2014-08-27 22:27 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:
Merlin Moncure <mmoncure@gmail.com> writes:
associative bit just makes it easier (which is important of course!).
mean for example can be pushed down if the 'pushed down' aggregates
return to the count to the "reaggregator" so that you can weight the
final average. that's a lot more complicated though.The real question is what you're expecting to get out of such an
"optimization". If the aggregate has to visit all rows then it's
not apparent to me that any win emerges from the extra complication.
I expect a remove a hashing or sorting part of aggregation. It can reduce
aggregation to seq scan only.
Pavel
Show quoted text
We do already have optimization of min/max across inheritance trees,
and that's certainly a win because you don't have to visit all rows.regression=# create table pp(f1 int unique);
CREATE TABLE
regression=# create table cc(unique(f1)) inherits(pp);
CREATE TABLE
regression=# create table cc2(unique(f1)) inherits(pp);
CREATE TABLE
regression=# explain select max(f1) from pp;
QUERY PLAN------------------------------------------------------------------------------------------------------------
Result (cost=0.51..0.52 rows=1 width=0)
InitPlan 1 (returns $0)
-> Limit (cost=0.46..0.51 rows=1 width=4)
-> Merge Append (cost=0.46..267.71 rows=4777 width=4)
Sort Key: pp.f1
-> Index Only Scan Backward using pp_f1_key on pp
(cost=0.12..8.14 rows=1 width=4)
Index Cond: (f1 IS NOT NULL)
-> Index Only Scan Backward using cc_f1_key on cc
(cost=0.15..85.94 rows=2388 width=4)
Index Cond: (f1 IS NOT NULL)
-> Index Only Scan Backward using cc2_f1_key on cc2
(cost=0.15..85.94 rows=2388 width=4)
Index Cond: (f1 IS NOT NULL)
Planning time: 0.392 ms
(12 rows)That doesn't currently extend to the GROUP BY case unfortunately.
regards, tom lane
On 27 Srpen 2014, 21:41, Merlin Moncure wrote:
On Wed, Aug 27, 2014 at 2:07 PM, Pavel Stehule <pavel.stehule@gmail.com>
Are there some plans to use partitioning for aggregation?
Besides min/max, what other aggregates (mean/stddev come to mind)
would you optimize and how would you determine which ones could be?
Where is that decision made?For example, could user defined aggregates be pushed down if you had a
reaggregation routine broken out from the main one?
I think that what Pavel suggests is that when you are aggregating by
GROUP BY x
and 'x' happens to be used for partitioning (making it impossible to
groups from different partitions to overlap), then it's perfectly fine to
perform the aggregation per partition, and just append the results.
If you need sorted output, you can sort the results (assuming the
cardinality of the output is much lower than the actual data).
This "append first, then aggregate" may be the cause for switch to sort
(because of fear that the amount of group will exceed work_mem), while we
could just as fine process each partition by hash aggregate separately.
Tomas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Aug 27, 2014 at 3:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Merlin Moncure <mmoncure@gmail.com> writes:
associative bit just makes it easier (which is important of course!).
mean for example can be pushed down if the 'pushed down' aggregates
return to the count to the "reaggregator" so that you can weight the
final average. that's a lot more complicated though.The real question is what you're expecting to get out of such an
"optimization". If the aggregate has to visit all rows then it's
not apparent to me that any win emerges from the extra complication.We do already have optimization of min/max across inheritance trees,
and that's certainly a win because you don't have to visit all rows.regression=# create table pp(f1 int unique);
CREATE TABLE
regression=# create table cc(unique(f1)) inherits(pp);
CREATE TABLE
regression=# create table cc2(unique(f1)) inherits(pp);
CREATE TABLE
regression=# explain select max(f1) from pp;
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Result (cost=0.51..0.52 rows=1 width=0)
InitPlan 1 (returns $0)
-> Limit (cost=0.46..0.51 rows=1 width=4)
-> Merge Append (cost=0.46..267.71 rows=4777 width=4)
Sort Key: pp.f1
-> Index Only Scan Backward using pp_f1_key on pp (cost=0.12..8.14 rows=1 width=4)
Index Cond: (f1 IS NOT NULL)
-> Index Only Scan Backward using cc_f1_key on cc (cost=0.15..85.94 rows=2388 width=4)
Index Cond: (f1 IS NOT NULL)
-> Index Only Scan Backward using cc2_f1_key on cc2 (cost=0.15..85.94 rows=2388 width=4)
Index Cond: (f1 IS NOT NULL)
Planning time: 0.392 ms
(12 rows)That doesn't currently extend to the GROUP BY case unfortunately.
Yeah: I was overthinking it. My mind was on parallel processing of
the aggregate (which is not what Pavel was proposing) because that
just happens to be what I'm working on currently -- using dblink to
decompose various aggregates and distribute the calculation across
servers. "Woudn't it nice to have to the server to that itself", I
impulsively thought.
merlin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Aug 27, 2014 at 6:46 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
Yeah: I was overthinking it. My mind was on parallel processing of
the aggregate (which is not what Pavel was proposing) because that
just happens to be what I'm working on currently -- using dblink to
decompose various aggregates and distribute the calculation across
servers. "Woudn't it nice to have to the server to that itself", I
impulsively thought.
But you'd have part of it too. Because then you'd have semantically
independent parallel nodes in the plan that do some meaningful data
wrangling and spit little output, whereas the previous plan did not do
much with the data and spit loads of rows as a result. This is a big
previous step for parallel execution really.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 08/28/2014 03:46 AM, Claudio Freire wrote:
You can't with mean and stddev, only with associative aggregates.
That's min, max, sum, bit_and, bit_or, bool_and, bool_or, count.
You could with a new helper function to merge the temporary states for
each scan though.
In the case of mean, for example, it'd just mean adding the counts and sums.
However, I'm not sure how interesting that is without the ability to
execute the subplans in parallel.
--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers