possible optimization: push down aggregates

Started by Pavel Stehuleover 11 years ago12 messages

pavel.stehule@gmail.com

over 11 years ago

one user asked about using a partitioning for faster aggregates queries.

I found so there is not any optimization.

create table x1(a int, d date);
create table x_1 ( check(d = '2014-01-01'::date)) inherits(x1);
create table x_2 ( check(d = '2014-01-02'::date)) inherits(x1);
create table x_3 ( check(d = '2014-01-03'::date)) inherits(x1);

When I have this schema, then optimizer try to do

postgres=# explain verbose select max(a) from x1 group by d order by d;
QUERY
PLAN
--------------------------------------------------------------------------------
GroupAggregate (cost=684.79..750.99 rows=200 width=8)
Output: max(x1.a), x1.d
Group Key: x1.d
-> Sort (cost=684.79..706.19 rows=8561 width=8)
Output: x1.d, x1.a
Sort Key: x1.d
-> Append (cost=0.00..125.60 rows=8561 width=8)
-> Seq Scan on public.x1 (cost=0.00..0.00 rows=1 width=8)
Output: x1.d, x1.a
-> Seq Scan on public.x_1 (cost=0.00..31.40 rows=2140
width=8)
Output: x_1.d, x_1.a
-> Seq Scan on public.x_2 (cost=0.00..31.40 rows=2140
width=8)
Output: x_2.d, x_2.a
-> Seq Scan on public.x_3 (cost=0.00..31.40 rows=2140
width=8)
Output: x_3.d, x_3.a
-> Seq Scan on public.x_4 (cost=0.00..31.40 rows=2140
width=8)
Output: x_4.d, x_4.a
Planning time: 0.333 ms

It can be reduced to:

sort by d
Append
Aggegate (a), d
seq scan from x_1
Aggregate (a), d
seq scan from x_2

Are there some plans to use partitioning for aggregation?

Regards

Pavel

Merlin Moncure

mmoncure@gmail.com

over 11 years ago

In reply to: Pavel Stehule (#1)

Re: possible optimization: push down aggregates

On Wed, Aug 27, 2014 at 2:07 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Hi

one user asked about using a partitioning for faster aggregates queries.

I found so there is not any optimization.

create table x1(a int, d date);
create table x_1 ( check(d = '2014-01-01'::date)) inherits(x1);
create table x_2 ( check(d = '2014-01-02'::date)) inherits(x1);
create table x_3 ( check(d = '2014-01-03'::date)) inherits(x1);

When I have this schema, then optimizer try to do

postgres=# explain verbose select max(a) from x1 group by d order by d;
QUERY PLAN
--------------------------------------------------------------------------------
GroupAggregate (cost=684.79..750.99 rows=200 width=8)
Output: max(x1.a), x1.d
Group Key: x1.d
-> Sort (cost=684.79..706.19 rows=8561 width=8)
Output: x1.d, x1.a
Sort Key: x1.d
-> Append (cost=0.00..125.60 rows=8561 width=8)
-> Seq Scan on public.x1 (cost=0.00..0.00 rows=1 width=8)
Output: x1.d, x1.a
-> Seq Scan on public.x_1 (cost=0.00..31.40 rows=2140
width=8)
Output: x_1.d, x_1.a
-> Seq Scan on public.x_2 (cost=0.00..31.40 rows=2140
width=8)
Output: x_2.d, x_2.a
-> Seq Scan on public.x_3 (cost=0.00..31.40 rows=2140
width=8)
Output: x_3.d, x_3.a
-> Seq Scan on public.x_4 (cost=0.00..31.40 rows=2140
width=8)
Output: x_4.d, x_4.a
Planning time: 0.333 ms

It can be reduced to:

sort by d
Append
Aggegate (a), d
seq scan from x_1
Aggregate (a), d
seq scan from x_2

Are there some plans to use partitioning for aggregation?

Besides min/max, what other aggregates (mean/stddev come to mind)
would you optimize and how would you determine which ones could be?
Where is that decision made?

For example, could user defined aggregates be pushed down if you had a
reaggregation routine broken out from the main one?

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Claudio Freire

klaussfreire@gmail.com

over 11 years ago

In reply to: Merlin Moncure (#2)

Re: possible optimization: push down aggregates

On Wed, Aug 27, 2014 at 4:41 PM, Merlin Moncure <mmoncure@gmail.com> wrote:

On Wed, Aug 27, 2014 at 2:07 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Hi

one user asked about using a partitioning for faster aggregates queries.

I found so there is not any optimization.

create table x1(a int, d date);
create table x_1 ( check(d = '2014-01-01'::date)) inherits(x1);
create table x_2 ( check(d = '2014-01-02'::date)) inherits(x1);
create table x_3 ( check(d = '2014-01-03'::date)) inherits(x1);

When I have this schema, then optimizer try to do

postgres=# explain verbose select max(a) from x1 group by d order by d;
QUERY PLAN
--------------------------------------------------------------------------------
GroupAggregate (cost=684.79..750.99 rows=200 width=8)
Output: max(x1.a), x1.d
Group Key: x1.d
-> Sort (cost=684.79..706.19 rows=8561 width=8)
Output: x1.d, x1.a
Sort Key: x1.d
-> Append (cost=0.00..125.60 rows=8561 width=8)
-> Seq Scan on public.x1 (cost=0.00..0.00 rows=1 width=8)
Output: x1.d, x1.a
-> Seq Scan on public.x_1 (cost=0.00..31.40 rows=2140
width=8)
Output: x_1.d, x_1.a
-> Seq Scan on public.x_2 (cost=0.00..31.40 rows=2140
width=8)
Output: x_2.d, x_2.a
-> Seq Scan on public.x_3 (cost=0.00..31.40 rows=2140
width=8)
Output: x_3.d, x_3.a
-> Seq Scan on public.x_4 (cost=0.00..31.40 rows=2140
width=8)
Output: x_4.d, x_4.a
Planning time: 0.333 ms

It can be reduced to:

sort by d
Append
Aggegate (a), d
seq scan from x_1
Aggregate (a), d
seq scan from x_2

Are there some plans to use partitioning for aggregation?

Besides min/max, what other aggregates (mean/stddev come to mind)
would you optimize and how would you determine which ones could be?
Where is that decision made?

You can't with mean and stddev, only with associative aggregates.

That's min, max, sum, bit_and, bit_or, bool_and, bool_or, count.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Pavel Stehule

pavel.stehule@gmail.com

over 11 years ago

In reply to: Merlin Moncure (#2)

Re: possible optimization: push down aggregates

2014-08-27 21:41 GMT+02:00 Merlin Moncure <mmoncure@gmail.com>:

On Wed, Aug 27, 2014 at 2:07 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

Hi

one user asked about using a partitioning for faster aggregates queries.

I found so there is not any optimization.

create table x1(a int, d date);
create table x_1 ( check(d = '2014-01-01'::date)) inherits(x1);
create table x_2 ( check(d = '2014-01-02'::date)) inherits(x1);
create table x_3 ( check(d = '2014-01-03'::date)) inherits(x1);

When I have this schema, then optimizer try to do

postgres=# explain verbose select max(a) from x1 group by d order by d;
QUERY PLAN

--------------------------------------------------------------------------------

GroupAggregate (cost=684.79..750.99 rows=200 width=8)
Output: max(x1.a), x1.d
Group Key: x1.d
-> Sort (cost=684.79..706.19 rows=8561 width=8)
Output: x1.d, x1.a
Sort Key: x1.d
-> Append (cost=0.00..125.60 rows=8561 width=8)
-> Seq Scan on public.x1 (cost=0.00..0.00 rows=1

width=8)

Output: x1.d, x1.a
-> Seq Scan on public.x_1 (cost=0.00..31.40 rows=2140
width=8)
Output: x_1.d, x_1.a
-> Seq Scan on public.x_2 (cost=0.00..31.40 rows=2140
width=8)
Output: x_2.d, x_2.a
-> Seq Scan on public.x_3 (cost=0.00..31.40 rows=2140
width=8)
Output: x_3.d, x_3.a
-> Seq Scan on public.x_4 (cost=0.00..31.40 rows=2140
width=8)
Output: x_4.d, x_4.a
Planning time: 0.333 ms

It can be reduced to:

sort by d
Append
Aggegate (a), d
seq scan from x_1
Aggregate (a), d
seq scan from x_2

Are there some plans to use partitioning for aggregation?

Besides min/max, what other aggregates (mean/stddev come to mind)
would you optimize and how would you determine which ones could be?
Where is that decision made?

I am thinking so all aggregates are possible

when you have a partitions by column X -- then you have a natural sets by X,

so you can directly calculate any aggregates on any column when GROUP BY
clause is a "GROUP BY X"

isn't it?

probably some similar optimizations are possible when you have "GROUP BY
X,Y" -- minimally you have more sets, and you can do aggregations on
smaller sets.

Pavel

Show quoted text

For example, could user defined aggregates be pushed down if you had a
reaggregation routine broken out from the main one?

merlin

Pavel Stehule

pavel.stehule@gmail.com

over 11 years ago

In reply to: Claudio Freire (#3)

Re: possible optimization: push down aggregates

2014-08-27 21:46 GMT+02:00 Claudio Freire <klaussfreire@gmail.com>:

On Wed, Aug 27, 2014 at 4:41 PM, Merlin Moncure <mmoncure@gmail.com>
wrote:

On Wed, Aug 27, 2014 at 2:07 PM, Pavel Stehule <pavel.stehule@gmail.com>

wrote:

Hi

one user asked about using a partitioning for faster aggregates queries.

I found so there is not any optimization.

create table x1(a int, d date);
create table x_1 ( check(d = '2014-01-01'::date)) inherits(x1);
create table x_2 ( check(d = '2014-01-02'::date)) inherits(x1);
create table x_3 ( check(d = '2014-01-03'::date)) inherits(x1);

When I have this schema, then optimizer try to do

postgres=# explain verbose select max(a) from x1 group by d order by d;
QUERY PLAN

--------------------------------------------------------------------------------

GroupAggregate (cost=684.79..750.99 rows=200 width=8)
Output: max(x1.a), x1.d
Group Key: x1.d
-> Sort (cost=684.79..706.19 rows=8561 width=8)
Output: x1.d, x1.a
Sort Key: x1.d
-> Append (cost=0.00..125.60 rows=8561 width=8)
-> Seq Scan on public.x1 (cost=0.00..0.00 rows=1

width=8)

Output: x1.d, x1.a
-> Seq Scan on public.x_1 (cost=0.00..31.40 rows=2140
width=8)
Output: x_1.d, x_1.a
-> Seq Scan on public.x_2 (cost=0.00..31.40 rows=2140
width=8)
Output: x_2.d, x_2.a
-> Seq Scan on public.x_3 (cost=0.00..31.40 rows=2140
width=8)
Output: x_3.d, x_3.a
-> Seq Scan on public.x_4 (cost=0.00..31.40 rows=2140
width=8)
Output: x_4.d, x_4.a
Planning time: 0.333 ms

It can be reduced to:

sort by d
Append
Aggegate (a), d
seq scan from x_1
Aggregate (a), d
seq scan from x_2

Are there some plans to use partitioning for aggregation?

Besides min/max, what other aggregates (mean/stddev come to mind)
would you optimize and how would you determine which ones could be?
Where is that decision made?

You can't with mean and stddev, only with associative aggregates.

That's min, max, sum, bit_and, bit_or, bool_and, bool_or, count.

I don't think

I have a partitions by X .. and my query has group by clause GROUP BY X

so I can calculate any aggregate

Pavel

Merlin Moncure

mmoncure@gmail.com

over 11 years ago

In reply to: Claudio Freire (#3)

Re: possible optimization: push down aggregates

On Wed, Aug 27, 2014 at 2:46 PM, Claudio Freire <klaussfreire@gmail.com> wrote:

On Wed, Aug 27, 2014 at 4:41 PM, Merlin Moncure <mmoncure@gmail.com> wrote:

On Wed, Aug 27, 2014 at 2:07 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Hi

one user asked about using a partitioning for faster aggregates queries.

I found so there is not any optimization.

create table x1(a int, d date);
create table x_1 ( check(d = '2014-01-01'::date)) inherits(x1);
create table x_2 ( check(d = '2014-01-02'::date)) inherits(x1);
create table x_3 ( check(d = '2014-01-03'::date)) inherits(x1);

When I have this schema, then optimizer try to do

postgres=# explain verbose select max(a) from x1 group by d order by d;
QUERY PLAN
--------------------------------------------------------------------------------
GroupAggregate (cost=684.79..750.99 rows=200 width=8)
Output: max(x1.a), x1.d
Group Key: x1.d
-> Sort (cost=684.79..706.19 rows=8561 width=8)
Output: x1.d, x1.a
Sort Key: x1.d
-> Append (cost=0.00..125.60 rows=8561 width=8)
-> Seq Scan on public.x1 (cost=0.00..0.00 rows=1 width=8)
Output: x1.d, x1.a
-> Seq Scan on public.x_1 (cost=0.00..31.40 rows=2140
width=8)
Output: x_1.d, x_1.a
-> Seq Scan on public.x_2 (cost=0.00..31.40 rows=2140
width=8)
Output: x_2.d, x_2.a
-> Seq Scan on public.x_3 (cost=0.00..31.40 rows=2140
width=8)
Output: x_3.d, x_3.a
-> Seq Scan on public.x_4 (cost=0.00..31.40 rows=2140
width=8)
Output: x_4.d, x_4.a
Planning time: 0.333 ms

It can be reduced to:

sort by d
Append
Aggegate (a), d
seq scan from x_1
Aggregate (a), d
seq scan from x_2

Are there some plans to use partitioning for aggregation?

Besides min/max, what other aggregates (mean/stddev come to mind)
would you optimize and how would you determine which ones could be?
Where is that decision made?

You can't with mean and stddev, only with associative aggregates.

associative bit just makes it easier (which is important of course!).
mean for example can be pushed down if the 'pushed down' aggregates
return to the count to the "reaggregator" so that you can weight the
final average. that's a lot more complicated though.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 11 years ago

In reply to: Merlin Moncure (#6)

Re: possible optimization: push down aggregates

Merlin Moncure <mmoncure@gmail.com> writes:

associative bit just makes it easier (which is important of course!).
mean for example can be pushed down if the 'pushed down' aggregates
return to the count to the "reaggregator" so that you can weight the
final average. that's a lot more complicated though.

The real question is what you're expecting to get out of such an
"optimization". If the aggregate has to visit all rows then it's
not apparent to me that any win emerges from the extra complication.

We do already have optimization of min/max across inheritance trees,
and that's certainly a win because you don't have to visit all rows.

regression=# create table pp(f1 int unique);
CREATE TABLE
regression=# create table cc(unique(f1)) inherits(pp);
CREATE TABLE
regression=# create table cc2(unique(f1)) inherits(pp);
CREATE TABLE
regression=# explain select max(f1) from pp;
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Result (cost=0.51..0.52 rows=1 width=0)
InitPlan 1 (returns $0)
-> Limit (cost=0.46..0.51 rows=1 width=4)
-> Merge Append (cost=0.46..267.71 rows=4777 width=4)
Sort Key: pp.f1
-> Index Only Scan Backward using pp_f1_key on pp (cost=0.12..8.14 rows=1 width=4)
Index Cond: (f1 IS NOT NULL)
-> Index Only Scan Backward using cc_f1_key on cc (cost=0.15..85.94 rows=2388 width=4)
Index Cond: (f1 IS NOT NULL)
-> Index Only Scan Backward using cc2_f1_key on cc2 (cost=0.15..85.94 rows=2388 width=4)
Index Cond: (f1 IS NOT NULL)
Planning time: 0.392 ms
(12 rows)

That doesn't currently extend to the GROUP BY case unfortunately.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Pavel Stehule

pavel.stehule@gmail.com

over 11 years ago

In reply to: Tom Lane (#7)

Re: possible optimization: push down aggregates

2014-08-27 22:27 GMT+02:00 Tom Lane <tgl@sss.pgh.pa.us>:

Merlin Moncure <mmoncure@gmail.com> writes:

associative bit just makes it easier (which is important of course!).
mean for example can be pushed down if the 'pushed down' aggregates
return to the count to the "reaggregator" so that you can weight the
final average. that's a lot more complicated though.

The real question is what you're expecting to get out of such an
"optimization". If the aggregate has to visit all rows then it's
not apparent to me that any win emerges from the extra complication.

I expect a remove a hashing or sorting part of aggregation. It can reduce
aggregation to seq scan only.

Pavel

Show quoted text

We do already have optimization of min/max across inheritance trees,
and that's certainly a win because you don't have to visit all rows.

regression=# create table pp(f1 int unique);
CREATE TABLE
regression=# create table cc(unique(f1)) inherits(pp);
CREATE TABLE
regression=# create table cc2(unique(f1)) inherits(pp);
CREATE TABLE
regression=# explain select max(f1) from pp;
QUERY PLAN

------------------------------------------------------------------------------------------------------------
Result (cost=0.51..0.52 rows=1 width=0)
InitPlan 1 (returns $0)
-> Limit (cost=0.46..0.51 rows=1 width=4)
-> Merge Append (cost=0.46..267.71 rows=4777 width=4)
Sort Key: pp.f1
-> Index Only Scan Backward using pp_f1_key on pp
(cost=0.12..8.14 rows=1 width=4)
Index Cond: (f1 IS NOT NULL)
-> Index Only Scan Backward using cc_f1_key on cc
(cost=0.15..85.94 rows=2388 width=4)
Index Cond: (f1 IS NOT NULL)
-> Index Only Scan Backward using cc2_f1_key on cc2
(cost=0.15..85.94 rows=2388 width=4)
Index Cond: (f1 IS NOT NULL)
Planning time: 0.392 ms
(12 rows)

That doesn't currently extend to the GROUP BY case unfortunately.

regards, tom lane

Tomas Vondra

tv@fuzzy.cz

over 11 years ago

In reply to: Merlin Moncure (#2)

Re: possible optimization: push down aggregates

On 27 Srpen 2014, 21:41, Merlin Moncure wrote:

On Wed, Aug 27, 2014 at 2:07 PM, Pavel Stehule <pavel.stehule@gmail.com>

Are there some plans to use partitioning for aggregation?

Besides min/max, what other aggregates (mean/stddev come to mind)
would you optimize and how would you determine which ones could be?
Where is that decision made?

For example, could user defined aggregates be pushed down if you had a
reaggregation routine broken out from the main one?

I think that what Pavel suggests is that when you are aggregating by

GROUP BY x

and 'x' happens to be used for partitioning (making it impossible to
groups from different partitions to overlap), then it's perfectly fine to
perform the aggregation per partition, and just append the results.

If you need sorted output, you can sort the results (assuming the
cardinality of the output is much lower than the actual data).

This "append first, then aggregate" may be the cause for switch to sort
(because of fear that the amount of group will exceed work_mem), while we
could just as fine process each partition by hash aggregate separately.

Tomas

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Merlin Moncure

mmoncure@gmail.com

over 11 years ago

In reply to: Tom Lane (#7)

Re: possible optimization: push down aggregates

On Wed, Aug 27, 2014 at 3:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Merlin Moncure <mmoncure@gmail.com> writes:

associative bit just makes it easier (which is important of course!).
mean for example can be pushed down if the 'pushed down' aggregates
return to the count to the "reaggregator" so that you can weight the
final average. that's a lot more complicated though.

The real question is what you're expecting to get out of such an
"optimization". If the aggregate has to visit all rows then it's
not apparent to me that any win emerges from the extra complication.

We do already have optimization of min/max across inheritance trees,
and that's certainly a win because you don't have to visit all rows.

regression=# create table pp(f1 int unique);
CREATE TABLE
regression=# create table cc(unique(f1)) inherits(pp);
CREATE TABLE
regression=# create table cc2(unique(f1)) inherits(pp);
CREATE TABLE
regression=# explain select max(f1) from pp;
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Result (cost=0.51..0.52 rows=1 width=0)
InitPlan 1 (returns $0)
-> Limit (cost=0.46..0.51 rows=1 width=4)
-> Merge Append (cost=0.46..267.71 rows=4777 width=4)
Sort Key: pp.f1
-> Index Only Scan Backward using pp_f1_key on pp (cost=0.12..8.14 rows=1 width=4)
Index Cond: (f1 IS NOT NULL)
-> Index Only Scan Backward using cc_f1_key on cc (cost=0.15..85.94 rows=2388 width=4)
Index Cond: (f1 IS NOT NULL)
-> Index Only Scan Backward using cc2_f1_key on cc2 (cost=0.15..85.94 rows=2388 width=4)
Index Cond: (f1 IS NOT NULL)
Planning time: 0.392 ms
(12 rows)

That doesn't currently extend to the GROUP BY case unfortunately.

Yeah: I was overthinking it. My mind was on parallel processing of
the aggregate (which is not what Pavel was proposing) because that
just happens to be what I'm working on currently -- using dblink to
decompose various aggregates and distribute the calculation across
servers. "Woudn't it nice to have to the server to that itself", I
impulsively thought.

merlin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Claudio Freire

klaussfreire@gmail.com

over 11 years ago

In reply to: Merlin Moncure (#10)

Re: possible optimization: push down aggregates

On Wed, Aug 27, 2014 at 6:46 PM, Merlin Moncure <mmoncure@gmail.com> wrote:

Yeah: I was overthinking it. My mind was on parallel processing of
the aggregate (which is not what Pavel was proposing) because that
just happens to be what I'm working on currently -- using dblink to
decompose various aggregates and distribute the calculation across
servers. "Woudn't it nice to have to the server to that itself", I
impulsively thought.

But you'd have part of it too. Because then you'd have semantically
independent parallel nodes in the plan that do some meaningful data
wrangling and spit little output, whereas the previous plan did not do
much with the data and spit loads of rows as a result. This is a big
previous step for parallel execution really.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Craig Ringer

craig@2ndquadrant.com

over 11 years ago

In reply to: Claudio Freire (#3)

Re: possible optimization: push down aggregates

On 08/28/2014 03:46 AM, Claudio Freire wrote:

You can't with mean and stddev, only with associative aggregates.

That's min, max, sum, bit_and, bit_or, bool_and, bool_or, count.

You could with a new helper function to merge the temporary states for
each scan though.

In the case of mean, for example, it'd just mean adding the counts and sums.

However, I'm not sure how interesting that is without the ability to
execute the subplans in parallel.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers