Parallel Aggregate
Parallel aggregate is the feature doing the aggregation job parallel
with the help of Gather and
partial seq scan nodes. The following is the basic overview of the
parallel aggregate changes.
Decision phase:
Based on the following conditions, the parallel aggregate plan is generated.
- check whether the below plan node is Gather + partial seq scan only.
This is because to check whether the plan nodes that are present are
aware of parallelism or not?
- check Are there any projection or qual condition is present in the
Gather node?
If there exists any quals and projection info that is required to
performed in the
Gather node because of the function that can only be executed in
master backends,
the parallel aggregate plan is not chosen.
- check whether the aggregate supports parallelism or not.
As for first patch, I thought of supporting only some aggregates for
this parallel aggregate.
The supported aggregates are mainly the aggregate functions that have
variable length data types as final and transition types. This is to
avoid changing the target list return types. Because of variable
lengths, even the transition type can be returned to backend without
applying the final function in aggregate. To identify the supported
aggregates for parallelism, a new member is added to pg_aggregate
system catalog table.
- currently Group and plain aggregates are only supported for simplicity.
This patch doesn't change anything in aggregate plan decision. If the
planner decides the group
or plain aggregates as the best plan, then we will check whether this
can be converted into
parallel aggregate or not?
Planning phase:
- Generate the target list items that needs to be passed to the child
aggregate nodes,
by separting bare aggregate and group by expressions. This is required
to take care
of any expressions those are involved the target list.
Example:
Output: (sum(id1)), (3 + (sum((id2 - 3)))), (max(id1)), ((count(id1))
- (max(id1)))
-> Aggregate
Output: sum(id1), sum((id2 - 3)), max(id1), count(id1)
- Don't push the Having clause to the child aggregate node, this needs
to be executed at
the Gather node only, after combining all results from workers with
the matching key,
(and also after the final function is called for the aggregate
function if exists).
- Get the details of the Gather plan and remove its plan node from the
actual plan and prepare
the Gather plan on top of the aggregate plan.
Execution phase:
- By passing some execution flag like EXEC_PARALLEL or something, the
aggregate operations doesn't do the final function calculation in the
worker side.
- Set the single_copy mode as true, in case if the below node of
Gather is a parallel aggregate.
- Add the support of getting a slot from a particular worker. This
support is required to
merge the slots from different workers based on grouping key.
- Merge the slots received from the workers based on the grouping key.
If there is no grouping key,
then merge all slots without waiting for receiving slots from all workers.
- If there exists a grouping key, backend has to wait till it gets
slots from all workers who are running. Once all slots are received,
they needs to be compared against the grouping key and merged
accordingly. The merged slot needs to be processed further to apply
the final function, qualification and projection.
I will try to provide a POC patch by next commit-fest.
Comments?
Regards,
Hari Babu
Fujitsu Australia
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 12 October 2015 at 15:07, Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:
Parallel aggregate is the feature doing the aggregation job parallel
with the help of Gather and
partial seq scan nodes. The following is the basic overview of the
parallel aggregate changes.Decision phase:
Based on the following conditions, the parallel aggregate plan is
generated.- check whether the below plan node is Gather + partial seq scan only.
This is because to check whether the plan nodes that are present are
aware of parallelism or not?- check Are there any projection or qual condition is present in the
Gather node?If there exists any quals and projection info that is required to
performed in the
Gather node because of the function that can only be executed in
master backends,
the parallel aggregate plan is not chosen.- check whether the aggregate supports parallelism or not.
As for first patch, I thought of supporting only some aggregates for
this parallel aggregate.
The supported aggregates are mainly the aggregate functions that have
variable length data types as final and transition types. This is to
avoid changing the target list return types. Because of variable
lengths, even the transition type can be returned to backend without
applying the final function in aggregate. To identify the supported
aggregates for parallelism, a new member is added to pg_aggregate
system catalog table.- currently Group and plain aggregates are only supported for simplicity.
This patch doesn't change anything in aggregate plan decision. If the
planner decides the group
or plain aggregates as the best plan, then we will check whether this
can be converted into
parallel aggregate or not?
Hi,
I've never previously proposed any implementation for parallel aggregation,
but I have previously proposed infrastructure to allow aggregation to
happen in multiple steps. It seems your plan sounds very different from
what I've proposed.
I attempted to convey my idea on this to the community here
/messages/by-id/CAKJS1f-TmWi-4c5K6CBLRdTfGsVxOJhadefzjE7SWuVBgMSkXA@mail.gmail.com
which Simon and I proposed an actual proof of concept patch here
https://commitfest.postgresql.org/5/131/
I've since expanded on that work in the form of a WIP patch which
implements GROUP BY before JOIN here
/messages/by-id/CAKJS1f9kw95K2pnCKAoPmNw==7fgjSjC-82cy1RB+-x-Jz0QHA@mail.gmail.com
It's pretty evident that we both need to align the way we plan to handle
this multiple step aggregation, there's no sense at all in having 2
different ways of doing this. Perhaps you could look over my patch and let
me know the parts which you disagree with, then we can resolve these
together and come up with the best solution for each of us.
It may also be useful for you to glance at how Postgres-XL handles this
partial aggregation problem, as it, where possible, will partially
aggregate the results on each node, pass the partially aggregates state to
the master node to have it perform the final aggregate stage on each of the
individual aggregate states from each node. Note that this requires giving
the aggregates with internal aggregate states an SQL level type and it also
means implementing an input and output function for these types. I've
noticed that XL mostly handles this by making the output function build a
string something along the lines of <count>:<sum> for aggregates such as
AVG(). I believe you'll need something very similar to this to pass the
partial states between worker and master process.
Regards
David Rowley
--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Oct 12, 2015 at 2:25 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
On 12 October 2015 at 15:07, Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:- check whether the aggregate supports parallelism or not.
As for first patch, I thought of supporting only some aggregates for
this parallel aggregate.
The supported aggregates are mainly the aggregate functions that have
variable length data types as final and transition types. This is to
avoid changing the target list return types. Because of variable
lengths, even the transition type can be returned to backend without
applying the final function in aggregate. To identify the supported
aggregates for parallelism, a new member is added to pg_aggregate
system catalog table.- currently Group and plain aggregates are only supported for simplicity.
This patch doesn't change anything in aggregate plan decision. If the
planner decides the group
or plain aggregates as the best plan, then we will check whether this
can be converted into
parallel aggregate or not?Hi,
I've never previously proposed any implementation for parallel aggregation,
but I have previously proposed infrastructure to allow aggregation to happen
in multiple steps. It seems your plan sounds very different from what I've
proposed.I attempted to convey my idea on this to the community here
/messages/by-id/CAKJS1f-TmWi-4c5K6CBLRdTfGsVxOJhadefzjE7SWuVBgMSkXA@mail.gmail.com
which Simon and I proposed an actual proof of concept patch here
https://commitfest.postgresql.org/5/131/
My plan also to use the combine_aggregate_state_v2.patch or similar
that you have proposed to merge the partial aggregate results
and combine them in the backend process. As a POC patch, I just want
to limit this functionality to aggregates that have variable length
datatypes as transition and final arguments.
I've since expanded on that work in the form of a WIP patch which implements
GROUP BY before JOIN here
/messages/by-id/CAKJS1f9kw95K2pnCKAoPmNw==7fgjSjC-82cy1RB+-x-Jz0QHA@mail.gmail.comIt's pretty evident that we both need to align the way we plan to handle
this multiple step aggregation, there's no sense at all in having 2
different ways of doing this. Perhaps you could look over my patch and let
me know the parts which you disagree with, then we can resolve these
together and come up with the best solution for each of us.
Thanks for the details. I will go through it. From a basic view, this
patch is an
enhancement of combine_aggregate_state_v2.patch.
It may also be useful for you to glance at how Postgres-XL handles this
partial aggregation problem, as it, where possible, will partially aggregate
the results on each node, pass the partially aggregates state to the master
node to have it perform the final aggregate stage on each of the individual
aggregate states from each node. Note that this requires giving the
aggregates with internal aggregate states an SQL level type and it also
means implementing an input and output function for these types. I've
noticed that XL mostly handles this by making the output function build a
string something along the lines of <count>:<sum> for aggregates such as
AVG(). I believe you'll need something very similar to this to pass the
partial states between worker and master process.
Yes, we may need something like this, or adding the support of passing internal
datatypes between worker and backend process to support all aggregate functions.
Regards,
Hari Babu
Fujitsu Australia
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Oct 11, 2015 at 10:07 PM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
Parallel aggregate is the feature doing the aggregation job parallel
with the help of Gather and
partial seq scan nodes. The following is the basic overview of the
parallel aggregate changes.Decision phase:
Based on the following conditions, the parallel aggregate plan is generated.
- check whether the below plan node is Gather + partial seq scan only.
This is because to check whether the plan nodes that are present are
aware of parallelism or not?
This is really not the right way of doing this. We should do
something more general. Most likely, parallel aggregate should wait
for Tom's work refactoring the upper planner to use paths. But either
way, it's not a good idea to limit ourselves to parallel aggregation
only in the case where there is exactly one base table.
One of the things I want to do pretty early on, perhaps in time for
9.6, is create a general notion of partial paths. A Partial Seq Scan
node creates a partial path. A Gather node turns a partial path into
a complete path. A join between a partial path and a complete path
creates a new partial path. This concept lets us consider,
essentially, pushing joins below Gather nodes. That's quite powerful
and could make Partial Seq Scan applicable to a much broader variety
of use cases. If there are worthwhile partial paths for the final
joinrel, and aggregation of that joinrel is needed, we can consider
parallel aggregation using that partial path as an alternative to
sticking a Gather node on there and then aggregating.
- Set the single_copy mode as true, in case if the below node of
Gather is a parallel aggregate.
That sounds wrong. Single-copy mode is for when we need to be certain
of running exactly one copy of the plan. If you're trying to have
several workers aggregate in parallel, that's exactly what you don't
want.
Also, I think the path for parallel aggregation should probably be
something like FinalizeAgg -> Gather -> PartialAgg -> some partial
path here. I'm not clear whether that is what you are thinking or
not.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Oct 13, 2015 at 12:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Sun, Oct 11, 2015 at 10:07 PM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:Parallel aggregate is the feature doing the aggregation job parallel
with the help of Gather and
partial seq scan nodes. The following is the basic overview of the
parallel aggregate changes.Decision phase:
Based on the following conditions, the parallel aggregate plan is generated.
- check whether the below plan node is Gather + partial seq scan only.
This is because to check whether the plan nodes that are present are
aware of parallelism or not?This is really not the right way of doing this. We should do
something more general. Most likely, parallel aggregate should wait
for Tom's work refactoring the upper planner to use paths. But either
way, it's not a good idea to limit ourselves to parallel aggregation
only in the case where there is exactly one base table.
Ok. Thanks for the details.
One of the things I want to do pretty early on, perhaps in time for
9.6, is create a general notion of partial paths. A Partial Seq Scan
node creates a partial path. A Gather node turns a partial path into
a complete path. A join between a partial path and a complete path
creates a new partial path. This concept lets us consider,
essentially, pushing joins below Gather nodes. That's quite powerful
and could make Partial Seq Scan applicable to a much broader variety
of use cases. If there are worthwhile partial paths for the final
joinrel, and aggregation of that joinrel is needed, we can consider
parallel aggregation using that partial path as an alternative to
sticking a Gather node on there and then aggregating.- Set the single_copy mode as true, in case if the below node of
Gather is a parallel aggregate.That sounds wrong. Single-copy mode is for when we need to be certain
of running exactly one copy of the plan. If you're trying to have
several workers aggregate in parallel, that's exactly what you don't
want.
I mean of setting the flag is to avoid backend executing the child plan.
Also, I think the path for parallel aggregation should probably be
something like FinalizeAgg -> Gather -> PartialAgg -> some partial
path here. I'm not clear whether that is what you are thinking or
not.
No. I am thinking of the following way.
Gather->partialagg->some partial path
I want the Gather node to merge the results coming from all workers, otherwise
it may be difficult to merge at parent of gather node. Because in case
the partial
group aggregate is under the Gather node, if any of two workers are returning
same group key data, we need to compare them and combine it to make it a
single group. If we are at Gather node, it is possible that we can
wait till we get
slots from all workers. Once all workers returns the slots we can compare
and merge the necessary slots and return the result. Am I missing something?
Regards,
Hari Babu
Fujitsu Australia
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 13 October 2015 at 17:09, Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:
On Tue, Oct 13, 2015 at 12:14 PM, Robert Haas <robertmhaas@gmail.com>
wrote:Also, I think the path for parallel aggregation should probably be
something like FinalizeAgg -> Gather -> PartialAgg -> some partial
path here. I'm not clear whether that is what you are thinking or
not.No. I am thinking of the following way.
Gather->partialagg->some partial pathI want the Gather node to merge the results coming from all workers,
otherwise
it may be difficult to merge at parent of gather node. Because in case
the partial
group aggregate is under the Gather node, if any of two workers are
returning
same group key data, we need to compare them and combine it to make it a
single group. If we are at Gather node, it is possible that we can
wait till we get
slots from all workers. Once all workers returns the slots we can compare
and merge the necessary slots and return the result. Am I missing
something?
My assumption is the same as Robert's here.
Unless I've misunderstood, it sounds like you're proposing to add logic
into the Gather node to handle final aggregation? That sounds like
a modularity violation of the whole node concept.
The handling of the final aggregate stage is not all that different from
the initial aggregate stage. The primary difference is just that your
calling the combine function instead of the transition function, and the
values being aggregated are aggregates states rather than the type of the
values which were initially aggregated. The handling of GROUP BY is all the
same, yet you only apply the HAVING clause during final aggregation. This
is why I ended up implementing this in nodeAgg.c instead of inventing some
new node type that's mostly a copy and paste of nodeAgg.c [1]/messages/by-id/CAKJS1f9kw95K2pnCKAoPmNw==7fgjSjC-82cy1RB+-x-Jz0QHA@mail.gmail.com
If you're performing a hash aggregate you need to wait until all the
partially aggregated groups are received anyway. If you're doing a sort/agg
then you'll need to sort again after the Gather node.
[1]: /messages/by-id/CAKJS1f9kw95K2pnCKAoPmNw==7fgjSjC-82cy1RB+-x-Jz0QHA@mail.gmail.com
/messages/by-id/CAKJS1f9kw95K2pnCKAoPmNw==7fgjSjC-82cy1RB+-x-Jz0QHA@mail.gmail.com
--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services
On 13 October 2015 at 02:14, Robert Haas <robertmhaas@gmail.com> wrote:
On Sun, Oct 11, 2015 at 10:07 PM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:Parallel aggregate is the feature doing the aggregation job parallel
with the help of Gather and
partial seq scan nodes. The following is the basic overview of the
parallel aggregate changes.Decision phase:
Based on the following conditions, the parallel aggregate plan is
generated.
- check whether the below plan node is Gather + partial seq scan only.
This is because to check whether the plan nodes that are present are
aware of parallelism or not?This is really not the right way of doing this. We should do
something more general. Most likely, parallel aggregate should wait
for Tom's work refactoring the upper planner to use paths. But either
way, it's not a good idea to limit ourselves to parallel aggregation
only in the case where there is exactly one base table.
What we discussed at PgCon was this rough flow of work
* Pathify upper Planner (Tom) WIP
* Aggregation push down (David) Prototype
* Parallel Aggregates
Parallel infrastructure is also required for aggregation, though that
dependency looks further ahead than the above at present.
Parallel aggregates do look like they can make it into 9.6, but there's not
much slack left in the critical path.
One of the things I want to do pretty early on, perhaps in time for
9.6, is create a general notion of partial paths. A Partial Seq Scan
node creates a partial path. A Gather node turns a partial path into
a complete path. A join between a partial path and a complete path
creates a new partial path. This concept lets us consider,
essentially, pushing joins below Gather nodes. That's quite powerful
and could make Partial Seq Scan applicable to a much broader variety
of use cases. If there are worthwhile partial paths for the final
joinrel, and aggregation of that joinrel is needed, we can consider
parallel aggregation using that partial path as an alternative to
sticking a Gather node on there and then aggregating.
Some form of partial plan makes sense. A better word might be "strand".
- Set the single_copy mode as true, in case if the below node of
Gather is a parallel aggregate.That sounds wrong. Single-copy mode is for when we need to be certain
of running exactly one copy of the plan. If you're trying to have
several workers aggregate in parallel, that's exactly what you don't
want.Also, I think the path for parallel aggregation should probably be
something like FinalizeAgg -> Gather -> PartialAgg -> some partial
path here. I'm not clear whether that is what you are thinking or
not.
Yes, but not sure of names.
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Oct 13, 2015 at 5:53 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
On 13 October 2015 at 17:09, Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:On Tue, Oct 13, 2015 at 12:14 PM, Robert Haas <robertmhaas@gmail.com>
wrote:Also, I think the path for parallel aggregation should probably be
something like FinalizeAgg -> Gather -> PartialAgg -> some partial
path here. I'm not clear whether that is what you are thinking or
not.No. I am thinking of the following way.
Gather->partialagg->some partial pathI want the Gather node to merge the results coming from all workers,
otherwise
it may be difficult to merge at parent of gather node. Because in case
the partial
group aggregate is under the Gather node, if any of two workers are
returning
same group key data, we need to compare them and combine it to make it a
single group. If we are at Gather node, it is possible that we can
wait till we get
slots from all workers. Once all workers returns the slots we can compare
and merge the necessary slots and return the result. Am I missing
something?My assumption is the same as Robert's here.
Unless I've misunderstood, it sounds like you're proposing to add logic into
the Gather node to handle final aggregation? That sounds like a modularity
violation of the whole node concept.The handling of the final aggregate stage is not all that different from the
initial aggregate stage. The primary difference is just that your calling
the combine function instead of the transition function, and the values
Yes, you are correct, till now i am thinking of using transition types as the
approach, because of that reason only I proposed it as Gather node to handle
the finalize aggregation.
being aggregated are aggregates states rather than the type of the values
which were initially aggregated. The handling of GROUP BY is all the same,
yet you only apply the HAVING clause during final aggregation. This is why I
ended up implementing this in nodeAgg.c instead of inventing some new node
type that's mostly a copy and paste of nodeAgg.c [1]
After going through your Partial Aggregation / GROUP BY before JOIN patch,
Following is my understanding of parallel aggregate.
Finalize [hash] aggregate
-> Gather
-> Partial [hash] aggregate
The data that comes from the Gather node contains the group key and
grouping results.
Based on these we can generate another hash table in case of hash aggregate at
finalize aggregate and return the final results. This approach works
for both plain and
hash aggregates.
For group aggregate support of parallel aggregate, the plan should be
as follows.
Finalize Group aggregate
->sort
-> Gather
-> Partial group aggregate
->sort
The data that comes from Gather node needs to be sorted again based on
the grouping key,
merge the data and generates the final grouping result.
With this approach, we no need to change anything in Gather node. Is
my understanding correct?
Regards,
Hari Babu
Fujitsu Australia
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 13 October 2015 at 20:57, Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:
On Tue, Oct 13, 2015 at 5:53 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:On 13 October 2015 at 17:09, Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:On Tue, Oct 13, 2015 at 12:14 PM, Robert Haas <robertmhaas@gmail.com>
wrote:Also, I think the path for parallel aggregation should probably be
something like FinalizeAgg -> Gather -> PartialAgg -> some partial
path here. I'm not clear whether that is what you are thinking or
not.No. I am thinking of the following way.
Gather->partialagg->some partial pathI want the Gather node to merge the results coming from all workers,
otherwise
it may be difficult to merge at parent of gather node. Because in case
the partial
group aggregate is under the Gather node, if any of two workers are
returning
same group key data, we need to compare them and combine it to make it a
single group. If we are at Gather node, it is possible that we can
wait till we get
slots from all workers. Once all workers returns the slots we cancompare
and merge the necessary slots and return the result. Am I missing
something?My assumption is the same as Robert's here.
Unless I've misunderstood, it sounds like you're proposing to add logicinto
the Gather node to handle final aggregation? That sounds like a
modularity
violation of the whole node concept.
The handling of the final aggregate stage is not all that different from
the
initial aggregate stage. The primary difference is just that your calling
the combine function instead of the transition function, and the valuesYes, you are correct, till now i am thinking of using transition types as
the
approach, because of that reason only I proposed it as Gather node to
handle
the finalize aggregation.being aggregated are aggregates states rather than the type of the values
which were initially aggregated. The handling of GROUP BY is all thesame,
yet you only apply the HAVING clause during final aggregation. This is
why I
ended up implementing this in nodeAgg.c instead of inventing some new
node
type that's mostly a copy and paste of nodeAgg.c [1]
After going through your Partial Aggregation / GROUP BY before JOIN patch,
Following is my understanding of parallel aggregate.Finalize [hash] aggregate
-> Gather
-> Partial [hash] aggregateThe data that comes from the Gather node contains the group key and
grouping results.
Based on these we can generate another hash table in case of hash
aggregate at
finalize aggregate and return the final results. This approach works
for both plain and
hash aggregates.For group aggregate support of parallel aggregate, the plan should be
as follows.Finalize Group aggregate
->sort
-> Gather
-> Partial group aggregate
->sortThe data that comes from Gather node needs to be sorted again based on
the grouping key,
merge the data and generates the final grouping result.With this approach, we no need to change anything in Gather node. Is
my understanding correct?
Our understandings are aligned.
Regards
David Rowley
--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services
On 20 October 2015 at 23:23, David Rowley <david.rowley@2ndquadrant.com>
wrote:
On 13 October 2015 at 20:57, Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:On Tue, Oct 13, 2015 at 5:53 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:On 13 October 2015 at 17:09, Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:On Tue, Oct 13, 2015 at 12:14 PM, Robert Haas <robertmhaas@gmail.com>
wrote:Also, I think the path for parallel aggregation should probably be
something like FinalizeAgg -> Gather -> PartialAgg -> some partial
path here. I'm not clear whether that is what you are thinking or
not.No. I am thinking of the following way.
Gather->partialagg->some partial pathI want the Gather node to merge the results coming from all workers,
otherwise
it may be difficult to merge at parent of gather node. Because in case
the partial
group aggregate is under the Gather node, if any of two workers are
returning
same group key data, we need to compare them and combine it to make ita
single group. If we are at Gather node, it is possible that we can
wait till we get
slots from all workers. Once all workers returns the slots we cancompare
and merge the necessary slots and return the result. Am I missing
something?My assumption is the same as Robert's here.
Unless I've misunderstood, it sounds like you're proposing to add logicinto
the Gather node to handle final aggregation? That sounds like a
modularity
violation of the whole node concept.
The handling of the final aggregate stage is not all that different
from the
initial aggregate stage. The primary difference is just that your
calling
the combine function instead of the transition function, and the values
Yes, you are correct, till now i am thinking of using transition types as
the
approach, because of that reason only I proposed it as Gather node to
handle
the finalize aggregation.being aggregated are aggregates states rather than the type of the
values
which were initially aggregated. The handling of GROUP BY is all the
same,
yet you only apply the HAVING clause during final aggregation. This is
why I
ended up implementing this in nodeAgg.c instead of inventing some new
node
type that's mostly a copy and paste of nodeAgg.c [1]
After going through your Partial Aggregation / GROUP BY before JOIN patch,
Following is my understanding of parallel aggregate.Finalize [hash] aggregate
-> Gather
-> Partial [hash] aggregateThe data that comes from the Gather node contains the group key and
grouping results.
Based on these we can generate another hash table in case of hash
aggregate at
finalize aggregate and return the final results. This approach works
for both plain and
hash aggregates.For group aggregate support of parallel aggregate, the plan should be
as follows.Finalize Group aggregate
->sort
-> Gather
-> Partial group aggregate
->sortThe data that comes from Gather node needs to be sorted again based on
the grouping key,
merge the data and generates the final grouping result.With this approach, we no need to change anything in Gather node. Is
my understanding correct?Our understandings are aligned.
Hi,
I just wanted to cross post here to mark that I've posted an updated patch
for combining aggregate states:
/messages/by-id/CAKJS1f9wfPKSYt8CG=T271xbyMZjRzWQBjEixiqRF-oLH_u-Zw@mail.gmail.com
I also wanted to check if you've managed to make any progress on Parallel
Aggregation? I'm very interested in this myself and would like to progress
with it, if you're not already doing so.
My current thinking is that most of the remaining changes required for
parallel aggregation, after applying the combine aggregate state patch,
will be in the exact area that Tom will be making changes for the upper
planner path-ification work. I'm not all that certain if we should hold off
for that or not.
--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Dec 3, 2015 at 4:18 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
Hi,
I just wanted to cross post here to mark that I've posted an updated patch
for combining aggregate states:
/messages/by-id/CAKJS1f9wfPKSYt8CG=T271xbyMZjRzWQBjEixiqRF-oLH_u-Zw@mail.gmail.comI also wanted to check if you've managed to make any progress on Parallel
Aggregation? I'm very interested in this myself and would like to progress
with it, if you're not already doing so.
Yes, the parallel aggregate basic patch is almost ready.
This patch is based on your earlier combine state patch.
I will post it to community with in a week or so.
Regards,
Hari Babu
Fujitsu Australia
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 3 December 2015 at 19:24, Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:
On Thu, Dec 3, 2015 at 4:18 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:Hi,
I just wanted to cross post here to mark that I've posted an updated
patch
for combining aggregate states:
/messages/by-id/CAKJS1f9wfPKSYt8CG=T271xbyMZjRzWQBjEixiqRF-oLH_u-Zw@mail.gmail.com
I also wanted to check if you've managed to make any progress on Parallel
Aggregation? I'm very interested in this myself and would like toprogress
with it, if you're not already doing so.
Yes, the parallel aggregate basic patch is almost ready.
This patch is based on your earlier combine state patch.
I will post it to community with in a week or so.
That's great news!
Also note that there's some bug fixes in the patch I just posted on the
other thread for combining aggregate states:
For example: values[Anum_pg_aggregate_aggcombinefn - 1] =
ObjectIdGetDatum(combinefn);
was missing from AggregateCreate().
It might be worth diffing to the updated patch just to pull in anything
else that's changed.
--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Dec 3, 2015 at 6:06 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
On 3 December 2015 at 19:24, Haribabu Kommi <kommi.haribabu@gmail.com>
wrote:On Thu, Dec 3, 2015 at 4:18 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:Hi,
I just wanted to cross post here to mark that I've posted an updated
patch
for combining aggregate states:/messages/by-id/CAKJS1f9wfPKSYt8CG=T271xbyMZjRzWQBjEixiqRF-oLH_u-Zw@mail.gmail.com
I also wanted to check if you've managed to make any progress on
Parallel
Aggregation? I'm very interested in this myself and would like to
progress
with it, if you're not already doing so.Yes, the parallel aggregate basic patch is almost ready.
This patch is based on your earlier combine state patch.
I will post it to community with in a week or so.That's great news!
Also note that there's some bug fixes in the patch I just posted on the
other thread for combining aggregate states:For example: values[Anum_pg_aggregate_aggcombinefn - 1] =
ObjectIdGetDatum(combinefn);
was missing from AggregateCreate().It might be worth diffing to the updated patch just to pull in anything else
that's changed.
Here I attached a POC patch of parallel aggregate based on combine
aggregate patch. This patch contains the combine aggregate changes
also. This patch generates and executes the parallel aggregate plan
as discussed in earlier threads.
Changes:
1. The aggregate reference in Finalize aggregate is getting overwritten
with OUTER_VAR reference. But to do the final aggregate we need the
aggregate here, so currently by checking the combine states it is avoided.
2. Check whether the aggregate functions that are present in the targetlist
and qual can be executed parallel or not? Based on this the targetlist is
formed to pass it to partial aggregate.
3. Replaces the seq scan as the lefttree with partial aggregate plan and
generate full parallel aggregate plan.
Todo:
1. Needs a code cleanup, it is just a prototype.
2. Explain plan with proper instrumentation data.
3. Performance test to observe the effect of parallel aggregate.
4. Need to separate combine aggregate patch with additional changes
done.
Regards,
Hari Babu
Fujitsu Australia
Attachments:
parallelagg_poc.patchapplication/octet-stream; name=parallelagg_poc.patchDownload+1432-246
On Fri, Dec 11, 2015 at 1:42 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
Here I attached a POC patch of parallel aggregate based on combine
aggregate patch. This patch contains the combine aggregate changes
also. This patch generates and executes the parallel aggregate plan
as discussed in earlier threads.
Pretty cool. I'm pretty sure there's some stuff in this patch that's
not right in detail, but I think this is an awfully exciting
direction.
I'd like to commit David Rowley's patch from the other thread first,
and then deal with this one afterwards. The only thing I feel
strongly needs to be changed in that patch is CFUNC -> COMBINEFUNC,
for clarity.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 12 December 2015 at 04:00, Robert Haas <robertmhaas@gmail.com> wrote:
I'd like to commit David Rowley's patch from the other thread first,
and then deal with this one afterwards. The only thing I feel
strongly needs to be changed in that patch is CFUNC -> COMBINEFUNC,
for clarity.
I have addressed that in my local copy. I'm now just working on adding some
test code which uses the new infrastructure. Perhaps I'll just experiment
with the parallel aggregate stuff instead now.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Sat, Dec 12, 2015 at 8:42 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
On 12 December 2015 at 04:00, Robert Haas <robertmhaas@gmail.com> wrote:
I'd like to commit David Rowley's patch from the other thread first,
and then deal with this one afterwards. The only thing I feel
strongly needs to be changed in that patch is CFUNC -> COMBINEFUNC,
for clarity.I have addressed that in my local copy. I'm now just working on adding some
test code which uses the new infrastructure. Perhaps I'll just experiment
with the parallel aggregate stuff instead now.
Here I attached a patch with following changes, i feel it is better to
include them as part
of combine aggregate patch.
1. Added Missing outfuncs.c changes for newly added variables in
Aggref structure
2. Keeping the aggregate function in final aggregate stage to do the
final aggregate
on the received tuples from all workers.
Patch still needs a fix for correcting the explain plan output issue.
postgres=# explain analyze verbose select count(*), sum(f1) from tbl
where f1 % 100 = 0 group by f3;
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Finalize HashAggregate (cost=1853.75..1853.76 rows=1 width=12)
(actual time=92.428..92.429 rows=1 loops=1)
Output: pg_catalog.count(*), pg_catalog.sum((sum(f1))), f3
Group Key: tbl.f3
-> Gather (cost=0.00..1850.00 rows=500 width=12) (actual
time=92.408..92.416 rows=3 loops=1)
Output: f3, (count(*)), (sum(f1))
Regards,
Hari Babu
Fujitsu Australia
Attachments:
set_ref_final_agg.patchapplication/octet-stream; name=set_ref_final_agg.patchDownload+221-1
On Fri, Dec 11, 2015 at 4:42 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
On 12 December 2015 at 04:00, Robert Haas <robertmhaas@gmail.com> wrote:
I'd like to commit David Rowley's patch from the other thread first,
and then deal with this one afterwards. The only thing I feel
strongly needs to be changed in that patch is CFUNC -> COMBINEFUNC,
for clarity.I have addressed that in my local copy. I'm now just working on adding some
test code which uses the new infrastructure.
Excellent.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Dec 11, 2015 at 5:42 PM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
3. Performance test to observe the effect of parallel aggregate.
Here I attached the performance test report of parallel aggregate.
Summary of the result is:
1. Parallel aggregate is not giving any improvement or having
very less overhead compared to parallel scan in case of low
selectivity.
2. Parallel aggregate is performing well more than 60% compared
to parallel scan because of very less data transfer overhead as the
hash aggregate operation is reducing the number of tuples that
are required to be transferred from workers to backend.
The parallel aggregate plan is depends on below parallel seq scan.
In case if parallel seq scan plan is not generated because of more
tuple transfer overhead cost in case of higher selectivity, then
parallel aggregate is also not possible. But with parallel aggregate
the number of records that are required to be transferred from
worker to backend may reduce compared to parallel seq scan. So
the overall cost of parallel aggregate may be better.
To handle this problem, how about the following way?
Having an one more member in RelOptInfo called
cheapest_parallel_path used to store the parallel path that is possible.
where ever the parallel plan is possible, this value will be set with
the possible parallel plan. If parallel plan is not possible in the parent
nodes, then this will be set as NULL. otherwise again calculate the
parallel plan at this node based on the below parallel plan node.
Once the entire paths are finalized, in grouping planner, prepare a
plan for normal aggregate and parallel aggregate. Compare these
two costs and decide the cheapest cost plan.
I didn't yet evaluated the feasibility of the above solution. suggestions?
Regards,
Hari Babu
Fujitsu Australia
Attachments:
On Thu, Dec 10, 2015 at 10:42 PM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
Here I attached a POC patch of parallel aggregate based on combine
aggregate patch. This patch contains the combine aggregate changes
also. This patch generates and executes the parallel aggregate plan
as discussed in earlier threads.
I tried this out using PostGIS with no great success. I used a very
simple aggregate for geometry union because the combine function is
just the same as the transfer function for this case (I also mark
ST_Area() as parallel safe, so that the planner will attempt to
parallelize the query)..
CREATE AGGREGATE ST_MemUnion (
basetype = geometry,
sfunc = ST_Union,
cfunc = ST_Union,
stype = geometry
);
Unfortunately attempting a test causes memory corruption and death.
select riding,
st_area(st_memunion(geom))
from vada group by riding;
The explain looks OK:
QUERY PLAN
---------------------------------------------------------------------------------------
Finalize HashAggregate (cost=220629.47..240380.26 rows=79 width=1189)
Group Key: riding
-> Gather (cost=0.00..807.49 rows=8792 width=1189)
Number of Workers: 1
-> Partial HashAggregate (cost=220628.59..220629.38 rows=79
width=1189)
Group Key: riding
-> Parallel Seq Scan on vada (cost=0.00..806.61
rows=8792 width=1189)
But the run dies.
NOTICE: SRID value -32897 converted to the officially unknown SRID value 0
ERROR: Unknown geometry type: 2139062143 - Invalid type
From the message it looks like geometry gets corrupted at some point,
causing a read to fail on very screwed up metadata.
P.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Dec 15, 2015 at 8:04 AM, Paul Ramsey <pramsey@cleverelephant.ca> wrote:
But the run dies.
NOTICE: SRID value -32897 converted to the officially unknown SRID value 0
ERROR: Unknown geometry type: 2139062143 - Invalid typeFrom the message it looks like geometry gets corrupted at some point,
causing a read to fail on very screwed up metadata.
Thanks for the test. There was some problem in advance_combination_function
in handling pass by reference data. Here I attached updated patch with the fix.
Regards,
Hari Babu
Fujitsu Australia