MinMaxAggPath vs. parallel-safety

Started by Robert Haasalmost 10 years ago3 messageshackers
Jump to latest
#1Robert Haas
robertmhaas@gmail.com

On Sun, Jun 26, 2016 at 10:35 PM, Noah Misch <noah@leadboat.com> wrote:

I looked into this and found that the costs are considered fuzzily the
same, and then add_path prefers the slightly-worse path on the grounds
that it is marked parallel_safe while the MinMaxAgg path is not. It seems
to me that there is some fuzzy thinking going on there. On exactly what
grounds is a path to be preferred merely because it is parallel safe, and
not actually parallelized? Or perhaps the question to ask is whether a
MinMaxAgg path can be marked parallel-safe.

[Action required within 72 hours. This is a generic notification.]

The above-described topic is currently a PostgreSQL 9.6 open item ("consider
whether MinMaxAggPath might fail to be parallel-safe"). Robert, since you
committed the patch believed to have created it, you own this open item. If
some other commit is more relevant or if this does not belong as a 9.6 open
item, please let us know. Otherwise, please observe the policy on open item
ownership[1] and send a status update within 72 hours of this message.
Include a date for your subsequent status update. Testers may discover new
open items at any time, and I want to plan to get them all fixed well in
advance of shipping 9.6rc1. Consequently, I will appreciate your efforts
toward speedy resolution. Thanks.

It turns out that this open item is phased incorrectly. I'll update
the phrasing.

/* A MinMaxAggPath implies use of subplans, so cannot be
parallel-safe */
pathnode->path.parallel_safe = false;

Currently, MinMaxAggPath is never parallel-safe; the question is
whether we could allow it to be parallel-safe (not, as the current
phrasing implies, whether it might ever need to be other than
parallel-safe). It appears to me that the answer is "no", because a
MinMaxAggPath contains a list of MinMaxAggInfo objects, and there we
have this:

Param *param; /* param for subplan's output */

Since subplans aren't passed down to parallel workers, no
MinMaxAggPath can be parallel-safe. Therefore, I think there's
nothing to do here right now. Comments?

See also /messages/by-id/CA+TgmoZ7wvMPMTSNtk+dfDUNWmc8kK5pUtLDnzOLvJ9DVeAF_A@mail.gmail.com

(Official status update: I'll remove this open item in 3 days unless
the above analysis is shown to be incorrect.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#1)
Re: MinMaxAggPath vs. parallel-safety

Robert Haas <robertmhaas@gmail.com> writes:

On Sun, Jun 26, 2016 at 10:35 PM, Noah Misch <noah@leadboat.com> wrote:

The above-described topic is currently a PostgreSQL 9.6 open item ("consider
whether MinMaxAggPath might fail to be parallel-safe").

Currently, MinMaxAggPath is never parallel-safe; the question is
whether we could allow it to be parallel-safe (not, as the current
phrasing implies, whether it might ever need to be other than
parallel-safe).

Check.

It appears to me that the answer is "no", because a
MinMaxAggPath contains a list of MinMaxAggInfo objects, and there we
have this:
Param *param; /* param for subplan's output */
Since subplans aren't passed down to parallel workers, no
MinMaxAggPath can be parallel-safe. Therefore, I think there's
nothing to do here right now. Comments?

Hm. In principle, this could be made to work, since I don't think it
would be necessary for the Param's value to pass across process
boundaries. (It could be locally generated within a worker, and then also
consumed within the worker, if the worker's plan looked like a Result with
a subplan attached.) However, if we don't even pass down the plan trees
for subplans, then I agree that it can't work at the moment.

In any case, this is an optimization opportunity not a bug. If you want
to kick this can down the road until parallel query is generally smarter
about subplans, that's OK with me.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#2)
Re: MinMaxAggPath vs. parallel-safety

On Mon, Jun 27, 2016 at 3:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Sun, Jun 26, 2016 at 10:35 PM, Noah Misch <noah@leadboat.com> wrote:

The above-described topic is currently a PostgreSQL 9.6 open item ("consider
whether MinMaxAggPath might fail to be parallel-safe").

Currently, MinMaxAggPath is never parallel-safe; the question is
whether we could allow it to be parallel-safe (not, as the current
phrasing implies, whether it might ever need to be other than
parallel-safe).

Check.

It appears to me that the answer is "no", because a
MinMaxAggPath contains a list of MinMaxAggInfo objects, and there we
have this:
Param *param; /* param for subplan's output */
Since subplans aren't passed down to parallel workers, no
MinMaxAggPath can be parallel-safe. Therefore, I think there's
nothing to do here right now. Comments?

Hm. In principle, this could be made to work, since I don't think it
would be necessary for the Param's value to pass across process
boundaries. (It could be locally generated within a worker, and then also
consumed within the worker, if the worker's plan looked like a Result with
a subplan attached.) However, if we don't even pass down the plan trees
for subplans, then I agree that it can't work at the moment.

We don't. See ExecSerializePlan().

In any case, this is an optimization opportunity not a bug. If you want
to kick this can down the road until parallel query is generally smarter
about subplans, that's OK with me.

I don't really see another option. I don't think it would be a lot of
work to pass subplans to workers along with the main plan, but finding
all of the places that can then benefit as a result of that change and
figuring out which cases are allowable sounds to me like development
work, not stabilization.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers