cost_agg() with AGG_HASHED does not account for startup costs

Started by David Rowleyover 10 years ago3 messageshackers
Jump to latest
#1David Rowley
dgrowleyml@gmail.com

During working on allowing the planner to perform GROUP BY before joining
I've noticed that cost_agg() completely ignores input_startup_cost
when aggstrategy == AGG_HASHED.

I can see at least 3 call to cost_agg() which pass aggstrategy as
AGG_HASHED that are also passing a possible non-zero startup cost.

The attached changes cost_agg() to include the startup cost, but does
nothing for the changed plans in the regression tests.

Is this really intended?

Regards

David Rowley

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

cost_agg_startup_cost.difftext/plain; charset=US-ASCII; name=cost_agg_startup_cost.diffDownload+2-1
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#1)
Re: cost_agg() with AGG_HASHED does not account for startup costs

David Rowley <david.rowley@2ndquadrant.com> writes:

During working on allowing the planner to perform GROUP BY before joining
I've noticed that cost_agg() completely ignores input_startup_cost
when aggstrategy == AGG_HASHED.

Isn't your proposed patch double-counting the input startup cost?
input_total_cost already includes that charge. The calculation
reflects the fact that we have to read all of the input before we
can deliver any aggregated results, so the time to get the first
input row isn't really interesting.

If this were wrong, the PLAIN costing path would also be wrong, but
I don't think that either one is.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3David Rowley
dgrowleyml@gmail.com
In reply to: Tom Lane (#2)
Re: cost_agg() with AGG_HASHED does not account for startup costs

On 5 August 2015 at 01:54, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Rowley <david.rowley@2ndquadrant.com> writes:

During working on allowing the planner to perform GROUP BY before joining
I've noticed that cost_agg() completely ignores input_startup_cost
when aggstrategy == AGG_HASHED.

Isn't your proposed patch double-counting the input startup cost?
input_total_cost already includes that charge. The calculation
reflects the fact that we have to read all of the input before we
can deliver any aggregated results, so the time to get the first
input row isn't really interesting.

If this were wrong, the PLAIN costing path would also be wrong, but
I don't think that either one is.

Sorry, false alarm, you're right.

This was a bug in my code where I was adding disable_cost to the
startup_cost, but not to the total_cost.

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Training & Services