Suspicious call of initial_cost_hashjoin()

Started by Antonin Houskaover 8 years ago6 messageshackers
Jump to latest
#1Antonin Houska
ah@cybertec.at

try_partial_hashjoin_path() passes constant true to for the parallel_hash
argument of initial_cost_hashjoin(). Shouldn't it instead pass the
parallel_hash argument that it receives?

This is related to commit 1804284042e659e7d16904e7bbb0ad546394b6a3.

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at

#2Thomas Munro
thomas.munro@gmail.com
In reply to: Antonin Houska (#1)
Re: Suspicious call of initial_cost_hashjoin()

On Fri, Dec 22, 2017 at 10:45 PM, Antonin Houska <ah@cybertec.at> wrote:

try_partial_hashjoin_path() passes constant true to for the parallel_hash
argument of initial_cost_hashjoin(). Shouldn't it instead pass the
parallel_hash argument that it receives?

Thanks. Yeah. When initial_cost_hashjoin() calls
get_parallel_divisor() on a non-partial inner path I think it would
return 1.0, so no damage was done there, but when
ExecChooseHashTableSize() receives try_combined_work_mem == true it
might underestimate the number of batches required for a partial hash
join without parallel hash, because it would incorrectly assume that a
single batch join could use the combined work_mem budget. This was
quite well hidden because ExecHashTableCreate() calls
ExecChooseHashTableSize() again (rather than reusing the results from
planning time), so the bad nbatch estimate doesn't show up anywhere.

--
Thomas Munro
http://www.enterprisedb.com

Attachments:

fix.patchapplication/octet-stream; name=fix.patchDownload+1-2
#3David Steele
david@pgmasters.net
In reply to: Thomas Munro (#2)
Re: Re: Suspicious call of initial_cost_hashjoin()

Hi Antonin,

On 12/22/17 6:13 AM, Thomas Munro wrote:

On Fri, Dec 22, 2017 at 10:45 PM, Antonin Houska <ah@cybertec.at> wrote:

try_partial_hashjoin_path() passes constant true to for the parallel_hash
argument of initial_cost_hashjoin(). Shouldn't it instead pass the
parallel_hash argument that it receives?

Thanks. Yeah. When initial_cost_hashjoin() calls
get_parallel_divisor() on a non-partial inner path I think it would
return 1.0, so no damage was done there, but when
ExecChooseHashTableSize() receives try_combined_work_mem == true it
might underestimate the number of batches required for a partial hash
join without parallel hash, because it would incorrectly assume that a
single batch join could use the combined work_mem budget. This was
quite well hidden because ExecHashTableCreate() calls
ExecChooseHashTableSize() again (rather than reusing the results from
planning time), so the bad nbatch estimate doesn't show up anywhere.

Does this look right to you? If so, can you sign up as a reviewer and
mark it Ready for Committer?

Thanks,
--
-David
david@pgmasters.net

#4Antonin Houska
ah@cybertec.at
In reply to: David Steele (#3)
Re: Suspicious call of initial_cost_hashjoin()

David Steele <david@pgmasters.net> wrote:

On 12/22/17 6:13 AM, Thomas Munro wrote:

On Fri, Dec 22, 2017 at 10:45 PM, Antonin Houska <ah@cybertec.at> wrote:

try_partial_hashjoin_path() passes constant true to for the parallel_hash
argument of initial_cost_hashjoin(). Shouldn't it instead pass the
parallel_hash argument that it receives?

Thanks. Yeah. When initial_cost_hashjoin() calls
get_parallel_divisor() on a non-partial inner path I think it would
return 1.0, so no damage was done there, but when
ExecChooseHashTableSize() receives try_combined_work_mem == true it
might underestimate the number of batches required for a partial hash
join without parallel hash, because it would incorrectly assume that a
single batch join could use the combined work_mem budget. This was
quite well hidden because ExecHashTableCreate() calls
ExecChooseHashTableSize() again (rather than reusing the results from
planning time), so the bad nbatch estimate doesn't show up anywhere.

Does this look right to you?

Yes, this is what I meant. The patch applies cleanly and the code compiles
well.

If so, can you sign up as a reviewer and mark it Ready for Committer?

Done.

Actually I think it'd be nice if the "parallel_hash" argument was mentioned in
the header comment of initial_cost_hashjoin() function, but not sure this is
worth returning the patch to the author.

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com

#5Thomas Munro
thomas.munro@gmail.com
In reply to: Antonin Houska (#4)
Re: Suspicious call of initial_cost_hashjoin()

On Fri, Mar 2, 2018 at 9:06 PM, Antonin Houska <ah@cybertec.at> wrote:

David Steele <david@pgmasters.net> wrote:

Does this look right to you?

Yes, this is what I meant. The patch applies cleanly and the code compiles
well.

If so, can you sign up as a reviewer and mark it Ready for Committer?

Done.

Thanks.

Actually I think it'd be nice if the "parallel_hash" argument was mentioned in
the header comment of initial_cost_hashjoin() function, but not sure this is
worth returning the patch to the author.

Done.

--
Thomas Munro
http://www.enterprisedb.com

Attachments:

0001-Fix-costing-of-parallel-hash-joins-v2.patchapplication/octet-stream; name=0001-Fix-costing-of-parallel-hash-joins-v2.patchDownload+3-2
#6Peter Eisentraut
peter_e@gmx.net
In reply to: Thomas Munro (#5)
Re: Suspicious call of initial_cost_hashjoin()

On 3/2/18 05:01, Thomas Munro wrote:

On Fri, Mar 2, 2018 at 9:06 PM, Antonin Houska <ah@cybertec.at> wrote:

David Steele <david@pgmasters.net> wrote:

Does this look right to you?

Yes, this is what I meant. The patch applies cleanly and the code compiles
well.

If so, can you sign up as a reviewer and mark it Ready for Committer?

Done.

Thanks.

Actually I think it'd be nice if the "parallel_hash" argument was mentioned in
the header comment of initial_cost_hashjoin() function, but not sure this is
worth returning the patch to the author.

Done.

committed

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services