to-do item for explain analyze of hash aggregates?

Started by Jeff Janesover 8 years ago7 messages

jeff.janes@gmail.com

over 8 years ago

The explain analyze of the hash step of a hash join reports something like
this:

-> Hash (cost=458287.68..458287.68 rows=24995368 width=37) (actual
rows=24995353 loops=1)
Buckets: 33554432 Batches: 1 Memory Usage: 2019630kB

Should the HashAggregate node also report on Buckets and Memory Usage? I
would have found that useful several times. Is there some reason this is
not wanted, or not possible?

Cheers,

Jeff

andres@anarazel.de

over 8 years ago

In reply to: Jeff Janes (#1)

Re: to-do item for explain analyze of hash aggregates?

On 2017-04-24 11:42:12 -0700, Jeff Janes wrote:

The explain analyze of the hash step of a hash join reports something like
this:

-> Hash (cost=458287.68..458287.68 rows=24995368 width=37) (actual
rows=24995353 loops=1)
Buckets: 33554432 Batches: 1 Memory Usage: 2019630kB

Should the HashAggregate node also report on Buckets and Memory Usage? I
would have found that useful several times. Is there some reason this is
not wanted, or not possible?

I've wanted that too. It's not impossible at all.

- Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

tomas.vondra@2ndquadrant.com

over 8 years ago

In reply to: Andres Freund (#2)

Re: to-do item for explain analyze of hash aggregates?

On 04/24/2017 08:52 PM, Andres Freund wrote:

On 2017-04-24 11:42:12 -0700, Jeff Janes wrote:

The explain analyze of the hash step of a hash join reports something like
this:

-> Hash (cost=458287.68..458287.68 rows=24995368 width=37) (actual
rows=24995353 loops=1)
Buckets: 33554432 Batches: 1 Memory Usage: 2019630kB

Should the HashAggregate node also report on Buckets and Memory Usage? I
would have found that useful several times. Is there some reason this is
not wanted, or not possible?

I've wanted that too. It's not impossible at all.

Why wouldn't that be possible? We probably can't use exactly the same
approach as Hash, because hashjoins use custom hash table while hashagg
uses dynahash IIRC. But why couldn't measure the amount of memory by
looking at the memory context, for example?

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

jeff.janes@gmail.com

over 8 years ago

In reply to: Tomas Vondra (#3)

Re: to-do item for explain analyze of hash aggregates?

On Mon, Apr 24, 2017 at 12:13 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com

wrote:

On 04/24/2017 08:52 PM, Andres Freund wrote:

On 2017-04-24 11:42:12 -0700, Jeff Janes wrote:

The explain analyze of the hash step of a hash join reports something
like
this:

-> Hash (cost=458287.68..458287.68 rows=24995368 width=37) (actual
rows=24995353 loops=1)
Buckets: 33554432 Batches: 1 Memory Usage: 2019630kB

Should the HashAggregate node also report on Buckets and Memory Usage? I
would have found that useful several times. Is there some reason this is
not wanted, or not possible?

I've wanted that too. It's not impossible at all.

Why wouldn't that be possible? We probably can't use exactly the same
approach as Hash, because hashjoins use custom hash table while hashagg
uses dynahash IIRC. But why couldn't measure the amount of memory by
looking at the memory context, for example?

He said "not impossible", meaning it is possible.

I've added it to the wiki Todo page. (Hopefully that has not doomed it to
be forgotten about)

Cheers,

Jeff

tomas.vondra@2ndquadrant.com

over 8 years ago

In reply to: Jeff Janes (#4)

Re: to-do item for explain analyze of hash aggregates?

On 04/24/2017 10:55 PM, Jeff Janes wrote:

On Mon, Apr 24, 2017 at 12:13 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote:

On 04/24/2017 08:52 PM, Andres Freund wrote:

...

I've wanted that too. It's not impossible at all.

Why wouldn't that be possible? We probably can't use exactly the
same approach as Hash, because hashjoins use custom hash table while
hashagg uses dynahash IIRC. But why couldn't measure the amount of
memory by looking at the memory context, for example?

He said "not impossible", meaning it is possible.

Ah, the dreaded double negative ...

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

andres@anarazel.de

over 8 years ago

In reply to: Jeff Janes (#4)

Re: to-do item for explain analyze of hash aggregates?

On 2017-04-24 13:55:57 -0700, Jeff Janes wrote:

On Mon, Apr 24, 2017 at 12:13 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com
I've added it to the wiki Todo page. (Hopefully that has not doomed it to
be forgotten about)

The easiest way to avoid that fate is to implement it yourself ;)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

andres@anarazel.de

over 8 years ago

In reply to: Tomas Vondra (#3)

Re: to-do item for explain analyze of hash aggregates?

On 2017-04-24 21:13:16 +0200, Tomas Vondra wrote:

On 04/24/2017 08:52 PM, Andres Freund wrote:

On 2017-04-24 11:42:12 -0700, Jeff Janes wrote:

The explain analyze of the hash step of a hash join reports something like
this:

-> Hash (cost=458287.68..458287.68 rows=24995368 width=37) (actual
rows=24995353 loops=1)
Buckets: 33554432 Batches: 1 Memory Usage: 2019630kB

Should the HashAggregate node also report on Buckets and Memory Usage? I
would have found that useful several times. Is there some reason this is
not wanted, or not possible?

I've wanted that too. It's not impossible at all.

Why wouldn't that be possible? We probably can't use exactly the same
approach as Hash, because hashjoins use custom hash table while hashagg uses
dynahash IIRC. But why couldn't measure the amount of memory by looking at
the memory context, for example?

Doesn't use dynahash anymore (but a simplehash.h style table) anymore,
but that should actually make it simpler, not harder.

- Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers