Duplicate Workers entries in some EXPLAIN plans

Started by Maciek Sakrejdaover 6 years ago5 messagesbugs
Jump to latest
#1Maciek Sakrejda
maciek@pganalyze.com

Hello,

I ran into an odd behavior with some EXPLAIN results in Postgres 11.5. I
noticed this with JSON format first, but similar issues exist with the
other formats as well for this query. I think I can follow up with the
query and full plan if needed, but essentially, the issue is that the Sort
node has two different entries for the "Workers" key (something that
technically JSON does allow, but such JSON structures are very difficult to
work with, and JSON library support for them is poor). The node looks like
this (some details elided):

{
"Node Type": "Sort",
...
"Workers": [
{
"Worker Number": 0,
"Sort Method": "external merge",
"Sort Space Used": 20128,
"Sort Space Type": "Disk"
},
{
"Worker Number": 1,
"Sort Method": "external merge",
"Sort Space Used": 20128,
"Sort Space Type": "Disk"
}
],
...
"Workers": [
{
"Worker Number": 0,
"Actual Startup Time": 309.726,
"Actual Total Time": 310.179,
"Actual Rows": 4128,
"Actual Loops": 1,
"Shared Hit Blocks": 2872,
"Shared Read Blocks": 7584,
"Shared Dirtied Blocks": 0,
"Shared Written Blocks": 0,
"Local Hit Blocks": 0,
"Local Read Blocks": 0,
"Local Dirtied Blocks": 0,
"Local Written Blocks": 0,
"Temp Read Blocks": 490,
"Temp Written Blocks": 2529
},
{
"Worker Number": 1,
"Actual Startup Time": 306.523,
"Actual Total Time": 307.001,
"Actual Rows": 4128,
"Actual Loops": 1,
"Shared Hit Blocks": 3356,
"Shared Read Blocks": 7100,
"Shared Dirtied Blocks": 0,
"Shared Written Blocks": 0,
"Local Hit Blocks": 0,
"Local Read Blocks": 0,
"Local Dirtied Blocks": 0,
"Local Written Blocks": 0,
"Temp Read Blocks": 490,
"Temp Written Blocks": 2529
}
],
"Plans:" ...
}

YAML and XML formats both have parallel issues. TEXT format is a little
different but also seems odd, with multiple lines in the plan node for each
worker:

Sort Method: external merge Disk: 4920kB
Worker 0: Sort Method: external merge Disk: 5880kB
Worker 1: Sort Method: external merge Disk: 5920kB
Buffers: shared hit=682 read=10188, temp read=1415 written=2101
Worker 0: actual time=130.058..130.324 rows=1324 loops=1
Buffers: shared hit=337 read=3489, temp read=505 written=739
Worker 1: actual time=130.273..130.512 rows=1297 loops=1
Buffers: shared hit=345 read=3507, temp read=505 written=744

Is this a bug?

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Maciek Sakrejda (#1)
Re: Duplicate Workers entries in some EXPLAIN plans

Maciek Sakrejda <maciek@pganalyze.com> writes:

I ran into an odd behavior with some EXPLAIN results in Postgres 11.5. I
noticed this with JSON format first, but similar issues exist with the
other formats as well for this query. I think I can follow up with the
query and full plan if needed, but essentially, the issue is that the Sort
node has two different entries for the "Workers" key (something that
technically JSON does allow, but such JSON structures are very difficult to
work with, and JSON library support for them is poor).

Yeah, this was already complained of here:

/messages/by-id/41ee53a5-a36e-cc8f-1bee-63f6565bb1ee@dalibo.com

I think the text-mode output is intentional, but the other formats
need more work. We also need to think about whether we can change
this without big backwards-compatibility problems.

regards, tom lane

#3Maciek Sakrejda
maciek@pganalyze.com
In reply to: Tom Lane (#2)
Re: Duplicate Workers entries in some EXPLAIN plans

Thanks, I searched for previous reports of this, but I did not see that
one. In that thread, Andrew Dunstan suggested

Maybe a simpler fix would be to rename one set of nodes to "Sort-Workers"

or some such.

Is that feasible? Maybe as "Workers (Sort)"?

We also need to think about whether we can change
this without big backwards-compatibility problems.

As in, due to users relying on this idiosyncratic output and working around
parsing issues (ruby, python, and node's built-in parsers all seem to just
keep the last entry when keys repeat by default), or because merging the
nodes would introduce new entries in the Workers nodes that users may not
expect?

#4Maciek Sakrejda
maciek@pganalyze.com
In reply to: Maciek Sakrejda (#3)
Re: Duplicate Workers entries in some EXPLAIN plans

Should I move this to a pgsql-hackers discussion? I noticed that jsonb also
appears to keep the last JSON entry in the face of multiple keys, so it'd
be nice to have something more usable. I'm not much of a C programmer, but
I think I see how to rename the second fields to Sort Workers if this
solution is acceptable. Looking at the code in explain.c, there do not
appear to be any other EXPLAIN node fields in a similar situation (I
grepped for ExplainOpenGroup and "Workers" is the only one that occurs
twice).

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Maciek Sakrejda (#4)
Re: Duplicate Workers entries in some EXPLAIN plans

Maciek Sakrejda <maciek@pganalyze.com> writes:

Should I move this to a pgsql-hackers discussion? I noticed that jsonb also
appears to keep the last JSON entry in the face of multiple keys, so it'd
be nice to have something more usable. I'm not much of a C programmer, but
I think I see how to rename the second fields to Sort Workers if this
solution is acceptable.

Yeah, the actual code change should be pretty trivial --- the hard part
here is to get consensus on what behavior change we want. It's not
unreasonable to decide that on pgsql-bugs ... but since there hasn't
been much commentary yet, maybe moving to -hackers is what to do
to seek consensus.

regards, tom lane