explain output infelicity in psql
I have just noticed while checking the EXPLAIN YAML patch that the
non-text explain formats are output as a single line with embedded line
feeds, while the text format is delivered as a set of text records, one
per line. The practical effect of this is that psql decorates the
non-text format output with continuation characters:
andrew=# explain select count(*) from pg_class where relname ~ 'pg_';
QUERY PLAN
----------------------------------------------------------------
Aggregate (cost=9.67..9.68 rows=1 width=0)
-> Seq Scan on pg_class (cost=0.00..9.16 rows=204 width=0)
Filter: (relname ~ 'pg_'::text)
(3 rows)
Time: 5.813 ms
andrew=# explain (format yaml) select count(*) from pg_class where
relname ~ 'pg_';
QUERY PLAN
-----------------------------------------
- Plan: +
Node Type: Aggregate +
Strategy: Plain +
Startup Cost: 9.67 +
Total Cost: 9.68 +
Plan Rows: 1 +
Plan Width: 0 +
Plans: +
- Node Type: Seq Scan +
Parent Relationship: Outer +
Relation Name: pg_class +
Alias: pg_class +
Startup Cost: 0.00 +
Total Cost: 9.16 +
Plan Rows: 204 +
Plan Width: 0 +
Filter: (relname ~ 'pg_'::text)
(1 row)
Those + chars at the end of the line are ugly, to say the least, and
they make the supposedly machine-readable formats not so machine
readable if anyone wanted to c&p the output into a parser. (I'm mildly
surprised this hasn't been noticed before).
Maybe we need to teach psql not to do this formatting for EXPLAIN output?
cheers
andrew
On Wed, Dec 9, 2009 at 2:37 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
I have just noticed while checking the EXPLAIN YAML patch that the non-text
explain formats are output as a single line with embedded line feeds, while
the text format is delivered as a set of text records, one per line. The
practical effect of this is that psql decorates the non-text format output
with continuation characters:andrew=# explain select count(*) from pg_class where relname ~ 'pg_';
QUERY PLAN
----------------------------------------------------------------
Aggregate (cost=9.67..9.68 rows=1 width=0)
-> Seq Scan on pg_class (cost=0.00..9.16 rows=204 width=0)
Filter: (relname ~ 'pg_'::text)
(3 rows)Time: 5.813 ms
andrew=# explain (format yaml) select count(*) from pg_class where
relname ~ 'pg_';
QUERY PLAN
-----------------------------------------
- Plan: +
Node Type: Aggregate +
Strategy: Plain +
Startup Cost: 9.67 +
Total Cost: 9.68 +
Plan Rows: 1 +
Plan Width: 0 +
Plans: +
- Node Type: Seq Scan +
Parent Relationship: Outer +
Relation Name: pg_class +
Alias: pg_class +
Startup Cost: 0.00 +
Total Cost: 9.16 +
Plan Rows: 204 +
Plan Width: 0 +
Filter: (relname ~ 'pg_'::text)
(1 row)Those + chars at the end of the line are ugly, to say the least, and they
make the supposedly machine-readable formats not so machine readable if
anyone wanted to c&p the output into a parser. (I'm mildly surprised this
hasn't been noticed before).Maybe we need to teach psql not to do this formatting for EXPLAIN output?
Oh, dear. I think that line continuation syntax was recently added -
subsequent to the machine-readable EXPLAIN patch. The reason why it's
coded to emit everything as a single row is because that will be most
convenient for programs that are sucking down this data
programatically. Otherwise, they'll have to concatenate all the lines
that are returned.
And in fact for XML format, it's even worse: the data is returned as
type xml, but that obviously won't fly if we return each line as a
separate tuple.
On first blush, I'm inclined to suggest that the addition of + signs
to mark continuation lines is a misfeature.
...Robert
Robert Haas wrote:
On Wed, Dec 9, 2009 at 2:37 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
I have just noticed while checking the EXPLAIN YAML patch that the non-text
explain formats are output as a single line with embedded line feeds, while
the text format is delivered as a set of text records, one per line. The
practical effect of this is that psql decorates the non-text format output
with continuation characters:
[snip]
Those + chars at the end of the line are ugly, to say the least, and they
make the supposedly machine-readable formats not so machine readable if
anyone wanted to c&p the output into a parser. (I'm mildly surprised this
hasn't been noticed before).Maybe we need to teach psql not to do this formatting for EXPLAIN output?
Oh, dear. I think that line continuation syntax was recently added -
subsequent to the machine-readable EXPLAIN patch. The reason why it's
coded to emit everything as a single row is because that will be most
convenient for programs that are sucking down this data
programatically. Otherwise, they'll have to concatenate all the lines
that are returned.And in fact for XML format, it's even worse: the data is returned as
type xml, but that obviously won't fly if we return each line as a
separate tuple.On first blush, I'm inclined to suggest that the addition of + signs
to mark continuation lines is a misfeature.
I certainly agree we don't want to break up the non-text formats.
A simple if ugly hack would make psql use old-ascii print style (which
doesn't use these contionuation chars) if the first attribute in the
resultset was named 'QUERY PLAN'
If someone has a better suggestion I'm all ears.
cheers
andrew
2009/12/9 Robert Haas <robertmhaas@gmail.com>:
On Wed, Dec 9, 2009 at 2:37 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
I have just noticed while checking the EXPLAIN YAML patch that the non-text
explain formats are output as a single line with embedded line feeds, while
the text format is delivered as a set of text records, one per line. The
practical effect of this is that psql decorates the non-text format output
with continuation characters:andrew=# explain select count(*) from pg_class where relname ~ 'pg_';
QUERY PLAN
----------------------------------------------------------------
Aggregate (cost=9.67..9.68 rows=1 width=0)
-> Seq Scan on pg_class (cost=0.00..9.16 rows=204 width=0)
Filter: (relname ~ 'pg_'::text)
(3 rows)Time: 5.813 ms
andrew=# explain (format yaml) select count(*) from pg_class where
relname ~ 'pg_';
QUERY PLAN
-----------------------------------------
- Plan: +
Node Type: Aggregate +
Strategy: Plain +
Startup Cost: 9.67 +
Total Cost: 9.68 +
Plan Rows: 1 +
Plan Width: 0 +
Plans: +
- Node Type: Seq Scan +
Parent Relationship: Outer +
Relation Name: pg_class +
Alias: pg_class +
Startup Cost: 0.00 +
Total Cost: 9.16 +
Plan Rows: 204 +
Plan Width: 0 +
Filter: (relname ~ 'pg_'::text)
(1 row)Those + chars at the end of the line are ugly, to say the least, and they
make the supposedly machine-readable formats not so machine readable if
anyone wanted to c&p the output into a parser. (I'm mildly surprised this
hasn't been noticed before).Maybe we need to teach psql not to do this formatting for EXPLAIN output?
Oh, dear. I think that line continuation syntax was recently added -
subsequent to the machine-readable EXPLAIN patch. The reason why it's
coded to emit everything as a single row is because that will be most
convenient for programs that are sucking down this data
programatically. Otherwise, they'll have to concatenate all the lines
that are returned.And in fact for XML format, it's even worse: the data is returned as
type xml, but that obviously won't fly if we return each line as a
separate tuple.On first blush, I'm inclined to suggest that the addition of + signs
to mark continuation lines is a misfeature.
+1
Pavel
Show quoted text
...Robert
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas escribi�:
On first blush, I'm inclined to suggest that the addition of + signs
to mark continuation lines is a misfeature.
-1
EXPLAIN is a special case. IMHO it should be dealt with accordingly.
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Thu, Dec 10, 2009 at 10:12:32AM -0300, Alvaro Herrera wrote:
Robert Haas escribió:
On first blush, I'm inclined to suggest that the addition of + signs
to mark continuation lines is a misfeature.-1
EXPLAIN is a special case. IMHO it should be dealt with accordingly.
If the formatting code can be taught that it's outputting for explain,
we can skip the wrap/newline markup easily. I don't think we
necessarily need to fall back to the old-ascii format, we just
conditionally disable that specific part.
Alternatively, would it make more sense just to add a boolean pset
parameter to enable/disable the use of wrap/newline marks? It may
be that people may wish to disable it for use cases in addition
to EXPLAIN.
Regards,
Roger
--
.''`. Roger Leigh
: :' : Debian GNU/Linux http://people.debian.org/~rleigh/
`. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/
`- GPG Public Key: 0x25BFB848 Please GPG sign your mail.
Roger Leigh wrote:
On Thu, Dec 10, 2009 at 10:12:32AM -0300, Alvaro Herrera wrote:
Robert Haas escribi�:
On first blush, I'm inclined to suggest that the addition of + signs
to mark continuation lines is a misfeature.-1
EXPLAIN is a special case. IMHO it should be dealt with accordingly.
If the formatting code can be taught that it's outputting for explain,
we can skip the wrap/newline markup easily. I don't think we
necessarily need to fall back to the old-ascii format, we just
conditionally disable that specific part.Alternatively, would it make more sense just to add a boolean pset
parameter to enable/disable the use of wrap/newline marks? It may
be that people may wish to disable it for use cases in addition
to EXPLAIN.
We have already added a lot of knobs to twist, and I don't want to add
any more.
cheers
andrew
On Wed, Dec 9, 2009 at 6:41 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
Oh, dear. I think that line continuation syntax was recently added -
subsequent to the machine-readable EXPLAIN patch. The reason why it's
coded to emit everything as a single row is because that will be most
convenient for programs that are sucking down this data
programatically. Otherwise, they'll have to concatenate all the lines
that are returned.And in fact for XML format, it's even worse: the data is returned as
type xml, but that obviously won't fly if we return each line as a
separate tuple.On first blush, I'm inclined to suggest that the addition of + signs
to mark continuation lines is a misfeature.I certainly agree we don't want to break up the non-text formats.
A simple if ugly hack would make psql use old-ascii print style (which
doesn't use these contionuation chars) if the first attribute in the
resultset was named 'QUERY PLAN'If someone has a better suggestion I'm all ears.
I don't believe that machine-readable EXPLAIN output is the only
multi-line output value that anyone would ever wish to cut and paste
into an editor without picking up a lot of stray garbage, so I don't
think this is a solution.
...Robert
Robert Haas wrote:
I don't believe that machine-readable EXPLAIN output is the only
multi-line output value that anyone would ever wish to cut and paste
into an editor without picking up a lot of stray garbage, so I don't
think this is a solution.
Well, yes, another example that has just occurred to me is
pg_proc.prosrc. So maybe this really is a misfeature.
cheers
andrew
Alvaro Herrera wrote:
Robert Haas escribi�:
On first blush, I'm inclined to suggest that the addition of + signs
to mark continuation lines is a misfeature.EXPLAIN is a special case. IMHO it should be dealt with accordingly.
Is it?
Wouldn't it affect anyone who stuck XML in a txt column and wanted
to copy and paste it into a parser?
Perhaps single column output usually won't want the + signs (because
it's copy&pasteable) but multi-column output could?
Ron Mayer wrote:
Alvaro Herrera wrote:
Robert Haas escribi�:
On first blush, I'm inclined to suggest that the addition of + signs
to mark continuation lines is a misfeature.EXPLAIN is a special case. IMHO it should be dealt with accordingly.
Is it?
Wouldn't it affect anyone who stuck XML in a txt column and wanted
to copy and paste it into a parser?Perhaps single column output usually won't want the + signs (because
it's copy&pasteable) but multi-column output could?
Yeah, I'm thinking we should probably suppress output of format.nl_right
(no matter what the format) where there is only one column. (This is
even uglier with unicode linestyle, btw). That's a sane rule and it's
not an ugly hack.
cheers
andrew
On Thu, Dec 10, 2009 at 11:44 AM, Ron Mayer
<rm_pg@cheapcomplexdevices.com> wrote:
Alvaro Herrera wrote:
Robert Haas escribió:
On first blush, I'm inclined to suggest that the addition of + signs
to mark continuation lines is a misfeature.EXPLAIN is a special case. IMHO it should be dealt with accordingly.
Is it?
Wouldn't it affect anyone who stuck XML in a txt column and wanted
to copy and paste it into a parser?Perhaps single column output usually won't want the + signs (because
it's copy&pasteable) but multi-column output could?
I don't think inconsistency is a good thing here. Apparently the old
format is available via \pset linestyle old-ascii. We can either
decide that people should use that format if they want that behavior,
or we can decide that changing the default was a mistake and revert
it. I don't think a half-way-in-between solution is a good option.
...Robert
Robert Haas <robertmhaas@gmail.com> wrote:
Andrew Dunstan <andrew@dunslane.net> wrote:
A simple if ugly hack would make psql use old-ascii print style
(which doesn't use these contionuation chars) if the first
attribute in the resultset was named 'QUERY PLAN'
I don't believe that machine-readable EXPLAIN output is the only
multi-line output value that anyone would ever wish to cut and
paste into an editor without picking up a lot of stray garbage, so
I don't think this is a solution.
Agreed.
This would be a significant annoyance for me on a regular basis. If
I can't turn it off, it would probably cause me to create my own
locally patched version of psql. Another alternative would be to
use some other tool to run queries where I wanted long values
without this, but psql has so many nice features that I'd be
switching back and forth, so the patch would probably be easier.
-Kevin
Andrew Dunstan <andrew@dunslane.net> writes:
Yeah, I'm thinking we should probably suppress output of format.nl_right
(no matter what the format) where there is only one column. (This is
even uglier with unicode linestyle, btw). That's a sane rule and it's
not an ugly hack.
Yes it is. The real problem here is expecting the tabular format to be
copy and pasteable, which is not a design goal it's ever had, and not
one that we can start imposing on it at this late date. Why don't you
just do "\pset format unaligned" (or "\a" if you're lazy)?
regards, tom lane
Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
Yeah, I'm thinking we should probably suppress output of format.nl_right
(no matter what the format) where there is only one column. (This is
even uglier with unicode linestyle, btw). That's a sane rule and it's
not an ugly hack.Yes it is.
I don't see much virtue of having these characters when there is only
one column.
The real problem here is expecting the tabular format to be
copy and pasteable, which is not a design goal it's ever had, and not
one that we can start imposing on it at this late date. Why don't you
just do "\pset format unaligned" (or "\a" if you're lazy)?
Well, I haven't had to up to now. I'm not sure what we have is exactly
an advance. I guess I can set the linestyle to old-ascii in my .psqlrc,
but having to do that all over the place is annoying.
And clearly I'm not the only person who doesn't like this behaviour.
cheers
andrew
Andrew Dunstan <andrew@dunslane.net> writes:
I don't see much virtue of having these characters when there is only
one column.
So you can tell a newline in the data from a wrap due to line length.
The need to be able to do that is not dependent on how many columns
there are.
And clearly I'm not the only person who doesn't like this behaviour.
It's just our usual negative reaction to any change whatsoever ;-).
I was unimpressed with Leigh's changes too at the start, but I can see
that there is value in it.
I think that changing the behavior depending on how many columns there
are is an incredibly ugly hack, and your assertions to the contrary
aren't going to change my mind. If we think that this is such a bad
idea it should be reverted, then let's revert it altogether.
Another possibility, which I don't understand why it was dismissed so
cavalierly, is to have EXPLAIN put out one row per logical line instead
of using embedded newlines.
regards, tom lane
On Thu, Dec 10, 2009 at 12:43 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
Yeah, I'm thinking we should probably suppress output of format.nl_right
(no matter what the format) where there is only one column. (This is even
uglier with unicode linestyle, btw). That's a sane rule and it's not an ugly
hack.Yes it is.
I don't see much virtue of having these characters when there is only one
column.
I don't see much virtue in having these characters, period. The reason
for having them is presumably to avoid confusion between two rows and
one row with an embedded newline. If anything, this confusion is more
likely with a single column than it is with multiple columns, since
you don't have the context of the surrounding output.
...Robert
Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
I don't see much virtue of having these characters when there is only
one column.So you can tell a newline in the data from a wrap due to line length.
The need to be able to do that is not dependent on how many columns
there are.
If that's really what we want then I think we're doing a terrible job of
it. Have a look at the output of:
select
E'xxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
as a, 1 as b;
How do we know from that where the linefeeds are, exactly?
cheers
andrew
Andrew Dunstan <andrew@dunslane.net> writes:
Tom Lane wrote:
So you can tell a newline in the data from a wrap due to line length.
The need to be able to do that is not dependent on how many columns
there are.
If that's really what we want then I think we're doing a terrible job of
it. Have a look at the output of:
select
E'xxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
as a, 1 as b;
How do we know from that where the linefeeds are, exactly?
It works quite nicely for me ... in HEAD that is:
regression=# select E'xxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
as a, 1 as b;
a | b
------------------------------------------------------+---
xxxxxxx +| 1
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +|
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx |
(1 row)
regression=#
The point here is exactly that previous versions didn't show the
distinction well.
regards, tom lane
Tom Lane wrote:
Why don't you
just do "\pset format unaligned" (or "\a" if you're lazy)?
That's fair. Now that I see it, I guess I should have been
doing that for copy&paste work anyway.
Ron Mayer <rm_pg@cheapcomplexdevices.com> wrote:
Tom Lane wrote:
Why don't you
just do "\pset format unaligned" (or "\a" if you're lazy)?That's fair. Now that I see it, I guess I should have been
doing that for copy&paste work anyway.
That'll cover my use cases.
-Kevin
Tom Lane wrote:
regression=# select E'xxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
as a, 1 as b;
a | b
------------------------------------------------------+---
xxxxxxx +| 1
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +|
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx |
(1 row)regression=#
The point here is exactly that previous versions didn't show the
distinction well.
If we really want to make linefeeds visible, I think we should place the
indicators immediately after the character preceding the line feed, not
next to the column separator.
cheers
andrew
Andrew Dunstan <andrew@dunslane.net> writes:
If we really want to make linefeeds visible, I think we should place the
indicators immediately after the character preceding the line feed, not
next to the column separator.
Then it's hard to tell it apart from an instance of that character in
the data.
regards, tom lane
Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
If we really want to make linefeeds visible, I think we should place the
indicators immediately after the character preceding the line feed, not
next to the column separator.Then it's hard to tell it apart from an instance of that character in
the data.
Yeah, I just thought of that. Oh, well, old-ascii for me ;-)
cheers
andrew
Andrew Dunstan wrote:
Tom Lane wrote:
regression=# select E'xxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
as a, 1 as b;
a | b
------------------------------------------------------+---
xxxxxxx +| 1
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +|
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx |
(1 row)regression=#
The point here is exactly that previous versions didn't show the
distinction well.If we really want to make linefeeds visible, I think we should place the
indicators immediately after the character preceding the line feed, not
next to the column separator.
One idea would be to change the column _separator_ for newlines --- that
way, they don't show up for single-column output.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
On Thu, Dec 10, 2009 at 8:11 PM, Bruce Momjian <bruce@momjian.us> wrote:
Andrew Dunstan wrote:
Tom Lane wrote:
regression=# select E'xxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\nxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
as a, 1 as b;
a | b
------------------------------------------------------+---
xxxxxxx +| 1
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx +|
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx |
(1 row)regression=#
The point here is exactly that previous versions didn't show the
distinction well.If we really want to make linefeeds visible, I think we should place the
indicators immediately after the character preceding the line feed, not
next to the column separator.One idea would be to change the column _separator_ for newlines --- that
way, they don't show up for single-column output.
Hilariously enough, that's exactly what we used to do. I am leaning
toward the view that we should revert all the ASCII output format
changes vs. 8.4 and let people use the new unicode mode if they want
all the spiffy bells and whistles.
...Robert
Robert Haas <robertmhaas@gmail.com> writes:
Hilariously enough, that's exactly what we used to do. I am leaning
toward the view that we should revert all the ASCII output format
changes vs. 8.4 and let people use the new unicode mode if they want
all the spiffy bells and whistles.
There are clearly use-cases for this feature; I think arguing to revert
it is an extreme overreaction. We could reconsider which behavior ought
to be default, perhaps.
regards, tom lane
On tor, 2009-12-10 at 22:02 -0500, Robert Haas wrote:
On Thu, Dec 10, 2009 at 8:11 PM, Bruce Momjian <bruce@momjian.us> wrote:
One idea would be to change the column _separator_ for newlines --- that
way, they don't show up for single-column output.Hilariously enough, that's exactly what we used to do.
Well, the only reason why that "worked" for this case is that it didn't
work at all when the column in question was the first one.