outfuncs.c utility statement support

Started by Peter Eisentrautalmost 9 years ago11 messageshackers
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

While debugging some other stuff, I was wondering to what extent node
types supporting utility statements should be supported in outfuncs.c.
Running with --debug-print-parse=on, executing

create table test1 (a int, b text);

gives output that is truncated somewhere in the middle (possibly a null
byte), and

drop table test1;

shows (among other things)

:utilityStmt ?

Is there a way to check that everything that needs to be there is there
and that the stuff that is there works correctly or is reachable at all?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#1)
Re: outfuncs.c utility statement support

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

While debugging some other stuff, I was wondering to what extent node
types supporting utility statements should be supported in outfuncs.c.

We've largely not tried too hard in that department. From a debugging
standpoint it'd be useful to have more complete coverage, but I don't
know how much additional code and maintenance effort would be required
to get to full coverage.

(Hmm .. I think copyfuncs.c does have complete coverage, and it's about
5500 lines vs. 4200 lines for outfuncs.c. So perhaps this would mean
about 30% growth of outfuncs.c and another 20KB or so in the executable.)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#1)
Re: outfuncs.c utility statement support

On 6/13/17 11:25, Peter Eisentraut wrote:

Running with --debug-print-parse=on, executing

create table test1 (a int, b text);

gives output that is truncated somewhere in the middle (possibly a null
byte)

So this seems to be a pretty basic bug. Some node fields of type char
may be zero, and so printing them as a zero byte just truncates the
whole output string. This could be fixed by printing chars like strings
with the full escaping mechanism. See attached patch.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

0001-Fix-output-of-char-node-fields.patchtext/plain; charset=UTF-8; name=0001-Fix-output-of-char-node-fields.patch; x-mac-creator=0; x-mac-type=0Download+19-2
#4Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Peter Eisentraut (#3)
Re: outfuncs.c utility statement support

On 2017/06/14 12:49, Peter Eisentraut wrote:

On 6/13/17 11:25, Peter Eisentraut wrote:

Running with --debug-print-parse=on, executing

create table test1 (a int, b text);

gives output that is truncated somewhere in the middle (possibly a null
byte)

So this seems to be a pretty basic bug. Some node fields of type char
may be zero, and so printing them as a zero byte just truncates the
whole output string. This could be fixed by printing chars like strings
with the full escaping mechanism. See attached patch.

+1. I've been meaning to report about
zero-byte-truncating-the-whole-output-string thing for some time now.

Thanks,
Amit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Robert Haas
robertmhaas@gmail.com
In reply to: Amit Langote (#4)
Re: outfuncs.c utility statement support

On Tue, Jun 13, 2017 at 11:59 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:

So this seems to be a pretty basic bug. Some node fields of type char
may be zero, and so printing them as a zero byte just truncates the
whole output string. This could be fixed by printing chars like strings
with the full escaping mechanism. See attached patch.

+1. I've been meaning to report about
zero-byte-truncating-the-whole-output-string thing for some time now.

I've been bitten by this, too.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#3)
Re: outfuncs.c utility statement support

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

So this seems to be a pretty basic bug. Some node fields of type char
may be zero, and so printing them as a zero byte just truncates the
whole output string. This could be fixed by printing chars like strings
with the full escaping mechanism. See attached patch.

+1 for fixing this, but you have not handled the read side correctly.
pg_strtok doesn't promise to return a null-terminated string, so without
changes, readfuncs.c would not successfully decode a zero-byte char field.
Also it would do the wrong thing for any character code that outToken had
decided to prefix with a backslash. I think you could fix both problems
like this:

 /* Read a char field (ie, one ascii character) */
 #define READ_CHAR_FIELD(fldname) \
 	token = pg_strtok(&length);		/* skip :fldname */ \
 	token = pg_strtok(&length);		/* get field value */ \
-	local_node->fldname = token[0]
+	local_node->fldname = debackslash(token, length)[0]

although that's annoyingly expensive for the normal case where no
special processing is needed. Maybe better

+ local_node->fldname = (length == 0) ? '\0' : (token[0] == '\\') ? token[1] : token[0]

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#6)
Re: outfuncs.c utility statement support

On 6/14/17 12:05, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

So this seems to be a pretty basic bug. Some node fields of type char
may be zero, and so printing them as a zero byte just truncates the
whole output string. This could be fixed by printing chars like strings
with the full escaping mechanism. See attached patch.

+1 for fixing this, but you have not handled the read side correctly.
pg_strtok doesn't promise to return a null-terminated string, so without
changes, readfuncs.c would not successfully decode a zero-byte char field.
Also it would do the wrong thing for any character code that outToken had
decided to prefix with a backslash. I think you could fix both problems
like this:

/* Read a char field (ie, one ascii character) */
#define READ_CHAR_FIELD(fldname) \
token = pg_strtok(&length);		/* skip :fldname */ \
token = pg_strtok(&length);		/* get field value */ \
-	local_node->fldname = token[0]
+	local_node->fldname = debackslash(token, length)[0]

An empty token produces "<>", so just debackslashing wouldn't work. Maybe

local_node->fldname = length ? token[0] : '\0';

?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#7)
Re: outfuncs.c utility statement support

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

An empty token produces "<>", so just debackslashing wouldn't work.

pg_strtok recognizes "<>" and returns length = 0, so debackslash()
would produce the right answer AFAICS (admittedly, I haven't tested).
But I don't really want to do it like that because of the wasted
palloc space and cycles.

Maybe
local_node->fldname = length ? token[0] : '\0';
?

Doesn't cope with backslash-quoted characters. If we're going to bother
to do anything here, I think we ought to make it reversible for all
possible characters.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#8)
Re: outfuncs.c utility statement support

On 6/18/17 10:14, Tom Lane wrote:

pg_strtok recognizes "<>" and returns length = 0, so debackslash()
would produce the right answer AFAICS (admittedly, I haven't tested).
But I don't really want to do it like that because of the wasted
palloc space and cycles.

Maybe
local_node->fldname = length ? token[0] : '\0';
?

Doesn't cope with backslash-quoted characters. If we're going to bother
to do anything here, I think we ought to make it reversible for all
possible characters.

Makes sense. Updated patch attached.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachments:

v2-0001-Fix-output-of-char-node-fields.patchtext/plain; charset=UTF-8; name=v2-0001-Fix-output-of-char-node-fields.patch; x-mac-creator=0; x-mac-type=0Download+21-3
#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#9)
Re: outfuncs.c utility statement support

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

On 6/18/17 10:14, Tom Lane wrote:

Doesn't cope with backslash-quoted characters. If we're going to bother
to do anything here, I think we ought to make it reversible for all
possible characters.

Makes sense. Updated patch attached.

That looks sane to me, though I've still not actually tested any
interesting cases.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#10)
Re: outfuncs.c utility statement support

On 6/21/17 23:40, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

On 6/18/17 10:14, Tom Lane wrote:

Doesn't cope with backslash-quoted characters. If we're going to bother
to do anything here, I think we ought to make it reversible for all
possible characters.

Makes sense. Updated patch attached.

That looks sane to me, though I've still not actually tested any
interesting cases.

I have committed this. I had to build a bunch more stuff around it to
be able to test it, which I will tidy up and submit at some later point.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers