pg_dump

Started by Thomas G. Lockhartover 27 years ago5 messages
#1Thomas G. Lockhart
lockhart@alumni.caltech.edu

We need to think about whether to surround all identifiers with double
quotes all the time in pg_dump output.

The reason is that Postgres allows reserved keywords to be specified as
table and column names if they are surrounded by the double quotes, but
pg_dump doesn't know whether an identifier also happens to be a reserved
keyword. afaik it's now only using DQs if there is upper case or funny
characters in the identifier. Instead it should probably surround the
fields with DQs all the time. It could perhaps have a command-line
switch to turn off that feature if necessary.

The alternative to always using the DQs is to have pg_dump use the
keywords.c routine available in the backend. But I don't think that is
as reliable since it isn't guaranteed to be in sync with the backend
version and since there are non-reserved keywords in that file which
test the same as the reserved ones.

This should probably be a "must do" for v6.4...

Comments?

- Tom

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas G. Lockhart (#1)
Re: [HACKERS] pg_dump

"Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:

We need to think about whether to surround all identifiers with double
quotes all the time in pg_dump output.

Is there any downside to that? Offhand I don't see one.
Unless you do, let's just do it...

regards, tom lane

#3Matthew C. Aycock
matt@mathcs.emory.edu
In reply to: Tom Lane (#2)
Re: [HACKERS] pg_dump

Well,

After looking at the code for pg_dump, it appears that all identifiers
go through fmtId which does as you thougt, it looks to see if a character
is non-lower case, a digit, or an '_' then it double quotes it.
I am going to change this behavior to always double quote it and see
if it breaks anything.

I also noted a possible memory leak that I introduced when I added
the code to dump the ACLs for tables and such. I will take care of
this as well.

My concern is that I do not have a database with enough stuff to
really test this. I tried to dump the regression and template1
databases, but template1 had nothing in it and regression failed
not being able to find a trigger.

Does anyone have a good test database that they could send me the
pg_dump of? If it is really big, I can either ftp it from you, or
you could ftp it to me.

Thanks,

Matt

We need to think about whether to surround all identifiers with double
quotes all the time in pg_dump output.

The reason is that Postgres allows reserved keywords to be specified as
table and column names if they are surrounded by the double quotes, but
pg_dump doesn't know whether an identifier also happens to be a reserved
keyword. afaik it's now only using DQs if there is upper case or funny
characters in the identifier. Instead it should probably surround the
fields with DQs all the time. It could perhaps have a command-line
switch to turn off that feature if necessary.

The alternative to always using the DQs is to have pg_dump use the
keywords.c routine available in the backend. But I don't think that is
as reliable since it isn't guaranteed to be in sync with the backend
version and since there are non-reserved keywords in that file which
test the same as the reserved ones.

This should probably be a "must do" for v6.4...

Comments?

- Tom

----------
Matthew C. Aycock
Operating Systems Analyst/Admin, Senior
Dept Math/CS
Emory University, Atlanta, GA
Internet: matt@mathcs.emory.edu

#4Thomas G. Lockhart
lockhart@alumni.caltech.edu
In reply to: Matthew C. Aycock (#3)
Re: [HACKERS] pg_dump

After looking at the code for pg_dump, it appears that all identifiers
go through fmtId which does as you thougt, it looks to see if a character
is non-lower case, a digit, or an '_' then it double quotes it.
I am going to change this behavior to always double quote it and see
if it breaks anything.

I've already got patches to do the double-quote thing, with a command
line option (-n) to recover the old behavior. Will commit that, so you
will only need to do the memory leak stuff. Or would you prefer that I
send you my patches and you can integrate them?

- Tom

#5Matthew C. Aycock
matt@mathcs.emory.edu
In reply to: Thomas G. Lockhart (#4)
Re: [HACKERS] pg_dump

Thomas,

Go ahead and commit your patches for the double quote stuff.
I need to resync with CVS. I inadverently deleted the original
files that I was using. I will create new patches after the
commits from you and Jan.

Matt

After looking at the code for pg_dump, it appears that all identifiers
go through fmtId which does as you thougt, it looks to see if a character
is non-lower case, a digit, or an '_' then it double quotes it.
I am going to change this behavior to always double quote it and see
if it breaks anything.

I've already got patches to do the double-quote thing, with a command
line option (-n) to recover the old behavior. Will commit that, so you
will only need to do the memory leak stuff. Or would you prefer that I
send you my patches and you can integrate them?

- Tom

----------
Matthew C. Aycock
Operating Systems Analyst/Admin, Senior
Dept Math/CS
Emory University, Atlanta, GA
Internet: matt@mathcs.emory.edu