pg_dump --pretty-print-views
Hi,
At the company I work for, we've been splitting dumps into separate
files and diffing them for a while now. By far the biggest problem we
had was with views: pg_dump by default dumps views on one line, in a
format which maximizes compatibility. Now this has several problems for
our use case:
1) The one-line equivalent of a 200-line view is completely impossible
to read.
2) If there's a difference between the two dumped view definitions,
it takes a long time to find where and what exactly it is.
3) For some reason some expressions are dumped differently depending
on how exactly they are written, cluttering the diff with false
positives.
While we can do the actual splitting of objects from a -Fc dump
relatively easily, we can't fix the view definitions after they've been
dumped. So I'm proposing a --pretty-print-views setting to pg_dump
(patch attached).
Any feedback is welcome.
Regards,
Marko Tiikkaja
Attachments:
pretty_print_views.patchtext/plain; charset=UTF-8; name=pretty_print_views.patch; x-mac-creator=0; x-mac-type=0Download+32-9
On Thu, Jan 10, 2013 at 01:23:10PM +0100, Marko Tiikkaja wrote:
Hi,
At the company I work for, we've been splitting dumps into separate
files and diffing them for a while now. By far the biggest problem
we had was with views: pg_dump by default dumps views on one line,
in a format which maximizes compatibility. Now this has several
problems for our use case:1) The one-line equivalent of a 200-line view is completely impossible
to read.
2) If there's a difference between the two dumped view definitions,
it takes a long time to find where and what exactly it is.
3) For some reason some expressions are dumped differently depending
on how exactly they are written, cluttering the diff with false
positives.While we can do the actual splitting of objects from a -Fc dump
relatively easily, we can't fix the view definitions after they've
been dumped. So I'm proposing a --pretty-print-views setting to
pg_dump (patch attached).Any feedback is welcome.
Why not make this the new default? That way, new users will have the
benefit, and people with tools or processes that depend on the old
behavior can still have it.
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
* David Fetter (david@fetter.org) wrote:
On Thu, Jan 10, 2013 at 01:23:10PM +0100, Marko Tiikkaja wrote:
Any feedback is welcome.
Why not make this the new default? That way, new users will have the
benefit, and people with tools or processes that depend on the old
behavior can still have it.
I tend to agree with this, I've never really liked having the view def
all on one massive line.
Well, I'll caveat that with this- being able to grep -C2 and pull the
views and the view definitions has been nice on occation, but you can
pull independent objects out with pg_restore from a -Fc dump and, in
general, I think the advantage of having the view definition look
reasonable in the dump is more than being able to do tricks on views
(but not tables, etc anyway..) with grep.
Thanks,
Stephen
On 01/10/2013 07:23 AM, Marko Tiikkaja wrote:
Hi,
At the company I work for, we've been splitting dumps into separate
files and diffing them for a while now. By far the biggest problem we
had was with views: pg_dump by default dumps views on one line, in a
format which maximizes compatibility. Now this has several problems
for our use case:1) The one-line equivalent of a 200-line view is completely impossible
to read.
2) If there's a difference between the two dumped view definitions,
it takes a long time to find where and what exactly it is.
3) For some reason some expressions are dumped differently depending
on how exactly they are written, cluttering the diff with false
positives.While we can do the actual splitting of objects from a -Fc dump
relatively easily, we can't fix the view definitions after they've
been dumped. So I'm proposing a --pretty-print-views setting to
pg_dump (patch attached).
For versions >= 9.2 it would be better to allow passing in a
pretty-print value, like 80 or 0, instead of just passing 'true'. The
new line-wrapping that the integer argument triggers is much more
readable than the supposedly pretty value that 'true' provides.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 1/10/13 3:22 PM, Andrew Dunstan wrote:
For versions >= 9.2 it would be better to allow passing in a
pretty-print value, like 80 or 0, instead of just passing 'true'. The
new line-wrapping that the integer argument triggers is much more
readable than the supposedly pretty value that 'true' provides.
Ooh, I had no idea we exposed that in SQL. I like this idea.
Regards,
Marko Tiikkaja
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Marko Tiikkaja <pgmail@joh.to> writes:
While we can do the actual splitting of objects from a -Fc dump
relatively easily, we can't fix the view definitions after they've been
dumped. So I'm proposing a --pretty-print-views setting to pg_dump
(patch attached).
-1. The reason that pg_dump does not pretty-print things is that
it's unsafe; there is no real guarantee that the view will reload as
intended, because it's under-parenthesized. (Even if we were sure
it would reload safely into current code, which I'm not, what of
future versions that could have different operator precedences?)
I don't think we should offer a foot-gun option like this at all,
and as for making it the default, not bloody likely.
I think your schema-diffing needs would be better served by a tool
specifically directed at that problem; which pg_dump is not, but
I believe there are some out there.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jan 10, 2013 at 11:21:13AM -0500, Tom Lane wrote:
Marko Tiikkaja <pgmail@joh.to> writes:
While we can do the actual splitting of objects from a -Fc dump
relatively easily, we can't fix the view definitions after they've
been dumped. So I'm proposing a --pretty-print-views setting to
pg_dump (patch attached).-1. The reason that pg_dump does not pretty-print things is that
it's unsafe; there is no real guarantee that the view will reload as
intended, because it's under-parenthesized. (Even if we were sure
it would reload safely into current code, which I'm not, what of
future versions that could have different operator precedences?)
Under what circumstances do pretty-printed views not reload? It seems
to me that such circumstances would be pretty_print() bugs by
definition.
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
David Fetter <david@fetter.org> writes:
On Thu, Jan 10, 2013 at 11:21:13AM -0500, Tom Lane wrote:
-1. The reason that pg_dump does not pretty-print things is that
it's unsafe; there is no real guarantee that the view will reload as
intended, because it's under-parenthesized. (Even if we were sure
it would reload safely into current code, which I'm not, what of
future versions that could have different operator precedences?)
Under what circumstances do pretty-printed views not reload? It seems
to me that such circumstances would be pretty_print() bugs by
definition.
It would not be a bug, particularly not in the case of a subsequent
release with different operator precedence. pg_dump's charter is to be
safe. Pretty-printing's charter is to look nice. These goals are not
compatible. If they were, we'd never have made a separate pretty
printing mode at all.
Now, we could consider changing the "safe" mode so that it tries to
provide nice whitespace/line breaks while not risking removal of
parentheses. But that would be a totally different patch, and I'm
not sure how much it would address Marko's desires anyway.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 01/10/2013 12:09 PM, Tom Lane wrote:
Now, we could consider changing the "safe" mode so that it tries to
provide nice whitespace/line breaks while not risking removal of
parentheses. But that would be a totally different patch, and I'm
not sure how much it would address Marko's desires anyway.
I think there's a very good case for breaking the nexus between
PRETTYFLAG_PAREN and PRETTYFLAG_INDENT+line wrapping for views. Only
PRETTYFLAG_PAREN affects the safety issue. The others are just about
white space in safe places.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andrew Dunstan <andrew@dunslane.net> writes:
I think there's a very good case for breaking the nexus between
PRETTYFLAG_PAREN and PRETTYFLAG_INDENT+line wrapping for views. Only
PRETTYFLAG_PAREN affects the safety issue. The others are just about
white space in safe places.
What I was actually thinking about was turning on indent/linewrapping
all the time (ie, no change on pg_dump's side, just hack ruleutils).
If we believe it's safe, why not just do it? It's the paren-removal
that pg_dump can't tolerate.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 01/10/2013 12:35 PM, Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
I think there's a very good case for breaking the nexus between
PRETTYFLAG_PAREN and PRETTYFLAG_INDENT+line wrapping for views. Only
PRETTYFLAG_PAREN affects the safety issue. The others are just about
white space in safe places.What I was actually thinking about was turning on indent/linewrapping
all the time (ie, no change on pg_dump's side, just hack ruleutils).
If we believe it's safe, why not just do it? It's the paren-removal
that pg_dump can't tolerate.
Works for me.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Jan 10, 2013 at 11:07 PM, Andrew Dunstan <andrew@dunslane.net>wrote:
On 01/10/2013 12:35 PM, Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
I think there's a very good case for breaking the nexus between
PRETTYFLAG_PAREN and PRETTYFLAG_INDENT+line wrapping for views. Only
PRETTYFLAG_PAREN affects the safety issue. The others are just about
white space in safe places.What I was actually thinking about was turning on indent/linewrapping
all the time (ie, no change on pg_dump's side, just hack ruleutils).
If we believe it's safe, why not just do it? It's the paren-removal
that pg_dump can't tolerate.Works for me.
Nice idea.
Marko, just hack ruleutils some thing like this:
diff --git a/src/backend/utils/adt/ruleutils.c
b/src/backend/utils/adt/ruleutils.c
index 266cec5..a46f588 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -511,8 +511,9 @@ pg_get_viewdef(PG_FUNCTION_ARGS)
{
/* By OID */
Oid viewoid = PG_GETARG_OID(0);
+ int prettyFlags = PRETTYFLAG_INDENT;
- PG_RETURN_TEXT_P(string_to_text(pg_get_viewdef_worker(viewoid, 0, -1)));
+ PG_RETURN_TEXT_P(string_to_text(pg_get_viewdef_worker(viewoid,
prettyFlags, WRAP_COLUMN_DEFAULT)));
}
Thanks
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/**mailpref/pgsql-hackers<http://www.postgresql.org/mailpref/pgsql-hackers>
--
Jeevan B Chalke
Senior Software Engineer, R&D
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Phone: +91 20 30589500
Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb
This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.
On Fri, 18 Jan 2013 07:46:21 +0100, Jeevan Chalke
<jeevan.chalke@enterprisedb.com> wrote:
Nice idea.
Marko, just hack ruleutils some thing like this:
Here's a patch attempting to do just that.
The actual code changes were trivial, the patch is mostly just regression
tests.
As for the docs, I wasn't entirely happy with what they say about
pg_get_viewdef(), but it didn't look like they needed to be changed by
this either.
Regards,
Marko Tiikkaja
Attachments:
prettyprint_v2.patchapplication/octet-stream; name=prettyprint_v2.patchDownload+1648-1584
Hi Marko,
I could not apply the patch with git apply, but able to apply it by patch
-p1 command.
However, will you please justify the changes done in "xml.out" ? I guess
they are not needed.
You might need to configure your sources with libxml.
Also, I am not sure about putting "WRAP_COLUMN_DEFAULT" by default. I will
keep that in code committors plate. Rest of the code changes looks good to
me.
Thanks
On Sun, Jan 27, 2013 at 6:23 PM, Marko Tiikkaja <pgmail@joh.to> wrote:
On Fri, 18 Jan 2013 07:46:21 +0100, Jeevan Chalke <
jeevan.chalke@enterprisedb.**com <jeevan.chalke@enterprisedb.com>> wrote:Nice idea.
Marko, just hack ruleutils some thing like this:
Here's a patch attempting to do just that.
The actual code changes were trivial, the patch is mostly just regression
tests.As for the docs, I wasn't entirely happy with what they say about
pg_get_viewdef(), but it didn't look like they needed to be changed by this
either.Regards,
Marko Tiikkaja
--
Jeevan B Chalke
Senior Software Engineer, R&D
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Phone: +91 20 30589500
Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb
This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.
On 1/28/13 12:14 PM, Jeevan Chalke wrote:
I could not apply the patch with git apply, but able to apply it by patch
-p1 command.
IME that's normal for patches that went through filterdiff. I do: git
diff |filterdiff --format=context to re-format the patches to the
context diff preferred on the mailing list. Maybe if I somehow told git
to produce context diff it would work..
However, will you please justify the changes done in "xml.out" ? I guess
they are not needed.
You might need to configure your sources with libxml.
If you look very carefully, the pretty-printing version adds one space
at the very beginning of the output. :-)
Also, I am not sure about putting "WRAP_COLUMN_DEFAULT" by default. I will
keep that in code committors plate. Rest of the code changes looks good to
me.
Thanks for reviewing!
Regards,
Marko Tiikkaja
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 28/01/13 12:31, Marko Tiikkaja wrote:
On 1/28/13 12:14 PM, Jeevan Chalke wrote:
I could not apply the patch with git apply, but able to apply it by patch
-p1 command.IME that's normal for patches that went through filterdiff. I do: git
diff |filterdiff --format=context to re-format the patches to the
context diff preferred on the mailing list. Maybe if I somehow told git
to produce context diff it would work..
Try this, worked for me:
http://wiki.postgresql.org/wiki/Working_with_Git#Context_diffs_with_Git
Cheers,
Jan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi Marko,
On Mon, Jan 28, 2013 at 5:01 PM, Marko Tiikkaja <pgmail@joh.to> wrote:
On 1/28/13 12:14 PM, Jeevan Chalke wrote:
I could not apply the patch with git apply, but able to apply it by patch
-p1 command.IME that's normal for patches that went through filterdiff. I do: git
diff |filterdiff --format=context to re-format the patches to the context
diff preferred on the mailing list. Maybe if I somehow told git to produce
context diff it would work..However, will you please justify the changes done in "xml.out" ? I guess
they are not needed.
You might need to configure your sources with libxml.If you look very carefully, the pretty-printing version adds one space at
the very beginning of the output. :-)
That's fine. I am not at all pointing that to you. Have a look at this:
*** a/src/test/regress/expected/xml.out
--- b/src/test/regress/expected/xml.out
***************
*** 3,82 **** CREATE TABLE xmltest (
data xml
);
INSERT INTO xmltest VALUES (1, '<value>one</value>');
INSERT INTO xmltest VALUES (2, '<value>two</value>');
INSERT INTO xmltest VALUES (3, '<wrong');
! ERROR: invalid XML content
LINE 1: INSERT INTO xmltest VALUES (3, '<wrong');
^
! DETAIL: line 1: Couldn't find end of Start Tag wrong line 1
! <wrong
! ^
.
.
.
--- 3,84 ----
data xml
);
INSERT INTO xmltest VALUES (1, '<value>one</value>');
+ ERROR: unsupported XML feature
+ LINE 1: INSERT INTO xmltest VALUES (1, '<value>one</value>');
+ ^
+ DETAIL: This functionality requires the server to be built with libxml
support.
+ HINT: You need to rebuild PostgreSQL using --with-libxml.
INSERT INTO xmltest VALUES (2, '<value>two</value>');
+ ERROR: unsupported XML feature
+ LINE 1: INSERT INTO xmltest VALUES (2, '<value>two</value>');
+ ^
+ DETAIL: This functionality requires the server to be built with libxml
support.
+ HINT: You need to rebuild PostgreSQL using --with-libxml.
These changes are not at all required.
Follow the hint.
In other way, if I apply your patch and run make check I get regression
failure for xml.out.
Please check.
Thanks
Also, I am not sure about putting "WRAP_COLUMN_DEFAULT" by default. I will
keep that in code committors plate. Rest of the code changes looks good to
me.Thanks for reviewing!
Regards,
Marko Tiikkaja
--
Jeevan B Chalke
Senior Software Engineer, R&D
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Phone: +91 20 30589500
Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb
This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.
On 1/29/13 10:18 AM, Jeevan Chalke wrote:
That's fine. I am not at all pointing that to you. Have a look at this:
Ugh.. I'm sorry, I don't understand how this happened. I manually
looked through all the changes, but somehow this slipped through. Will
have a look this evening.
Regards,
Marko Tiikkaja
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, 29 Jan 2013 10:18:51 +0100, Jeevan Chalke
<jeevan.chalke@enterprisedb.com> wrote:
That's fine. I am not at all pointing that to you. Have a look at this:
Here's the third version of this patch, hopefully this time without any
problems. I looked through the patch and it looked OK, but I did that
last time too so I wouldn't trust myself on that one.
Regards,
Marko Tiikkaja
Attachments:
prettyprint_v3.patchapplication/octet-stream; name=prettyprint_v3.patchDownload+539-538
Hi Marko,
On Wed, Jan 30, 2013 at 2:07 AM, Marko Tiikkaja <pgmail@joh.to> wrote:
On Tue, 29 Jan 2013 10:18:51 +0100, Jeevan Chalke <
jeevan.chalke@enterprisedb.**com <jeevan.chalke@enterprisedb.com>> wrote:That's fine. I am not at all pointing that to you. Have a look at this:
Here's the third version of this patch, hopefully this time without any
problems. I looked through the patch and it looked OK, but I did that last
time too so I wouldn't trust myself on that one.
Looks good this time.
Will mark "Ready for committor"
However, I am not sure about putting "WRAP_COLUMN_DEFAULT" by default. In
my opinion we should put that by default but other people may object so I
will keep that in code committors plate.
Thanks
Regards,
Marko Tiikkaja
--
Jeevan B Chalke
Senior Software Engineer, R&D
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Phone: +91 20 30589500
Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb
This e-mail message (and any attachment) is intended for the use of the
individual or entity to whom it is addressed. This message contains
information from EnterpriseDB Corporation that may be privileged,
confidential, or exempt from disclosure under applicable law. If you are
not the intended recipient or authorized to receive this for the intended
recipient, any use, dissemination, distribution, retention, archiving, or
copying of this communication is strictly prohibited. If you have received
this e-mail in error, please notify the sender immediately by reply e-mail
and delete this message.