WIP: pg_pretty_query

Started by Pavel Stehuleover 13 years ago13 messages

Pavel Stehule

pavel.stehule@gmail.com

over 13 years ago

1 attachment(s)

Hello

last year we are spoke about reusing pretty print view code for some queries.

Here is patch:

this patch is really short - it is nice. But - it works only with
known database objects (probably we would it) and it doesn't format
subqueries well

postgres=# select pg_pretty_query('select x.*, z.* from foo, foo x, x
z where x.a = 10 and x.a = 30 and EXISTS(SELECT * FROM foo WHERE a =
z.a)', true, false);
pg_pretty_query
----------------------------------------------------------
SELECT x.a, z.a +
FROM foo, foo x, x z +
WHERE x.a = 10 AND x.a = 30 AND (EXISTS ( SELECT foo.a+
FROM foo +
WHERE foo.a = z.a))
(1 row)

Regards

Pavel

Attachments:

pg_pretty_query.patchapplication/octet-stream; name=pg_pretty_query.patchDownload

diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 412dfe6..dfc91d1 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -39,6 +39,7 @@
 #include "nodes/nodeFuncs.h"
 #include "optimizer/clauses.h"
 #include "optimizer/tlist.h"
+#include "parser/analyze.h"
 #include "parser/keywords.h"
 #include "parser/parse_func.h"
 #include "parser/parse_oper.h"
@@ -47,6 +48,7 @@
 #include "rewrite/rewriteHandler.h"
 #include "rewrite/rewriteManip.h"
 #include "rewrite/rewriteSupport.h"
+#include "tcop/tcopprot.h"
 #include "utils/array.h"
 #include "utils/builtins.h"
 #include "utils/fmgroids.h"
@@ -435,6 +437,39 @@ pg_get_viewdef_name_ext(PG_FUNCTION_ARGS)
 	PG_RETURN_TEXT_P(string_to_text(pg_get_viewdef_worker(viewoid, prettyFlags)));
 }
 
+
+Datum
+pg_pretty_query(PG_FUNCTION_ARGS)
+{
+	text	*qstring = PG_GETARG_TEXT_P(0);
+	bool	indent = PG_GETARG_BOOL(1);
+	bool	paren = PG_GETARG_BOOL(2);
+	int		prettyFlags;
+	Query *query;
+	StringInfoData  buf;
+	List			*parsetree;
+	const char *query_string;
+
+	query_string = text_to_cstring(qstring);
+	prettyFlags = ( indent ? PRETTYFLAG_INDENT : 0 ) | ( ! paren ? PRETTYFLAG_PAREN : 0 );
+
+	/*
+	 * Do this first so that string is alloc'd in outer context not SPI's.
+	 */
+	initStringInfo(&buf);
+
+	parsetree = pg_parse_query(query_string);
+  
+	query = parse_analyze((Node *) linitial(parsetree), query_string, NULL, 0);
+
+	//query->commandType = CMD_SELECT;
+
+	get_query_def(query, &buf, NIL, NULL, prettyFlags, 0);
+
+	PG_RETURN_TEXT_P(string_to_text(buf.data));
+}
+
+
 /*
  * Common code for by-OID and by-name variants of pg_get_viewdef
  */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 665918f..5101683 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -1959,6 +1959,9 @@ DESCR("identity argument list of a function");
 DATA(insert OID = 2165 (  pg_get_function_result	   PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 25 "26" _null_ _null_ _null_ _null_ pg_get_function_result _null_ _null_ _null_ ));
 DESCR("result type of a function");
 
+DATA(insert OID = 3839 (  pg_pretty_query	   PGNSP PGUID 12 1 0 0 0 f f f f t f s 3 0 25 "25 16 16" _null_ _null_ _null_ _null_ pg_pretty_query _null_ _null_ _null_ ));
+DESCR("query pretty format");
+
 DATA(insert OID = 1686 (  pg_get_keywords		PGNSP PGUID 12 10 400 0 0 f f f f t t s 0 0 2249 "" "{25,18,25}" "{o,o,o}" "{word,catcode,catdesc}" _null_ pg_get_keywords _null_ _null_ _null_ ));
 DESCR("list of SQL keywords");

Thom Brown

thom@linux.com

over 13 years ago

In reply to: Pavel Stehule (#1)

Re: WIP: pg_pretty_query

On 7 August 2012 15:14, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Hello

last year we are spoke about reusing pretty print view code for some queries.

Here is patch:

this patch is really short - it is nice. But - it works only with
known database objects (probably we would it) and it doesn't format
subqueries well

postgres=# select pg_pretty_query('select x.*, z.* from foo, foo x, x
z where x.a = 10 and x.a = 30 and EXISTS(SELECT * FROM foo WHERE a =
z.a)', true, false);
pg_pretty_query
----------------------------------------------------------
SELECT x.a, z.a +
FROM foo, foo x, x z +
WHERE x.a = 10 AND x.a = 30 AND (EXISTS ( SELECT foo.a+
FROM foo +
WHERE foo.a = z.a))
(1 row)

This looks odd:

postgres=# SELECT pg_pretty_query('SELECT 1, (SELECT max(a.x) +
greatest(2,3) FROM generate_series(4,10,2) a(x)) FROM
generate_series(1,100) GROUP BY 1 ORDER BY 1, 2 USING < NULLS FIRST',
true, false);
pg_pretty_query
------------------------------------------------------------------
SELECT 1, +
( SELECT max(a.x) + GREATEST(2, 3) +
FROM generate_series(4, 10, 2) a(x)) +
FROM generate_series(1, 100) generate_series(generate_series)+
GROUP BY 1::integer +
ORDER BY 1::integer, ( SELECT max(a.x) + GREATEST(2, 3) +
FROM generate_series(4, 10, 2) a(x)) NULLS FIRST
(1 row)

USING < is removed completely (or if I used DESC, NULLS FIRST is then
removed instead), "2" in the order by is expanded to its full query,
and generate_series when used in FROM is repeated with its own name as
a parameter. I'm also not sure about the spacing before each line.
SELECT, FROM and GROUP BY all appear out of alignment from one
another.

Plus it would be nice if we could support something like the following style:

SELECT
field_one,
field_two + field_three
FROM
my_table
INNER JOIN
another_table
ON
my_table.field_one = another_table.another_field
AND
another_table.valid = true
WHERE
field_one > 3
AND
field_two < 10;

But that's just a nice-to-have.
--
Thom

Pavel Stehule

pavel.stehule@gmail.com

over 13 years ago

In reply to: Thom Brown (#2)

Re: WIP: pg_pretty_query

2012/8/7 Thom Brown <thom@linux.com>:

On 7 August 2012 15:14, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Hello

last year we are spoke about reusing pretty print view code for some queries.

Here is patch:

this patch is really short - it is nice. But - it works only with
known database objects (probably we would it) and it doesn't format
subqueries well

postgres=# select pg_pretty_query('select x.*, z.* from foo, foo x, x
z where x.a = 10 and x.a = 30 and EXISTS(SELECT * FROM foo WHERE a =
z.a)', true, false);
pg_pretty_query
----------------------------------------------------------
SELECT x.a, z.a +
FROM foo, foo x, x z +
WHERE x.a = 10 AND x.a = 30 AND (EXISTS ( SELECT foo.a+
FROM foo +
WHERE foo.a = z.a))
(1 row)

This looks odd:

postgres=# SELECT pg_pretty_query('SELECT 1, (SELECT max(a.x) +
greatest(2,3) FROM generate_series(4,10,2) a(x)) FROM
generate_series(1,100) GROUP BY 1 ORDER BY 1, 2 USING < NULLS FIRST',
true, false);
pg_pretty_query
------------------------------------------------------------------
SELECT 1, +
( SELECT max(a.x) + GREATEST(2, 3) +
FROM generate_series(4, 10, 2) a(x)) +
FROM generate_series(1, 100) generate_series(generate_series)+
GROUP BY 1::integer +
ORDER BY 1::integer, ( SELECT max(a.x) + GREATEST(2, 3) +
FROM generate_series(4, 10, 2) a(x)) NULLS FIRST
(1 row)

USING < is removed completely (or if I used DESC, NULLS FIRST is then
removed instead), "2" in the order by is expanded to its full query,
and generate_series when used in FROM is repeated with its own name as
a parameter. I'm also not sure about the spacing before each line.
SELECT, FROM and GROUP BY all appear out of alignment from one
another.

it is issue - probably we can start deserialization just from parser
stage, not from rewriter stage - but then code will be significantly
longer and we cannot reuse current code for pretty print view.

Plus it would be nice if we could support something like the following style:

SELECT
field_one,
field_two + field_three
FROM
my_table
INNER JOIN
another_table
ON
my_table.field_one = another_table.another_field
AND
another_table.valid = true
WHERE
field_one > 3
AND
field_two < 10;

it is second issue - probably there are more lovely styles - CELKO,
your and other. I am not sure if we can support more styles in core
(contrib should be better maybe).

Regards

Pavel

Show quoted text

But that's just a nice-to-have.
--
Thom

Bruce Momjian

bruce@momjian.us

over 13 years ago

In reply to: Pavel Stehule (#1)

Re: WIP: pg_pretty_query

On Tue, Aug 7, 2012 at 04:14:34PM +0200, Pavel Stehule wrote:

Hello

last year we are spoke about reusing pretty print view code for some queries.

Here is patch:

this patch is really short - it is nice. But - it works only with
known database objects (probably we would it) and it doesn't format
subqueries well

postgres=# select pg_pretty_query('select x.*, z.* from foo, foo x, x
z where x.a = 10 and x.a = 30 and EXISTS(SELECT * FROM foo WHERE a =
z.a)', true, false);
pg_pretty_query
----------------------------------------------------------
SELECT x.a, z.a +
FROM foo, foo x, x z +
WHERE x.a = 10 AND x.a = 30 AND (EXISTS ( SELECT foo.a+
FROM foo +
WHERE foo.a = z.a))
(1 row)

I can see this as very useful for people reporting badly-formatted
queries to our email lists. Great!

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Andrew Dunstan

andrew@dunslane.net

over 13 years ago

In reply to: Pavel Stehule (#1)

Re: WIP: pg_pretty_query

On 08/07/2012 10:14 AM, Pavel Stehule wrote:

Hello

last year we are spoke about reusing pretty print view code for some queries.

Here is patch:

this patch is really short - it is nice. But - it works only with
known database objects (probably we would it) and it doesn't format
subqueries well

postgres=# select pg_pretty_query('select x.*, z.* from foo, foo x, x
z where x.a = 10 and x.a = 30 and EXISTS(SELECT * FROM foo WHERE a =
z.a)', true, false);
pg_pretty_query
----------------------------------------------------------
SELECT x.a, z.a +
FROM foo, foo x, x z +
WHERE x.a = 10 AND x.a = 30 AND (EXISTS ( SELECT foo.a+
FROM foo +
WHERE foo.a = z.a))
(1 row)

Good stuff. That's one less item on my TODO list :-)

I think we should have a version that lets you specify the wrap column,
like pg_get_viewdef does. Possibly for this case we should even default
it to 0 (wrap after each item) instead of 79 which it is for views.

cheers

andrew

David Fetter

david@fetter.org

over 13 years ago

In reply to: Pavel Stehule (#3)

Re: WIP: pg_pretty_query

On Tue, Aug 07, 2012 at 04:54:12PM +0200, Pavel Stehule wrote:

2012/8/7 Thom Brown <thom@linux.com>:

On 7 August 2012 15:14, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Hello

last year we are spoke about reusing pretty print view code for some queries.

Here is patch:

this patch is really short - it is nice. But - it works only with
known database objects (probably we would it) and it doesn't format
subqueries well

postgres=# select pg_pretty_query('select x.*, z.* from foo, foo x, x
z where x.a = 10 and x.a = 30 and EXISTS(SELECT * FROM foo WHERE a =
z.a)', true, false);
pg_pretty_query
----------------------------------------------------------
SELECT x.a, z.a +
FROM foo, foo x, x z +
WHERE x.a = 10 AND x.a = 30 AND (EXISTS ( SELECT foo.a+
FROM foo +
WHERE foo.a = z.a))
(1 row)

This looks odd:

postgres=# SELECT pg_pretty_query('SELECT 1, (SELECT max(a.x) +
greatest(2,3) FROM generate_series(4,10,2) a(x)) FROM
generate_series(1,100) GROUP BY 1 ORDER BY 1, 2 USING < NULLS FIRST',
true, false);
pg_pretty_query
------------------------------------------------------------------
SELECT 1, +
( SELECT max(a.x) + GREATEST(2, 3) +
FROM generate_series(4, 10, 2) a(x)) +
FROM generate_series(1, 100) generate_series(generate_series)+
GROUP BY 1::integer +
ORDER BY 1::integer, ( SELECT max(a.x) + GREATEST(2, 3) +
FROM generate_series(4, 10, 2) a(x)) NULLS FIRST
(1 row)

USING < is removed completely (or if I used DESC, NULLS FIRST is then
removed instead), "2" in the order by is expanded to its full query,
and generate_series when used in FROM is repeated with its own name as
a parameter. I'm also not sure about the spacing before each line.
SELECT, FROM and GROUP BY all appear out of alignment from one
another.

it is issue - probably we can start deserialization just from parser
stage, not from rewriter stage - but then code will be significantly
longer and we cannot reuse current code for pretty print view.

Plus it would be nice if we could support something like the following style:

SELECT
field_one,
field_two + field_three
FROM
my_table
INNER JOIN
another_table
ON
my_table.field_one = another_table.another_field
AND
another_table.valid = true
WHERE
field_one > 3
AND
field_two < 10;

it is second issue - probably there are more lovely styles - CELKO,
your and other. I am not sure if we can support more styles in core
(contrib should be better maybe).

Would it be better to have output plugins and not privilege one?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Tom Lane

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Bruce Momjian (#4)

Re: WIP: pg_pretty_query

Bruce Momjian <bruce@momjian.us> writes:

On Tue, Aug 7, 2012 at 04:14:34PM +0200, Pavel Stehule wrote:

last year we are spoke about reusing pretty print view code for some queries.

Here is patch:

I can see this as very useful for people reporting badly-formatted
queries to our email lists. Great!

Allow me to express a contrary opinion: I think this is a bad idea.

* First off, packaging it as a SQL function that takes and returns text
seems rather awkward to use. A lot of places where you might wish for
a SQL pretty-printer aren't going to have a database connection handy
(think "emacs SQL mode").

* The functionality provided is not merely a pretty-printer but sort
of a query validator as well: the function will fail if the query refers
to any tables, columns, functions, etc you don't have in your database.
For some applications that's fine, but others not so much --- in
particular I suspect it's nigh useless for the use-case you mention of
quickly turning an emailed query into something legible. And there's
no way to separate out the reformatting functionality from that.

* As per some of the complaints already registered in this thread,
ruleutils.c is not designed with the goal of being a pretty-printer.
Its primary charter is to support pg_dump by regurgitating rules/views
in an unambiguous form, which does not necessarily look very similar to
what was entered. An example of a transformation that probably nobody
would want in a typical pretty-printing context is expansion of
"SELECT *" lists. But again, there is really no way to turn that off.
Another aspect that seems pretty undesirable for pretty-printing is
loss of any comments embedded in the query text.

I'm very much not in favor of trying to make ruleutils serve two
masters, but that's the game we will be playing if we accept this patch.

In short, the only redeeming value of this patch is that it's short.
The functionality it provides is not something that anyone would come
up with in a green-field design for a pretty-printer, and if we take
it we are going to be faced with a whole lot of redesign requests that
will be painful to implement and will carry heavy risks of breaking
pg_dump and/or EXPLAIN.

regards, tom lane

Andrew Dunstan

andrew@dunslane.net

over 13 years ago

In reply to: Tom Lane (#7)

Re: WIP: pg_pretty_query

On 08/07/2012 02:14 PM, Tom Lane wrote:

* As per some of the complaints already registered in this thread,
ruleutils.c is not designed with the goal of being a pretty-printer.
Its primary charter is to support pg_dump by regurgitating rules/views
in an unambiguous form, which does not necessarily look very similar to
what was entered. An example of a transformation that probably nobody
would want in a typical pretty-printing context is expansion of
"SELECT *" lists. But again, there is really no way to turn that off.
Another aspect that seems pretty undesirable for pretty-printing is
loss of any comments embedded in the query text.

I'm very much not in favor of trying to make ruleutils serve two
masters, but that's the game we will be playing if we accept this patch.

I think this horse has probably bolted. If you wanted to segregate off
this functionality we shouldn't have used things like pg_get_viewdef in
psql, ISTM.

In short, the only redeeming value of this patch is that it's short.
The functionality it provides is not something that anyone would come
up with in a green-field design for a pretty-printer, and if we take
it we are going to be faced with a whole lot of redesign requests that
will be painful to implement and will carry heavy risks of breaking
pg_dump and/or EXPLAIN.

One of the challenges is to have a pretty printer that is kept in sync
with the dialect that's supported. Anything that doesn't use the
backend's parser seems to me to be guaranteed to get out of sync very
quickly.

cheers

andrew

Tom Lane

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Andrew Dunstan (#8)

Re: WIP: pg_pretty_query

Andrew Dunstan <andrew@dunslane.net> writes:

On 08/07/2012 02:14 PM, Tom Lane wrote:

In short, the only redeeming value of this patch is that it's short.

One of the challenges is to have a pretty printer that is kept in sync
with the dialect that's supported. Anything that doesn't use the
backend's parser seems to me to be guaranteed to get out of sync very
quickly.

Sure. I think if we wanted an actually engineered solution, rather than
a half-baked one, ecpg provides a good source of inspiration. One could
imagine a standalone program that reads a query on stdin and emits a
pretty-printed version to stdout, using a parser that is automatically
generated from the backend's grammar with much the same technology used
in recent ecpg releases. I think that would address most of the
complaints I raised: it would be relatively painless to make use of from
contexts that don't have a live database connection, it wouldn't impose
any constraints related to having suitable database content available,
it wouldn't apply any of the multitude of implementation-dependent
transformations that the backend's parser does, and it could be built
(I think) to do something more with comments than just throw them away.

regards, tom lane

#10

Peter Geoghegan

peter@2ndquadrant.com

over 13 years ago

In reply to: Andrew Dunstan (#8)

Re: WIP: pg_pretty_query

On 7 August 2012 20:01, Andrew Dunstan <andrew@dunslane.net> wrote:

On 08/07/2012 02:14 PM, Tom Lane wrote:

* As per some of the complaints already registered in this thread,
ruleutils.c is not designed with the goal of being a pretty-printer.
Its primary charter is to support pg_dump by regurgitating rules/views
in an unambiguous form, which does not necessarily look very similar to
what was entered. An example of a transformation that probably nobody
would want in a typical pretty-printing context is expansion of
"SELECT *" lists. But again, there is really no way to turn that off.
Another aspect that seems pretty undesirable for pretty-printing is
loss of any comments embedded in the query text.

I'm very much not in favor of trying to make ruleutils serve two
masters, but that's the game we will be playing if we accept this patch.

+1. A pretty-printer that makes the query to be cleaned-up actually
undergo parse-analysis seems misconceived to me. This is a tool that
begs to be written in something like Python, as a satellite project,
with much greater flexibility in the format of the SQL that it
outputs.

One of the challenges is to have a pretty printer that is kept in sync with
the dialect that's supported. Anything that doesn't use the backend's parser
seems to me to be guaranteed to get out of sync very quickly.

I'm not convinced of that. Consider the example of cscope, a popular
tool for browsing C code. Its parser was written to be "fuzzy", so
it's actually perfectly usable for C++ and Java, even though that
isn't actually supported, IIRC. Now, I'll grant you that that isn't a
perfectly analogous situation, but it is similar in some ways. If we
add a new clause to select and that bit is generically pretty-printed,
is that really so bad? I have a hard time believing that a well
written fuzzy pretty-printer for Postgres wouldn't deliver the vast
majority of the benefits of an in-core approach, at a small fraction
of the effort. You'd also be able to pretty-print plpgsql function
definitions (a particularly compelling case for this kind of tool),
which any approach based on the backends grammar will never be able to
do (except, perhaps, if you were to do something even more
complicated).

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

#11

Tom Lane

tgl@sss.pgh.pa.us

over 13 years ago

In reply to: Peter Geoghegan (#10)

Re: WIP: pg_pretty_query

Peter Geoghegan <peter@2ndquadrant.com> writes:

On 7 August 2012 20:01, Andrew Dunstan <andrew@dunslane.net> wrote:

One of the challenges is to have a pretty printer that is kept in sync with
the dialect that's supported. Anything that doesn't use the backend's parser
seems to me to be guaranteed to get out of sync very quickly.

I'm not convinced of that. Consider the example of cscope, a popular
tool for browsing C code. Its parser was written to be "fuzzy", so
it's actually perfectly usable for C++ and Java, even though that
isn't actually supported, IIRC. Now, I'll grant you that that isn't a
perfectly analogous situation, but it is similar in some ways.

Yeah. A related question here is whether you want a pretty printer that
is entirely unforgiving of (what it thinks are) syntax errors in the
input. It might be a lot more useful if it didn't spit up on that, but
just did the best it could.

regards, tom lane

#12

Pavel Stehule

pavel.stehule@gmail.com

over 13 years ago

In reply to: Tom Lane (#9)

Re: WIP: pg_pretty_query

2012/8/7 Tom Lane <tgl@sss.pgh.pa.us>:

Andrew Dunstan <andrew@dunslane.net> writes:

On 08/07/2012 02:14 PM, Tom Lane wrote:

In short, the only redeeming value of this patch is that it's short.

One of the challenges is to have a pretty printer that is kept in sync
with the dialect that's supported. Anything that doesn't use the
backend's parser seems to me to be guaranteed to get out of sync very
quickly.

Sure. I think if we wanted an actually engineered solution, rather than
a half-baked one, ecpg provides a good source of inspiration. One could
imagine a standalone program that reads a query on stdin and emits a
pretty-printed version to stdout, using a parser that is automatically
generated from the backend's grammar with much the same technology used
in recent ecpg releases. I think that would address most of the
complaints I raised: it would be relatively painless to make use of from
contexts that don't have a live database connection, it wouldn't impose
any constraints related to having suitable database content available,
it wouldn't apply any of the multitude of implementation-dependent
transformations that the backend's parser does, and it could be built
(I think) to do something more with comments than just throw them away.

it is better, and it is allow more space for possible styling

Pavel

Show quoted text

regards, tom lane

#13

Pavel Stehule

pavel.stehule@gmail.com

over 13 years ago

In reply to: Peter Geoghegan (#10)

Re: WIP: pg_pretty_query

2012/8/7 Peter Geoghegan <peter@2ndquadrant.com>:

On 7 August 2012 20:01, Andrew Dunstan <andrew@dunslane.net> wrote:

On 08/07/2012 02:14 PM, Tom Lane wrote:

* As per some of the complaints already registered in this thread,
ruleutils.c is not designed with the goal of being a pretty-printer.
Its primary charter is to support pg_dump by regurgitating rules/views
in an unambiguous form, which does not necessarily look very similar to
what was entered. An example of a transformation that probably nobody
would want in a typical pretty-printing context is expansion of
"SELECT *" lists. But again, there is really no way to turn that off.
Another aspect that seems pretty undesirable for pretty-printing is
loss of any comments embedded in the query text.

I'm very much not in favor of trying to make ruleutils serve two
masters, but that's the game we will be playing if we accept this patch.

+1. A pretty-printer that makes the query to be cleaned-up actually
undergo parse-analysis seems misconceived to me. This is a tool that
begs to be written in something like Python, as a satellite project,
with much greater flexibility in the format of the SQL that it
outputs.

One of the challenges is to have a pretty printer that is kept in sync with
the dialect that's supported. Anything that doesn't use the backend's parser
seems to me to be guaranteed to get out of sync very quickly.

I'm not convinced of that. Consider the example of cscope, a popular
tool for browsing C code. Its parser was written to be "fuzzy", so
it's actually perfectly usable for C++ and Java, even though that
isn't actually supported, IIRC. Now, I'll grant you that that isn't a
perfectly analogous situation, but it is similar in some ways. If we
add a new clause to select and that bit is generically pretty-printed,
is that really so bad? I have a hard time believing that a well
written fuzzy pretty-printer for Postgres wouldn't deliver the vast
majority of the benefits of an in-core approach, at a small fraction
of the effort. You'd also be able to pretty-print plpgsql function
definitions (a particularly compelling case for this kind of tool),
which any approach based on the backends grammar will never be able to
do (except, perhaps, if you were to do something even more
complicated).

I disagree - simply pretty printer based on technique that we know
from ecpg can be very cheep and it cannot be "fuzzy" because it
integrate PostgreSQL parser.

Pavel

Show quoted text

--
Peter Geoghegan http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services