Parser Hook

Started by Jim Mlodgenskialmost 5 years ago13 messages

jimmy76@gmail.com

almost 5 years ago

2 attachment(s)

As Jan mentioned in his thread about a pluggable wire protocol [0]/messages/by-id/CAGBW59d5SjLyJLt-jwNv+oP6esbD8SCB===11WVe5=dOHLQ5wQ@mail.gmail.com, AWS is
working on a set of extensions for Babelfish. The intention is to not
necessarily have it as a single monolithic extension, but be possible for
people to use pieces of it as they need when they are migrating to
PostgreSQL. Some may just need the functions or data types. Others may need
the stored procedure language. Many times when enterprises are migrating
databases, they have satellite applications that they may not be able to
change or they are on a different schedules than the main application so
the database still needs to support some of the old syntax. A common need
in these situations is the parser.

Attached is a patch to place a hook at the top of the parser to allow for a
pluggable parser. It is modeled after the planner_hook [1]/messages/by-id/27516.1180053940@sss.pgh.pa.us. To test the
hook, I have also attached a simple proof of concept that wraps the parser
in a TRY/CATCH block to catch any parse errors. That could potentially help
a class of users who are sensitive to parse errors ending up in the logs
and leaking PII data or passwords.

-- Jim
-- Amazon Web Services

[0]: /messages/by-id/CAGBW59d5SjLyJLt-jwNv+oP6esbD8SCB===11WVe5=dOHLQ5wQ@mail.gmail.com
/messages/by-id/CAGBW59d5SjLyJLt-jwNv+oP6esbD8SCB===11WVe5=dOHLQ5wQ@mail.gmail.com
[1]: /messages/by-id/27516.1180053940@sss.pgh.pa.us
/messages/by-id/27516.1180053940@sss.pgh.pa.us

Attachments:

parser_hook.patchapplication/octet-stream; name=parser_hook.patchDownload

diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index 875de7ba28..0dabbcecc5 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -26,6 +26,9 @@
 #include "parser/parser.h"
 #include "parser/scansup.h"
 
+/* Hook for plugins to get control in raw_parser() */
+parser_hook_type parser_hook = NULL;
+
 static bool check_uescapechar(unsigned char escape);
 static char *str_udeescape(const char *str, char escape,
 						   int position, core_yyscan_t yyscanner);
@@ -40,6 +43,18 @@ static char *str_udeescape(const char *str, char escape,
  */
 List *
 raw_parser(const char *str, RawParseMode mode)
+{
+	List *result;
+
+	if (parser_hook)
+		result = (*parser_hook) (str, mode);
+	else
+		result = standard_raw_parser(str, mode);
+	return result;
+}
+
+List *
+standard_raw_parser(const char *str, RawParseMode mode)
 {
 	core_yyscan_t yyscanner;
 	base_yy_extra_type yyextra;
diff --git a/src/include/parser/parser.h b/src/include/parser/parser.h
index 853b0f1606..ad772e4849 100644
--- a/src/include/parser/parser.h
+++ b/src/include/parser/parser.h
@@ -57,9 +57,14 @@ extern int	backslash_quote;
 extern bool escape_string_warning;
 extern PGDLLIMPORT bool standard_conforming_strings;
 
+/* Hook for plugins to get control in raw_parser() */
+typedef List *(*parser_hook_type) (const char *str, RawParseMode mode);;
+
+extern PGDLLIMPORT parser_hook_type parser_hook;
 
 /* Primary entry point for the raw parsing functions */
 extern List *raw_parser(const char *str, RawParseMode mode);
+extern List *standard_raw_parser(const char *str, RawParseMode mode);
 
 /* Utility functions exported by gram.y (perhaps these should be elsewhere) */
 extern List *SystemFuncName(char *name);

quiet_parser.capplication/octet-stream; name=quiet_parser.cDownload

Andres Freund

andres@anarazel.de

almost 5 years ago

In reply to: Jim Mlodgenski (#1)

Re: Parser Hook

Hi,

On 2021-02-22 11:20:54 -0500, Jim Mlodgenski wrote:

As Jan mentioned in his thread about a pluggable wire protocol [0], AWS is
working on a set of extensions for Babelfish. The intention is to not
necessarily have it as a single monolithic extension, but be possible for
people to use pieces of it as they need when they are migrating to
PostgreSQL. Some may just need the functions or data types. Others may need
the stored procedure language. Many times when enterprises are migrating
databases, they have satellite applications that they may not be able to
change or they are on a different schedules than the main application so
the database still needs to support some of the old syntax. A common need
in these situations is the parser.

Attached is a patch to place a hook at the top of the parser to allow for a
pluggable parser. It is modeled after the planner_hook [1]. To test the
hook, I have also attached a simple proof of concept that wraps the parser
in a TRY/CATCH block to catch any parse errors. That could potentially help
a class of users who are sensitive to parse errors ending up in the logs
and leaking PII data or passwords.

I don't think these are really comparable. In case of the planner hook
you can reuse the normal planner pieces, and just deal with the one part
you need to extend. But we have pretty much no infrastructure to use the
parser in a piecemeal fashion (there's a tiny bit for plpgsql).

Which in turn means that to effectively use the proposed hook to
*extend* what postgres accepts, you need to copy the existing parser,
and hack in your extensions. Which in turn invariably will lead to
complaints about parser changes / breakages the community will get
complaints about in minor releases etc.

I think the cost incurred for providing a hook that only allows
extensions to replace the parser with a modified copy of ours will be
higher than the gain. Note that I'm not saying that I'm against
extending the parser, or hooks - just that I don't think just adding the
hook is a step worth doing on its own.

Imo a functional approach would really need to do the work to allow to
extend & reuse the parser in a piecemeal fashion and *then* add a hook.

Greetings,

Andres Freund

Jim Mlodgenski

jimmy76@gmail.com

almost 5 years ago

In reply to: Andres Freund (#2)

2 attachment(s)

Re: Parser Hook

On Mon, Feb 22, 2021 at 3:52 PM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2021-02-22 11:20:54 -0500, Jim Mlodgenski wrote:

As Jan mentioned in his thread about a pluggable wire protocol [0], AWS

is

working on a set of extensions for Babelfish. The intention is to not
necessarily have it as a single monolithic extension, but be possible for
people to use pieces of it as they need when they are migrating to
PostgreSQL. Some may just need the functions or data types. Others may

need

the stored procedure language. Many times when enterprises are migrating
databases, they have satellite applications that they may not be able to
change or they are on a different schedules than the main application so
the database still needs to support some of the old syntax. A common need
in these situations is the parser.

Attached is a patch to place a hook at the top of the parser to allow

for a

pluggable parser. It is modeled after the planner_hook [1]. To test the
hook, I have also attached a simple proof of concept that wraps the

parser

in a TRY/CATCH block to catch any parse errors. That could potentially

help

a class of users who are sensitive to parse errors ending up in the logs
and leaking PII data or passwords.

I don't think these are really comparable. In case of the planner hook
you can reuse the normal planner pieces, and just deal with the one part
you need to extend. But we have pretty much no infrastructure to use the
parser in a piecemeal fashion (there's a tiny bit for plpgsql).

Which in turn means that to effectively use the proposed hook to
*extend* what postgres accepts, you need to copy the existing parser,
and hack in your extensions. Which in turn invariably will lead to
complaints about parser changes / breakages the community will get
complaints about in minor releases etc.

Going deeper on this, I created another POC as an example. Yes, having a
hook at the top of the parser does mean an extension needs to copy the
existing grammar and modify it. Without a total redesign of how the grammar
is handled, I'm not seeing how else this could be accomplished. The example
I have is adding a CREATE JOB command that a scheduler may use. The amount
of effort needed for an extension maintainer doesn't appear to be that
onerous. Its not ideal having to copy and patch gram.y, but certainly
doable for someone wanting to extend the parser. I also extended the patch
to add another hook in parse_expr.c to see what we would need to add
another keyword and have it call a function like SYSDATE. That appears to
be a lot of work to get all of the potentail hook points that an extension
may want to add and there may not be that many usecases worth the effort.

I think the cost incurred for providing a hook that only allows
extensions to replace the parser with a modified copy of ours will be
higher than the gain. Note that I'm not saying that I'm against
extending the parser, or hooks - just that I don't think just adding the
hook is a step worth doing on its own.

However we would want to modify the parser to allow it to be more plugable
in the future, we would very likely need to have a hook at the top of the
parser to intiailize things like keywords. Having a hook at the top of the
parser along with the existing ProcessUtility_hook allows extension to add
their own utility commands if they wish. I would image that covers many
existing use cases for extensions today.

Show quoted text

Imo a functional approach would really need to do the work to allow to
extend & reuse the parser in a piecemeal fashion and *then* add a hook.

Greetings,

Andres Freund

Attachments:

poc_extended_parser.tar.gzapplication/x-gzip; name=poc_extended_parser.tar.gzDownload

poc_parser_hook.patchapplication/octet-stream; name=poc_parser_hook.patchDownload

diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f869e159d6..bbe0dee8c6 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -41,6 +41,8 @@
 /* GUC parameters */
 bool		Transform_null_equals = false;
 
+/* Hook for plugins to get control in transformExprRecurse() */
+parse_expr_hook_type parse_expr_hook = NULL;
 
 static Node *transformExprRecurse(ParseState *pstate, Node *expr);
 static Node *transformParamRef(ParseState *pstate, ParamRef *pref);
@@ -307,9 +309,14 @@ transformExprRecurse(ParseState *pstate, Node *expr)
 			}
 
 		default:
-			/* should not reach here */
-			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(expr));
-			result = NULL;		/* keep compiler quiet */
+			if (parse_expr_hook)
+				result = (*parse_expr_hook) (pstate, expr);
+			else
+			{
+				/* should not reach here */
+				elog(ERROR, "unrecognized node type: %d", (int) nodeTag(expr));
+				result = NULL;		/* keep compiler quiet */
+			}
 			break;
 	}
 
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index 875de7ba28..0dabbcecc5 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -26,6 +26,9 @@
 #include "parser/parser.h"
 #include "parser/scansup.h"
 
+/* Hook for plugins to get control in raw_parser() */
+parser_hook_type parser_hook = NULL;
+
 static bool check_uescapechar(unsigned char escape);
 static char *str_udeescape(const char *str, char escape,
 						   int position, core_yyscan_t yyscanner);
@@ -40,6 +43,18 @@ static char *str_udeescape(const char *str, char escape,
  */
 List *
 raw_parser(const char *str, RawParseMode mode)
+{
+	List *result;
+
+	if (parser_hook)
+		result = (*parser_hook) (str, mode);
+	else
+		result = standard_raw_parser(str, mode);
+	return result;
+}
+
+List *
+standard_raw_parser(const char *str, RawParseMode mode)
 {
 	core_yyscan_t yyscanner;
 	base_yy_extra_type yyextra;
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 05bb698cf4..6a69599db6 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -3101,7 +3101,9 @@ CreateCommandTag(Node *parsetree)
 				}
 			}
 			break;
-
+		case T_ExtensibleNode:
+			tag = CMDTAG_EXTENDED_COMMAND;
+			break;
 		default:
 			elog(WARNING, "unrecognized node type: %d",
 				 (int) nodeTag(parsetree));
diff --git a/src/include/parser/parse_expr.h b/src/include/parser/parse_expr.h
index 8ac4a0a369..6465a8d03e 100644
--- a/src/include/parser/parse_expr.h
+++ b/src/include/parser/parse_expr.h
@@ -15,6 +15,10 @@
 
 #include "parser/parse_node.h"
 
+/* Hook for plugins to get control in transformExprRecurse() */
+typedef Node *(*parse_expr_hook_type) (ParseState *pstate, Node *expr);
+extern PGDLLIMPORT parse_expr_hook_type parse_expr_hook;
+
 /* GUC parameters */
 extern bool Transform_null_equals;
 
diff --git a/src/include/parser/parser.h b/src/include/parser/parser.h
index 853b0f1606..8b1b915410 100644
--- a/src/include/parser/parser.h
+++ b/src/include/parser/parser.h
@@ -57,9 +57,13 @@ extern int	backslash_quote;
 extern bool escape_string_warning;
 extern PGDLLIMPORT bool standard_conforming_strings;
 
+/* Hook for plugins to get control in raw_parser() */
+typedef List *(*parser_hook_type) (const char *str, RawParseMode mode);;
+extern PGDLLIMPORT parser_hook_type parser_hook;
 
 /* Primary entry point for the raw parsing functions */
 extern List *raw_parser(const char *str, RawParseMode mode);
+extern List *standard_raw_parser(const char *str, RawParseMode mode);
 
 /* Utility functions exported by gram.y (perhaps these should be elsewhere) */
 extern List *SystemFuncName(char *name);
diff --git a/src/include/tcop/cmdtaglist.h b/src/include/tcop/cmdtaglist.h
index 9ba24d4ca9..c0d3427db5 100644
--- a/src/include/tcop/cmdtaglist.h
+++ b/src/include/tcop/cmdtaglist.h
@@ -178,6 +178,7 @@ PG_CMDTAG(CMDTAG_DROP_USER_MAPPING, "DROP USER MAPPING", true, false, false)
 PG_CMDTAG(CMDTAG_DROP_VIEW, "DROP VIEW", true, false, false)
 PG_CMDTAG(CMDTAG_EXECUTE, "EXECUTE", false, false, false)
 PG_CMDTAG(CMDTAG_EXPLAIN, "EXPLAIN", false, false, false)
+PG_CMDTAG(CMDTAG_EXTENDED_COMMAND, "EXTENDED COMMAND", false, false, false)
 PG_CMDTAG(CMDTAG_FETCH, "FETCH", false, false, true)
 PG_CMDTAG(CMDTAG_GRANT, "GRANT", true, false, false)
 PG_CMDTAG(CMDTAG_GRANT_ROLE, "GRANT ROLE", false, false, false)

Julien Rouhaud

rjuju123@gmail.com

almost 5 years ago

In reply to: Jim Mlodgenski (#3)

Re: Parser Hook

On Mon, Mar 15, 2021 at 11:48:58AM -0400, Jim Mlodgenski wrote:

Going deeper on this, I created another POC as an example. Yes, having a
hook at the top of the parser does mean an extension needs to copy the
existing grammar and modify it. Without a total redesign of how the grammar
is handled, I'm not seeing how else this could be accomplished. The example
I have is adding a CREATE JOB command that a scheduler may use. The amount
of effort needed for an extension maintainer doesn't appear to be that
onerous. Its not ideal having to copy and patch gram.y, but certainly
doable for someone wanting to extend the parser.

AFAIK nothing in bison prevents you from silently ignoring unhandled grammar
rather than erroring out. So you could have a parser hook called first, and
if no valid command was recognized fall back on the original parser. I'm not
saying that it's a good idea or will be performant (although the added grammar
will likely be very small, so it may not be that bad), but you could definitely
avoid the need to duplicate the whole grammar in each and every extension, and
allow multiple extensions extending the grammar.

That won't reduce the difficulty of producing a correct parse tree if you want
to implement some syntactic sugar for already handled DML though.

However we would want to modify the parser to allow it to be more plugable
in the future, we would very likely need to have a hook at the top of the
parser to intiailize things like keywords. Having a hook at the top of the
parser along with the existing ProcessUtility_hook allows extension to add
their own utility commands if they wish. I would image that covers many
existing use cases for extensions today.

What happens if multiple extensions want to add their own new grammar? There
will at least be possible conflicts with the additional node tags.

Also, I'm not sure that many extensions would really benefit from custom
utility command, as you can already do pretty much anything you want using SQL
functions. For instance it would be nice for hypopg to be able to support

CREATE HYPOTHETICAL INDEX ...

rather than

SELECT hypopg_create_index('CREATE INDEX...')

But really the only benefit would be autocompletion, which still wouldn't be
possible as psql autocompletion won't be extended. And even if it somehow was,
I wouldn't expect all psql clients to be setup as needed.

Joel Jacobson

joel@compiler.org

almost 5 years ago

In reply to: Jim Mlodgenski (#3)

Re: Parser Hook

On Mon, Mar 15, 2021, at 16:48, Jim Mlodgenski wrote:

The example I have is adding a CREATE JOB command that a scheduler may use.

This CREATE JOB thing sounds interesting.

Are you working on adding the ability to schedule SQL-commands to run in the background,
similar to cronjob and/or adding ampersand ("&") to a command in the terminal?

I couldn't figure it out by reading the patch.
I noted the "insert into extended_parser.jobs" query,
which to me sounds like the job would be some kind of parsing job,
but that seems strange.

/Joel

Jim Mlodgenski

jimmy76@gmail.com

almost 5 years ago

In reply to: Julien Rouhaud (#4)

Re: Parser Hook

On Mon, Mar 15, 2021 at 12:43 PM Julien Rouhaud <rjuju123@gmail.com> wrote:

On Mon, Mar 15, 2021 at 11:48:58AM -0400, Jim Mlodgenski wrote:

Going deeper on this, I created another POC as an example. Yes, having a
hook at the top of the parser does mean an extension needs to copy the
existing grammar and modify it. Without a total redesign of how the

grammar

is handled, I'm not seeing how else this could be accomplished. The

example

I have is adding a CREATE JOB command that a scheduler may use. The

amount

of effort needed for an extension maintainer doesn't appear to be that
onerous. Its not ideal having to copy and patch gram.y, but certainly
doable for someone wanting to extend the parser.

AFAIK nothing in bison prevents you from silently ignoring unhandled
grammar
rather than erroring out. So you could have a parser hook called first,
and
if no valid command was recognized fall back on the original parser. I'm
not
saying that it's a good idea or will be performant (although the added
grammar
will likely be very small, so it may not be that bad), but you could
definitely
avoid the need to duplicate the whole grammar in each and every extension,
and
allow multiple extensions extending the grammar.

That's a good point. That does simplify it

That won't reduce the difficulty of producing a correct parse tree if you
want
to implement some syntactic sugar for already handled DML though.

Complex DML like Oracle's outer join syntax is tricky no matter which way
you slice it.

However we would want to modify the parser to allow it to be more

plugable

in the future, we would very likely need to have a hook at the top of the
parser to intiailize things like keywords. Having a hook at the top of

the

parser along with the existing ProcessUtility_hook allows extension to

add

their own utility commands if they wish. I would image that covers many
existing use cases for extensions today.

What happens if multiple extensions want to add their own new grammar?
There
will at least be possible conflicts with the additional node tags.

The extensions would need to play nice with one another like they do with
other hooks and properly call the previous hook.

Also, I'm not sure that many extensions would really benefit from custom
utility command, as you can already do pretty much anything you want using
SQL
functions. For instance it would be nice for hypopg to be able to support

CREATE HYPOTHETICAL INDEX ...

rather than

SELECT hypopg_create_index('CREATE INDEX...')

But really the only benefit would be autocompletion, which still wouldn't
be
possible as psql autocompletion won't be extended. And even if it somehow
was,
I wouldn't expect all psql clients to be setup as needed.

Having the functionality exposed through DDL gives it a more native feel to
it for users and for some more likely use the exentions.

Jim Mlodgenski

jimmy76@gmail.com

almost 5 years ago

In reply to: Joel Jacobson (#5)

Re: Parser Hook

On Mon, Mar 15, 2021 at 12:58 PM Joel Jacobson <joel@compiler.org> wrote:

On Mon, Mar 15, 2021, at 16:48, Jim Mlodgenski wrote:

The example I have is adding a CREATE JOB command that a scheduler may
use.

This CREATE JOB thing sounds interesting.

Are you working on adding the ability to schedule SQL-commands to run in
the background,
similar to cronjob and/or adding ampersand ("&") to a command in the
terminal?

No, it was just a sample of how the parser could be extended to all an
extension like pg_cron can use CREATE JOB instead of calling a function
like SELECT cron.schedule(...)

Pavel Stehule

pavel.stehule@gmail.com

almost 5 years ago

In reply to: Julien Rouhaud (#4)

Re: Parser Hook

Also, I'm not sure that many extensions would really benefit from custom
utility command, as you can already do pretty much anything you want using
SQL
functions. For instance it would be nice for hypopg to be able to support

CREATE HYPOTHETICAL INDEX ...

rather than

SELECT hypopg_create_index('CREATE INDEX...')

But really the only benefit would be autocompletion, which still wouldn't
be
possible as psql autocompletion won't be extended. And even if it somehow
was,
I wouldn't expect all psql clients to be setup as needed.

The extending parser can be interesting for two cases

a) compatibility with other databases

b) experimental supports of some features standard (current or future)

c) some experiments - using

CREATE PIPE xxx(xxx), or CREATE HYPERCUBE xxx (xxx) is more readable more
SQLish (more natural syntax) than

SELECT create_pipe('name', 'a1', 'int', ...) or SELECT ext('CREATE PIPE ...)

Possibility to work with a parser is one main reason for forking postgres.
Lot of interestings projects fail on the cost of maintaining their own fork.

Maybe a good enough possibility is the possibility to inject an own parser
called before Postgres parser. Then it can do a transformation from "CREATE
PIPE ..." to "SELECT extparse("CREATE PIPE()". There can be a switch if
returned content is string for reparsing or already prepared AST.

It can be very interesting feature.

Pavel

Julien Rouhaud

rjuju123@gmail.com

almost 5 years ago

In reply to: Pavel Stehule (#8)

Re: Parser Hook

On Mon, Mar 15, 2021 at 06:05:52PM +0100, Pavel Stehule wrote:

Possibility to work with a parser is one main reason for forking postgres.
Lot of interestings projects fail on the cost of maintaining their own fork.

Maybe a good enough possibility is the possibility to inject an own parser
called before Postgres parser. Then it can do a transformation from "CREATE
PIPE ..." to "SELECT extparse("CREATE PIPE()". There can be a switch if
returned content is string for reparsing or already prepared AST.

Having a hook that returns a reformatted query string would definitely be
easier to write compared to generating an AST, but the overhead of parsing the
query twice plus deparsing it will probably make that approach way too
expensive in many usecases, so we shouldn't go that way.

#10

Pavel Stehule

pavel.stehule@gmail.com

almost 5 years ago

In reply to: Julien Rouhaud (#9)

Re: Parser Hook

po 15. 3. 2021 v 18:18 odesílatel Julien Rouhaud <rjuju123@gmail.com>
napsal:

On Mon, Mar 15, 2021 at 06:05:52PM +0100, Pavel Stehule wrote:

Possibility to work with a parser is one main reason for forking

postgres.

Lot of interestings projects fail on the cost of maintaining their own

fork.

Maybe a good enough possibility is the possibility to inject an own

parser

called before Postgres parser. Then it can do a transformation from

"CREATE

PIPE ..." to "SELECT extparse("CREATE PIPE()". There can be a switch if
returned content is string for reparsing or already prepared AST.

Having a hook that returns a reformatted query string would definitely be
easier to write compared to generating an AST, but the overhead of parsing
the
query twice plus deparsing it will probably make that approach way too
expensive in many usecases, so we shouldn't go that way.

yes - so it can be nice to have more possibilities.

parsing is expensive - but on today computers, the cost of parsing is low -
the optimization is significantly more expensive.

I wrote some patches in this area (all rejected by Tom :)), and a lot of
work can be done after parser and before the analysis stage. Probably, the
parser hook is not good enough, there should be an analysis stage hook too.

#11

Julien Rouhaud

rjuju123@gmail.com

almost 5 years ago

In reply to: Pavel Stehule (#10)

Re: Parser Hook

On Mon, Mar 15, 2021 at 06:41:36PM +0100, Pavel Stehule wrote:

po 15. 3. 2021 v 18:18 odesï¿½latel Julien Rouhaud <rjuju123@gmail.com>
napsal:

On Mon, Mar 15, 2021 at 06:05:52PM +0100, Pavel Stehule wrote:

Possibility to work with a parser is one main reason for forking

postgres.

Lot of interestings projects fail on the cost of maintaining their own

fork.

Maybe a good enough possibility is the possibility to inject an own

parser

called before Postgres parser. Then it can do a transformation from

"CREATE

PIPE ..." to "SELECT extparse("CREATE PIPE()". There can be a switch if
returned content is string for reparsing or already prepared AST.

Having a hook that returns a reformatted query string would definitely be
easier to write compared to generating an AST, but the overhead of parsing
the
query twice plus deparsing it will probably make that approach way too
expensive in many usecases, so we shouldn't go that way.

yes - so it can be nice to have more possibilities.

parsing is expensive - but on today computers, the cost of parsing is low -
the optimization is significantly more expensive.

I wrote some patches in this area (all rejected by Tom :)), and a lot of
work can be done after parser and before the analysis stage. Probably, the
parser hook is not good enough, there should be an analysis stage hook too.

If you need an parse/analyse hook, it means that you're extending the AST, so
you probably also need executor support for that right? Or is it only to
support syntactic sugar in the analysis rather than parsing?

#12

Pavel Stehule

pavel.stehule@gmail.com

almost 5 years ago

In reply to: Julien Rouhaud (#11)

Re: Parser Hook

po 15. 3. 2021 v 18:54 odesílatel Julien Rouhaud <rjuju123@gmail.com>
napsal:

On Mon, Mar 15, 2021 at 06:41:36PM +0100, Pavel Stehule wrote:

po 15. 3. 2021 v 18:18 odesílatel Julien Rouhaud <rjuju123@gmail.com>
napsal:

On Mon, Mar 15, 2021 at 06:05:52PM +0100, Pavel Stehule wrote:

Possibility to work with a parser is one main reason for forking

postgres.

Lot of interestings projects fail on the cost of maintaining their

own

fork.

Maybe a good enough possibility is the possibility to inject an own

parser

called before Postgres parser. Then it can do a transformation from

"CREATE

PIPE ..." to "SELECT extparse("CREATE PIPE()". There can be a switch

if

returned content is string for reparsing or already prepared AST.

Having a hook that returns a reformatted query string would definitely

be

easier to write compared to generating an AST, but the overhead of

parsing

the
query twice plus deparsing it will probably make that approach way too
expensive in many usecases, so we shouldn't go that way.

yes - so it can be nice to have more possibilities.

parsing is expensive - but on today computers, the cost of parsing is

low -

the optimization is significantly more expensive.

I wrote some patches in this area (all rejected by Tom :)), and a lot of
work can be done after parser and before the analysis stage. Probably,

the

parser hook is not good enough, there should be an analysis stage hook

too.

If you need an parse/analyse hook, it means that you're extending the AST,
so
you probably also need executor support for that right? Or is it only to
support syntactic sugar in the analysis rather than parsing?

I think all necessary executor's hooks are available already. On the
executor level, I was able to do all what I wanted.

I miss just preparsing, and postparsing hooks.

Regards

Pavel

#13

Joshua Drake

jd@commandprompt.com

almost 5 years ago

In reply to: Julien Rouhaud (#4)

Re: Parser Hook

Also, I'm not sure that many extensions would really benefit from custom
utility command, as you can already do pretty much anything you want using
SQL
functions. For instance it would be nice for hypopg to be able to support

CREATE HYPOTHETICAL INDEX ...

rather than

SELECT hypopg_create_index('CREATE INDEX...')

But really the only benefit would be autocompletion, which still wouldn't
be
possible as psql autocompletion won't be extended. And even if it somehow
was,
I wouldn't expect all psql clients to be setup as needed.

"technically" speaking you are correct, usability speaking you are not. We
ran into this discussion previously when dealing with replication. There is
certainly a history to calling functions to do what the grammar (from a
usability perspective) should do and that is not really a good history. It
is just what we are all used to. Looking at what you wrote above as a DBA
or even an average developer: CREATE HYPOTHETICAL INDEX makes much more
sense than the SELECT execution.

P.S. I had to write HYPOTHETICAL 4 times, I kept typing HYPOTECHNICAL :/