WIP: hooking parser

Started by Pavel Stehulealmost 17 years ago33 messages
#1Pavel Stehule
pavel.stehule@gmail.com
2 attachment(s)

Hello

some years ago there was some plans about parser's extensibility. I am
able write bison extensions, but I thing, so lot of work should be
done via hooking of transform stage.

I did small example - real implementation of Oracle's decode function.
It's based on hooking transformExpr function.

It works. And I thing, this should to solve lot of special task
related to increase compatibility with Oracle, Informix, or it could
be usefull for some others (json support).

postgres=# load 'decode';
LOAD
postgres=# select
decode(null::integer,2,'ahoj',3,'Petr',1,'Pavel',null, 'jaja');
decode
--------
jaja
(1 row)

postgres=# select decode(3,2,'ahoj',3,'Petr',1,'Pavel',null, 'jaja');
decode
--------
Petr
(1 row)

postgres=# select decode(6,2,'ahoj',3,'Petr',1,'Pavel',null, 'jaja');
decode
--------

(1 row)

postgres=# select decode(6,2,'ahoj',3,'Petr',1,'Pavel',null, 'jaja', 'Milos');
decode
--------
Milos
(1 row)

Any ideas, notes?

regards
Pavel Stehule

Attachments:

decode.ctext/plain; charset=US-ASCII; name=decode.cDownload
parse_expr.difftext/x-patch; charset=US-ASCII; name=parse_expr.diffDownload
*** ./parse_expr.c.orig	2009-02-10 22:50:47.000000000 +0100
--- ./parse_expr.c	2009-02-11 06:45:37.000000000 +0100
***************
*** 36,41 ****
--- 36,44 ----
  
  bool		Transform_null_equals = false;
  
+ /* Hook for plugins to get control in transformExpr */
+ ParseExprTransform_hook_type ParseExprTransform_hook = NULL;
+ 
  static Node *transformParamRef(ParseState *pstate, ParamRef *pref);
  static Node *transformAExprOp(ParseState *pstate, A_Expr *a);
  static Node *transformAExprAnd(ParseState *pstate, A_Expr *a);
***************
*** 97,105 ****
--- 100,118 ----
   * a Const.  More care is needed for node types that are used as both
   * input and output of transformExpr; see SubLink for example.
   */
+  
  Node *
  transformExpr(ParseState *pstate, Node *expr)
  {
+ 	if (ParseExprTransform_hook)
+ 		return (*ParseExprTransform_hook) (pstate, expr);
+ 	else
+ 		return standard_transformExpr(pstate, expr);
+ }
+  
+ Node *
+ standard_transformExpr(ParseState *pstate, Node *expr)
+ {
  	Node	   *result = NULL;
  
  	if (expr == NULL)
***************
*** 107,113 ****
  
  	/* Guard against stack overflow due to overly complex expressions */
  	check_stack_depth();
! 
  	switch (nodeTag(expr))
  	{
  		case T_ColumnRef:
--- 120,126 ----
  
  	/* Guard against stack overflow due to overly complex expressions */
  	check_stack_depth();
! 	
  	switch (nodeTag(expr))
  	{
  		case T_ColumnRef:
***************
*** 313,319 ****
  			/* should not reach here */
  			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(expr));
  			break;
! 	}
  
  	return result;
  }
--- 326,332 ----
  			/* should not reach here */
  			elog(ERROR, "unrecognized node type: %d", (int) nodeTag(expr));
  			break;
! 	}	
  
  	return result;
  }
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#1)
Re: WIP: hooking parser

Pavel Stehule <pavel.stehule@gmail.com> writes:

some years ago there was some plans about parser's extensibility. I am
able write bison extensions, but I thing, so lot of work should be
done via hooking of transform stage.

This strikes me as next door to useless, because it can only handle
things that look like valid expressions to the existing grammar.
So pretty much all you can do is weird sorts of functions, which are
already accommodated at less effort with existing features such as
function overloading.

A hook check in that particular place is not going to have negligible
performance impact, since it's going to be hit tens or hundreds or
thousands of times per query rather than just once. So it's going to
require more than a marginal use case to persuade me we ought to have
it.

regards, tom lane

#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#2)
Re: WIP: hooking parser

2009/2/11 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:

some years ago there was some plans about parser's extensibility. I am
able write bison extensions, but I thing, so lot of work should be
done via hooking of transform stage.

This strikes me as next door to useless, because it can only handle
things that look like valid expressions to the existing grammar.
So pretty much all you can do is weird sorts of functions, which are
already accommodated at less effort with existing features such as
function overloading.

Usually we don't need change syntax. But we need to control of
coercion stage. I afraid so function overloading is bad when there lot
of combination, and polymorphic functions are not enough.

for some cases we need more polymorphic types - anyelement1,
anyelement2, anyarray1, ...

A hook check in that particular place is not going to have negligible
performance impact, since it's going to be hit tens or hundreds or
thousands of times per query rather than just once. So it's going to
require more than a marginal use case to persuade me we ought to have
it.

Because this stage isn't repeated (I don't expect bigger performance
impact), it's similar to other's hooks. But, sure, wrong hook should
do strange things. It's risk.

+ argument - it increase customisability and allows gentle syntax
tuning. Function decode is first sample from today morning.

regards
Pavel Stehule

Show quoted text

regards, tom lane

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#3)
Re: WIP: hooking parser

Pavel Stehule <pavel.stehule@gmail.com> writes:

2009/2/11 Tom Lane <tgl@sss.pgh.pa.us>:

This strikes me as next door to useless, because it can only handle
things that look like valid expressions to the existing grammar.
So pretty much all you can do is weird sorts of functions, which are
already accommodated at less effort with existing features such as
function overloading.

Usually we don't need change syntax. But we need to control of
coercion stage. I afraid so function overloading is bad when there lot
of combination, and polymorphic functions are not enough.
for some cases we need more polymorphic types - anyelement1,
anyelement2, anyarray1, ...

Well, then we should go fix those things.

A hook function whose purpose is to fundamentally change query semantics
strikes me as a very dangerous thing anyway, because your queries either
stop working or suddenly do something completely different if the hook
happens not to be loaded. The hooks we've accepted to date are intended
for either monitoring or experimentation with planner behavior, neither
of which will change query semantics.

regards, tom lane

#5Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#4)
Re: WIP: hooking parser

2009/2/11 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:

2009/2/11 Tom Lane <tgl@sss.pgh.pa.us>:

This strikes me as next door to useless, because it can only handle
things that look like valid expressions to the existing grammar.
So pretty much all you can do is weird sorts of functions, which are
already accommodated at less effort with existing features such as
function overloading.

Usually we don't need change syntax. But we need to control of
coercion stage. I afraid so function overloading is bad when there lot
of combination, and polymorphic functions are not enough.
for some cases we need more polymorphic types - anyelement1,
anyelement2, anyarray1, ...

Well, then we should go fix those things.

I am for it, and I doing on it.

A hook function whose purpose is to fundamentally change query semantics
strikes me as a very dangerous thing anyway, because your queries either
stop working or suddenly do something completely different if the hook
happens not to be loaded. The hooks we've accepted to date are intended
for either monitoring or experimentation with planner behavior, neither
of which will change query semantics.

I agree, and I understand well this risk. But still it is better and
wide used than custom patching. Look on executor hook. There are only
three cases - useful cases. It is some corner, that is far for general
using (integrating into core) and too sugar for ignore it for ever.
It's possibility, nothing less, nothing more.

regards
Pavel Stehule

Show quoted text

regards, tom lane

#6Peter Eisentraut
peter_e@gmx.net
In reply to: Pavel Stehule (#1)
Re: WIP: hooking parser

On Wednesday 11 February 2009 12:05:03 Pavel Stehule wrote:

It works. And I thing, this should to solve lot of special task
related to increase compatibility with Oracle, Informix, or it could
be usefull for some others (json support).

postgres=# load 'decode';
LOAD
postgres=# select
decode(null::integer,2,'ahoj',3,'Petr',1,'Pavel',null, 'jaja');
decode
--------
jaja
(1 row)

I think what you want here is some way to define a function that takes an
arbitrary number of arguments of arbitrary type and let the function figure
everything out. I see no reason why this can't be a variant on CREATE
FUNCTION, except that of course you need to figure out some API and function
resolution details. But it doesn't have to be a completely different concept
like a binary plugin.

#7Pavel Stehule
pavel.stehule@gmail.com
In reply to: Peter Eisentraut (#6)
Re: WIP: hooking parser

2009/2/12 Peter Eisentraut <peter_e@gmx.net>:

On Wednesday 11 February 2009 12:05:03 Pavel Stehule wrote:

It works. And I thing, this should to solve lot of special task
related to increase compatibility with Oracle, Informix, or it could
be usefull for some others (json support).

postgres=# load 'decode';
LOAD
postgres=# select
decode(null::integer,2,'ahoj',3,'Petr',1,'Pavel',null, 'jaja');
decode
--------
jaja
(1 row)

I think what you want here is some way to define a function that takes an
arbitrary number of arguments of arbitrary type and let the function figure
everything out. I see no reason why this can't be a variant on CREATE
FUNCTION, except that of course you need to figure out some API and function
resolution details. But it doesn't have to be a completely different concept
like a binary plugin.

Actually I need add some metada to parameter list, Question is, what
is more simple and more readable - descriptive or procedural solution.
And what we are able to implement.

example DECODE(any1, any2, (asany1, asany2).,(asany2)+)

Actually I thing so with some hook of parser transform stage we should
to this task more simply.

I found next sample, that should be solved via hook - emulation of
Oracle behave '' is null.

Regards
Pavel

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#6)
Re: WIP: hooking parser

Peter Eisentraut <peter_e@gmx.net> writes:

I think what you want here is some way to define a function that takes an
arbitrary number of arguments of arbitrary type and let the function figure
everything out. I see no reason why this can't be a variant on CREATE
FUNCTION, except that of course you need to figure out some API and function
resolution details.

We've already got "variadic any" functions --- the problem is to tell
the parser what the function's result type will be, given a particular
parameter list. I agree that hooking transformExpr is not exactly the
most ideal way to attack that from a performance or complexity
standpoint.

regards, tom lane

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#8)
Re: WIP: hooking parser

Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

I think what you want here is some way to define a function that takes an
arbitrary number of arguments of arbitrary type and let the function figure
everything out. I see no reason why this can't be a variant on CREATE
FUNCTION, except that of course you need to figure out some API and function
resolution details.

We've already got "variadic any" functions --- the problem is to tell
the parser what the function's result type will be, given a particular
parameter list. I agree that hooking transformExpr is not exactly the
most ideal way to attack that from a performance or complexity
standpoint.

What is the defined return type logic for the decode() function anyway?
If you want the full CASE-like resolution logic, it might be very hard
to fit that into a general system.

#10Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Peter Eisentraut (#9)
Re: WIP: hooking parser

Peter Eisentraut wrote:

Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

I think what you want here is some way to define a function that
takes an arbitrary number of arguments of arbitrary type and let the
function figure everything out. I see no reason why this can't be a
variant on CREATE FUNCTION, except that of course you need to figure
out some API and function resolution details.

We've already got "variadic any" functions --- the problem is to tell
the parser what the function's result type will be, given a particular
parameter list. I agree that hooking transformExpr is not exactly the
most ideal way to attack that from a performance or complexity
standpoint.

What is the defined return type logic for the decode() function anyway?
If you want the full CASE-like resolution logic, it might be very hard
to fit that into a general system.

And on top of that, decode() is supposed to do short-circuit evaluation
of the arguments.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#11Pavel Stehule
pavel.stehule@gmail.com
In reply to: Heikki Linnakangas (#10)
Re: WIP: hooking parser

2009/2/13 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>:

Peter Eisentraut wrote:

Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

I think what you want here is some way to define a function that takes
an arbitrary number of arguments of arbitrary type and let the function
figure everything out. I see no reason why this can't be a variant on
CREATE FUNCTION, except that of course you need to figure out some API and
function resolution details.

We've already got "variadic any" functions --- the problem is to tell
the parser what the function's result type will be, given a particular
parameter list. I agree that hooking transformExpr is not exactly the
most ideal way to attack that from a performance or complexity
standpoint.

What is the defined return type logic for the decode() function anyway?
If you want the full CASE-like resolution logic, it might be very hard to
fit that into a general system.

And on top of that, decode() is supposed to do short-circuit evaluation of
the arguments.

yes, you should to look so this work do transform hook very vell

regards
Pavel

Show quoted text

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#12Peter Eisentraut
peter_e@gmx.net
In reply to: Heikki Linnakangas (#10)
Re: WIP: hooking parser

Heikki Linnakangas wrote:

And on top of that, decode() is supposed to do short-circuit evaluation
of the arguments.

Then the only solution is to hack it right into the parser.

There is an existing decode() function however ...

#13Pavel Stehule
pavel.stehule@gmail.com
In reply to: Peter Eisentraut (#12)
1 attachment(s)
Re: WIP: hooking parser

Next sample of parser hook using:

attachment contains module that transform every empty string to null.
I am not sure, if this behave is exactly compatible with Oracle, but
for first iteration it is good.

postgres=# select length('') is null;
?column?
----------
t
(1 row)

I thing, so this should be used for emulation of some constructors too.

Regards
Pavel Stehule

2009/2/13 Peter Eisentraut <peter_e@gmx.net>:

Show quoted text

Heikki Linnakangas wrote:

And on top of that, decode() is supposed to do short-circuit evaluation of
the arguments.

Then the only solution is to hack it right into the parser.

There is an existing decode() function however ...

Attachments:

null.ctext/plain; charset=US-ASCII; name=null.cDownload
#14Sam Mason
sam@samason.me.uk
In reply to: Pavel Stehule (#13)
Re: WIP: hooking parser

On Mon, Feb 16, 2009 at 02:35:54PM +0100, Pavel Stehule wrote:

attachment contains module that transform every empty string to null.

Why would anyone ever want to do this? This would appear to break all
sorts of things in very non-obvious ways:

SELECT CASE s WHEN '' THEN 'empty string' ELSE s END FROM foo;
UPDATE foo SET s = NULL WHERE s = '';

would no longer do the expected thing. It would only do the expected
thing (in my eyes) when strings of zero length were actually being
inserted into the database. Like:

INSERT INTO foo (s) VALUES ('');
UPDATE foo SET s = '' WHERE s = 'empty string';

Or am I missing something obvious?

--
Sam http://samason.me.uk/

#15Pavel Stehule
pavel.stehule@gmail.com
In reply to: Sam Mason (#14)
Re: WIP: hooking parser

2009/2/16 Sam Mason <sam@samason.me.uk>:

On Mon, Feb 16, 2009 at 02:35:54PM +0100, Pavel Stehule wrote:

attachment contains module that transform every empty string to null.

Why would anyone ever want to do this? This would appear to break all
sorts of things in very non-obvious ways:

I agree, so this behave is strange - but Oracle does it.

so normal query in Oracle for empty value looks like

select * from people where surname is null;

and some application expect transformation from '' to null.

http://www.thunderguy.com/semicolon/2003/04/26/oracle-empty-string-null/

so these modules (decode, oraemptystr) decrease differences between
PostgreSQL and Oracle.

Regards
Pavel Stehule

p.s. I am not Oracle expert, I expect so here are more qualified men.

Show quoted text

SELECT CASE s WHEN '' THEN 'empty string' ELSE s END FROM foo;
UPDATE foo SET s = NULL WHERE s = '';

would no longer do the expected thing. It would only do the expected
thing (in my eyes) when strings of zero length were actually being
inserted into the database. Like:

INSERT INTO foo (s) VALUES ('');
UPDATE foo SET s = '' WHERE s = 'empty string';

Or am I missing something obvious?

--
Sam http://samason.me.uk/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Sam Mason
sam@samason.me.uk
In reply to: Pavel Stehule (#15)
Re: WIP: hooking parser

On Mon, Feb 16, 2009 at 03:21:12PM +0100, Pavel Stehule wrote:

2009/2/16 Sam Mason <sam@samason.me.uk>:

On Mon, Feb 16, 2009 at 02:35:54PM +0100, Pavel Stehule wrote:

attachment contains module that transform every empty string to null.

Why would anyone ever want to do this? This would appear to break all
sorts of things in very non-obvious ways:

I agree, so this behave is strange - but Oracle does it.

so normal query in Oracle for empty value looks like

select * from people where surname is null;

and some application expect transformation from '' to null.

http://www.thunderguy.com/semicolon/2003/04/26/oracle-empty-string-null/

that's pretty grim!

<rant>
I'd agree with the comment saying "A string variable that can't be
set empty is like a number variable that can't be set zero".

Oracle have just thrown out (or, at best, rewritten) the inductive
base case for strings. For numbers you (logically) start with a Zero
and a Succ (successor function) and model numbers as an arbitrary
number of applications of Succ to Zero ("2" is (Succ (Succ Zero))).
For strings, you start with an empty string and an append function
("hi" being (Append (Append '' \h) \i)).
</rant>

so these modules (decode, oraemptystr) decrease differences between
PostgreSQL and Oracle.

wouldn't it be better/easier to extend something like pgpool to
transform Oracle style SQL code to PG style code? You'd certainly
be able to get it more complete in reasonable amounts of time, but
performance would suffer when you went to look up table definitions to
check the types of various things.

--
Sam http://samason.me.uk/

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#13)
Re: WIP: hooking parser

Pavel Stehule <pavel.stehule@gmail.com> writes:

Next sample of parser hook using:
attachment contains module that transform every empty string to null.
I am not sure, if this behave is exactly compatible with Oracle,

Surely a parser hook like this would have nothing whatsoever to do
with Oracle's behavior.

regards, tom lane

#18Pavel Stehule
pavel.stehule@gmail.com
In reply to: Sam Mason (#16)
Re: WIP: hooking parser

2009/2/16 Sam Mason <sam@samason.me.uk>:

On Mon, Feb 16, 2009 at 03:21:12PM +0100, Pavel Stehule wrote:

2009/2/16 Sam Mason <sam@samason.me.uk>:

On Mon, Feb 16, 2009 at 02:35:54PM +0100, Pavel Stehule wrote:

attachment contains module that transform every empty string to null.

Why would anyone ever want to do this? This would appear to break all
sorts of things in very non-obvious ways:

I agree, so this behave is strange - but Oracle does it.

so normal query in Oracle for empty value looks like

select * from people where surname is null;

and some application expect transformation from '' to null.

http://www.thunderguy.com/semicolon/2003/04/26/oracle-empty-string-null/

that's pretty grim!

<rant>
I'd agree with the comment saying "A string variable that can't be
set empty is like a number variable that can't be set zero".

Oracle have just thrown out (or, at best, rewritten) the inductive
base case for strings. For numbers you (logically) start with a Zero
and a Succ (successor function) and model numbers as an arbitrary
number of applications of Succ to Zero ("2" is (Succ (Succ Zero))).
For strings, you start with an empty string and an append function
("hi" being (Append (Append '' \h) \i)).
</rant>

so these modules (decode, oraemptystr) decrease differences between
PostgreSQL and Oracle.

wouldn't it be better/easier to extend something like pgpool to
transform Oracle style SQL code to PG style code? You'd certainly
be able to get it more complete in reasonable amounts of time, but
performance would suffer when you went to look up table definitions to
check the types of various things.

then you should to rewrite complete PostgreSQL parser :) and
performance will be worse (you have to parse query string two times).
For this transformation you need query's semantic tree and access to
dictionary (some caches) . Lot of things should by done via
extensibility features of PostgreSQL. Sure - you can do this things
difficult outside of PostgreSQL or simply via parser's hook.

These samples are only for Oracle. But I am sure, so this technique
should be used for different databases too. Example. Informix uses
convention for named params like paramname = value. PostgreSQL 8.5
will use syntax paramname AS value. So you need change app. code. With
hook I am able transform transparently Informix syntax to PostgreSQL
syntax without significant increase of load or complexity.

regards
Pavel Stehule

Show quoted text

--
Sam http://samason.me.uk/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Sam Mason
sam@samason.me.uk
In reply to: Pavel Stehule (#18)
Re: WIP: hooking parser

On Mon, Feb 16, 2009 at 04:40:23PM +0100, Pavel Stehule wrote:

2009/2/16 Sam Mason <sam@samason.me.uk>:

On Mon, Feb 16, 2009 at 03:21:12PM +0100, Pavel Stehule wrote:

so these modules (decode, oraemptystr) decrease differences between
PostgreSQL and Oracle.

wouldn't it be better/easier to extend something like pgpool to
transform Oracle style SQL code to PG style code? You'd certainly
be able to get it more complete in reasonable amounts of time, but
performance would suffer when you went to look up table definitions to
check the types of various things.

then you should to rewrite complete PostgreSQL parser :) and
performance will be worse (you have to parse query string two times).

Yes, there'd be a few thousand lines of code to write.

Note that you only need to parse things twice, planning only needs to be
done by PG, so it shouldn't be too bad. It'll add maybe a millisecond
or so to query execution times, with most of that time spent going off
to find table and function definitions from the real database.

For this transformation you need query's semantic tree and access to
dictionary (some caches) . Lot of things should by done via
extensibility features of PostgreSQL. Sure - you can do this things
difficult outside of PostgreSQL or simply via parser's hook.

But to do it properly inside PG would be difficult; how would your hooks
know to transform:

SELECT s FROM foo WHERE s IS NULL;

into:

SELECT s FROM foo WHERE (s = '' OR s IS NULL);

that all looks a bit tricky to me. Hum... actually it's not. All you
need to do is to rewrite any string reference "s" into NULLIF(s,'').
That would tank performance as indexes wouldn't be used most of the
time, but never mind.

These samples are only for Oracle. But I am sure, so this technique
should be used for different databases too. Example. Informix uses
convention for named params like paramname = value. PostgreSQL 8.5
will use syntax paramname AS value. So you need change app. code. With
hook I am able transform transparently Informix syntax to PostgreSQL
syntax without significant increase of load or complexity.

That would be a *much* bigger change; you're actually changing PG's
parser there and not just modifying the parse tree. If it was done
externally it would be a much easier thing to do.

--
Sam http://samason.me.uk/

#20Pavel Stehule
pavel.stehule@gmail.com
In reply to: Sam Mason (#19)
Re: WIP: hooking parser

2009/2/16 Sam Mason <sam@samason.me.uk>:

On Mon, Feb 16, 2009 at 04:40:23PM +0100, Pavel Stehule wrote:

2009/2/16 Sam Mason <sam@samason.me.uk>:

On Mon, Feb 16, 2009 at 03:21:12PM +0100, Pavel Stehule wrote:

so these modules (decode, oraemptystr) decrease differences between
PostgreSQL and Oracle.

wouldn't it be better/easier to extend something like pgpool to
transform Oracle style SQL code to PG style code? You'd certainly
be able to get it more complete in reasonable amounts of time, but
performance would suffer when you went to look up table definitions to
check the types of various things.

then you should to rewrite complete PostgreSQL parser :) and
performance will be worse (you have to parse query string two times).

Yes, there'd be a few thousand lines of code to write.

Note that you only need to parse things twice, planning only needs to be
done by PG, so it shouldn't be too bad. It'll add maybe a millisecond
or so to query execution times, with most of that time spent going off
to find table and function definitions from the real database.

Hello

But to do it properly inside PG would be difficult; how would your hooks
know to transform:

SELECT s FROM foo WHERE s IS NULL;

into:

SELECT s FROM foo WHERE (s = '' OR s IS NULL);

I don't need it. Oracle store NULL without ''. So expression some IS
NULL is stable.

that all looks a bit tricky to me. Hum... actually it's not. All you
need to do is to rewrite any string reference "s" into NULLIF(s,'').
That would tank performance as indexes wouldn't be used most of the
time, but never mind.

look to source what I do. It' just simple. But you have to emulate
Oracle behave everywhere. Then all is simple, because Oracle doesn't
know ''.

That would be a *much* bigger change; you're actually changing PG's
parser there and not just modifying the parse tree. If it was done
externally it would be a much easier thing to do.

No I don't do it. Loadable modules are really external. I need only
hook inside parser.

Regards
Pavel Stehule

Show quoted text

--
Sam http://samason.me.uk/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Sam Mason
sam@samason.me.uk
In reply to: Pavel Stehule (#20)
Re: WIP: hooking parser

On Mon, Feb 16, 2009 at 08:03:42PM +0100, Pavel Stehule wrote:

2009/2/16 Sam Mason <sam@samason.me.uk>:

But to do it properly inside PG would be difficult; how would your hooks
know to transform:

SELECT s FROM foo WHERE s IS NULL;

into:

SELECT s FROM foo WHERE (s = '' OR s IS NULL);

I don't need it. Oracle store NULL without ''. So expression some IS
NULL is stable.

OK, I was under the impression that you wanted general Oracle
compatibility from PG. Apparently this isn't the case.

that all looks a bit tricky to me. Hum... actually it's not. All you
need to do is to rewrite any string reference "s" into NULLIF(s,'').
That would tank performance as indexes wouldn't be used most of the
time, but never mind.

look to source what I do. It' just simple. But you have to emulate
Oracle behave everywhere. Then all is simple, because Oracle doesn't
know ''.

Yes, I read your code. You'll still get zero length strings back from
things like substring('hello world',1,0) and not a NULL as I expect
you'd get back from Oracle.

[ context removed by Pavel; but the example was supporting Informix
style named parameters by PG ]

That would be a *much* bigger change; you're actually changing PG's
parser there and not just modifying the parse tree. If it was done
externally it would be a much easier thing to do.

No I don't do it. Loadable modules are really external. I need only
hook inside parser.

As far as I checked, your code gets passed some subset of the parse
tree. For the parser to have a chance of getting the code to your hook
it would need to be considered valid syntax. Informix style named
parameters isn't considered valid by PG's parser and hence the user will
get an error before the hook would get a chance to rewrite the parse
tree and make it valid. This is basically what Tom was alluding to
here:

http://archives.postgresql.org/pgsql-hackers/2009-02/msg00574.php

--
Sam http://samason.me.uk/

#22Pavel Stehule
pavel.stehule@gmail.com
In reply to: Sam Mason (#21)
Re: WIP: hooking parser

2009/2/16 Sam Mason <sam@samason.me.uk>:

On Mon, Feb 16, 2009 at 08:03:42PM +0100, Pavel Stehule wrote:

2009/2/16 Sam Mason <sam@samason.me.uk>:

But to do it properly inside PG would be difficult; how would your hooks
know to transform:

SELECT s FROM foo WHERE s IS NULL;

into:

SELECT s FROM foo WHERE (s = '' OR s IS NULL);

I don't need it. Oracle store NULL without ''. So expression some IS
NULL is stable.

OK, I was under the impression that you wanted general Oracle
compatibility from PG. Apparently this isn't the case.

that all looks a bit tricky to me. Hum... actually it's not. All you
need to do is to rewrite any string reference "s" into NULLIF(s,'').
That would tank performance as indexes wouldn't be used most of the
time, but never mind.

look to source what I do. It' just simple. But you have to emulate
Oracle behave everywhere. Then all is simple, because Oracle doesn't
know ''.

Yes, I read your code. You'll still get zero length strings back from
things like substring('hello world',1,0) and not a NULL as I expect
you'd get back from Oracle.

my sample is very simple - full emulation needs maybe 100 lines more,
but it is possible. After finishing transformation is possible to get
rusult type and I can do some really easy alchemy and wrap funccall
for some text functions and replace simple string with NULL. Similar
game is playing now when you use variadic function or function with
defaults arguments.

[ context removed by Pavel; but the example was supporting Informix
style named parameters by PG ]

That would be a *much* bigger change; you're actually changing PG's
parser there and not just modifying the parse tree. If it was done
externally it would be a much easier thing to do.

No I don't do it. Loadable modules are really external. I need only
hook inside parser.

As far as I checked, your code gets passed some subset of the parse
tree. For the parser to have a chance of getting the code to your hook
it would need to be considered valid syntax. Informix style named
parameters isn't considered valid by PG's parser and hence the user will
get an error before the hook would get a chance to rewrite the parse
tree and make it valid. This is basically what Tom was alluding to
here:

Sure. I need some basic functionality, PostgreSQL have to support
named params. But for example, Informix style named params are valid
now (for bison stage).

There are two etaps - Bison parsing - and transformation. And with
wrapping transformation I am able do it. Is paradox so I am able to do
it with Oracle or Informix syntax and not with planned PostgreSQL
syntax now (in this moment). When I find some functionality, that I
can use, then module is really simple - like decode implementation. It
is only transformation to specific CASE statement (specific, because I
have to use IS NOT DISTINCT operator). But without this base
functionality, I should to use C functions. It is only some code more.

This solution isn't absolutely general - It's not able emulate full
SQL/XML syntax - but current func_call rules are very simple. On
second hand It can support smart functions, that knows their source -
like some SQL/XML functions does.

Show quoted text

http://archives.postgresql.org/pgsql-hackers/2009-02/msg00574.php

--
Sam http://samason.me.uk/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#17)
Re: WIP: hooking parser

2009/2/16 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:

Next sample of parser hook using:
attachment contains module that transform every empty string to null.
I am not sure, if this behave is exactly compatible with Oracle,

Surely a parser hook like this would have nothing whatsoever to do
with Oracle's behavior.

regards, tom lane

it's maybe too much simple :). It is sample that have to show possibility.

regards
Pavel Stehule

#24Peter Eisentraut
peter_e@gmx.net
In reply to: Pavel Stehule (#23)
Re: WIP: hooking parser

Pavel Stehule wrote:

2009/2/16 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:

Next sample of parser hook using:
attachment contains module that transform every empty string to null.
I am not sure, if this behave is exactly compatible with Oracle,

Surely a parser hook like this would have nothing whatsoever to do
with Oracle's behavior.

regards, tom lane

it's maybe too much simple :). It is sample that have to show possibility.

I'd be quite interested to support some kind of hook to deal with this
Oracle null issue. It would be a great help for porting projects.

However, doing this properly is probably more complex and needs further
thought. I'd suggest writing a type of regression test first for Oracle
null behavior and then evaluating any kind of hook or hack against that.

#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#24)
Re: WIP: hooking parser

Peter Eisentraut <peter_e@gmx.net> writes:

I'd be quite interested to support some kind of hook to deal with this
Oracle null issue. It would be a great help for porting projects.

However, doing this properly is probably more complex and needs further
thought. I'd suggest writing a type of regression test first for Oracle
null behavior and then evaluating any kind of hook or hack against that.

AFAIK, the Oracle behavior is just about entirely unrelated to the
parser --- it's a matter of runtime comparison behavior. It is
certainly *not* restricted to literal NULL/'' constants, which is the
only case that a parser hack can deal with.

There's some interesting comments here:
http://stackoverflow.com/questions/203493/why-does-oracle-9i-treat-an-empty-string-as-null

regards, tom lane

#26Pavel Stehule
pavel.stehule@gmail.com
In reply to: Peter Eisentraut (#24)
Re: WIP: hooking parser

2009/2/18 Peter Eisentraut <peter_e@gmx.net>:

Pavel Stehule wrote:

2009/2/16 Tom Lane <tgl@sss.pgh.pa.us>:

Pavel Stehule <pavel.stehule@gmail.com> writes:

Next sample of parser hook using:
attachment contains module that transform every empty string to null.
I am not sure, if this behave is exactly compatible with Oracle,

Surely a parser hook like this would have nothing whatsoever to do
with Oracle's behavior.

regards, tom lane

it's maybe too much simple :). It is sample that have to show possibility.

I'd be quite interested to support some kind of hook to deal with this
Oracle null issue. It would be a great help for porting projects.

However, doing this properly is probably more complex and needs further
thought. I'd suggest writing a type of regression test first for Oracle
null behavior and then evaluating any kind of hook or hack against that

+ 1
regards
Pavel

#27Sam Mason
sam@samason.me.uk
In reply to: Tom Lane (#25)
Re: WIP: hooking parser

On Wed, Feb 18, 2009 at 10:30:12AM -0500, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

I'd be quite interested to support some kind of hook to deal with this
Oracle null issue. It would be a great help for porting projects.

However, doing this properly is probably more complex and needs further
thought. I'd suggest writing a type of regression test first for Oracle
null behavior and then evaluating any kind of hook or hack against that.

AFAIK, the Oracle behavior is just about entirely unrelated to the
parser --- it's a matter of runtime comparison behavior. It is
certainly *not* restricted to literal NULL/'' constants, which is the
only case that a parser hack can deal with.

How about introducing a "varchar2" type as in Oracle? It would be a bit
of a fiddle going through all the operators and functions making sure
that versions existed to cast things back again but seems possible.

Not sure how fragile user code would be with it though, I'm mainly
worried about it trying to convert things back to TEXT automatically and
the resulting change in semantics. Any ideas about good ways to go?

--
Sam http://samason.me.uk/

In reply to: Sam Mason (#27)
Re: WIP: hooking parser

On Thu, Feb 19, 2009 at 06:29:25PM +0000, Sam Mason wrote:

On Wed, Feb 18, 2009 at 10:30:12AM -0500, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

I'd be quite interested to support some kind of hook to deal with this
Oracle null issue. It would be a great help for porting projects.

However, doing this properly is probably more complex and needs further
thought. I'd suggest writing a type of regression test first for Oracle
null behavior and then evaluating any kind of hook or hack against that.

AFAIK, the Oracle behavior is just about entirely unrelated to the
parser --- it's a matter of runtime comparison behavior. It is
certainly *not* restricted to literal NULL/'' constants, which is the
only case that a parser hack can deal with.

How about introducing a "varchar2" type as in Oracle? It would be a bit
of a fiddle going through all the operators and functions making sure
that versions existed to cast things back again but seems possible.

Not sure how fragile user code would be with it though, I'm mainly
worried about it trying to convert things back to TEXT automatically and
the resulting change in semantics. Any ideas about good ways to go?

Could you define a type/domain for varchar2 mapping it to varchar.
There does not seem to be anything else that needs to be done.

Cheers,
Ken

#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sam Mason (#27)
Re: WIP: hooking parser

Sam Mason <sam@samason.me.uk> writes:

On Wed, Feb 18, 2009 at 10:30:12AM -0500, Tom Lane wrote:

AFAIK, the Oracle behavior is just about entirely unrelated to the
parser --- it's a matter of runtime comparison behavior. It is
certainly *not* restricted to literal NULL/'' constants, which is the
only case that a parser hack can deal with.

How about introducing a "varchar2" type as in Oracle?

Maybe. I think right now we don't allow input functions to decide
that a non-null input string should be converted to a NULL, but
that might be fixable. It'd still be an ugly mess though, since
I suspect you'd have to introduce a whole structure of varchar2
functions/operators paralleling text. For example, what is Oracle's
handling of || ? AFAICS they can't be standards compliant there,
which means you need a varchar2-specific nonstrict implementation
of ||, and then to make that work the way Oracle users would expect,
varchar2-ness rather than text-ness would have to propagate through
anything else that might be done to a column before it reaches the ||.

regards, tom lane

#30Sam Mason
sam@samason.me.uk
In reply to: Tom Lane (#29)
Re: WIP: hooking parser

On Thu, Feb 19, 2009 at 02:02:06PM -0500, Tom Lane wrote:

Sam Mason <sam@samason.me.uk> writes:

On Wed, Feb 18, 2009 at 10:30:12AM -0500, Tom Lane wrote:

AFAIK, the Oracle behavior is just about entirely unrelated to the
parser --- it's a matter of runtime comparison behavior. It is
certainly *not* restricted to literal NULL/'' constants, which is the
only case that a parser hack can deal with.

How about introducing a "varchar2" type as in Oracle?

Maybe. I think right now we don't allow input functions to decide
that a non-null input string should be converted to a NULL, but
that might be fixable.

It seems like the most horrible failure of encapsulation. I don't
know the code well enough to comment, but I've already realized that I
misinterpreted the docs. They say that the type's input_function is
called for NULL values, but because it's strict this obviously doesn't
normally affect things. I was hence assuming that it was OK for the
function to return NULL for arbitrary inputs, ah well.

It'd still be an ugly mess though, since
I suspect you'd have to introduce a whole structure of varchar2
functions/operators paralleling text.

[and later]

to make that work the way Oracle users would expect,
varchar2-ness rather than text-ness would have to propagate through
anything else that might be done to a column before it reaches the ||.

Yes, I'm somewhat prone to understatement and that's what my "fiddle"
comment was about. The only way I could see it working was to keep it
as varchar2 for a long as possible, which is why I was wondering if PG
would ever have a tendency to auto-magically convert it back to a TEXT
breaking things for the user.

For example, what is Oracle's
handling of || ? AFAICS they can't be standards compliant there,
which means you need a varchar2-specific nonstrict implementation
of ||

Didn't think about the non-strict append operator though, that's
'orrible!

--
Sam http://samason.me.uk/

#31Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#29)
Re: WIP: hooking parser

Tom Lane wrote:

How about introducing a "varchar2" type as in Oracle?

Maybe. I think right now we don't allow input functions to decide
that a non-null input string should be converted to a NULL, but
that might be fixable. It'd still be an ugly mess though, since
I suspect you'd have to introduce a whole structure of varchar2
functions/operators paralleling text. For example, what is Oracle's
handling of || ? AFAICS they can't be standards compliant there,
which means you need a varchar2-specific nonstrict implementation
of ||, and then to make that work the way Oracle users would expect,
varchar2-ness rather than text-ness would have to propagate through
anything else that might be done to a column before it reaches the ||.

Curiously enough, Oracle has it so that || of null arguments treats the
arguments as empty string.

It's beyond comprehension.

But yeah, a varchar2 type with a full set of functions and operators
could work. If you choose not to bother with supporting the char type.

#32Pavel Stehule
pavel.stehule@gmail.com
In reply to: Peter Eisentraut (#31)
Re: WIP: hooking parser

Curiously enough, Oracle has it so that || of null arguments treats the
arguments as empty string.

It's beyond comprehension.

what is result of '' || '' ?

Pavel

Show quoted text

But yeah, a varchar2 type with a full set of functions and operators could
work. If you choose not to bother with supporting the char type.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#33Gregory Stark
stark@enterprisedb.com
In reply to: Pavel Stehule (#32)
Re: WIP: hooking parser

Pavel Stehule <pavel.stehule@gmail.com> writes:

Curiously enough, Oracle has it so that || of null arguments treats the
arguments as empty string.

It's beyond comprehension.

what is result of '' || '' ?

Well the result of this is NULL of course (which is the same as '')

What's more puzzling is what the answer to 'foo' || NULL is...

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!