proposal: new contrib module plpgsql's embeded sql validator
Hello all,
a lazy deep SQL validation inside plpgsq functions is interesting
attribute. It allows to work with temporary tables and it make testing
and debugging harder, because lot of errors in embedded queries are
detected too late. I wrote a simple module that can to help little
bit. It is based on plpgsql plugin API and it ensures a deep
validation of embedded sql early - after start of execution. I am
thinking, so this plugin is really useful and it is example of plpgsql
pluging - that is missing in contrib.
Example:
buggy function - raise error when par > 10
CREATE OR REPLACE FUNCTION public.kuku(a integer)
RETURNS integer
LANGUAGE plpgsql
AS $function$
begin
if (a > 10) then
return b + 1;
else
return a + 1;
end if;
end;
$function$
but it is works for par <= 10
postgres=# select kuku(1);
kuku
------
2
(1 row)
postgres=# load 'plpgsql';
LOAD
postgres=# load 'plpgsql_esql_checker';
LOAD
postgres=# select kuku(1);
ERROR: column "b" does not exist
LINE 1: SELECT b + 1
^
QUERY: SELECT b + 1
CONTEXT: PL/pgSQL function "kuku" line 3 at RETURN
with esql checker this bug is identified without dependency on used
parameter's value
What do you think about this idea?
The code contains a plpgsql_statement_tree walker - it should be moved
to core and used generally - statistic, coverage tests, ...
Regards
Pavel Stehule
Attachments:
On Jul 7, 2011, at 11:31 PM, Pavel Stehule wrote:
a lazy deep SQL validation inside plpgsq functions is interesting
attribute. It allows to work with temporary tables and it make testing
and debugging harder, because lot of errors in embedded queries are
detected too late. I wrote a simple module that can to help little
bit. It is based on plpgsql plugin API and it ensures a deep
validation of embedded sql early - after start of execution. I am
thinking, so this plugin is really useful and it is example of plpgsql
pluging - that is missing in contrib.Example:
buggy function - raise error when par > 10
CREATE OR REPLACE FUNCTION public.kuku(a integer)
RETURNS integer
LANGUAGE plpgsql
AS $function$
begin
if (a > 10) then
return b + 1;
else
return a + 1;
end if;
end;
$function$but it is works for par <= 10
postgres=# select kuku(1);
kuku
------
2
(1 row)postgres=# load 'plpgsql';
LOAD
postgres=# load 'plpgsql_esql_checker';
LOAD
postgres=# select kuku(1);
ERROR: column "b" does not exist
LINE 1: SELECT b + 1
^
QUERY: SELECT b + 1
CONTEXT: PL/pgSQL function "kuku" line 3 at RETURNwith esql checker this bug is identified without dependency on used
parameter's valueWhat do you think about this idea?
The code contains a plpgsql_statement_tree walker - it should be moved
to core and used generally - statistic, coverage tests, ...
I think this should at least be a contrib module; it seems very useful.
On a somewhat related note, I'd also really like to have the ability to parse things like .sql files externally, to do things like LINT checking.
--
Jim C. Nasby, Database Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net
Excerpts from Jim Nasby's message of dom jul 17 16:31:45 -0400 2011:
On a somewhat related note, I'd also really like to have the ability to parse things like .sql files externally, to do things like LINT checking.
We talked about this during PGCon. The idea that seemed to have
consensues there was to export the parser similarly to how we build the
ecpg parser, that is, a set of perl scripts which take our gram.y as
input and modify it to emit something different. What ecpg does with it
is emit a different grammar, but it doesn't seem impossible to me to
have it emit some sort of (lint) checker.
I admit I haven't looked at the new Perl scripts we use in ecpg (they
were rewritten in the 9.1 era).
--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Mon, Jul 18, 2011 at 02:05:42PM -0400, Alvaro Herrera wrote:
Excerpts from Jim Nasby's message of dom jul 17 16:31:45 -0400 2011:
On a somewhat related note, I'd also really like to have the
ability to parse things like .sql files externally, to do things
like LINT checking.We talked about this during PGCon. The idea that seemed to have
consensues there was to export the parser similarly to how we build
the ecpg parser, that is, a set of perl scripts which take our
gram.y as input and modify it to emit something different.
That solves the non-customized part of the problem, which is worth
solving. A complete parser for a given DB would need catalog access,
i.e. a live connection to that DB.
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
We talked about this during PGCon. The idea that seemed to have
consensues there was to export the parser similarly to how we build
the ecpg parser, that is, a set of perl scripts which take our
gram.y as input and modify it to emit something different.That solves the non-customized part of the problem, which is worth
solving.
In addition to this, parsing SQL may need tree walker functions and
some support functions(e.g. memory management). These are source files
from pgpool-II's parser directory.
copyfuncs.c kwlist.h nodes.c pg_list.h pool_string.h value.c
gram.c kwlookup.c nodes.h pg_wchar.h primnodes.h value.h
gram.h list.c outfuncs.c pool_memory.c scan.c wchar.c
gramparse.h makefuncs.c parsenodes.h pool_memory.h scanner.h
keywords.c makefuncs.h parser.c pool_parser.h scansup.c
keywords.h memnodes.h parser.h pool_string.c scansup.h
A complete parser for a given DB would need catalog access,
i.e. a live connection to that DB.
Yes, pgpool-II does some of this. (for example, to know if particular
table is a tempory table, to expand "*" to real column lists and so on).
Just F.Y.I.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
On 07/17/2011 10:31 PM, Jim Nasby wrote:
On Jul 7, 2011, at 11:31 PM, Pavel Stehule wrote:
a lazy deep SQL validation inside plpgsq functions is interesting
attribute. It allows to work with temporary tables and it make testing
and debugging harder, because lot of errors in embedded queries are
detected too late. I wrote a simple module that can to help little
bit. It is based on plpgsql plugin API and it ensures a deep
validation of embedded sql early - after start of execution. I am
thinking, so this plugin is really useful and it is example of plpgsql
pluging - that is missing in contrib.I think this should at least be a contrib module; it seems very useful.
Yes I agree this should be part of pg distribution.
But, I think we should add valitation hook to plpgsql plugin structure
so that you don't have to actually execute the function to check it -
curretly there are only executing hooks which is why the plugin only
works when you the func (not good for automation).
--
Petr Jelinek
Dne 19. července 2011 21:15 Petr Jelínek <pjmodos@pjmodos.net> napsal(a):
On 07/17/2011 10:31 PM, Jim Nasby wrote:
On Jul 7, 2011, at 11:31 PM, Pavel Stehule wrote:
a lazy deep SQL validation inside plpgsq functions is interesting
attribute. It allows to work with temporary tables and it make testing
and debugging harder, because lot of errors in embedded queries are
detected too late. I wrote a simple module that can to help little
bit. It is based on plpgsql plugin API and it ensures a deep
validation of embedded sql early - after start of execution. I am
thinking, so this plugin is really useful and it is example of plpgsql
pluging - that is missing in contrib.I think this should at least be a contrib module; it seems very useful.
Yes I agree this should be part of pg distribution.
But, I think we should add valitation hook to plpgsql plugin structure so
that you don't have to actually execute the function to check it - curretly
there are only executing hooks which is why the plugin only works when you
the func (not good for automation).
should be great, but there are still few limits in compile time
* polymorphic parameters
* triggers - there are no a info about relation in compile time
we can adapt a #option keyword for using in some plpgsql plugins
for example - for addition information that are necessary for usage of
lint in compilation time
CREATE OR REPLACE FUNCTION foo ()
RETURNS ... AS $$
#option trigger_relation some_table_name
#option replace_anyelement integer
...
with this addition info it and some compile hook it is possible
Regards
Pavel
Show quoted text
--
Petr Jelinek
2011/7/20 Tom Lane <tgl@sss.pgh.pa.us>:
=?ISO-8859-1?Q?Petr_Jel=EDnek?= <pjmodos@pjmodos.net> writes:
But, I think we should add valitation hook to plpgsql plugin structure
so that you don't have to actually execute the function to check it -
curretly there are only executing hooks which is why the plugin only
works when you the func (not good for automation).If you mean that such checks would be done automatically, no, they
shouldn't be. Consider a function that creates a table and then uses
it, or even just depends on using a table that doesn't yet exist when
you do CREATE FUNCTION.
yes, any deep check is not possible for function that uses a temporary tables.
A plpgsql_lint is not silver bullet - for these cases is necessary to
disable lint.
. I can't to speak generally - I have no idea, how much percent of
functions are functions with access to temporary tables - in my last
project I use 0 temp tables on cca 300 KB of plpgsql code.
The more terrible problem is a new dependency between functions. I use
a workaround - some like headers
CREATE FUNCTIONS foo(define interface here) RETURNS ... AS $$ BEGIN
RETURN; END; $$ LANGUAGE plpgsql;
....
...
--real implementation of foo
CREATE OR REPLACE FUNCTIONS foo(...)
RETURNS ...
AS ..
It works because I write a plpgsql script in hand - I don't use a dump
for plpgsql, but it is not solution for production servers. On second
hand - plpgsql_lint or some similar (and builtin or external) should
not be active on production servers. A planning only really processed
queries is necessary optimization if we have not a global plan cache.
Regards
Pavel
Show quoted text
regards, tom lane
Import Notes
Reply to msg id not found: 28802.1311132279@sss.pgh.pa.us
On 07/20/2011 05:24 AM, Tom Lane wrote:
=?ISO-8859-1?Q?Petr_Jel=EDnek?=<pjmodos@pjmodos.net> writes:
But, I think we should add valitation hook to plpgsql plugin structure
so that you don't have to actually execute the function to check it -
curretly there are only executing hooks which is why the plugin only
works when you the func (not good for automation).If you mean that such checks would be done automatically, no, they
shouldn't be. Consider a function that creates a table and then uses
it, or even just depends on using a table that doesn't yet exist when
you do CREATE FUNCTION.
No, certainly not by default, I would just like to be able to do it
automatically without having to call the function.
--
Petr Jelinek
Import Notes
Reply to msg id not found: 28802.1311132279@sss.pgh.pa.us
On Jul 19, 2011, at 10:51 PM, Pavel Stehule wrote:
If you mean that such checks would be done automatically, no, they
shouldn't be. Consider a function that creates a table and then uses
it, or even just depends on using a table that doesn't yet exist when
you do CREATE FUNCTION.yes, any deep check is not possible for function that uses a temporary tables.
A plpgsql_lint is not silver bullet - for these cases is necessary to
disable lint.. I can't to speak generally - I have no idea, how much percent of
functions are functions with access to temporary tables - in my last
project I use 0 temp tables on cca 300 KB of plpgsql code.The more terrible problem is a new dependency between functions. I use
a workaround - some like headers
You can work around temp table issues the same way: just define the temp table before you create the function.
In practice, if I have a function that depends on a temp table it either creates it itself if it doesn't already exist or I have a separate function to create the table; that way you have a single place that has the temp table definition, and that is in the database itself.
--
Jim C. Nasby, Database Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net
2011/7/22 Jim Nasby <jim@nasby.net>:
On Jul 19, 2011, at 10:51 PM, Pavel Stehule wrote:
If you mean that such checks would be done automatically, no, they
shouldn't be. Consider a function that creates a table and then uses
it, or even just depends on using a table that doesn't yet exist when
you do CREATE FUNCTION.yes, any deep check is not possible for function that uses a temporary tables.
A plpgsql_lint is not silver bullet - for these cases is necessary to
disable lint.. I can't to speak generally - I have no idea, how much percent of
functions are functions with access to temporary tables - in my last
project I use 0 temp tables on cca 300 KB of plpgsql code.The more terrible problem is a new dependency between functions. I use
a workaround - some like headersYou can work around temp table issues the same way: just define the temp table before you create the function.
In practice, if I have a function that depends on a temp table it either creates it itself if it doesn't already exist or I have a separate function to create the table; that way you have a single place that has the temp table definition, and that is in the database itself.
there is other trick - use a persistent table with same name before.
Runtime temporary table is near in search_path, so all executed SQL
will be related to this temp table.
Pavel
Show quoted text
--
Jim C. Nasby, Database Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net