proposal: new contrib module plpgsql's embeded sql validator

Started by Pavel Stehuleover 14 years ago11 messages
#1Pavel Stehule
pavel.stehule@gmail.com
1 attachment(s)

Hello all,

a lazy deep SQL validation inside plpgsq functions is interesting
attribute. It allows to work with temporary tables and it make testing
and debugging harder, because lot of errors in embedded queries are
detected too late. I wrote a simple module that can to help little
bit. It is based on plpgsql plugin API and it ensures a deep
validation of embedded sql early - after start of execution. I am
thinking, so this plugin is really useful and it is example of plpgsql
pluging - that is missing in contrib.

Example:

buggy function - raise error when par > 10

CREATE OR REPLACE FUNCTION public.kuku(a integer)
RETURNS integer
LANGUAGE plpgsql
AS $function$
begin
if (a > 10) then
return b + 1;
else
return a + 1;
end if;
end;
$function$

but it is works for par <= 10

postgres=# select kuku(1);
kuku
------
2
(1 row)

postgres=# load 'plpgsql';
LOAD
postgres=# load 'plpgsql_esql_checker';
LOAD
postgres=# select kuku(1);
ERROR: column "b" does not exist
LINE 1: SELECT b + 1
^
QUERY: SELECT b + 1
CONTEXT: PL/pgSQL function "kuku" line 3 at RETURN

with esql checker this bug is identified without dependency on used
parameter's value

What do you think about this idea?

The code contains a plpgsql_statement_tree walker - it should be moved
to core and used generally - statistic, coverage tests, ...

Regards

Pavel Stehule

Attachments:

plpgsql-checker-9.0.tgzapplication/x-gzip; name=plpgsql-checker-9.0.tgzDownload
#2Jim Nasby
jim@nasby.net
In reply to: Pavel Stehule (#1)
Re: proposal: new contrib module plpgsql's embeded sql validator

On Jul 7, 2011, at 11:31 PM, Pavel Stehule wrote:

a lazy deep SQL validation inside plpgsq functions is interesting
attribute. It allows to work with temporary tables and it make testing
and debugging harder, because lot of errors in embedded queries are
detected too late. I wrote a simple module that can to help little
bit. It is based on plpgsql plugin API and it ensures a deep
validation of embedded sql early - after start of execution. I am
thinking, so this plugin is really useful and it is example of plpgsql
pluging - that is missing in contrib.

Example:

buggy function - raise error when par > 10

CREATE OR REPLACE FUNCTION public.kuku(a integer)
RETURNS integer
LANGUAGE plpgsql
AS $function$
begin
if (a > 10) then
return b + 1;
else
return a + 1;
end if;
end;
$function$

but it is works for par <= 10

postgres=# select kuku(1);
kuku
------
2
(1 row)

postgres=# load 'plpgsql';
LOAD
postgres=# load 'plpgsql_esql_checker';
LOAD
postgres=# select kuku(1);
ERROR: column "b" does not exist
LINE 1: SELECT b + 1
^
QUERY: SELECT b + 1
CONTEXT: PL/pgSQL function "kuku" line 3 at RETURN

with esql checker this bug is identified without dependency on used
parameter's value

What do you think about this idea?

The code contains a plpgsql_statement_tree walker - it should be moved
to core and used generally - statistic, coverage tests, ...

I think this should at least be a contrib module; it seems very useful.

On a somewhat related note, I'd also really like to have the ability to parse things like .sql files externally, to do things like LINT checking.
--
Jim C. Nasby, Database Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net

#3Alvaro Herrera
alvherre@commandprompt.com
In reply to: Jim Nasby (#2)
Re: proposal: new contrib module plpgsql's embeded sql validator

Excerpts from Jim Nasby's message of dom jul 17 16:31:45 -0400 2011:

On a somewhat related note, I'd also really like to have the ability to parse things like .sql files externally, to do things like LINT checking.

We talked about this during PGCon. The idea that seemed to have
consensues there was to export the parser similarly to how we build the
ecpg parser, that is, a set of perl scripts which take our gram.y as
input and modify it to emit something different. What ecpg does with it
is emit a different grammar, but it doesn't seem impossible to me to
have it emit some sort of (lint) checker.

I admit I haven't looked at the new Perl scripts we use in ecpg (they
were rewritten in the 9.1 era).

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#4David Fetter
david@fetter.org
In reply to: Alvaro Herrera (#3)
Re: proposal: new contrib module plpgsql's embeded sql validator

On Mon, Jul 18, 2011 at 02:05:42PM -0400, Alvaro Herrera wrote:

Excerpts from Jim Nasby's message of dom jul 17 16:31:45 -0400 2011:

On a somewhat related note, I'd also really like to have the
ability to parse things like .sql files externally, to do things
like LINT checking.

We talked about this during PGCon. The idea that seemed to have
consensues there was to export the parser similarly to how we build
the ecpg parser, that is, a set of perl scripts which take our
gram.y as input and modify it to emit something different.

That solves the non-customized part of the problem, which is worth
solving. A complete parser for a given DB would need catalog access,
i.e. a live connection to that DB.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#5Tatsuo Ishii
ishii@postgresql.org
In reply to: David Fetter (#4)
Re: proposal: new contrib module plpgsql's embeded sql validator

We talked about this during PGCon. The idea that seemed to have
consensues there was to export the parser similarly to how we build
the ecpg parser, that is, a set of perl scripts which take our
gram.y as input and modify it to emit something different.

That solves the non-customized part of the problem, which is worth
solving.

In addition to this, parsing SQL may need tree walker functions and
some support functions(e.g. memory management). These are source files
from pgpool-II's parser directory.

copyfuncs.c kwlist.h nodes.c pg_list.h pool_string.h value.c
gram.c kwlookup.c nodes.h pg_wchar.h primnodes.h value.h
gram.h list.c outfuncs.c pool_memory.c scan.c wchar.c
gramparse.h makefuncs.c parsenodes.h pool_memory.h scanner.h
keywords.c makefuncs.h parser.c pool_parser.h scansup.c
keywords.h memnodes.h parser.h pool_string.c scansup.h

A complete parser for a given DB would need catalog access,
i.e. a live connection to that DB.

Yes, pgpool-II does some of this. (for example, to know if particular
table is a tempory table, to expand "*" to real column lists and so on).

Just F.Y.I.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

#6Petr Jelínek
pjmodos@pjmodos.net
In reply to: Jim Nasby (#2)
Re: proposal: new contrib module plpgsql's embeded sql validator

On 07/17/2011 10:31 PM, Jim Nasby wrote:

On Jul 7, 2011, at 11:31 PM, Pavel Stehule wrote:

a lazy deep SQL validation inside plpgsq functions is interesting
attribute. It allows to work with temporary tables and it make testing
and debugging harder, because lot of errors in embedded queries are
detected too late. I wrote a simple module that can to help little
bit. It is based on plpgsql plugin API and it ensures a deep
validation of embedded sql early - after start of execution. I am
thinking, so this plugin is really useful and it is example of plpgsql
pluging - that is missing in contrib.

I think this should at least be a contrib module; it seems very useful.

Yes I agree this should be part of pg distribution.

But, I think we should add valitation hook to plpgsql plugin structure
so that you don't have to actually execute the function to check it -
curretly there are only executing hooks which is why the plugin only
works when you the func (not good for automation).

--
Petr Jelinek

#7Pavel Stehule
pavel.stehule@gmail.com
In reply to: Petr Jelínek (#6)
Re: proposal: new contrib module plpgsql's embeded sql validator

Dne 19. července 2011 21:15 Petr Jelínek <pjmodos@pjmodos.net> napsal(a):

On 07/17/2011 10:31 PM, Jim Nasby wrote:

On Jul 7, 2011, at 11:31 PM, Pavel Stehule wrote:

a lazy deep SQL validation inside plpgsq functions is interesting
attribute. It allows to work with temporary tables and it make testing
and debugging harder, because lot of errors in embedded queries are
detected too late. I wrote a simple module that can to help little
bit. It is based on plpgsql plugin API and it ensures a deep
validation of embedded sql early - after start of execution. I am
thinking, so this plugin is really useful and it is example of plpgsql
pluging - that is missing in contrib.

I think this should at least be a contrib module; it seems very useful.

Yes I agree this should be part of pg distribution.

But, I think we should add valitation hook to plpgsql plugin structure so
that you don't have to actually execute the function to check it - curretly
there are only executing hooks which is why the plugin only works when you
the func (not good for automation).

should be great, but there are still few limits in compile time

* polymorphic parameters
* triggers - there are no a info about relation in compile time

we can adapt a #option keyword for using in some plpgsql plugins

for example - for addition information that are necessary for usage of
lint in compilation time

CREATE OR REPLACE FUNCTION foo ()
RETURNS ... AS $$

#option trigger_relation some_table_name
#option replace_anyelement integer

...

with this addition info it and some compile hook it is possible

Regards

Pavel

Show quoted text

--
Petr Jelinek

#8Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#1)
Re: proposal: new contrib module plpgsql's embeded sql validator

2011/7/20 Tom Lane <tgl@sss.pgh.pa.us>:

=?ISO-8859-1?Q?Petr_Jel=EDnek?= <pjmodos@pjmodos.net> writes:

But, I think we should add valitation hook to plpgsql plugin structure
so that you don't have to actually execute the function to check it -
curretly there are only executing hooks which is why the plugin only
works when you the func (not good for automation).

If you mean that such checks would be done automatically, no, they
shouldn't be.  Consider a function that creates a table and then uses
it, or even just depends on using a table that doesn't yet exist when
you do CREATE FUNCTION.

yes, any deep check is not possible for function that uses a temporary tables.

A plpgsql_lint is not silver bullet - for these cases is necessary to
disable lint.

. I can't to speak generally - I have no idea, how much percent of
functions are functions with access to temporary tables - in my last
project I use 0 temp tables on cca 300 KB of plpgsql code.

The more terrible problem is a new dependency between functions. I use
a workaround - some like headers

CREATE FUNCTIONS foo(define interface here) RETURNS ... AS $$ BEGIN
RETURN; END; $$ LANGUAGE plpgsql;

....

...

--real implementation of foo
CREATE OR REPLACE FUNCTIONS foo(...)
RETURNS ...
AS ..

It works because I write a plpgsql script in hand - I don't use a dump
for plpgsql, but it is not solution for production servers. On second
hand - plpgsql_lint or some similar (and builtin or external) should
not be active on production servers. A planning only really processed
queries is necessary optimization if we have not a global plan cache.

Regards

Pavel

Show quoted text

                       regards, tom lane

#9Petr Jelínek
pjmodos@pjmodos.net
In reply to: Pavel Stehule (#1)
Re: proposal: new contrib module plpgsql's embeded sql validator

On 07/20/2011 05:24 AM, Tom Lane wrote:

=?ISO-8859-1?Q?Petr_Jel=EDnek?=<pjmodos@pjmodos.net> writes:

But, I think we should add valitation hook to plpgsql plugin structure
so that you don't have to actually execute the function to check it -
curretly there are only executing hooks which is why the plugin only
works when you the func (not good for automation).

If you mean that such checks would be done automatically, no, they
shouldn't be. Consider a function that creates a table and then uses
it, or even just depends on using a table that doesn't yet exist when
you do CREATE FUNCTION.

No, certainly not by default, I would just like to be able to do it
automatically without having to call the function.

--
Petr Jelinek

#10Jim Nasby
jim@nasby.net
In reply to: Pavel Stehule (#8)
Re: proposal: new contrib module plpgsql's embeded sql validator

On Jul 19, 2011, at 10:51 PM, Pavel Stehule wrote:

If you mean that such checks would be done automatically, no, they
shouldn't be. Consider a function that creates a table and then uses
it, or even just depends on using a table that doesn't yet exist when
you do CREATE FUNCTION.

yes, any deep check is not possible for function that uses a temporary tables.

A plpgsql_lint is not silver bullet - for these cases is necessary to
disable lint.

. I can't to speak generally - I have no idea, how much percent of
functions are functions with access to temporary tables - in my last
project I use 0 temp tables on cca 300 KB of plpgsql code.

The more terrible problem is a new dependency between functions. I use
a workaround - some like headers

You can work around temp table issues the same way: just define the temp table before you create the function.

In practice, if I have a function that depends on a temp table it either creates it itself if it doesn't already exist or I have a separate function to create the table; that way you have a single place that has the temp table definition, and that is in the database itself.
--
Jim C. Nasby, Database Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net

#11Pavel Stehule
pavel.stehule@gmail.com
In reply to: Jim Nasby (#10)
Re: proposal: new contrib module plpgsql's embeded sql validator

2011/7/22 Jim Nasby <jim@nasby.net>:

On Jul 19, 2011, at 10:51 PM, Pavel Stehule wrote:

If you mean that such checks would be done automatically, no, they
shouldn't be.  Consider a function that creates a table and then uses
it, or even just depends on using a table that doesn't yet exist when
you do CREATE FUNCTION.

yes, any deep check is not possible for function that uses a temporary tables.

A plpgsql_lint is not silver bullet - for these cases is necessary to
disable lint.

. I can't to speak generally - I have no idea, how much percent of
functions are functions with access to temporary tables - in my last
project I use 0 temp tables on cca 300 KB of plpgsql code.

The more terrible problem is a new dependency between functions. I use
a workaround - some like headers

You can work around temp table issues the same way: just define the temp table before you create the function.

In practice, if I have a function that depends on a temp table it either creates it itself if it doesn't already exist or I have a separate function to create the table; that way you have a single place that has the temp table definition, and that is in the database itself.

there is other trick - use a persistent table with same name before.
Runtime temporary table is near in search_path, so all executed SQL
will be related to this temp table.

Pavel

Show quoted text

--
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net