SQL-Invoked Procedures for 8.1

Started by Gavin Sherryover 21 years ago84 messageshackers
Jump to latest
#1Gavin Sherry
swm@linuxworld.com.au

Hi all,

Following is a proposal to implement what SQL2003 calls 'SQL-Invoked
Procedures' and what most people refer to as stored procedures. Fujitsu
will be funding Neil Conway and I to work on this feature.

Procedures
----------

Procedures are nearly identical to functions. From my reading of
SQL2003 (see sections 4.27 and 11.50) procedures are different from
functions in three ways:

1) Unlike functions, procedures cannot return values.

2) Procedures support parameter modes which specify whether an argument is
IN, OUT or IN OUT

3) Procedures can be run in the same savepoint level as the caller when
OLD SAVEPOINT LEVEL is specified at creation time. According to SQL2003,
functions must be run on a new savepoint level. From my understanding, we
do not do this currently.

Work will focus on 1 and 2 until we have the concept of savepoint levels
with functions. Its possible that we will implement this too if there is
demand.

This makes the difference between procedures and functions quite
superficial: procedures are functions which return void and have parameter
modes. As such, I think we can largely wrap around the existing function
creation code using this rule. That is, CREATE PROCEDURE could and I think
should be syntactic sugar.

If we go down the route of saying that procedures are a type of function,
we have the option of allowing users access to OUT and INOUT in functions.
This would make procedures simply a subset of functions. What do people think?

Grammar modifications
---------------------

The grammar for creating a procedure, which is a subset of the SQL 2003
grammar, is as follows:

CREATE [ OR REPLACE ] PROCEDURE <name> ( [ [ IN | OUT | IN OUT ] [ name ] type, ... ] )
<routine characteristics>
<routine body>

<routine characteristics> are LANGUAGE, null-call and security clauses,
all of which are already implemented. <routine body> is the body, same as
with the existing functions implementation.

DROP PROCEDURE and CALL will also need to be added.

System catalog changes
-----------------------

There will be cases when we need to identify whether a routine is a
function or a procedure. This could be done two ways. We could say that
any proc in pg_proc which returns void is a procedure or we could store this
in some 'protype' column. Thoughts?

To handle parameter modes, we will also need to add an array of "char"s so
that we can determine the mode of an element in proargnames[].

Invocation
----------

Since procedures do not return a value they are invoked differently than
functions are (at least according to SQL2003). This is because if they were
used in a SELECT, UPDATE, DELETE or INSERT clause, their effect would be
ambiguous at best. ie,

SELECT * FROM foo WHERE bar();

Doesn't make much sense.

In SQL2003, procedures are invoked using:

CALL foo();

One of our original goals was to allow easier invocation of stored
procedures from PL/PgSQL, by allowing "foo()" rather than "CALL foo()"
or "PERFORM foo()". Neil submitted some preliminary patches
implementing this; following discussion with Tom, it was agreed to
make CALL optional at the SQL level, which provides the same effect:

http://archives.postgresql.org/pgsql-hackers/2004-09/msg00412.php

In this sense, invoking a procedure follows the same rules as invoking a
function which returns void.

Procedure arguments
-------------------

From what I can tell, there are only 3 ways to pass data to a

procedure:

1) As a literal value:

CALL foo('Then one day....');
CALL bar(1, 2, 3);
CALL baz(current_timestamp)

Note that this only works for parameters which are of mode IN (see
more below).

2) As a variable

Other databases, and SQL2003, support a few different implementations what
could be called variables. In SQL2003, there are a few types of
'variables':

i) Host parameters
ii) Column references to transition variables (OLD and NEW in the case of
triggers and rules)
iii) Dynamic parameters (ie, ? in the case of prepare)
iv) Embedded variables

(i) and (iv) are basically bindings to the client side variables. They
provide a mechanism for interface designers to allow users to associate
variables with a procedure, execute that procedure, and have the new value
of the variables available to the language once the execution call
returns. JDBC 3.0 focuses fairly heavily on this kind of thing, for
example.

I think we only need implement one of these and I discuss it below under
'Host Variables'.

3) A transition variable

These are the NEW and OLD variables in trigger and rule specifications.

If we are calling a procedure, we will need to make sure that every
argument is one of these.

Only (2) and (3) can be passed as parameters which are marked OUT or
INOUT. SQL2003 makes the following distinction between parameter modes:

IN - A 'read only' argument. It must be initialised.
OUT - An uninitialised argument which, from memory, SQL2003 says *must* be
initialised by the function itself.
INOUT - A combination of the two above. That is, an initialised argument
which can be written to.

Host Variables
--------------

Host variables are required to make INOUT and OUT parameters useful,
so this is an important part of the implementation of procedures.

The parser will need to be modified to recognise the host parameter syntax,
which is ':<varname>'. We could restrict this to something like CALL or
expand it so that normal queries could use variables too. I cannot
think of any reason to restrict use but I'm open to what people think.

We'll need a host variable node just like we have Const, etc, so that
the variable can be propagated through the system. The variable should be
substituted within the executor. This ensures that EXPLAIN wont break.
Then again, if we don't allow host parameters in normal queries, its not
a problem.

If we do, I think we could get away with piggy backing on the PREPARE/EXECUTE
code with the possibility of allowing the planner to take a look at the current
value of a given host variable so that it can generate a better plan.
Then again, using variables in standard queries may just be a very bad
idea.

As for the creation and modification of host parameters, we have two options:
an SQL language construct, which MySQL uses for example, or a protocol level
mechanism which Oracle appears to use.

As far as I can understand from SQL2003, host parameters are predefine there
at procedure creation time. I don't really get this and I haven't seen it in
another database. If someone could explain it and thinks it would serve us
better than that covered below, please let me know :-).

If we go down the grammar route, I'd imagine that we do something like:

SET VARIABLE <varname> = <val>
SHOW VARIABLE <varname>

I've used SET and SHOW because they're intuitive. Obviously these are used
by GUC and I'm not proposing we use GUC for host parameters as we want a
session-local implementation. It would be useful if we could also add some
type information to this so make procedure candidate selection easier. Ideas?

The other option is that we do it at the protocol level and modify libpq
(and psql) to add support. This would allow us something like:

template1=# \vc x int -- create a variable 'x' of type int
template1=# \vs x 2 -- set it to '2'
template1=# CALL proc_with_inout_param(:x);
CALL
template1=# \vp x -- print 'x'
120

That might need some massaging but you get the idea. The commands to psql
translate to lower level protocol level commands which are: create
variable, set (ie, modify the value of) variable and describe variable. Neil
thinks we should have create and set in one step. This is great, since it
covers most cases, but I'm not sure how we have uninitialised values. Perhaps
we set to NULL?

I think interacting with variables as above is actually quite ugly as well
but that has nothing to do with implementation at the protocol level rather
my own lack of UI skills :-).

So, what does doing this at the protocol level give us? Well, it keeps it out
of the grammar and decreases overhead for interfaces. Still, those reasons
aren't great when we're look at a protocol modification. I'd love to hear
other people's ideas here: maybe I've overlooked a much more attractive
alternative.

Getting back to the host variables themselves.

Most implementations I've seen only support a very limited range of types,
such as varchar, numeric and int. I don't see any reason why we can't
support the full range of types available in the system.

The visibility of variables is restricted to the session which created
them. That is, they're just like our implementation of temporary tables.

The only other question (that I can think of now) with respect to
variables is how they are affected by transactions. My gut feel is
that they should have transactional semantics. For example:

template1=# \vc x int
template1=# \vs x 2
template1=# BEGIN;
BEGIN
template1=# CALL proc_which_sets_arg_to_100(:x);
CALL
template1=# ABORT;
template1=# \vp x

What should 'x' be? Following transactional semantics, it should be 2.

The only problem is, implementing the transaction semantics might be
painful. Bruce made a suggestion to me a while ago that we could just put
the variable in a temp table. But that limits the scope for types and it
also may slow performance in a long running transaction which modifies a
variable a lot. That is, its a lot of overhead.

The other option is a hash (or some dynamic structure) in which we
attach some transaction information to each variable. We will need to
process this from CommitTransaction()/AbortTransaction(). We'll also
need to be subtransaction aware. Because we don't need to manage
concurrent access to variables, the logic needed to implement
transactional semantics by hand shouldn't be too difficult.

PL/PgSQL
--------

Its in PL/PgSQL that procedures will be most intensively used, I
believe. Since variables are already supported in the language we
will need to modify that system to make OUT and INOUT parameters
work. Currently, we copy each the argument before we call a function
so that any (potential) write to it will not be reflected in the
caller. We're also protected here by the executor which doesn't allow
us to update arguments since they are marked as constants. We'll also
need to tweak this.

The other thing we will need to look out for if we're executing a
procedure is if the user attempts to RETURN a non-void value. This would be an
error, of course. Simply using RETURN to return control to the caller shouldn't
be a problem in my opinion, however. Neil has also pointed out to me that
this is also the exact behaviour required of functions returning void.

Permissions
-----------

We can use the same permissions as for functions.

----

I will be away for the next few days (until Monday). I hope to have Internet
access but I'm not sure. Neil and I have worked through this together so
he can also answer any questions too.

Gavin

#2Grant Finnemore
grantf@guruhut.co.za
In reply to: Gavin Sherry (#1)
Re: SQL-Invoked Procedures for 8.1

Hi Gavin,

Although I have not read the SQL 2003 spec, my recollection of other database
products' stored procs differed from your description in one significant way,
namely that they could return multiple (and varied) sets of rows.

For example, a stored proc could do a SELECT over foo and then a SELECT over
bar and return the tuples of both foo and bar. (each having different column
counts, types, etc)

The JDBC interfaces would appear to illustrate this point.
(In CallableStatement)

"A CallableStatement can return one ResultSet object or multiple ResultSet
objects. Multiple ResultSet objects are handled using operations inherited
from Statement."

Regards,
Grant

Gavin Sherry wrote:

Hi all,

Following is a proposal to implement what SQL2003 calls 'SQL-Invoked
Procedures' and what most people refer to as stored procedures. Fujitsu
will be funding Neil Conway and I to work on this feature.

[lots of interesting detail]

#3Gavin Sherry
swm@linuxworld.com.au
In reply to: Grant Finnemore (#2)
Re: SQL-Invoked Procedures for 8.1

On Thu, 23 Sep 2004, Grant Finnemore wrote:

Hi Gavin,

Although I have not read the SQL 2003 spec, my recollection of other database
products' stored procs differed from your description in one significant way,
namely that they could return multiple (and varied) sets of rows.

For example, a stored proc could do a SELECT over foo and then a SELECT over
bar and return the tuples of both foo and bar. (each having different column
counts, types, etc)

The JDBC interfaces would appear to illustrate this point.
(In CallableStatement)

"A CallableStatement can return one ResultSet object or multiple ResultSet
objects. Multiple ResultSet objects are handled using operations inherited
from Statement."

I read the JDBC 3.0 spec and didn't see this. Like with all specs, some
details are hard to find. However, from what I've seen in the spec, I
think they have functions in mind here. That being said, I can't think how
SQL2003 would allow such behaviour. If you could show us an example,
that'd be great.

Thanks,

Gavin

#4Grant Finnemore
grantf@guruhut.co.za
In reply to: Gavin Sherry (#3)
Re: SQL-Invoked Procedures for 8.1

Quoth the JDBC spec:

public interface CallableStatement
extends PreparedStatement

The interface used to execute SQL stored procedures. The JDBC API provides a
stored procedure SQL escape syntax that allows stored procedures to be called
in a standard way for all RDBMSs. This escape syntax has one form that includes
a result parameter and one that does not. If used, the result parameter must be
registered as an OUT parameter. The other parameters can be used for input,
output or both. Parameters are referred to sequentially, by number, with the
first parameter being 1.

{?= call <procedure-name>[<arg1>,<arg2>, ...]}
{call <procedure-name>[<arg1>,<arg2>, ...]}

IN parameter values are set using the set methods inherited from
PreparedStatement. The type of all OUT parameters must be registered prior to
executing the stored procedure; their values are retrieved after execution via
the get methods provided here.

A CallableStatement can return one ResultSet object or multiple ResultSet
objects. Multiple ResultSet objects are handled using operations inherited from
Statement.

For maximum portability, a call's ResultSet objects and update counts should be
processed prior to getting the values of output parameters.

Regards,
Grant

Gavin Sherry wrote:

Show quoted text

On Thu, 23 Sep 2004, Grant Finnemore wrote:

Hi Gavin,

Although I have not read the SQL 2003 spec, my recollection of other database
products' stored procs differed from your description in one significant way,
namely that they could return multiple (and varied) sets of rows.

For example, a stored proc could do a SELECT over foo and then a SELECT over
bar and return the tuples of both foo and bar. (each having different column
counts, types, etc)

The JDBC interfaces would appear to illustrate this point.
(In CallableStatement)

"A CallableStatement can return one ResultSet object or multiple ResultSet
objects. Multiple ResultSet objects are handled using operations inherited
from Statement."

I read the JDBC 3.0 spec and didn't see this. Like with all specs, some
details are hard to find. However, from what I've seen in the spec, I
think they have functions in mind here. That being said, I can't think how
SQL2003 would allow such behaviour. If you could show us an example,
that'd be great.

Thanks,

Gavin

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

#5Gavin Sherry
swm@linuxworld.com.au
In reply to: Grant Finnemore (#4)
Re: SQL-Invoked Procedures for 8.1

On Thu, 23 Sep 2004, Grant Finnemore wrote:

Quoth the JDBC spec:

public interface CallableStatement
extends PreparedStatement

The interface used to execute SQL stored procedures. The JDBC API provides a
stored procedure SQL escape syntax that allows stored procedures to be called
in a standard way for all RDBMSs. This escape syntax has one form that includes
a result parameter and one that does not. If used, the result parameter must be
registered as an OUT parameter. The other parameters can be used for input,
output or both. Parameters are referred to sequentially, by number, with the
first parameter being 1.

{?= call <procedure-name>[<arg1>,<arg2>, ...]}
{call <procedure-name>[<arg1>,<arg2>, ...]}

I didn't see this in my copy of the spec, which is version 3.0 FR (final
release). Still, I think we're fine. As I said before, what I think the
spec had in mind was allowing functions to be called from the
callablestatement stuff and have their output put in the first OUT
variable.

This is... reasonable. Compare the stuff Neil's been working on with
"bare" function calls in PL/PgSQL and Tom (I think) saying that it might
be reasonable to just issue func(); as an SQL query: no CALL, no SELECT.

IN parameter values are set using the set methods inherited from
PreparedStatement. The type of all OUT parameters must be registered prior to
executing the stored procedure; their values are retrieved after execution via
the get methods provided here.

A CallableStatement can return one ResultSet object or multiple ResultSet
objects. Multiple ResultSet objects are handled using operations inherited from
Statement.

I don't get this multiple ResultSet stuff. All I can think of is that the
spec has this in mind:

CallableStatement cstmt = conn.prepareCall("{call foo(); call bar();}");

or

CallableStatement cstmt = conn.prepareCall("{call foo()} {call bar();}");

or some other permutation.

I see plenty of references to multiple ResultSets but I cannot find an
example or information on how to generate one.

Gavin

#6Magnus Hagander
magnus@hagander.net
In reply to: Gavin Sherry (#5)
Re: SQL-Invoked Procedures for 8.1

IN parameter values are set using the set methods inherited from
PreparedStatement. The type of all OUT parameters must be

registered

prior to executing the stored procedure; their values are retrieved
after execution via the get methods provided here.

A CallableStatement can return one ResultSet object or multiple
ResultSet objects. Multiple ResultSet objects are handled using
operations inherited from Statement.

I don't get this multiple ResultSet stuff. All I can think of
is that the spec has this in mind:

CallableStatement cstmt = conn.prepareCall("{call foo(); call
bar();}");

or

CallableStatement cstmt = conn.prepareCall("{call foo()}
{call bar();}");

or some other permutation.

I see plenty of references to multiple ResultSets but I
cannot find an example or information on how to generate one.

Not a user of JDBC, but this is fairly common in the ADO/ADO.NET world
with MS SQL Server as well (not sure about other dbs and .NET - I'ev
only used it with mssql and pgsql)... As for an example, something along
the line of (though in my cases usually with a *lot* more parameters):

--
CREATE PROCEDURE get_info_for_user(@userid varchar(16)) AS
SELECT something FROM contentstable WHERE userid=@userid

SELECT whatever,somethingelse FROM anothertable WHERE
something=anything
--

You get the point :-)
Then in my .net code I'd do a simple:
SqlDataReader rdr = cmd.ExecuteReader();
... process first result ...
rdr.NextResult();
... process second result...

This is the very simple case. In this case, the only thing you gain is
less server roundtrips and less parsing steps.

In an example of a more complex case, the first part of the stored
procedure will do some complex (and expensive) work to get to a
resulting variable. This variable is then applied to several different
queries after each other, and their respective resultsets are returned
to the client. In this case, you save having to run that complex logic
more than once. (You could cache the result at the client, but if you're
going down that path then you don't need stored procs at all.. It is
usually necessary to keep it in the db to maintain abstraction)

//Magnus

#7Oliver Jowett
oliver@opencloud.com
In reply to: Gavin Sherry (#5)
Re: SQL-Invoked Procedures for 8.1

Gavin Sherry wrote:

I don't get this multiple ResultSet stuff. All I can think of is that the
spec has this in mind:

CallableStatement cstmt = conn.prepareCall("{call foo(); call bar();}");

or

CallableStatement cstmt = conn.prepareCall("{call foo()} {call bar();}");

or some other permutation.

It's not specific to CallableStatement; you can generate multiple
resultsets from a plain Statement, and CallableStatement is just
inheriting that functionality.

The common way of generating multiple resultsets is, indeed, a
multi-statement query. For example:

Statement stmt = conn.createStatement();
stmt.execute("SELECT * FROM foo; SELECT * FROM bar");

ResultSet rs1 = stmt.getResultSet();
// process rs1
rs1.close();

boolean moreResults = stmt.getMoreResults();
assert moreResults;

ResultSet rs2 = stmt.getResultSet();
// process rs2
rs2.close();

stmt.close();

AFAIK the multiple-resultset stuff is not *required* functionality in
JDBC, it's just there to support it if it does happen. The postgresql
JDBC driver didn't actually support multiple resultsets at all until
recently.

For function/procedure calls, I'd expect it to look like:

CallableStatement cstmt = conn.prepareCall("{call foo()}; {call bar()}");

and for the driver to turn that into two separate SELECT/CALL/whatever
queries at the protocol level, and manage the multiple resultsets
itself. The current driver doesn't handle multiple call escapes in one
query at all, but that's really just a limitation of the reasonably dumb
call-escape parser it currently has.

I wouldn't worry about this case unless there's some other reason that a
*single* function/procedure call needs to return more than one set of
results.

I see plenty of references to multiple ResultSets but I cannot find an
example or information on how to generate one.

That's because there's no standard way to generate them :)

-O

#8Grant Finnemore
grantf@guruhut.co.za
In reply to: Magnus Hagander (#6)
Re: SQL-Invoked Procedures for 8.1

Hi Magnus,

Yes, this is the situation that I have been thinking about. Specifically
when a single stored procedure returns many recordsets.

Perhaps I should also clarify that the "spec" I have been using is
the JDK javadoc documentation.

Using java with Magnus' procedure:
CallableStatement cs = connection.prepareCall("call get_info_for_user ?");
cs.setString(1, "test");
if(cs.execute()) {
ResultSet rs = cs.getResultSet();
while(rs != null) {
// Process rs
}
}

Regards,
Grant

Magnus Hagander wrote:
[snip]

Show quoted text

Not a user of JDBC, but this is fairly common in the ADO/ADO.NET world
with MS SQL Server as well (not sure about other dbs and .NET - I'ev
only used it with mssql and pgsql)... As for an example, something along
the line of (though in my cases usually with a *lot* more parameters):

--
CREATE PROCEDURE get_info_for_user(@userid varchar(16)) AS
SELECT something FROM contentstable WHERE userid=@userid

SELECT whatever,somethingelse FROM anothertable WHERE
something=anything
--

You get the point :-)
Then in my .net code I'd do a simple:
SqlDataReader rdr = cmd.ExecuteReader();
... process first result ...
rdr.NextResult();
... process second result...

#9Gavin Sherry
swm@linuxworld.com.au
In reply to: Oliver Jowett (#7)
Re: SQL-Invoked Procedures for 8.1

On Thu, 23 Sep 2004, Oliver Jowett wrote:

Gavin Sherry wrote:

I don't get this multiple ResultSet stuff. All I can think of is that the
spec has this in mind:

CallableStatement cstmt = conn.prepareCall("{call foo(); call bar();}");

or

CallableStatement cstmt = conn.prepareCall("{call foo()} {call bar();}");

or some other permutation.

It's not specific to CallableStatement; you can generate multiple
resultsets from a plain Statement, and CallableStatement is just
inheriting that functionality.

The common way of generating multiple resultsets is, indeed, a
multi-statement query. For example:

Statement stmt = conn.createStatement();
stmt.execute("SELECT * FROM foo; SELECT * FROM bar");

ResultSet rs1 = stmt.getResultSet();
// process rs1
rs1.close();

boolean moreResults = stmt.getMoreResults();
assert moreResults;

ResultSet rs2 = stmt.getResultSet();
// process rs2
rs2.close();

stmt.close();

AFAIK the multiple-resultset stuff is not *required* functionality in
JDBC, it's just there to support it if it does happen. The postgresql
JDBC driver didn't actually support multiple resultsets at all until
recently.

For function/procedure calls, I'd expect it to look like:

CallableStatement cstmt = conn.prepareCall("{call foo()}; {call bar()}");

and for the driver to turn that into two separate SELECT/CALL/whatever
queries at the protocol level, and manage the multiple resultsets
itself. The current driver doesn't handle multiple call escapes in one
query at all, but that's really just a limitation of the reasonably dumb
call-escape parser it currently has.

I wouldn't worry about this case unless there's some other reason that a
*single* function/procedure call needs to return more than one set of
results.

I see plenty of references to multiple ResultSets but I cannot find an
example or information on how to generate one.

That's because there's no standard way to generate them :)

Okay. So, its something that can be handled in the driver. That's what I
thought.

Do you have any idea about databases returning result sets from SQL
procedures (ie, not functions).

Thanks,

Gavin

#10Hannu Krosing
hannu@tm.ee
In reply to: Grant Finnemore (#8)
Re: SQL-Invoked Procedures for 8.1

On N, 2004-09-23 at 11:34, Grant Finnemore wrote:

Hi Magnus,

Yes, this is the situation that I have been thinking about. Specifically
when a single stored procedure returns many recordsets.

IIRC support for returning multiple recordsets was removed from
postgresql fe-be protocol years ago as "nobody ever needs it" ;)

---------------
Hannu

#11Peter T Mount
peter@retep.org.uk
In reply to: Hannu Krosing (#10)
Re: SQL-Invoked Procedures for 8.1

Hannu Krosing wrote:

On N, 2004-09-23 at 11:34, Grant Finnemore wrote:

Hi Magnus,

Yes, this is the situation that I have been thinking about. Specifically
when a single stored procedure returns many recordsets.

IIRC support for returning multiple recordsets was removed from
postgresql fe-be protocol years ago as "nobody ever needs it" ;)

Until recently I would have said the same thing.

A couple of months ago I started a new job where they are MS-SQL based,
and they do have several procedures that actually do return multiple
recordsets from a single procedure. However this is the only time that
I've ever seen any procedure actually require it.

Peter

#12Hannu Krosing
hannu@tm.ee
In reply to: Peter T Mount (#11)
Re: SQL-Invoked Procedures for 8.1

On N, 2004-09-23 at 15:22, Peter Mount wrote:

Hannu Krosing wrote:

On N, 2004-09-23 at 11:34, Grant Finnemore wrote:

Hi Magnus,

Yes, this is the situation that I have been thinking about. Specifically
when a single stored procedure returns many recordsets.

IIRC support for returning multiple recordsets was removed from
postgresql fe-be protocol years ago as "nobody ever needs it" ;)

Until recently I would have said the same thing.

A couple of months ago I started a new job where they are MS-SQL based,
and they do have several procedures that actually do return multiple
recordsets from a single procedure. However this is the only time that
I've ever seen any procedure actually require it.

Actually the original support was inhetited from Postgres4.2 where
PostQuel had native support for it.

It was probably decided that SQL (which replaced PostQuel) would not
generate such things.

------------
Hannu

#13Peter T Mount
peter@retep.org.uk
In reply to: Hannu Krosing (#12)
Re: SQL-Invoked Procedures for 8.1

Hannu Krosing wrote:

On N, 2004-09-23 at 15:22, Peter Mount wrote:

Hannu Krosing wrote:

On N, 2004-09-23 at 11:34, Grant Finnemore wrote:

Hi Magnus,

Yes, this is the situation that I have been thinking about. Specifically
when a single stored procedure returns many recordsets.

IIRC support for returning multiple recordsets was removed from
postgresql fe-be protocol years ago as "nobody ever needs it" ;)

Until recently I would have said the same thing.

A couple of months ago I started a new job where they are MS-SQL based,
and they do have several procedures that actually do return multiple
recordsets from a single procedure. However this is the only time that
I've ever seen any procedure actually require it.

Actually the original support was inhetited from Postgres4.2 where
PostQuel had native support for it.

It was probably decided that SQL (which replaced PostQuel) would not
generate such things.

I'm not so sure as both JDBC & ODBC have implicit support for it and
they are younger than SQL.

ie: with Prepared/Callable statements in JDBC you are supposed to check
for the existence of any other ResultSets when reading the results...
this was what I had to do with CallableStatement last week with MSSQL.

Peter

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Hannu Krosing (#10)
Re: SQL-Invoked Procedures for 8.1

Hannu Krosing <hannu@tm.ee> writes:

IIRC support for returning multiple recordsets was removed from
postgresql fe-be protocol years ago as "nobody ever needs it" ;)

The protocol can still do it, and so can the backend, but it will
certainly break most if not all clients. Here's an example:

regression=# create table surprise(f1 text);
CREATE TABLE
regression=# create rule r1 as on insert to surprise do
regression-# ( select 'hello' ; select 'how are you' );
CREATE RULE
regression=# insert into surprise values ('boo');
?column?
-------------
how are you
(1 row)

regression=#

The 'hello' result was in fact computed and sent by the backend, but it
was discarded in libpq (see the documentation about PQexec: only the
last resultset returned by a querystring is returned to the caller).
psql could have printed both results, but it would need to use
PQsendQuery/PQgetResult instead of PQexec.

regards, tom lane

#15Joe Conway
mail@joeconway.com
In reply to: Gavin Sherry (#9)
Re: SQL-Invoked Procedures for 8.1

Gavin Sherry wrote:

Do you have any idea about databases returning result sets from SQL
procedures (ie, not functions).

As other's have pointed out, this is very common in the MS SQL Server
world (and I believe Sysbase also supports it). It works like:

<begin proc def>

select * from something
...
select * from somethingelse
...

<end proc def>

We get requests for this kind of functionality at least a couple of
times a month, and although it's been a few years since I mucked with
MSSQL, I found it to be very useful in a number of different circumstances.

It is only workable because stored procedures cannot participate in
normal SELECT statements. In MSSQL you would do something like:

exec sp_my_multiresultset_proc
GO
-- or --
sp_my_multiresultset_proc
GO

so the analogy to your stored procedure proposal holds:

call sp_my_multiresultset_proc();
-- or --
sp_my_multiresultset_proc();

I had always envisioned implementing this by projecting tuples directly
the way that SHOW ALL or EXPLAIN ANALYZE do. See, e.g.
ShowAllGUCConfig() in guc.c.

Joe

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gavin Sherry (#1)
Re: SQL-Invoked Procedures for 8.1

Gavin Sherry <swm@linuxworld.com.au> writes:

3) Procedures can be run in the same savepoint level as the caller when
OLD SAVEPOINT LEVEL is specified at creation time. According to SQL2003,
functions must be run on a new savepoint level. From my understanding, we
do not do this currently.

It's irrelevant since we don't allow functions to call SAVEPOINT/RELEASE/
ROLLBACK TO explicitly, and probably won't do so anytime soon. The only
thing we can really manage for a function is constrained use of
subtransactions a la plpgsql exceptions. This doesn't require the
savepoints to be named at all, so savepoint levels need not enter into it.

This makes the difference between procedures and functions quite
superficial: procedures are functions which return void and have parameter
modes.

If you implement it that way I think it'll be very largely a waste of
effort :-(. What you're talking about seems mere syntactic sugar and
not a fundamental advance in capability.

What I'd like to see is a "procedure" capability which is somehow
outside the transaction system and can therefore invoke BEGIN, COMMIT,
SAVEPOINT, etc. I have no immediate ideas about how to do this, but
I think that's what people are really after when they ask for
server-side procedures. They want to be able, for example, to have
a procedure encapsulate an abort-and-retry loop around a serializable
transaction. (It'd be great if we could do that in a function, but
I haven't thought of a way to make it work.)

I concur with Grant Finnemore's objection as well: people expect
procedures to be able to return resultsets, ie SETOF something,
not only scalar values. Whether this is what SQL2003 says is not
really the issue -- we have to look at what's out there in competing
products.

regards, tom lane

#17Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#16)
Re: SQL-Invoked Procedures for 8.1

Gavin, Neil,

Following is a proposal to implement what SQL2003 calls 'SQL-Invoked
Procedures' and what most people refer to as stored procedures. Fujitsu
will be funding Neil Conway and I to work on this feature.

Which, by the way, is way keen.

My comments are based on having professionally written several hundred
thousand lines of procedural code for PostgreSQL, SQL Server, and Oracle. I
believe that your interpretation of the spec is correct but that there are
several things not covered by the spec, but implemented by other RDBMSes,
which make stored procedures *useful* which have been omitted. I feel
strongly that these things will make a large difference to people thinking of
migrating to PostgreSQL from other DBMSes, and want to make sure that Neil's
implementation does not make them harder, instead of easier, to to implement
later.

Procedures are nearly identical to functions.

IMHO, this is largely because the spec regards a great deal of SP
functionality to be "implementation-defined", and is thus kept as vague as
possible. In practice, other DBMSes which have both SPs and Functions treat
them *very* differently.

3) Procedures can be run in the same savepoint level as the caller when
OLD SAVEPOINT LEVEL is specified at creation time. According to SQL2003,
functions must be run on a new savepoint level. From my understanding, we
do not do this currently.

Work will focus on 1 and 2 until we have the concept of savepoint levels
with functions. Its possible that we will implement this too if there is
demand.

One of the things which differentiates SPs on other DBs from PostgreSQL
Functions is transactionality. In SQL Server and Oracle, SPs are not
automatically a transaction; instead, they contain transactions within them.
This is vitally important to DBAs who want to use SPs to automate database
maintenance, loads, transformations, and other activities which require
checkpointing within the course of a program.

For example, I run a nightly data transformation for one client which requires
16 steps with VACUUMs, ANALYZEs and error-handling between them.
Currently, the only way I can implement this for PostgreSQL is to have an
external program (Perl, in my case) manage this and call each step as a
separate function. It would be far easier to manage if I could put all of
the steps, including the vaccums inside one long-running SP, but the required
transaction container prevents this.

If we go down the route of saying that procedures are a type of function,
we have the option of allowing users access to OUT and INOUT in functions.
This would make procedures simply a subset of functions. What do people
think?

Well, to be frank, my first thought is, why bother? If you're just going to
implement some syntatic sugar on top of the current Function feature, why
bother at all?

Given the opportunity, I would far prefer to set us on a road that would allow
us to replicate -- and exceed -- the functionality of Oracle's PL/SQL. This
proposal does not do that; in fact, if someone were to start implementing
such functionality later they might find this code a stumbling block.

There will be cases when we need to identify whether a routine is a
function or a procedure. This could be done two ways. We could say that
any proc in pg_proc which returns void is a procedure or we could store
this in some 'protype' column. Thoughts?

Well, see my thoughts above on differentiating SPs from Functions. I
certainly don't think we should be using the same table.

A second point, which I brought up with you on IRC, is to eliminate
overloading and allow named parameter calls on SPs. This is extremely
useful functionality in T-SQL and PL/SQL; in fact, I'd say that it's
essential for any operation that wants to create an SP-centric middleware as
only named parameter calls allow developers to add parameters to existing
procedures without breaking existing calls.

For anyone who doesn't know what I'm talking about, it's this form:

CREATE PROCEDURE do_some_work ( alpha INT, beta INT, gamma TEXT )
etc.

Where you can:
CALL do_some_work( alpha = 5, beta = 7 )
.. and then gamma picks up its default, if any, or even:
CALL do_some_work( gamma = 'Lord Fontleroy', beta = 7, alpha = 1 )

The idea being that for SPs, schema.name is unique regardless of the
parameters. Even if implementing named parameter calls is beyond the
current spec, I will argue strongly in favor of eliminating overloading for
SPs. Overloading serves no purpose for them and prohibits the whole concept
of default values.

Other databases, and SQL2003, support a few different implementations what
could be called variables. In SQL2003, there are a few types of
'variables':

I completely follow your thinking about variables, and find it makes sense.

SET VARIABLE <varname> = <val>
SHOW VARIABLE <varname>

The problem with this is name collisions with the GUC -- especially as it now
allows add-ins to create their own GUC variables. However intuitive SET and
SHOW are, they will lead to problems. Maybe SETVAR and SHOWVAR? Or using
the PL/pgsql assignment operator for the first, and DISPLAY for the second,
e.g.:

num_logos := 917;
DISPLAY num_logos;

Also, you do realize that this is going to lead to requests for SELECT ....
INTO on the SQL command line, don't you?

The other option is that we do it at the protocol level and modify libpq
(and psql) to add support. This would allow us something like:

This may be a better approach. I've personally never been comfortable with
the use of variables outside of SPs and packages; it seems orthagonal to the
declaritive nature of SQL. However, this is a aesthic thing and not really
based on practical considerations.

The more practical consideration is, where will OUT and INOUT parameters be
used? Do we want them returned to the SQL session or directly to the
calling client? I would think that practicality would argue in favor of the
latter; I can't see needing variables in SQL except for testing, and having
them in psql will allow me that.

That might need some massaging but you get the idea. The commands to psql
translate to lower level protocol level commands which are: create
variable, set (ie, modify the value of) variable and describe variable.
Neil thinks we should have create and set in one step. This is great, since
it covers most cases, but I'm not sure how we have uninitialised values.
Perhaps we set to NULL?

Yes.

The only other question (that I can think of now) with respect to
variables is how they are affected by transactions. My gut feel is
that they should have transactional semantics.  For example:

I agree strongly with this, especially since we'll be using Savepoints inside
the SPs. Having one's data mods roll back but not the variable values would
be confusing and lead to a *lot* of debugging.

We can use the same permissions as for functions.

Agreed.

--
Josh Berkus
Aglio Database Solutions
San Francisco

#18Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Tom Lane (#16)
Re: SQL-Invoked Procedures for 8.1

On Thu, Sep 23, 2004 at 12:40:41PM -0400, Tom Lane wrote:

What I'd like to see is a "procedure" capability which is somehow
outside the transaction system and can therefore invoke BEGIN, COMMIT,
SAVEPOINT, etc. I have no immediate ideas about how to do this, but
I think that's what people are really after when they ask for
server-side procedures. They want to be able, for example, to have
a procedure encapsulate an abort-and-retry loop around a serializable
transaction. (It'd be great if we could do that in a function, but
I haven't thought of a way to make it work.)

I don't think we can do that in a standard function, at least not
without a lot of work. If we think of procedures as different from
functions, however, it seems doable.

What's needed for this is to isolate the transaction-initiating code
from the main query-processing loop. So for CALL statements it wouldn't
be invoked, and the procedure would be able to use its own explicit
transaction blocks and savepoints.

This part is not hard to do at all. It can be handled from the parser,
I think.

What's harder is handling the execution code. If the procedure uses
SPI, we need a mechanism to keep its SPI state, outside the normal
transaction-bound SPI stack.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"And as an added bonus, now my computer goes to the toilet for me, leaving me
free to spend time on more useful activities! yay slug codefests!" (C. Parker)

#19Oliver Jowett
oliver@opencloud.com
In reply to: Tom Lane (#14)
Re: SQL-Invoked Procedures for 8.1

Tom Lane wrote:

regression=# create rule r1 as on insert to surprise do
regression-# ( select 'hello' ; select 'how are you' );
CREATE RULE

The 'hello' result was in fact computed and sent by the backend, but it
was discarded in libpq (see the documentation about PQexec: only the
last resultset returned by a querystring is returned to the caller).
psql could have printed both results, but it would need to use
PQsendQuery/PQgetResult instead of PQexec.

Yikes. I thought this couldn't happen under the V3 extended query protocol.

The JDBC driver currently sends Describe/Execute and expects exactly one
of RowDescription/NoData followed by zero or more DataRows followed by
one of CommandComplete/EmptyQueryResponse/PortalSuspended. This seems
wrong if there could be multiple resultsets from a single Execute.

How can clients distinguish multiple resultsets if they're using the
extended query protocol?

-O

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Oliver Jowett (#19)
Re: SQL-Invoked Procedures for 8.1

Oliver Jowett <oliver@opencloud.com> writes:

The JDBC driver currently sends Describe/Execute and expects exactly one
of RowDescription/NoData followed by zero or more DataRows followed by
one of CommandComplete/EmptyQueryResponse/PortalSuspended. This seems
wrong if there could be multiple resultsets from a single Execute.

This is okay if you know the query was a SELECT, since we don't allow
ON SELECT rules that aren't view-like (ie, DO INSTEAD SELECT something-else).
Non-SELECT queries can return multiple result sets, though, as I
illustrated. It's probably reasonable for the driver to throw these
away if it's not looking for a resultset at all. Or you could follow
PQexec's lead and discard all but the last.

How can clients distinguish multiple resultsets if they're using the
extended query protocol?

You'll get multiple repetitions of RowDescription/DataRows.

regards, tom lane

#21Oliver Jowett
oliver@opencloud.com
In reply to: Tom Lane (#20)
#22Bruce Momjian
bruce@momjian.us
In reply to: Alvaro Herrera (#18)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Oliver Jowett (#21)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#22)
#25Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#24)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#25)
#27Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Josh Berkus (#17)
#28Oliver Jowett
oliver@opencloud.com
In reply to: Tom Lane (#23)
#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Oliver Jowett (#28)
#30Maarten Boekhold
boekhold@emirates.net.ae
In reply to: Joe Conway (#15)
#31Peter T Mount
peter@retep.org.uk
In reply to: Tom Lane (#26)
#32Peter T Mount
peter@retep.org.uk
In reply to: Maarten Boekhold (#30)
#33Neil Conway
neilc@samurai.com
In reply to: Josh Berkus (#17)
#34Neil Conway
neilc@samurai.com
In reply to: Joe Conway (#15)
#35Neil Conway
neilc@samurai.com
In reply to: Alvaro Herrera (#18)
#36Neil Conway
neilc@samurai.com
In reply to: Tom Lane (#16)
#37Maarten Boekhold
boekhold@emirates.net.ae
In reply to: Neil Conway (#34)
#38Bruce Momjian
bruce@momjian.us
In reply to: Peter T Mount (#31)
#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: Neil Conway (#36)
#40Tom Lane
tgl@sss.pgh.pa.us
In reply to: Neil Conway (#33)
#41Joe Conway
mail@joeconway.com
In reply to: Neil Conway (#34)
#42Joshua D. Drake
jd@commandprompt.com
In reply to: Joe Conway (#41)
#43Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tom Lane (#39)
#44Neil Conway
neilc@samurai.com
In reply to: Neil Conway (#35)
#45Gavin Sherry
swm@linuxworld.com.au
In reply to: Josh Berkus (#17)
#46Gavin Sherry
swm@linuxworld.com.au
In reply to: Tom Lane (#16)
#47Gavin Sherry
swm@linuxworld.com.au
In reply to: Jim Nasby (#27)
#48Gavin Sherry
swm@linuxworld.com.au
In reply to: Joe Conway (#15)
#49Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gavin Sherry (#46)
#50Joe Conway
mail@joeconway.com
In reply to: Gavin Sherry (#48)
#51Josh Berkus
josh@agliodbs.com
In reply to: Gavin Sherry (#47)
#52Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Neil Conway (#35)
#53Gavin Sherry
swm@linuxworld.com.au
In reply to: Tom Lane (#49)
#54Gavin Sherry
swm@linuxworld.com.au
In reply to: Joe Conway (#50)
#55Gavin Sherry
swm@linuxworld.com.au
In reply to: Josh Berkus (#51)
#56Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Gavin Sherry (#1)
#57Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Jim Nasby (#56)
#58Andrew Dunstan
andrew@dunslane.net
In reply to: Jim Nasby (#56)
#59Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Alvaro Herrera (#57)
#60Alvaro Herrera
alvherre@dcc.uchile.cl
In reply to: Jim Nasby (#59)
#61Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#60)
#62Gavin Sherry
swm@linuxworld.com.au
In reply to: Josh Berkus (#17)
#63Josh Berkus
josh@agliodbs.com
In reply to: Gavin Sherry (#62)
#64Gavin Sherry
swm@linuxworld.com.au
In reply to: Josh Berkus (#63)
#65Peter Eisentraut
peter_e@gmx.net
In reply to: Gavin Sherry (#64)
#66Tom Lane
tgl@sss.pgh.pa.us
In reply to: Gavin Sherry (#64)
#67Andrew Dunstan
andrew@dunslane.net
In reply to: Peter Eisentraut (#65)
#68Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#66)
#69Gaetano Mendola
mendola@bigfoot.com
In reply to: Gavin Sherry (#64)
#70Gaetano Mendola
mendola@bigfoot.com
In reply to: Tom Lane (#66)
#71Gavin Sherry
swm@linuxworld.com.au
In reply to: Gaetano Mendola (#69)
#72Gavin Sherry
swm@linuxworld.com.au
In reply to: Tom Lane (#66)
#73Gavin Sherry
swm@linuxworld.com.au
In reply to: Peter Eisentraut (#65)
#74Gaetano Mendola
gmendola@mbigroup.it
In reply to: Gavin Sherry (#71)
#75Gavin Sherry
swm@linuxworld.com.au
In reply to: Gavin Sherry (#72)
#76Josh Berkus
josh@agliodbs.com
In reply to: Gavin Sherry (#72)
#77Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#76)
#78Gavin Sherry
swm@linuxworld.com.au
In reply to: Tom Lane (#77)
#79Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#76)
#80Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#66)
#81Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#80)
#82Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#81)
#83Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Josh Berkus (#63)
#84Josh Berkus
josh@agliodbs.com
In reply to: Jim Nasby (#83)