c-function returning multiple rows
Hello, I encountered the following problem ( at PostgreSQL 7.1 on Solaris i386 )
with compiling c-function returning multiple rows. Here is a transcript.
+++
+++
postgres@beta:~$ cd lib/
postgres@beta:~/lib$ cat <<! >myrand.c
#include <stdlib.h>
#include "postgres.h"
#include "fmgr.h"
#include "nodes/execnodes.h"PG_FUNCTION_INFO_V1(myrand);
Datum
myrand(PG_FUNCTION_ARGS)
{
if ( 100*rand() > RAND_MAX )
{
fcinfo->resultinfo->isDone = ExprMultipleResult;
PG_RETURN_INT32( PG_GETARG_INT32(0)*rand()/RAND_MAX );
}
else
{
fcinfo->resultinfo->isDone = ExprEndResult;
PG_RETURN_NULL();
}
}
!
postgres@beta:~/lib$ gcc -I /usr/local/include/pgsql -fpic -c myrand.c
myrand.c: In function `triple':
myrand.c:13: structure has no member named `isDone'
myrand.c:18: structure has no member named `isDone'
+++
+++
I digged into sources and supposed that line 61 in fmgr.h might be 'struct ReturnSetInfo *resultinfo;'
instead of 'struct Node *resultinfo;'. But I'm not sure if it is correct.
After changing this line in file fmgr.h it became working. Here is a transcript.
+++
+++
postgres@beta:~/lib$ gcc -I /usr/local/include/pgsql -fpic -c myrand.c
postgres@beta:~/lib$ gcc -G -o myrand.so myrand.o
postgres@beta:~/lib$ psql
Welcome to psql, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? for help on internal slash commands
\g or terminate with semicolon to execute query
\q to quit
postgres=# CREATE FUNCTION myrand(int4) RETURNS SETOF int4 AS '/var/local/lib/pgsql/lib/myrand.so' LANGUAGE 'C';
CREATE
postgres=# SELECT myrand(50);
?column?
----------
26
46
30
29
40
8
22
38
23
18
2
43
24
44
22
46
48
15
(18 rows)
+++
+++
--
WBR, Alexey Nalbat
Alexey Nalbat <alexey@price.ru> writes:
I digged into sources and supposed that line 61 in fmgr.h might be
'struct ReturnSetInfo *resultinfo;' instead of 'struct Node
*resultinfo;'. But I'm not sure if it is correct.
No, it isn't. fmgr.h is correct as given, because the resultinfo
field might point at various different kinds of Nodes. You need to
do an IsA test and then a cast, instead. See the code in
src/backend/executor/functions.c for an example.
regards, tom lane
Tom, thank you very much.
I was succeeded in constructing such a test function. And now have another question.
I wrote function myarr(foo) which returns exactly 10 rows of random values in the range [0,foo).
But I also want this function to work correctly, when used in a query with limit clause, like
"select myarr(100) limit 6;". After a bit of experiments I supposed that while executing
this query postgres called myarr() seven times (not six!). And may be at the seven call
fcinfo has some_flag set to "stop_return", after checking which myarr() should do the same
as when returning 11's row with PG_RETURN_NULL and setting "isDone" to "ExprEndResult"
and resetting variables. Is it so? What are some_flag and "stop_return"?
Thanks in advance.
P.S.:
Now myarr() does not reset variables when "interrupted by limit", and because of this returns:
+++
+++
pl=# select myarr(100) limit 6;
?column?
----------
87
42
35
38
4
16
(6 rows)
pl=# select myarr(100) limit 6;
?column?
----------
69
9
40
(3 rows)
+++
+++
Here is myarr() code:
+++
+++
#include <stdlib.h>
#include "postgres.h"
#include "fmgr.h"
#include "nodes/execnodes.h"
#define N 10
int a_c[N];
int n_c=0;
int i_c=0;
PG_FUNCTION_INFO_V1(myarr);
Datum
myarr(PG_FUNCTION_ARGS)
{
int n=PG_GETARG_INT32(0);
if ( n_c!=n )
{
int j;
n_c=n;
i_c=0;
for ( j=0 ; j<N ; j++ )
{
a_c[j]=n_c*rand()/RAND_MAX;
}
}
if ( i_c<N )
{
i_c++;
((ReturnSetInfo*)fcinfo->resultinfo)->isDone=ExprMultipleResult;
PG_RETURN_INT32(a_c[i_c-1]);
}
else
{
n_c=0;
i_c=0;
((ReturnSetInfo*)fcinfo->resultinfo)->isDone=ExprEndResult;
PG_RETURN_NULL();
}
}
+++
+++
--
WBR, Alexey Nalbat
Alexey Nalbat <alexey@price.ru> writes:
But I also want this function to work correctly, when used in a query
with limit clause, like "select myarr(100) limit 6;". After a bit of
experiments I supposed that while executing this query postgres called
myarr() seven times (not six!).
Indeed. Observe the comments in nodeLimit.c:
* If we have reached the subplan EOF or the limit, just quit.
*
* NOTE: when scanning forwards, we must fetch one tuple beyond the
* COUNT limit before we can return NULL, else the subplan won't
* be properly positioned to start going backwards. Hence test
* here is for position > netlimit not position >= netlimit.
*
* Similarly, when scanning backwards, we must re-fetch the last
* tuple in the offset region before we can return NULL.
* Otherwise we won't be correctly aligned to start going forward
* again. So, although you might think we can quit when position
* equals offset + 1, we have to fetch a subplan tuple first, and
* then exit when position = offset.
Relying on static state as you are having your function do is hopelessly
unreliable anyway --- what happens if the query is aborted partway
through by some error? You'll be messed up when a new query is issued,
that's what.
I would suggest storing the cross-call state you need in a memory
block that you allocate on first call and save a pointer to in
fcinfo->flinfo->fn_extra. Strictly speaking this is an abuse of the
fn_extra feature, since the caller is not required to preserve that
across successive calls in one query, but in practice it will work.
Don't forget to do the allocation in the proper context, viz
ptr = MemoryContextAlloc(fcinfo->flinfo->fn_mcxt, sizewanted);
In this way, the state automatically goes away at end of query,
and you'll always see a NULL fcinfo->flinfo->fn_extra at first
call in a new query.
regards, tom lane