C function migration from 9.2 to 9.5
Hi,
We've been having a Postgresql database with some custom C functionality
happily running for many years now. It's been running on 9.2, and we wish
to upgrade this to the latest version. However, we're seeing some issues
with the database process crashing each time.
A simplified, minimal example of the stuff that seems to get us into
trouble:
SQL definitions:
----
CREATE TYPE sessionData AS ( a int4, b int4, c int4 );
CREATE OR REPLACE FUNCTION
__WCI_SCHEMA__.getSessionData
()
RETURNS sessionData AS
'db_libdir/db_lib', 'session_get'
LANGUAGE 'C' STABLE;
----
Function definition:
----
PG_FUNCTION_INFO_V1 (session_get);
Datum session_get(PG_FUNCTION_ARGS) {
Datum ret = packSessionData( 0, 0, 0, fcinfo);
return ret;
}
Datum packSessionData( int a, int b, int c, FunctionCallInfo fcinfo )
{
TupleDesc td;
if ( get_call_result_type( fcinfo, NULL, & td ) != TYPEFUNC_COMPOSITE )
{
ereport( ERROR,
( errcode( ERRCODE_DATA_EXCEPTION ),
errmsg( "\'packSessionData\': Function returning record
called in context that cannot accept type record" ) ) );
}
td = BlessTupleDesc( td );
Datum * ret = ( Datum * ) palloc( 3 * sizeof( Datum ) );
bool isNull[ 3 ] = {false, false, false};
ret[ 0 ] = Int32GetDatum( a );
ret[ 1 ] = Int32GetDatum( b );
ret[ 2 ] = Int32GetDatum( c );
HeapTuple ht = ( HeapTuple ) heap_form_tuple( td, ret, isNull );
return HeapTupleGetDatum( ht );
}
----
Running "select getSessionData()" with this code crashes the database.
Pretty much identical code worked fine in 9.2; it's only from 9.3+ we run
into trouble. I'm sure we're overseeing something completely basic that has
changed in 9.3, but having trouble seeing what it is. A little help would
be appreciated.
Regards,
Michael A.
Hi
2016-03-03 15:12 GMT+01:00 Michael Omotayo Akinde <michaeloa@met.no>:
Hi,
We've been having a Postgresql database with some custom C functionality
happily running for many years now. It's been running on 9.2, and we wish
to upgrade this to the latest version. However, we're seeing some issues
with the database process crashing each time.A simplified, minimal example of the stuff that seems to get us into
trouble:
It is strange. I tested your code, and it is working without any problems
compiled by gcc on Linux 64bit
Regards
Pavel
Show quoted text
SQL definitions:
----CREATE TYPE sessionData AS ( a int4, b int4, c int4 );
CREATE OR REPLACE FUNCTION
__WCI_SCHEMA__.getSessionData
()
RETURNS sessionData AS
'db_libdir/db_lib', 'session_get'
LANGUAGE 'C' STABLE;----
Function definition:
----PG_FUNCTION_INFO_V1 (session_get);
Datum session_get(PG_FUNCTION_ARGS) {
Datum ret = packSessionData( 0, 0, 0, fcinfo);
return ret;
}Datum packSessionData( int a, int b, int c, FunctionCallInfo fcinfo )
{
TupleDesc td;
if ( get_call_result_type( fcinfo, NULL, & td ) != TYPEFUNC_COMPOSITE )
{
ereport( ERROR,
( errcode( ERRCODE_DATA_EXCEPTION ),
errmsg( "\'packSessionData\': Function returning record
called in context that cannot accept type record" ) ) );
}
td = BlessTupleDesc( td );Datum * ret = ( Datum * ) palloc( 3 * sizeof( Datum ) );
bool isNull[ 3 ] = {false, false, false};ret[ 0 ] = Int32GetDatum( a );
ret[ 1 ] = Int32GetDatum( b );
ret[ 2 ] = Int32GetDatum( c );HeapTuple ht = ( HeapTuple ) heap_form_tuple( td, ret, isNull );
return HeapTupleGetDatum( ht );
}----
Running "select getSessionData()" with this code crashes the database.
Pretty much identical code worked fine in 9.2; it's only from 9.3+ we run
into trouble. I'm sure we're overseeing something completely basic that has
changed in 9.3, but having trouble seeing what it is. A little help would
be appreciated.Regards,
Michael A.
2016-03-03 16:06 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:
Hi
2016-03-03 15:12 GMT+01:00 Michael Omotayo Akinde <michaeloa@met.no>:
Hi,
We've been having a Postgresql database with some custom C functionality
happily running for many years now. It's been running on 9.2, and we wish
to upgrade this to the latest version. However, we're seeing some issues
with the database process crashing each time.A simplified, minimal example of the stuff that seems to get us into
trouble:It is strange. I tested your code, and it is working without any problems
try to recheck your development environment, maybe you mix new and old
libraries, caches, ...
Show quoted text
compiled by gcc on Linux 64bit
Regards
Pavel
Michael Omotayo Akinde <michaeloa@met.no> writes:
We've been having a Postgresql database with some custom C functionality
happily running for many years now. It's been running on 9.2, and we wish
to upgrade this to the latest version. However, we're seeing some issues
with the database process crashing each time.
Like Pavel, I can't see anything wrong with that code --- it's not quite
according to PG project style, but it certainly looks like it does what
it needs to. I think he's right to suspect some inconsistency in your
coding environment. One concrete idea worth considering is that maybe
you are compiling with headers that postdate commit 3f8c8e3c6 et al and
trying to use the code in a database that predates that. That'd result
in successfully compiling a call to a nonexistent core function, which
might end up as a crash depending on what your dynamic linker does
about it.
What exactly does the crash look like --- anything interesting in the
postmaster log? (If your logging setup fails to capture postmaster
stderr, now would be a good time to fix that.) Have you tried to
get a back-trace with gdb?
regards, tom lane
PS: for reference, this is the patch I'm wondering about:
Author: Tom Lane <tgl@sss.pgh.pa.us>
Branch: master Release: REL9_4_BR [3f8c8e3c6] 2014-05-01 15:19:06 -0400
Branch: REL9_3_STABLE Release: REL9_3_5 [b72e90bc3] 2014-05-01 15:19:10 -0400
Branch: REL9_2_STABLE Release: REL9_2_9 [8c43980a1] 2014-05-01 15:19:14 -0400
Branch: REL9_1_STABLE Release: REL9_1_14 [db1fdc945] 2014-05-01 15:19:17 -0400
Branch: REL9_0_STABLE Release: REL9_0_18 [7a4f114f3] 2014-05-01 15:19:20 -0400
Branch: REL8_4_STABLE Release: REL8_4_22 [70debcf09] 2014-05-01 15:19:23 -0400
Fix failure to detoast fields in composite elements of structured types.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Thanks for the responses.
I currently have Postgres installed through apt-get, so I don't think it is
old libraries/header files. Just to double check, I spun up a "blank" VM
with Ubuntu 14.04 (Trusty), which is what we're currently running in our
dev environment, and apt-get installed postgresql-9.3 and
postgresql-server-dev-9.3 (v 9.3.11). I've tried this with 9.5 (from the
PGDG repos) as well, with the same result.
Copied over the two attached files (test.c and test.sql) and basically ran:
createdb test
gcc -I/usr/include/postgresql/9.5/server -fPIC -c test.c
This throws a compiler warning on the cast from heap_form_tuple to
HeapTuple, but IIRC it's always done that so not an error?
gcc -shared -o test.so test.o
psql -d test -f test.sql
CREATE TYPE
CREATE FUNCTION
psql:test.sql:9: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
psql:test.sql:9: connection to server was lost
The logs don't seem all that useful:
2016-03-03 23:03:13 UTC STATEMENT: CREATE TYPE sessionType AS ( a int4, b
int4, c int4);
2016-03-03 23:03:13 UTC LOG: server process (PID 9988) was terminated by
signal 11: Segmentation fault
2016-03-03 23:03:13 UTC DETAIL: Failed process was running: SELECT
sessionData();
2016-03-03 23:03:13 UTC LOG: terminating any other active server processes
2016-03-03 23:03:13 UTC WARNING: terminating connection because of crash
of another server process
2016-03-03 23:03:13 UTC DETAIL: The postmaster has commanded this server
process to roll back the current transaction and exit, because another
server process exited abnormally and possibly corrupted shared memory.
2016-03-03 23:03:13 UTC HINT: In a moment you should be able to reconnect
to the database and repeat your command.
2016-03-03 23:03:13 UTC LOG: all server processes terminated;
reinitializing
I haven't attempted to run the stack backtrace with gdb yet (requires a bit
more setup), but that will be the next step if you guys can't see any
obvious mistakes that I've made.
Regards,
Michael A.
On Thu, Mar 3, 2016 at 5:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
Michael Omotayo Akinde <michaeloa@met.no> writes:
We've been having a Postgresql database with some custom C functionality
happily running for many years now. It's been running on 9.2, and we wish
to upgrade this to the latest version. However, we're seeing some issues
with the database process crashing each time.Like Pavel, I can't see anything wrong with that code --- it's not quite
according to PG project style, but it certainly looks like it does what
it needs to. I think he's right to suspect some inconsistency in your
coding environment. One concrete idea worth considering is that maybe
you are compiling with headers that postdate commit 3f8c8e3c6 et al and
trying to use the code in a database that predates that. That'd result
in successfully compiling a call to a nonexistent core function, which
might end up as a crash depending on what your dynamic linker does
about it.What exactly does the crash look like --- anything interesting in the
postmaster log? (If your logging setup fails to capture postmaster
stderr, now would be a good time to fix that.) Have you tried to
get a back-trace with gdb?regards, tom lane
PS: for reference, this is the patch I'm wondering about:
Author: Tom Lane <tgl@sss.pgh.pa.us>
Branch: master Release: REL9_4_BR [3f8c8e3c6] 2014-05-01 15:19:06 -0400
Branch: REL9_3_STABLE Release: REL9_3_5 [b72e90bc3] 2014-05-01 15:19:10
-0400
Branch: REL9_2_STABLE Release: REL9_2_9 [8c43980a1] 2014-05-01 15:19:14
-0400
Branch: REL9_1_STABLE Release: REL9_1_14 [db1fdc945] 2014-05-01 15:19:17
-0400
Branch: REL9_0_STABLE Release: REL9_0_18 [7a4f114f3] 2014-05-01 15:19:20
-0400
Branch: REL8_4_STABLE Release: REL8_4_22 [70debcf09] 2014-05-01 15:19:23
-0400Fix failure to detoast fields in composite elements of structured
types.
Michael Omotayo Akinde <michaeloa@met.no> writes:
This throws a compiler warning on the cast from heap_form_tuple to
HeapTuple, but IIRC it's always done that so not an error?
Uh, that's *awfully* fishy, because heap_form_tuple certainly returns
HeapTuple. I wondered why you had that cast there; it should not be
necessary. If the compiler is warning about it, that sets off all kinds
of alarm bells.
I notice your test program fails to #include "access/htup_details.h",
which is where heap_form_tuple() is declared these days. I wonder if
your problem boils down to "no visible declaration of function, so
compiler thinks it returns int"? In that case the real difference
from what worked to what didn't probably had less to do with any PG
version change and more to do with moving from 32-bit to 64-bit.
(Or I guess we might've relocated the declaration of heap_form_tuple
somewhere along the line.)
A general tip for getting C code to work is to turn on as many
compiler warnings as you can, and never ignore any of them.
regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Nailed it. Ensuring heap_form_tuple was properly defined resolved the issue.
Thanks a lot. Without your help, I would probably have spent more time
looking at the 32 <=> 64 bit thing first, since I was pretty certain that
we had this warning in the old compile. I suspect I am wrong about that,
though, and that the actual problem is the heap_form_tuple declaration
being relocated, as you suggest. In either case, going through the code and
ensuring that we had that include everywhere in the code seems to have
cleared up the problems (all tests running OK).
Regards,
Michael A.
On Fri, Mar 4, 2016 at 1:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Show quoted text
Michael Omotayo Akinde <michaeloa@met.no> writes:
This throws a compiler warning on the cast from heap_form_tuple to
HeapTuple, but IIRC it's always done that so not an error?Uh, that's *awfully* fishy, because heap_form_tuple certainly returns
HeapTuple. I wondered why you had that cast there; it should not be
necessary. If the compiler is warning about it, that sets off all kinds
of alarm bells.I notice your test program fails to #include "access/htup_details.h",
which is where heap_form_tuple() is declared these days. I wonder if
your problem boils down to "no visible declaration of function, so
compiler thinks it returns int"? In that case the real difference
from what worked to what didn't probably had less to do with any PG
version change and more to do with moving from 32-bit to 64-bit.
(Or I guess we might've relocated the declaration of heap_form_tuple
somewhere along the line.)A general tip for getting C code to work is to turn on as many
compiler warnings as you can, and never ignore any of them.regards, tom lane