C function migration from 9.2 to 9.5

Started by Michael Omotayo Akindeabout 10 years ago7 messagesgeneral
Jump to latest
#1Michael Omotayo Akinde
michaeloa@met.no

Hi,

We've been having a Postgresql database with some custom C functionality
happily running for many years now. It's been running on 9.2, and we wish
to upgrade this to the latest version. However, we're seeing some issues
with the database process crashing each time.

A simplified, minimal example of the stuff that seems to get us into
trouble:

SQL definitions:
----

CREATE TYPE sessionData AS ( a int4, b int4, c int4 );

CREATE OR REPLACE FUNCTION
__WCI_SCHEMA__.getSessionData
()
RETURNS sessionData AS
'db_libdir/db_lib', 'session_get'
LANGUAGE 'C' STABLE;

----
Function definition:
----

PG_FUNCTION_INFO_V1 (session_get);
Datum session_get(PG_FUNCTION_ARGS) {
Datum ret = packSessionData( 0, 0, 0, fcinfo);
return ret;
}

Datum packSessionData( int a, int b, int c, FunctionCallInfo fcinfo )
{
TupleDesc td;
if ( get_call_result_type( fcinfo, NULL, & td ) != TYPEFUNC_COMPOSITE )
{
ereport( ERROR,
( errcode( ERRCODE_DATA_EXCEPTION ),
errmsg( "\'packSessionData\': Function returning record
called in context that cannot accept type record" ) ) );
}
td = BlessTupleDesc( td );

Datum * ret = ( Datum * ) palloc( 3 * sizeof( Datum ) );
bool isNull[ 3 ] = {false, false, false};

ret[ 0 ] = Int32GetDatum( a );
ret[ 1 ] = Int32GetDatum( b );
ret[ 2 ] = Int32GetDatum( c );

HeapTuple ht = ( HeapTuple ) heap_form_tuple( td, ret, isNull );
return HeapTupleGetDatum( ht );
}

----

Running "select getSessionData()" with this code crashes the database.
Pretty much identical code worked fine in 9.2; it's only from 9.3+ we run
into trouble. I'm sure we're overseeing something completely basic that has
changed in 9.3, but having trouble seeing what it is. A little help would
be appreciated.

Regards,

Michael A.

#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Michael Omotayo Akinde (#1)
Re: C function migration from 9.2 to 9.5

Hi

2016-03-03 15:12 GMT+01:00 Michael Omotayo Akinde <michaeloa@met.no>:

Hi,

We've been having a Postgresql database with some custom C functionality
happily running for many years now. It's been running on 9.2, and we wish
to upgrade this to the latest version. However, we're seeing some issues
with the database process crashing each time.

A simplified, minimal example of the stuff that seems to get us into
trouble:

It is strange. I tested your code, and it is working without any problems

compiled by gcc on Linux 64bit

Regards

Pavel

Show quoted text

SQL definitions:
----

CREATE TYPE sessionData AS ( a int4, b int4, c int4 );

CREATE OR REPLACE FUNCTION
__WCI_SCHEMA__.getSessionData
()
RETURNS sessionData AS
'db_libdir/db_lib', 'session_get'
LANGUAGE 'C' STABLE;

----
Function definition:
----

PG_FUNCTION_INFO_V1 (session_get);
Datum session_get(PG_FUNCTION_ARGS) {
Datum ret = packSessionData( 0, 0, 0, fcinfo);
return ret;
}

Datum packSessionData( int a, int b, int c, FunctionCallInfo fcinfo )
{
TupleDesc td;
if ( get_call_result_type( fcinfo, NULL, & td ) != TYPEFUNC_COMPOSITE )
{
ereport( ERROR,
( errcode( ERRCODE_DATA_EXCEPTION ),
errmsg( "\'packSessionData\': Function returning record
called in context that cannot accept type record" ) ) );
}
td = BlessTupleDesc( td );

Datum * ret = ( Datum * ) palloc( 3 * sizeof( Datum ) );
bool isNull[ 3 ] = {false, false, false};

ret[ 0 ] = Int32GetDatum( a );
ret[ 1 ] = Int32GetDatum( b );
ret[ 2 ] = Int32GetDatum( c );

HeapTuple ht = ( HeapTuple ) heap_form_tuple( td, ret, isNull );
return HeapTupleGetDatum( ht );
}

----

Running "select getSessionData()" with this code crashes the database.
Pretty much identical code worked fine in 9.2; it's only from 9.3+ we run
into trouble. I'm sure we're overseeing something completely basic that has
changed in 9.3, but having trouble seeing what it is. A little help would
be appreciated.

Regards,

Michael A.

#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#2)
Re: C function migration from 9.2 to 9.5

2016-03-03 16:06 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

Hi

2016-03-03 15:12 GMT+01:00 Michael Omotayo Akinde <michaeloa@met.no>:

Hi,

We've been having a Postgresql database with some custom C functionality
happily running for many years now. It's been running on 9.2, and we wish
to upgrade this to the latest version. However, we're seeing some issues
with the database process crashing each time.

A simplified, minimal example of the stuff that seems to get us into
trouble:

It is strange. I tested your code, and it is working without any problems

try to recheck your development environment, maybe you mix new and old
libraries, caches, ...

Show quoted text

compiled by gcc on Linux 64bit

Regards

Pavel

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Omotayo Akinde (#1)
Re: C function migration from 9.2 to 9.5

Michael Omotayo Akinde <michaeloa@met.no> writes:

We've been having a Postgresql database with some custom C functionality
happily running for many years now. It's been running on 9.2, and we wish
to upgrade this to the latest version. However, we're seeing some issues
with the database process crashing each time.

Like Pavel, I can't see anything wrong with that code --- it's not quite
according to PG project style, but it certainly looks like it does what
it needs to. I think he's right to suspect some inconsistency in your
coding environment. One concrete idea worth considering is that maybe
you are compiling with headers that postdate commit 3f8c8e3c6 et al and
trying to use the code in a database that predates that. That'd result
in successfully compiling a call to a nonexistent core function, which
might end up as a crash depending on what your dynamic linker does
about it.

What exactly does the crash look like --- anything interesting in the
postmaster log? (If your logging setup fails to capture postmaster
stderr, now would be a good time to fix that.) Have you tried to
get a back-trace with gdb?

regards, tom lane

PS: for reference, this is the patch I'm wondering about:

Author: Tom Lane <tgl@sss.pgh.pa.us>
Branch: master Release: REL9_4_BR [3f8c8e3c6] 2014-05-01 15:19:06 -0400
Branch: REL9_3_STABLE Release: REL9_3_5 [b72e90bc3] 2014-05-01 15:19:10 -0400
Branch: REL9_2_STABLE Release: REL9_2_9 [8c43980a1] 2014-05-01 15:19:14 -0400
Branch: REL9_1_STABLE Release: REL9_1_14 [db1fdc945] 2014-05-01 15:19:17 -0400
Branch: REL9_0_STABLE Release: REL9_0_18 [7a4f114f3] 2014-05-01 15:19:20 -0400
Branch: REL8_4_STABLE Release: REL8_4_22 [70debcf09] 2014-05-01 15:19:23 -0400

Fix failure to detoast fields in composite elements of structured types.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Michael Omotayo Akinde
michaeloa@met.no
In reply to: Tom Lane (#4)
Re: C function migration from 9.2 to 9.5

Thanks for the responses.

I currently have Postgres installed through apt-get, so I don't think it is
old libraries/header files. Just to double check, I spun up a "blank" VM
with Ubuntu 14.04 (Trusty), which is what we're currently running in our
dev environment, and apt-get installed postgresql-9.3 and
postgresql-server-dev-9.3 (v 9.3.11). I've tried this with 9.5 (from the
PGDG repos) as well, with the same result.

Copied over the two attached files (test.c and test.sql) and basically ran:

createdb test

gcc -I/usr/include/postgresql/9.5/server -fPIC -c test.c

This throws a compiler warning on the cast from heap_form_tuple to
HeapTuple, but IIRC it's always done that so not an error?

gcc -shared -o test.so test.o

psql -d test -f test.sql

CREATE TYPE
CREATE FUNCTION
psql:test.sql:9: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
psql:test.sql:9: connection to server was lost

The logs don't seem all that useful:
2016-03-03 23:03:13 UTC STATEMENT: CREATE TYPE sessionType AS ( a int4, b
int4, c int4);
2016-03-03 23:03:13 UTC LOG: server process (PID 9988) was terminated by
signal 11: Segmentation fault
2016-03-03 23:03:13 UTC DETAIL: Failed process was running: SELECT
sessionData();
2016-03-03 23:03:13 UTC LOG: terminating any other active server processes
2016-03-03 23:03:13 UTC WARNING: terminating connection because of crash
of another server process
2016-03-03 23:03:13 UTC DETAIL: The postmaster has commanded this server
process to roll back the current transaction and exit, because another
server process exited abnormally and possibly corrupted shared memory.
2016-03-03 23:03:13 UTC HINT: In a moment you should be able to reconnect
to the database and repeat your command.
2016-03-03 23:03:13 UTC LOG: all server processes terminated;
reinitializing

I haven't attempted to run the stack backtrace with gdb yet (requires a bit
more setup), but that will be the next step if you guys can't see any
obvious mistakes that I've made.

Regards,

Michael A.

On Thu, Mar 3, 2016 at 5:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Michael Omotayo Akinde <michaeloa@met.no> writes:

We've been having a Postgresql database with some custom C functionality
happily running for many years now. It's been running on 9.2, and we wish
to upgrade this to the latest version. However, we're seeing some issues
with the database process crashing each time.

Like Pavel, I can't see anything wrong with that code --- it's not quite
according to PG project style, but it certainly looks like it does what
it needs to. I think he's right to suspect some inconsistency in your
coding environment. One concrete idea worth considering is that maybe
you are compiling with headers that postdate commit 3f8c8e3c6 et al and
trying to use the code in a database that predates that. That'd result
in successfully compiling a call to a nonexistent core function, which
might end up as a crash depending on what your dynamic linker does
about it.

What exactly does the crash look like --- anything interesting in the
postmaster log? (If your logging setup fails to capture postmaster
stderr, now would be a good time to fix that.) Have you tried to
get a back-trace with gdb?

regards, tom lane

PS: for reference, this is the patch I'm wondering about:

Author: Tom Lane <tgl@sss.pgh.pa.us>
Branch: master Release: REL9_4_BR [3f8c8e3c6] 2014-05-01 15:19:06 -0400
Branch: REL9_3_STABLE Release: REL9_3_5 [b72e90bc3] 2014-05-01 15:19:10
-0400
Branch: REL9_2_STABLE Release: REL9_2_9 [8c43980a1] 2014-05-01 15:19:14
-0400
Branch: REL9_1_STABLE Release: REL9_1_14 [db1fdc945] 2014-05-01 15:19:17
-0400
Branch: REL9_0_STABLE Release: REL9_0_18 [7a4f114f3] 2014-05-01 15:19:20
-0400
Branch: REL8_4_STABLE Release: REL8_4_22 [70debcf09] 2014-05-01 15:19:23
-0400

Fix failure to detoast fields in composite elements of structured
types.

Attachments:

test.ctext/x-csrc; charset=US-ASCII; name=test.cDownload
test.sqlapplication/sql; name=test.sqlDownload
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Omotayo Akinde (#5)
Re: C function migration from 9.2 to 9.5

Michael Omotayo Akinde <michaeloa@met.no> writes:

This throws a compiler warning on the cast from heap_form_tuple to
HeapTuple, but IIRC it's always done that so not an error?

Uh, that's *awfully* fishy, because heap_form_tuple certainly returns
HeapTuple. I wondered why you had that cast there; it should not be
necessary. If the compiler is warning about it, that sets off all kinds
of alarm bells.

I notice your test program fails to #include "access/htup_details.h",
which is where heap_form_tuple() is declared these days. I wonder if
your problem boils down to "no visible declaration of function, so
compiler thinks it returns int"? In that case the real difference
from what worked to what didn't probably had less to do with any PG
version change and more to do with moving from 32-bit to 64-bit.
(Or I guess we might've relocated the declaration of heap_form_tuple
somewhere along the line.)

A general tip for getting C code to work is to turn on as many
compiler warnings as you can, and never ignore any of them.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#7Michael Omotayo Akinde
michaeloa@met.no
In reply to: Tom Lane (#6)
Re: C function migration from 9.2 to 9.5

Nailed it. Ensuring heap_form_tuple was properly defined resolved the issue.

Thanks a lot. Without your help, I would probably have spent more time
looking at the 32 <=> 64 bit thing first, since I was pretty certain that
we had this warning in the old compile. I suspect I am wrong about that,
though, and that the actual problem is the heap_form_tuple declaration
being relocated, as you suggest. In either case, going through the code and
ensuring that we had that include everywhere in the code seems to have
cleared up the problems (all tests running OK).

Regards,

Michael A.

On Fri, Mar 4, 2016 at 1:52 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Michael Omotayo Akinde <michaeloa@met.no> writes:

This throws a compiler warning on the cast from heap_form_tuple to
HeapTuple, but IIRC it's always done that so not an error?

Uh, that's *awfully* fishy, because heap_form_tuple certainly returns
HeapTuple. I wondered why you had that cast there; it should not be
necessary. If the compiler is warning about it, that sets off all kinds
of alarm bells.

I notice your test program fails to #include "access/htup_details.h",
which is where heap_form_tuple() is declared these days. I wonder if
your problem boils down to "no visible declaration of function, so
compiler thinks it returns int"? In that case the real difference
from what worked to what didn't probably had less to do with any PG
version change and more to do with moving from 32-bit to 64-bit.
(Or I guess we might've relocated the declaration of heap_form_tuple
somewhere along the line.)

A general tip for getting C code to work is to turn on as many
compiler warnings as you can, and never ignore any of them.

regards, tom lane