server-side extension in c++

Started by Igor Shevchenkoabout 16 years ago29 messagesgeneral
Jump to latest
#1Igor Shevchenko
igor@carcass.ath.cx

Hi All,

Is there an easy way to add c++ files to my simple pgsql module ? My Makefile
is as follows -

===
MODULES = pg_uservars
DATA_built = pg_uservars.sql
PGXS := $(shell pg_config --pgxs)
include $(PGXS)
===

I've got "pg_uservars.c" and "hv.cc" and I'd like to compile hv.cc via g++.
I'm aware of c++ name [de]mangling, just looking if there's a standard way of
using C++ when it comes to pgxs.

--
Best Regards,
Igor Shevchenko

#2Craig Ringer
craig@2ndquadrant.com
In reply to: Igor Shevchenko (#1)
Re: server-side extension in c++

Igor wrote:

Hi All,

Is there an easy way to add c++ files to my simple pgsql module ? My Makefile
is as follows -

===
MODULES = pg_uservars
DATA_built = pg_uservars.sql
PGXS := $(shell pg_config --pgxs)
include $(PGXS)
===

I've got "pg_uservars.c" and "hv.cc" and I'd like to compile hv.cc via g++.
I'm aware of c++ name [de]mangling, just looking if there's a standard way of
using C++ when it comes to pgxs.

It should "just work". Simply make sure to follow the usual rules for
calling into C++ code from C and vice versa:

- Use "extern C" linkage for all functions that must be accessible by
dlopen(), and preferably also for any functions that you might take
a function pointer to and pass to C code

- Never return new()'d memory that might be free()'d by the C code; use
malloc()

- Never delete() memory that was malloc()'d by the C code; use free()

- Never let an exception propagate into the C code; use a catch-all
block at the top level of all "extern C" functions

... and probably other things I've missed.

--
Craig Ringer

#3Bruce Momjian
bruce@momjian.us
In reply to: Craig Ringer (#2)
Re: server-side extension in c++

Craig Ringer wrote:

Igor wrote:

Hi All,

Is there an easy way to add c++ files to my simple pgsql module ? My Makefile
is as follows -

===
MODULES = pg_uservars
DATA_built = pg_uservars.sql
PGXS := $(shell pg_config --pgxs)
include $(PGXS)
===

I've got "pg_uservars.c" and "hv.cc" and I'd like to compile hv.cc via g++.
I'm aware of c++ name [de]mangling, just looking if there's a standard way of
using C++ when it comes to pgxs.

It should "just work". Simply make sure to follow the usual rules for
calling into C++ code from C and vice versa:

- Use "extern C" linkage for all functions that must be accessible by
dlopen(), and preferably also for any functions that you might take
a function pointer to and pass to C code

- Never return new()'d memory that might be free()'d by the C code; use
malloc()

- Never delete() memory that was malloc()'d by the C code; use free()

- Never let an exception propagate into the C code; use a catch-all
block at the top level of all "extern C" functions

... and probably other things I've missed.

That is great new information. I have created a new documentation
section called "Using C++ for Extensibility", and listed you as the
author in the CVS commit; patch attached. Thanks.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +

Attachments:

/rtmp/difftext/x-diffDownload+46-0
#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#3)
Re: server-side extension in c++

Bruce Momjian <bruce@momjian.us> writes:

That is great new information. I have created a new documentation
section called "Using C++ for Extensibility", and listed you as the
author in the CVS commit; patch attached. Thanks.

Too bad two out of the four pieces of advice are wrong (how many pieces
of memory managed by the backend are allocated directly with malloc?).
The other two are not wrong as far as they go, but they're certainly
woefully inadequate, because no interesting backend extension is going
to be able to get along without calling back into the core code.

Personally I would reduce this section to

<para>
Don't.
</para>

I don't think it is worth our time to try to support people who run into
the inevitable memory management and error handling incompatibilities.
Nor are they likely to be happy at the end of the experience, if we
blithely tell them up front that it'll work.

regards, tom lane

#5Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#4)
Re: server-side extension in c++

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

That is great new information. I have created a new documentation
section called "Using C++ for Extensibility", and listed you as the
author in the CVS commit; patch attached. Thanks.

Too bad two out of the four pieces of advice are wrong (how many pieces
of memory managed by the backend are allocated directly with malloc?).
The other two are not wrong as far as they go, but they're certainly
woefully inadequate, because no interesting backend extension is going
to be able to get along without calling back into the core code.

Good point. I assumed others would chime in to improve this.

Personally I would reduce this section to

<para>
Don't.
</para>

I don't think it is worth our time to try to support people who run into
the inevitable memory management and error handling incompatibilities.
Nor are they likely to be happy at the end of the experience, if we
blithely tell them up front that it'll work.

Well, I would have avoided this mine-trap except we have this 9.0
release note item:

Allow use of <productname>C++</> functions in backend code (Kurt
Harriman, Peter Eisentraut)

I figure if we don't provide some guidance, things will be even worse.

I have updated the docs to mention palloc/pfree instead; applied patch
attached.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +

Attachments:

/rtmp/difftext/x-diffDownload+4-4
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#5)
Re: server-side extension in c++

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

Personally I would reduce this section to
Don't.

Well, I would have avoided this mine-trap except we have this 9.0
release note item:
Allow use of <productname>C++</> functions in backend code (Kurt
Harriman, Peter Eisentraut)

I'd be interested to see a section like this written by someone who'd
actually done a nontrivial C++ extension and lived to tell the tale.
As is, this is so incomplete that my opinion is it's worse than useless.
It gives people the impression that writing an extension in C++ will
be easy. When they find out it isn't, we'll get the blame.

regards, tom lane

#7Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#6)
Re: server-side extension in c++

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

Personally I would reduce this section to
Don't.

Well, I would have avoided this mine-trap except we have this 9.0
release note item:
Allow use of <productname>C++</> functions in backend code (Kurt
Harriman, Peter Eisentraut)

I'd be interested to see a section like this written by someone who'd
actually done a nontrivial C++ extension and lived to tell the tale.
As is, this is so incomplete that my opinion is it's worse than useless.
It gives people the impression that writing an extension in C++ will
be easy. When they find out it isn't, we'll get the blame.

So should I just comment it out and then when someone gets serious we
can use it as a starting point for them?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#7)
Re: server-side extension in c++

Bruce Momjian <bruce@momjian.us> writes:

Well, I would have avoided this mine-trap except we have this 9.0
release note item:
Allow use of <productname>C++</> functions in backend code (Kurt
Harriman, Peter Eisentraut)

So should I just comment it out and then when someone gets serious we
can use it as a starting point for them?

Sure. While you're at it, tone down the release-note item. It should
read more like "Take some steps towards allowing use ...", because C++
keywords in the header files surely were not the only stumbling block.

regards, tom lane

#9Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#8)
Re: server-side extension in c++

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Well, I would have avoided this mine-trap except we have this 9.0
release note item:
Allow use of <productname>C++</> functions in backend code (Kurt
Harriman, Peter Eisentraut)

So should I just comment it out and then when someone gets serious we
can use it as a starting point for them?

Sure. While you're at it, tone down the release-note item. It should
read more like "Take some steps towards allowing use ...", because C++
keywords in the header files surely were not the only stumbling block.

OK, done with attached, applied patch.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +

Attachments:

/rtmp/difftext/x-diffDownload+13-10
#10Craig Ringer
craig@2ndquadrant.com
In reply to: Tom Lane (#4)
Re: server-side extension in c++

On 01/06/10 10:48, Tom Lane wrote:

Too bad two out of the four pieces of advice are wrong (how many pieces
of memory managed by the backend are allocated directly with malloc?).
The other two are not wrong as far as they go, but they're certainly
woefully inadequate, because no interesting backend extension is going
to be able to get along without calling back into the core code.

It's a lot like mixing C++ with Symbian's longjump-based error handling.
It's possible, just ugly, and requires error-handling boundaries to be
carefully thought out.

Rather than saying "don't mix new/delete and malloc/free" I should've
said "always be sure to release memory with the matching function to
that which allocated it", thus covering palloc too. Not that you
generally need to worry too much about palloc'd memory.

Personally I would reduce this section to

<para>
Don't.
</para>

Sometimes you need or want to expose capabilities of a C++ library. So
long as you do so with proper encapsulation of the C++ functionality, so
that the only interfaces Pg sees are C, there's really no problem.

Nor are they likely to be happy at the end of the experience, if we
blithely tell them up front that it'll work.

I've had no issues using C++ libraries in Pg server-side code. It *does*
work. You just need to be careful where your error-handling and memory
management style boundaries lie.

--
Craig Ringer

#11Craig Ringer
craig@2ndquadrant.com
In reply to: Tom Lane (#6)
Re: server-side extension in c++

On 01/06/10 11:05, Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

Personally I would reduce this section to
Don't.

Well, I would have avoided this mine-trap except we have this 9.0
release note item:
Allow use of <productname>C++</> functions in backend code (Kurt
Harriman, Peter Eisentraut)

I'd be interested to see a section like this written by someone who'd
actually done a nontrivial C++ extension and lived to tell the tale.

I can't speak up there - my own C++/Pg backend stuff has been fairly
trivial, and has been where I can maintain a fairly clean separation of
the C++-exposed and the Pg-backend-exposed parts. I was able to keep
things separate enough that my C++ compilation units didn't include the
Pg backend headers; they just exposed a pure C public interface. The Pg
backend-using compilation units were written in C, and talked to the C++
part over its exposed pure C interfaces.

This was very much pain-free, but I certainly wouldn't want to try to
use C++ code tightly intermixed with Pg backend-using code. It'd be a
nightmare.

--
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

#12David Fetter
david@fetter.org
In reply to: Craig Ringer (#11)
Re: server-side extension in c++

On Tue, Jun 01, 2010 at 02:13:02PM +0800, Craig Ringer wrote:

On 01/06/10 11:05, Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

Personally I would reduce this section to
Don't.

Well, I would have avoided this mine-trap except we have this 9.0
release note item:
Allow use of <productname>C++</> functions in backend code (Kurt
Harriman, Peter Eisentraut)

I'd be interested to see a section like this written by someone
who'd actually done a nontrivial C++ extension and lived to tell
the tale.

I can't speak up there - my own C++/Pg backend stuff has been fairly
trivial, and has been where I can maintain a fairly clean separation
of the C++-exposed and the Pg-backend-exposed parts. I was able to
keep things separate enough that my C++ compilation units didn't
include the Pg backend headers; they just exposed a pure C public
interface. The Pg backend-using compilation units were written in C,
and talked to the C++ part over its exposed pure C interfaces.

This was very much pain-free, but I certainly wouldn't want to try
to use C++ code tightly intermixed with Pg backend-using code. It'd
be a nightmare.

These two paragraphs, suitably changed to be more like the rest of the
docs, would be a great start for people interested in using C++.

Would some short bits of sample code help?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Craig Ringer (#11)
Re: server-side extension in c++

Craig Ringer <craig@postnewspapers.com.au> writes:

On 01/06/10 11:05, Tom Lane wrote:

I'd be interested to see a section like this written by someone who'd
actually done a nontrivial C++ extension and lived to tell the tale.

I can't speak up there - my own C++/Pg backend stuff has been fairly
trivial, and has been where I can maintain a fairly clean separation of
the C++-exposed and the Pg-backend-exposed parts. I was able to keep
things separate enough that my C++ compilation units didn't include the
Pg backend headers; they just exposed a pure C public interface. The Pg
backend-using compilation units were written in C, and talked to the C++
part over its exposed pure C interfaces.

Yeah, if you can design your code so that C++ never has to call back
into the core backend, that eliminates a large chunk of the pain.
Should we be documenting design ideas like this one?

regards, tom lane

#14Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#13)
Re: server-side extension in c++

Tom Lane wrote:

Craig Ringer <craig@postnewspapers.com.au> writes:

On 01/06/10 11:05, Tom Lane wrote:

I'd be interested to see a section like this written by someone who'd
actually done a nontrivial C++ extension and lived to tell the tale.

I can't speak up there - my own C++/Pg backend stuff has been fairly
trivial, and has been where I can maintain a fairly clean separation of
the C++-exposed and the Pg-backend-exposed parts. I was able to keep
things separate enough that my C++ compilation units didn't include the
Pg backend headers; they just exposed a pure C public interface. The Pg
backend-using compilation units were written in C, and talked to the C++
part over its exposed pure C interfaces.

Yeah, if you can design your code so that C++ never has to call back
into the core backend, that eliminates a large chunk of the pain.
Should we be documenting design ideas like this one?

I have incorporated the new ideas into the C++ documentation section,
and removed the comment block in the attached patch.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +

Attachments:

/pgpatches/cpptext/x-diffDownload+28-31
#15Craig Ringer
craig@2ndquadrant.com
In reply to: Bruce Momjian (#14)
Re: server-side extension in c++

On 02/06/10 09:23, Bruce Momjian wrote:

Tom Lane wrote:

Craig Ringer <craig@postnewspapers.com.au> writes:

On 01/06/10 11:05, Tom Lane wrote:

I'd be interested to see a section like this written by someone who'd
actually done a nontrivial C++ extension and lived to tell the tale.

I can't speak up there - my own C++/Pg backend stuff has been fairly
trivial, and has been where I can maintain a fairly clean separation of
the C++-exposed and the Pg-backend-exposed parts. I was able to keep
things separate enough that my C++ compilation units didn't include the
Pg backend headers; they just exposed a pure C public interface. The Pg
backend-using compilation units were written in C, and talked to the C++
part over its exposed pure C interfaces.

Yeah, if you can design your code so that C++ never has to call back
into the core backend, that eliminates a large chunk of the pain.
Should we be documenting design ideas like this one?

I have incorporated the new ideas into the C++ documentation section,
and removed the comment block in the attached patch.

If you're going to include that much, I'd still really want to warn
people about exception/error handling too. It's important. I made brief
mention of it before, but perhaps some more detail would help if people
really want to do this.

( BTW, all in all, I agree with Tom Lane - the best answer is "don't".
Sometimes you need to access functionality from C++ libraries, but
unless that's your reason I wouldn't ever consider doing it. )

Here's a rough outline of the rules I follow when mixing C/C++ code,
plus some info on the longjmp error handling related complexities added
by Pg:

Letting an exception thrown from C++ code cross into C code will be
EXTREMELY ugly. The C++-to-C boundaries *must* have unconditional catch
blocks to convert thrown exceptions into appropriate error codes, even
if the C++ code in question never knowingly throws an exception. C++ may
throw std::bad_alloc on failure of operator new(), among other things,
so the user must _always_ have an unconditional catch. Letting an
exception propagate out to the C-based Pg backend is rather likely to
result in a backend crash.

If the C++ libraries you are using will put up with it, compile your C++
code with -fno-exceptions to make your life much, much easier, as you
can avoid worrying about this entirely. OTOH, you must then check for
NULL return from operator new().

If you can't do that: My usual rule is that any "extern C" function
*must* have an unconditional catch. I also require that any function
that may be passed as a function pointer to C code must be "extern C"
and thus must obey the previous rule, so that covers function pointers
and dlopen()ed access to functions.

Similarly, calling Pg code that may use Pg's error handling from within
C++ is unsafe. It should be OK if you know for absolute certain that the
C++ call tree in question only has plain-old-data (POD) structs and
simple variables on the stack, but even then it requires caution. C++
code that uses Pg calls can't do anything it couldn't do if you were
using 'goto' and labels in each involved function, but additionally has
to worry about returning and passing non-POD objects between functions
in a call chain by value, as a longjmp may result in dtors not being
properly called.

The best way to get around this issue is not to call into the Pg backend
from C++ code at all, instead encapsulating your C++ functionality into
cleanly separated modules with pure C interfaces. If you don't #include
any Pg backend headers into any compilation units compiled with the C++
compiler, that should do the trick.

If you must mix Pg calls and C++, restrict your C++ objects to the heap
(ie use pointers to them, managed with new and delete) and limit your
stack to POD variables (simple structs and built-in types). Note that
this means you can't use std::auto_ptr, std::tr1:shared_ptr, RAII lock
management, etc in C++ code that may call into the Pg backend.

--
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

#16Bruce Momjian
bruce@momjian.us
In reply to: Craig Ringer (#15)
Re: server-side extension in c++

Craig Ringer wrote:

( BTW, all in all, I agree with Tom Lane - the best answer is "don't".
Sometimes you need to access functionality from C++ libraries, but
unless that's your reason I wouldn't ever consider doing it. )

Here's a rough outline of the rules I follow when mixing C/C++ code,
plus some info on the longjmp error handling related complexities added
by Pg:

This was very helpful. I have condensed your ideas into the attached
patch that contains the potential C++ documentation section.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +

Attachments:

/pgpatches/cpptext/plainDownload+46-44
In reply to: Craig Ringer (#15)
Re: server-side extension in c++

Letting an exception thrown from C++ code cross into C code will be
EXTREMELY ugly. The C++-to-C boundaries *must* have unconditional catch
blocks to convert thrown exceptions into appropriate error codes, even
if the C++ code in question never knowingly throws an exception. C++ may
throw std::bad_alloc on failure of operator new(), among other things,
so the user must _always_ have an unconditional catch. Letting an
exception propagate out to the C-based Pg backend is rather likely to
result in a backend crash.

Right, but I don't think that this differs from the general C++ case.
Allowing exceptions to propagate across module boundaries was always a
bad idea, as was managing memory across module boundaries. Aside from
being very messy, they had to have exactly compatible runtimes and so
on.

If the C++ libraries you are using will put up with it, compile your C++
code with -fno-exceptions to make your life much, much easier, as you
can avoid worrying about this entirely. OTOH, you must then check for
NULL return from operator new().

That's the pre-standard behaviour of operator new(), and operator
new() continues to behave that way on some platforms, typically
embedded systems.

If you can't do that: My usual rule is that any "extern C" function
*must* have an unconditional catch. I also require that any function
that may be passed as a function pointer to C code must be "extern C"
and thus must obey the previous rule, so that covers function pointers
and dlopen()ed access to functions.

Seems reasonable, and not overly difficult.

Similarly, calling Pg code that may use Pg's error handling from within
C++ is unsafe. It should be OK if you know for absolute certain that the
C++ call tree in question only has plain-old-data (POD) structs and
simple variables on the stack, but even then it requires caution. C++
code that uses Pg calls can't do anything it couldn't do if you were
using 'goto' and labels in each involved function, but additionally has
to worry about returning and passing non-POD objects between functions
in a call chain by value, as a longjmp may result in dtors not being
properly called.

Really? That seems like an *incredibly* arduous requirement.
Intuitively, I find it difficult to believe. After all, even though
using longjmp in C++ code is a fast track to undefined behaviour, I
would have imagined that doing so in an isolated C module with a well
defined interface, called from C++ would be safe. I would have
imagined that ultimately, the call to the Pg C function must return,
and therefore cannot affect stack unwinding within the C++ part of the
program.

To invoke a reductio ad absurdum argument, if this were the case,
calling C functions from C++ would be widely considered a dangerous
thing to do, which it is not. After all, setjmp()/longjmp() are part
of the C standard library...in general, it's difficult to know whether
or not a third party module may use them. Have you ever seen a C
library marked as C++ safe or C++ unsafe? No, me neither.

Perhaps I'm missing something though...does the error handling portion
of the pg code potentially need a hook into the C++ code, from where
the longjmp() must be performed? I don't know what you mean by "a call
chain by value".

The bottom line is that *I think* you're fine as long as you don't do
setjmp()/longjmp() from within C++. You may even be okay if you just
do setjmp() from within C++. The longjmp() will hopefully only affect
stack unwinding before we get down to the C++ part of the stack, where
that matters.

--
Regards,
Peter Geoghegan

#18Craig Ringer
craig@2ndquadrant.com
In reply to: Peter Geoghegan (#17)
Re: server-side extension in c++

On 02/06/10 19:17, Peter Geoghegan wrote:

Similarly, calling Pg code that may use Pg's error handling from within
C++ is unsafe. It should be OK if you know for absolute certain that the
C++ call tree in question only has plain-old-data (POD) structs and
simple variables on the stack, but even then it requires caution. C++
code that uses Pg calls can't do anything it couldn't do if you were
using 'goto' and labels in each involved function, but additionally has
to worry about returning and passing non-POD objects between functions
in a call chain by value, as a longjmp may result in dtors not being
properly called.

Really? That seems like an *incredibly* arduous requirement.
Intuitively, I find it difficult to believe. After all, even though
using longjmp in C++ code is a fast track to undefined behaviour, I
would have imagined that doing so in an isolated C module with a well
defined interface, called from C++ would be safe.

Not necessarily. It's only safe if setjmp/longjmp calls occur only
within the C code without "breaking" call paths involving C++.

This is ok:

[ C ]
entrypoint()
callIntoCppCode()
[ C++ ]
someCalls()
callIntoCCode()
[ C ]
setjmp()
doSomeStuff()
longjmp()

This is really, really not:

[ C ]
entrypoint()
setjmp() <----
callIntoCppCode()
[ C++ ]
someCalls()
callIntoCCode()
[ C ]
doSomeStuff()
longjmp()

See the attached demo (pop all files in the same directory then run "make").

I would have
imagined that ultimately, the call to the Pg C function must return,
and therefore cannot affect stack unwinding within the C++ part of the
program.

That's the whole point; a longjmp breaks the call chain, and the
guarantee that eventually the stack will unwind as functions return.

It's OK if you setjmp(a), do some work, setjmp(b), longjmp(a), do some
work, longjmp(b), return.

My understanding, which is likely imperfect, is that Pg's error handling
does NOT guarantee that, ie it's quite possible that a function may call
longjmp() without preparing any jmp_env to "jump back to" and therefore
will never return.

To invoke a reductio ad absurdum argument, if this were the case,
calling C functions from C++ would be widely considered a dangerous
thing to do, which it is not.

If those C functions use setjmp/longjmp, it *is* a dangerous thing to
do. Most libraries that use setjmp/longjump in ways that may affect
calling code DO document this, and it's expected that the user of the
library will know what that entails.

If the library uses setjmp/longjmp entirely internally, so that it never

http://stackoverflow.com/questions/1376085/c-safe-to-use-longjmp-and-setjmp

--
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

Attachments:

Makefiletext/plain; name=MakefileDownload
cppmodule.cpptext/x-c++src; name=cppmodule.cppDownload
main.ctext/x-csrc; name=main.cDownload
#19David Fetter
david@fetter.org
In reply to: Craig Ringer (#15)
Re: server-side extension in c++

On Wed, Jun 02, 2010 at 10:11:37AM +0800, Craig Ringer wrote:

On 02/06/10 09:23, Bruce Momjian wrote:

Tom Lane wrote:

Craig Ringer <craig@postnewspapers.com.au> writes:

On 01/06/10 11:05, Tom Lane wrote:

I'd be interested to see a section like this written by someone who'd
actually done a nontrivial C++ extension and lived to tell the tale.

I can't speak up there - my own C++/Pg backend stuff has been fairly
trivial, and has been where I can maintain a fairly clean separation of
the C++-exposed and the Pg-backend-exposed parts. I was able to keep
things separate enough that my C++ compilation units didn't include the
Pg backend headers; they just exposed a pure C public interface. The Pg
backend-using compilation units were written in C, and talked to the C++
part over its exposed pure C interfaces.

Yeah, if you can design your code so that C++ never has to call back
into the core backend, that eliminates a large chunk of the pain.
Should we be documenting design ideas like this one?

I have incorporated the new ideas into the C++ documentation section,
and removed the comment block in the attached patch.

If you're going to include that much, I'd still really want to warn
people about exception/error handling too. It's important. I made brief
mention of it before, but perhaps some more detail would help if people
really want to do this.

( BTW, all in all, I agree with Tom Lane - the best answer is "don't".
Sometimes you need to access functionality from C++ libraries, but
unless that's your reason I wouldn't ever consider doing it. )

Here's a rough outline of the rules I follow when mixing C/C++ code,
plus some info on the longjmp error handling related complexities added
by Pg:

Letting an exception thrown from C++ code cross into C code will be
EXTREMELY ugly. The C++-to-C boundaries *must* have unconditional catch
blocks to convert thrown exceptions into appropriate error codes, even
if the C++ code in question never knowingly throws an exception. C++ may
throw std::bad_alloc on failure of operator new(), among other things,
so the user must _always_ have an unconditional catch. Letting an
exception propagate out to the C-based Pg backend is rather likely to
result in a backend crash.

If the C++ libraries you are using will put up with it, compile your C++
code with -fno-exceptions to make your life much, much easier, as you
can avoid worrying about this entirely. OTOH, you must then check for
NULL return from operator new().

If you can't do that: My usual rule is that any "extern C" function
*must* have an unconditional catch. I also require that any function
that may be passed as a function pointer to C code must be "extern C"
and thus must obey the previous rule, so that covers function pointers
and dlopen()ed access to functions.

Similarly, calling Pg code that may use Pg's error handling from within
C++ is unsafe. It should be OK if you know for absolute certain that the
C++ call tree in question only has plain-old-data (POD) structs and
simple variables on the stack, but even then it requires caution. C++
code that uses Pg calls can't do anything it couldn't do if you were
using 'goto' and labels in each involved function, but additionally has
to worry about returning and passing non-POD objects between functions
in a call chain by value, as a longjmp may result in dtors not being
properly called.

The best way to get around this issue is not to call into the Pg backend
from C++ code at all, instead encapsulating your C++ functionality into
cleanly separated modules with pure C interfaces. If you don't #include
any Pg backend headers into any compilation units compiled with the C++
compiler, that should do the trick.

If you must mix Pg calls and C++, restrict your C++ objects to the heap
(ie use pointers to them, managed with new and delete) and limit your
stack to POD variables (simple structs and built-in types). Note that
this means you can't use std::auto_ptr, std::tr1:shared_ptr, RAII lock
management, etc in C++ code that may call into the Pg backend.

Is PostGIS following these guidelines?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#20Bruce Momjian
bruce@momjian.us
In reply to: Craig Ringer (#18)
Re: server-side extension in c++

Craig Ringer wrote:

See the attached demo (pop all files in the same directory then run "make").

I would have
imagined that ultimately, the call to the Pg C function must return,
and therefore cannot affect stack unwinding within the C++ part of the
program.

That's the whole point; a longjmp breaks the call chain, and the
guarantee that eventually the stack will unwind as functions return.

It's OK if you setjmp(a), do some work, setjmp(b), longjmp(a), do some
work, longjmp(b), return.

My understanding, which is likely imperfect, is that Pg's error handling
does NOT guarantee that, ie it's quite possible that a function may call
longjmp() without preparing any jmp_env to "jump back to" and therefore
will never return.

You are correct that a longjump() jumps back to the query entry loop,
hopping over any user-defined C or C++ functions in the call stack, and
you are right that if we were just using longjump() without unwinding
C++ calls, we would be OK using non-POD structures.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +

In reply to: Craig Ringer (#18)
#22Craig Ringer
craig@2ndquadrant.com
In reply to: Peter Geoghegan (#21)
#23Mark Cave-Ayland
mark.cave-ayland@siriusit.co.uk
In reply to: David Fetter (#19)
#24David Fetter
david@fetter.org
In reply to: Mark Cave-Ayland (#23)
#25Bruce Momjian
bruce@momjian.us
In reply to: Peter Geoghegan (#21)
In reply to: Craig Ringer (#22)
In reply to: David Fetter (#24)
#28Mark Cave-Ayland
mark.cave-ayland@siriusit.co.uk
In reply to: David Fetter (#24)
#29Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#25)