libxml incompatibility

Started by Alvaro Herreraalmost 17 years ago16 messages
#1Alvaro Herrera
alvherre@commandprompt.com

Hi,

It seems that if you load libxml into a backend for whatever reason (say
you create a table with a column of type xml) and then create a plperlu
function that "use XML::LibXML", we get a segmentation fault.

This sequence reproduces the problem for me in 8.3:

create table xmlcrash (a xml);
insert into xmlcrash values ('<a />');
create function xmlcrash() returns void language plperlu as $$ use XML::LibXML; $$;

The problem is reported as

TRAP: BadArgument(�!(((context) != ((void *)0) && (((((Node*)((context)))->type) == T_AllocSetContext))))�, Archivo: �/pgsql/source/83_rel/src/backend/utils/mmgr/mcxt.c�, L�nea: 507)

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#2Kenneth Marshall
ktm@rice.edu
In reply to: Alvaro Herrera (#1)
Re: libxml incompatibility

This looks like a problem caused by two different libxml versions:
the one used for the perl XML::LibXML wrappers and the one used to
build PostgreSQL. They really need to be the same. Does it still
segfault if they are identical?

Regards,
Ken

Show quoted text

On Fri, Mar 06, 2009 at 04:14:04PM -0300, Alvaro Herrera wrote:

Hi,

It seems that if you load libxml into a backend for whatever reason (say
you create a table with a column of type xml) and then create a plperlu
function that "use XML::LibXML", we get a segmentation fault.

This sequence reproduces the problem for me in 8.3:

create table xmlcrash (a xml);
insert into xmlcrash values ('<a />');
create function xmlcrash() returns void language plperlu as $$ use XML::LibXML; $$;

The problem is reported as

TRAP: BadArgument(?!(((context) != ((void *)0) && (((((Node*)((context)))->type) == T_AllocSetContext))))?, Archivo: ?/pgsql/source/83_rel/src/backend/utils/mmgr/mcxt.c?, L?nea: 507)

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Andrew Dunstan
andrew@dunslane.net
In reply to: Alvaro Herrera (#1)
Re: libxml incompatibility

Alvaro Herrera wrote:

Hi,

It seems that if you load libxml into a backend for whatever reason (say
you create a table with a column of type xml) and then create a plperlu
function that "use XML::LibXML", we get a segmentation fault.

Yes, I discovered this a few weeks ago. It looks like libxml is not
reentrant, so for perl you need to use some other XML library. Very
annoying.

cheers

andrew

#4Alvaro Herrera
alvherre@commandprompt.com
In reply to: Kenneth Marshall (#2)
Re: libxml incompatibility

Kenneth Marshall wrote:

This looks like a problem caused by two different libxml versions:
the one used for the perl XML::LibXML wrappers and the one used to
build PostgreSQL. They really need to be the same. Does it still
segfault if they are identical?

Unlikely, because AFAICT there's a single libxml installed on my system.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#5Kenneth Marshall
ktm@rice.edu
In reply to: Andrew Dunstan (#3)
Re: libxml incompatibility

On Fri, Mar 06, 2009 at 02:58:30PM -0500, Andrew Dunstan wrote:

Alvaro Herrera wrote:

Hi,

It seems that if you load libxml into a backend for whatever reason (say
you create a table with a column of type xml) and then create a plperlu
function that "use XML::LibXML", we get a segmentation fault.

Yes, I discovered this a few weeks ago. It looks like libxml is not
reentrant, so for perl you need to use some other XML library. Very
annoying.

cheers

andrew

Ugh! That is worse than a simple library link incompatibility.

Ken

#6Alvaro Herrera
alvherre@commandprompt.com
In reply to: Alvaro Herrera (#1)
Re: libxml incompatibility

Kenneth Marshall wrote:

On Fri, Mar 06, 2009 at 05:23:45PM -0300, Alvaro Herrera wrote:

Kenneth Marshall wrote:

This looks like a problem caused by two different libxml versions:
the one used for the perl XML::LibXML wrappers and the one used to
build PostgreSQL. They really need to be the same. Does it still
segfault if they are identical?

Unlikely, because AFAICT there's a single libxml installed on my system.

Yes, I saw Andrew's comment and I have had that problem my self with
Apache/PHP and perl with libxml. As simple library mismatch would at
least be easy to resolve. :)

Agreed :-(

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#7Kenneth Marshall
ktm@rice.edu
In reply to: Alvaro Herrera (#4)
Re: libxml incompatibility

On Fri, Mar 06, 2009 at 05:23:45PM -0300, Alvaro Herrera wrote:

Kenneth Marshall wrote:

This looks like a problem caused by two different libxml versions:
the one used for the perl XML::LibXML wrappers and the one used to
build PostgreSQL. They really need to be the same. Does it still
segfault if they are identical?

Unlikely, because AFAICT there's a single libxml installed on my system.

Yes, I saw Andrew's comment and I have had that problem my self with
Apache/PHP and perl with libxml. As simple library mismatch would at
least be easy to resolve. :)

Regards,
Ken

#8Holger Hoffstaette
holger@wizards.de
In reply to: Alvaro Herrera (#1)
Re: libxml incompatibility

On Fri, 06 Mar 2009 14:32:25 -0600, Kenneth Marshall wrote:

On Fri, Mar 06, 2009 at 02:58:30PM -0500, Andrew Dunstan wrote:

Yes, I discovered this a few weeks ago. It looks like libxml is not
reentrant, so for perl you need to use some other XML library. Very
annoying.

Ugh! That is worse than a simple library link incompatibility.

http://www.nabble.com/New-libxml-which-is-reentrant---to18329452.html

Seems to me that Perl (?) is calling functions it is not supposed to call
- I'm guessing due to assumptions about mismatching lifecycles. The
parsing functions themselves are supposedly reentrant.

-h

#9Andrew Dunstan
andrew@dunslane.net
In reply to: Holger Hoffstaette (#8)
Re: libxml incompatibility

Holger Hoffstaette wrote:

On Fri, 06 Mar 2009 14:32:25 -0600, Kenneth Marshall wrote:

On Fri, Mar 06, 2009 at 02:58:30PM -0500, Andrew Dunstan wrote:

Yes, I discovered this a few weeks ago. It looks like libxml is not
reentrant, so for perl you need to use some other XML library. Very
annoying.

Ugh! That is worse than a simple library link incompatibility.

http://www.nabble.com/New-libxml-which-is-reentrant---to18329452.html

Seems to me that Perl (?) is calling functions it is not supposed to call
- I'm guessing due to assumptions about mismatching lifecycles. The
parsing functions themselves are supposedly reentrant.

Maybe someone can trace the libxml calls ... not sure how exactly ...
given Alvaro's example, it doesn't seem likely to me that this is due to
a call to xmlCleanupParser(), but maybe the perl code invokes by simply
doing "use XML::LibXML;" calls that for some perverse reason.

My interest wasn't so high that I wanted to spend a lot of time on it.
If it didn't work I was just going to move on.

cheers

andrew

#10Alvaro Herrera
alvherre@commandprompt.com
In reply to: Andrew Dunstan (#9)
Re: libxml incompatibility

Andrew Dunstan wrote:

Holger Hoffstaette wrote:

http://www.nabble.com/New-libxml-which-is-reentrant---to18329452.html

Seems to me that Perl (?) is calling functions it is not supposed to call
- I'm guessing due to assumptions about mismatching lifecycles. The
parsing functions themselves are supposedly reentrant.

Maybe someone can trace the libxml calls ... not sure how exactly ...
given Alvaro's example, it doesn't seem likely to me that this is due to
a call to xmlCleanupParser(), but maybe the perl code invokes by simply
doing "use XML::LibXML;" calls that for some perverse reason.

Something that came to my mind was that maybe the change of memory
management (to make it use palloc) could be confusing libxml somehow.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#11Andrew Dunstan
andrew@dunslane.net
In reply to: Alvaro Herrera (#10)
Re: libxml incompatibility

Alvaro Herrera wrote:

Andrew Dunstan wrote:

Holger Hoffstaette wrote:

http://www.nabble.com/New-libxml-which-is-reentrant---to18329452.html

Seems to me that Perl (?) is calling functions it is not supposed to call
- I'm guessing due to assumptions about mismatching lifecycles. The
parsing functions themselves are supposedly reentrant.

Maybe someone can trace the libxml calls ... not sure how exactly ...
given Alvaro's example, it doesn't seem likely to me that this is due to
a call to xmlCleanupParser(), but maybe the perl code invokes by simply
doing "use XML::LibXML;" calls that for some perverse reason.

Something that came to my mind was that maybe the change of memory
management (to make it use palloc) could be confusing libxml somehow.

Seems very possible. But what would perl be doing just as a result of
loading the module, not even doing anything, that would cause a segfault
because of that?

cheers

andrew

#12David Lee Lambert
davidl@lmert.com
In reply to: Alvaro Herrera (#1)
Re: libxml incompatibility

On 6 mar, 22:44, and...@dunslane.net (Andrew Dunstan) wrote:

Holger Hoffstaette wrote:

On Fri, 06 Mar 2009 14:32:25 -0600, Kenneth Marshall wrote:

On Fri, Mar 06, 2009 at 02:58:30PM -0500, Andrew Dunstan wrote:

Yes, I discovered this a few weeks ago. [...]

Maybe someone can trace the libxml calls ... not sure how exactly ...
given Alvaro's example, it doesn't seem likely to me that this is due to
a call to xmlCleanupParser(), but maybe the perl code invokes by simply
doing "use XML::LibXML;" calls that for some perverse reason.

I'm able to duplicate this on Postgres 8.4 (Debian Etch, XML::LibXML
from CPAN). Here's the backtrace from the crash:

#0 0x082f3cf1 in MemoryContextAlloc ()
#1 0x082c3f8a in xml_palloc ()
#2 0xb7dfa548 in xmlInitCharEncodingHandlers () from /usr/lib/
libxml2.so.2
#3 0xb7e0195e in xmlInitParser () from /usr/lib/libxml2.so.2
#4 0xb7dff2ef in xmlCheckVersion () from /usr/lib/libxml2.so.2
#5 0xb573af2e in boot_XML__LibXML ()
from /usr/local/lib/perl/5.8.8/auto/XML/LibXML/LibXML.so
#6 0xb587981b in Perl_pp_entersub () from /usr/lib/libperl.so.5.8
#7 0xb5877f19 in Perl_runops_standard () from /usr/lib/libperl.so.5.8
#8 0xb5819b6e in Perl_magicname () from /usr/lib/libperl.so.5.8
#9 0xb581a844 in Perl_call_sv () from /usr/lib/libperl.so.5.8
...

Is it supposed to be OK to call xmlCheckVersion() more than once?

--
DLL

#13Andrew Dunstan
andrew@dunslane.net
In reply to: David Lee Lambert (#12)
Re: libxml incompatibility

David Lee Lambert wrote:

On 6 mar, 22:44, and...@dunslane.net (Andrew Dunstan) wrote:

Holger Hoffstaette wrote:

On Fri, 06 Mar 2009 14:32:25 -0600, Kenneth Marshall wrote:

On Fri, Mar 06, 2009 at 02:58:30PM -0500, Andrew Dunstan wrote:

Yes, I discovered this a few weeks ago. [...]

Maybe someone can trace the libxml calls ... not sure how exactly ...
given Alvaro's example, it doesn't seem likely to me that this is due to
a call to xmlCleanupParser(), but maybe the perl code invokes by simply
doing "use XML::LibXML;" calls that for some perverse reason.

I'm able to duplicate this on Postgres 8.4 (Debian Etch, XML::LibXML
from CPAN). Here's the backtrace from the crash:

#0 0x082f3cf1 in MemoryContextAlloc ()
#1 0x082c3f8a in xml_palloc ()
#2 0xb7dfa548 in xmlInitCharEncodingHandlers () from /usr/lib/
libxml2.so.2
#3 0xb7e0195e in xmlInitParser () from /usr/lib/libxml2.so.2
#4 0xb7dff2ef in xmlCheckVersion () from /usr/lib/libxml2.so.2
#5 0xb573af2e in boot_XML__LibXML ()
from /usr/local/lib/perl/5.8.8/auto/XML/LibXML/LibXML.so
#6 0xb587981b in Perl_pp_entersub () from /usr/lib/libperl.so.5.8
#7 0xb5877f19 in Perl_runops_standard () from /usr/lib/libperl.so.5.8
#8 0xb5819b6e in Perl_magicname () from /usr/lib/libperl.so.5.8
#9 0xb581a844 in Perl_call_sv () from /usr/lib/libperl.so.5.8
...

Is it supposed to be OK to call xmlCheckVersion() more than once?

You are certainly not supposed to call xmlInitParser more than once -
see <http://xmlsoft.org/html/libxml-parser.html#xmlInitParser&gt;

Since this is being called by xmlCheckVersion(), that looks like a bug
in libxml2.

Even if this were fixed, however, I'm still not convinced that we'll be
able to call libxml2 from perl after we've installed our memory handler
(xml_palloc).

cheers

andrew

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#13)
Re: libxml incompatibility

Andrew Dunstan <andrew@dunslane.net> writes:

David Lee Lambert wrote:

Is it supposed to be OK to call xmlCheckVersion() more than once?

You are certainly not supposed to call xmlInitParser more than once -
see <http://xmlsoft.org/html/libxml-parser.html#xmlInitParser&gt;

No, what that says is that it can't be called concurrently by more
than one thread. If there were such a restriction then our own code
wouldn't work at all, because we call it every time through xml_parse()
or xpath().

Even if this were fixed, however, I'm still not convinced that we'll be
able to call libxml2 from perl after we've installed our memory handler
(xml_palloc).

Yeah, I'm wondering about that too. It certainly wouldn't have the
behavior that perl is expecting.

We could possibly use xmlMemGet() to fetch the prior settings and then
restore them after we are done, but making sure that happens after an
error would be a bit tricky.

regards, tom lane

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#14)
Re: libxml incompatibility

I wrote:

We could possibly use xmlMemGet() to fetch the prior settings and then
restore them after we are done, but making sure that happens after an
error would be a bit tricky.

I experimented with this a bit, and came up with the attached patch.
Basically what it does is revert libxml to its native memory management
methods anytime LibxmlContext doesn't exist. It fixes Alvaro's original
test case and some variants that I stumbled across, but I can't say that
I have a lot of faith in it. I see at least a couple of risk factors:

* it doesn't scale to the case where some other code is doing the same
kind of thing --- the pointers we saved during xml_init might or might
not still be appropriate to restore at end of transaction.

* suppose that a plperl function does some Perlish XML stuff, then calls
a SQL function that calls something in xml.c. When we start up use of
LibxmlContext we'll wipe the internal state of libxml (which we *have*
to do; this still crashes trivially without the added xmlCleanupParser
call). Can this break anything that the perl XML code is expecting to
still be valid when control gets back to it?

If this doesn't work then I'm afraid we'll need some radical rethinking
of the way we handle libxml memory management...

Please test. I'm not much with either Perl or XML and have little
idea of how to stress this.

regards, tom lane

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#1)
Re: libxml incompatibility

Alvaro Herrera <alvherre@commandprompt.com> writes:

It seems that if you load libxml into a backend for whatever reason (say
you create a table with a column of type xml) and then create a plperlu
function that "use XML::LibXML", we get a segmentation fault.

I've applied a patch for this in HEAD. It fixes the reported case,
but since I'm not a big user of either Perl or XML, it would be good
to get some more testing done ...

regards, tom lane