libxml2 author overwhelmed with security requests

Started by Bruce Momjian7 months ago11 messages
#1Bruce Momjian
bruce@momjian.us

This blog post explains the serious problems the single libxml2 author
is having in maintaining the library:

https://socket.dev/blog/libxml2-maintainer-ends-embargoed-vulnerability-reports

There are few learnings from this:

* libxml2 is even less production-ready than we thought
* many projects don't have the resources we do

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.

#2Álvaro Herrera
alvherre@kurilemu.de
In reply to: Bruce Momjian (#1)
Re: libxml2 author overwhelmed with security requests

On 2025-Jun-18, Bruce Momjian wrote:

This blog post explains the serious problems the single libxml2 author
is having in maintaining the library:

https://socket.dev/blog/libxml2-maintainer-ends-embargoed-vulnerability-reports

There are few learnings from this:

* libxml2 is even less production-ready than we thought
* many projects don't have the resources we do

Maybe some of the companies doing business with Postgres can chime in to
let Nick Wellnhofer (the aforementioned maintainer) spend more time on
libxml2 maintenance:
https://opencollective.com/libxml2

Currently, looking at the OpenCollective reports, it seems USD 50 come
monthly from Airbnb to libxml2's Wellnhofer. That's unlikely to pay
very many bills.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Once again, thank you and all of the developers for your hard work on
PostgreSQL. This is by far the most pleasant management experience of
any database I've worked on." (Dan Harris)
http://archives.postgresql.org/pgsql-performance/2006-04/msg00247.php

#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Álvaro Herrera (#2)
Re: libxml2 author overwhelmed with security requests

čt 19. 6. 2025 v 11:00 odesílatel Álvaro Herrera <alvherre@kurilemu.de>
napsal:

On 2025-Jun-18, Bruce Momjian wrote:

This blog post explains the serious problems the single libxml2 author
is having in maintaining the library:

https://socket.dev/blog/libxml2-maintainer-ends-embargoed-vulnerability-reports

There are few learnings from this:

* libxml2 is even less production-ready than we thought
* many projects don't have the resources we do

Maybe some of the companies doing business with Postgres can chime in to
let Nick Wellnhofer (the aforementioned maintainer) spend more time on
libxml2 maintenance:
https://opencollective.com/libxml2

Currently, looking at the OpenCollective reports, it seems USD 50 come
monthly from Airbnb to libxml2's Wellnhofer. That's unlikely to pay
very many bills.

plus - there is not any free alternative for C

Regards

Pavel

Show quoted text

--
Álvaro Herrera 48°01'N 7°57'E —
https://www.EnterpriseDB.com/
"Once again, thank you and all of the developers for your hard work on
PostgreSQL. This is by far the most pleasant management experience of
any database I've worked on." (Dan Harris)
http://archives.postgresql.org/pgsql-performance/2006-04/msg00247.php

#4Jim Jones
jim.jones@uni-muenster.de
In reply to: Bruce Momjian (#1)
Re: libxml2 author overwhelmed with security requests

On 19.06.25 03:41, Bruce Momjian wrote:

This blog post explains the serious problems the single libxml2 author
is having in maintaining the library:

https://socket.dev/blog/libxml2-maintainer-ends-embargoed-vulnerability-reports

There are few learnings from this:

* libxml2 is even less production-ready than we thought
* many projects don't have the resources we do

That's even worse than I thought. Especially this disclaimer consideration:

“This is open-source software written by hobbyists, maintained by a
single volunteer, badly tested, written in a memory-unsafe language and
full of security bugs. It is foolish to use this software to process
untrusted data.”

No wonder other major databases opt for writing their own XML processing
engines. Sadly, despite these issues, there doesn't seem to be a decent
alternative to libxml2 :(

--
Jim

#5Bruce Momjian
bruce@momjian.us
In reply to: Jim Jones (#4)
Re: libxml2 author overwhelmed with security requests

On Thu, Jun 19, 2025 at 09:24:32PM +0200, Jim Jones wrote:

On 19.06.25 03:41, Bruce Momjian wrote:

This blog post explains the serious problems the single libxml2 author
is having in maintaining the library:

https://socket.dev/blog/libxml2-maintainer-ends-embargoed-vulnerability-reports

There are few learnings from this:

* libxml2 is even less production-ready than we thought
* many projects don't have the resources we do

That's even worse than I thought. Especially this disclaimer consideration:

“This is open-source software written by hobbyists, maintained by a
single volunteer, badly tested, written in a memory-unsafe language and
full of security bugs. It is foolish to use this software to process
untrusted data.”

No wonder other major databases opt for writing their own XML processing
engines. Sadly, despite these issues, there doesn't seem to be a decent
alternative to libxml2 :(

I think our solution to making Postgres more secure would be to just
remove XML support --- we aleady have the inclusion of libxml options at
configure time. I don't think there is community support to be
developing an XML library --- some Postgres companies might feel
differently, but that is not the community's concern.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.

#6Pavel Stehule
pavel.stehule@gmail.com
In reply to: Bruce Momjian (#5)
Re: libxml2 author overwhelmed with security requests

čt 19. 6. 2025 v 22:09 odesílatel Bruce Momjian <bruce@momjian.us> napsal:

On Thu, Jun 19, 2025 at 09:24:32PM +0200, Jim Jones wrote:

On 19.06.25 03:41, Bruce Momjian wrote:

This blog post explains the serious problems the single libxml2 author
is having in maintaining the library:

https://socket.dev/blog/libxml2-maintainer-ends-embargoed-vulnerability-reports

There are few learnings from this:

* libxml2 is even less production-ready than we thought
* many projects don't have the resources we do

That's even worse than I thought. Especially this disclaimer

consideration:

“This is open-source software written by hobbyists, maintained by a
single volunteer, badly tested, written in a memory-unsafe language and
full of security bugs. It is foolish to use this software to process
untrusted data.”

No wonder other major databases opt for writing their own XML processing
engines. Sadly, despite these issues, there doesn't seem to be a decent
alternative to libxml2 :(

I think our solution to making Postgres more secure would be to just
remove XML support --- we aleady have the inclusion of libxml options at
configure time. I don't think there is community support to be
developing an XML library --- some Postgres companies might feel
differently, but that is not the community's concern.

Own implementation of SQL/XML generating functions like XMLFOREST or
XMLELEMENT should not be too
difficult. Significantly more difficult problem is parsing of XML (more
with namespaces), although some basic
support for XMLTABLE should not be too hard too.

Libxml2 is very complex due it supports a lot of API, a lot of redundant
API - SAX, DOM, DTD, ...
But we use only a few percent of functionality from libxml2.

Isn't possible to call Rust code from C? Then maybe there are some
possibility from Rust world

https://github.com/ballsteve/xrust

Regards

Pavel

Show quoted text

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#6)
Re: libxml2 author overwhelmed with security requests

Pavel Stehule <pavel.stehule@gmail.com> writes:

Own implementation of SQL/XML generating functions like XMLFOREST or
XMLELEMENT should not be too
difficult. Significantly more difficult problem is parsing of XML (more
with namespaces), although some basic
support for XMLTABLE should not be too hard too.

I don't think anybody really wants to roll our own XML parser.

Isn't possible to call Rust code from C? Then maybe there are some
possibility from Rust world
https://github.com/ballsteve/xrust

Maybe. I think the fundamental problem here, similar to what we've
run into elsewhere, is that we chose a library to depend on without
thinking hard enough about whether it would be well-supported in the
long run. I see little reason to think that that risk would be less
for some random not-written-in-C implementation. If we want to
jump ship away from libxml2, we had better ask hard questions about
the new choice.

regards, tom lane

#8Sandeep Thakkar
sandeep.thakkar@enterprisedb.com
In reply to: Tom Lane (#7)
Re: libxml2 author overwhelmed with security requests

On Fri, Jun 20, 2025 at 2:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Pavel Stehule <pavel.stehule@gmail.com> writes:

Own implementation of SQL/XML generating functions like XMLFOREST or
XMLELEMENT should not be too
difficult. Significantly more difficult problem is parsing of XML (more
with namespaces), although some basic
support for XMLTABLE should not be too hard too.

I don't think anybody really wants to roll our own XML parser.

Isn't possible to call Rust code from C? Then maybe there are some
possibility from Rust world
https://github.com/ballsteve/xrust

Maybe. I think the fundamental problem here, similar to what we've
run into elsewhere, is that we chose a library to depend on without
thinking hard enough about whether it would be well-supported in the
long run. I see little reason to think that that risk would be less
for some random not-written-in-C implementation. If we want to
jump ship away from libxml2, we had better ask hard questions about
the new choice.

Also, libxslt depends on libxml2, and there is no maintainer now after the
recent commits done to remove the existing ones:
https://gitlab.gnome.org/GNOME/libxslt/-/commit/c8b1ea4b89a9b81fa611f32c80f47df0c3b3b004
https://gitlab.gnome.org/GNOME/libxslt/-/commit/923903c59d668af42e3144bc623c9190a0f65988

regards, tom lane

--
Sandeep Thakkar

#9Bruce Momjian
bruce@momjian.us
In reply to: Sandeep Thakkar (#8)
Re: libxml2 author overwhelmed with security requests

On Mon, Jul 21, 2025 at 12:46:03PM +0530, Sandeep Thakkar wrote:

On Fri, Jun 20, 2025 at 2:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Pavel Stehule <pavel.stehule@gmail.com> writes:

Own implementation of SQL/XML generating functions like XMLFOREST or
XMLELEMENT should not be too
difficult. Significantly more difficult problem is parsing of XML (more
with namespaces), although some basic
support for XMLTABLE should not be too hard too.

I don't think anybody really wants to roll our own XML parser.

Isn't possible to call Rust code from C? Then maybe there are some
possibility from Rust world
https://github.com/ballsteve/xrust

Maybe.  I think the fundamental problem here, similar to what we've
run into elsewhere, is that we chose a library to depend on without
thinking hard enough about whether it would be well-supported in the
long run.  I see little reason to think that that risk would be less
for some random not-written-in-C implementation.  If we want to
jump ship away from libxml2, we had better ask hard questions about
the new choice.

Also, libxslt depends on libxml2, and there is no maintainer now after the
recent commits done to remove the existing ones:
https://gitlab.gnome.org/GNOME/libxslt/-/commit/
c8b1ea4b89a9b81fa611f32c80f47df0c3b3b004
https://gitlab.gnome.org/GNOME/libxslt/-/commit/
923903c59d668af42e3144bc623c9190a0f65988

Where do we think our use of libxml2 is heading? Do you suspect
security scanners will start negative reporting the use of libxml2?

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#9)
Re: libxml2 author overwhelmed with security requests

Bruce Momjian <bruce@momjian.us> writes:

Where do we think our use of libxml2 is heading? Do you suspect
security scanners will start negative reporting the use of libxml2?

There's at least one distro that's already stopped building with
--with-libxml out of security concerns. (I forget who exactly,
but it's been mentioned on the PG lists.)

regards, tom lane

#11Iván Chavero
ichavero@chavero.com.mx
In reply to: Sandeep Thakkar (#8)
Re: libxml2 author overwhelmed with security requests

En 21/07/25 1:16 a. m., Sandeep Thakkar escribió:

On Fri, Jun 20, 2025 at 2:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Pavel Stehule <pavel.stehule@gmail.com> writes:

Own implementation of SQL/XML generating functions like XMLFOREST or
XMLELEMENT should not be too
difficult. Significantly more difficult problem is parsing of

XML (more

with namespaces), although some basic
support for XMLTABLE should not be too hard too.

I don't think anybody really wants to roll our own XML parser.

Isn't possible to call Rust code from C? Then maybe there are some
possibility from Rust world
https://github.com/ballsteve/xrust

Maybe.  I think the fundamental problem here, similar to what we've
run into elsewhere, is that we chose a library to depend on without
thinking hard enough about whether it would be well-supported in the
long run.  I see little reason to think that that risk would be less
for some random not-written-in-C implementation.  If we want to
jump ship away from libxml2, we had better ask hard questions about
the new choice.

Also, libxslt depends on libxml2, and there is no maintainer now after the
recent commits done to remove the existing ones:
https://gitlab.gnome.org/GNOME/libxslt/-/commit/c8b1ea4b89a9b81fa611f32c80f47df0c3b3b004
https://gitlab.gnome.org/GNOME/libxslt/-/commit/923903c59d668af42e3144bc623c9190a0f65988

After reading this thread I've stepped in to maintain libxslt and me and
other

Mexican developers are going to be on top of libxml2. We use this
libraries and their

Rust bindings because we're writing libraries for handling Mexican taxes
and they are

wrapped in XML.

So at least me and another developer will be helping with this libraries
and will make

our best effort to keep them up to date both in securities and
functionalities (eg. XSLT 2.0 support).

Cheers,

Iván