[PATCH] Fix libxml leaks in contrib/xml2 XPath functions

Started by Andrey Chernyy23 days ago5 messageshackers
Jump to latest
#1Andrey Chernyy
andrey.cherny@tantorlabs.com

Hi,

While reviewing contrib/xml2 I found two successful-path libxml leaks in
XPath functions.

The attached series is:

0001 Fix libxml string leak in contrib/xml2 xpath_list
0002 Fix libxml leaks in contrib/xml2 xpath_table

In xpath_list(), the plain separator path passes the result of
xmlXPathCastNodeToString() directly to xmlBufferWriteCHAR().
xmlXPathCastNodeToString() returns a libxml-allocated xmlChar * that
has to be released with xmlFree(), while xmlBufferWriteCHAR() copies
the string rather than taking ownership.

In xpath_table(), xmlXPathCompiledEval() returns an xmlXPathObjectPtr
that has to be released with xmlXPathFreeObject(). The function also
stores libxml-allocated xmlChar strings in the values array before
calling BuildTupleFromCStrings(). BuildTupleFromCStrings() consumes
those strings by converting/copying them into the result tuple, so the
temporary libxml strings can be released after the tuple is built.

The second patch frees the XPath result object after each evaluation,
frees per-column string values after BuildTupleFromCStrings() has
consumed them, and tracks the current libxml allocations across the
existing PG_TRY block so they are also released on error.

The attached manual repro scripts exercise the two paths separately.
They are Linux-specific because they sample VmRSS from
/proc/<backend-pid>/status using pg_read_file(), so they should be run
as a superuser or a role allowed to read server files.

On unpatched origin/master, selected NOTICE lines from the repro scripts
show steady backend RSS growth:

postgres=# \i ./xml2-xpath-list-leak-repro.sql
psql:xml2-xpath-list-leak-repro.sql:38: NOTICE: xpath_list i=1,
total_kb=38616, diff_kb=17220 psql:xml2-xpath-list-leak-repro.sql:38:
NOTICE: xpath_list i=2, total_kb=40732, diff_kb=2104
psql:xml2-xpath-list-leak-repro.sql:38: NOTICE: xpath_list i=3,
total_kb=42720, diff_kb=1988 psql:xml2-xpath-list-leak-repro.sql:38:
NOTICE: xpath_list i=4, total_kb=44644, diff_kb=1924
psql:xml2-xpath-list-leak-repro.sql:38: NOTICE: xpath_list i=5,
total_kb=46440, diff_kb=1796 psql:xml2-xpath-list-leak-repro.sql:38:
NOTICE: xpath_list i=6, total_kb=47968, diff_kb=1528
psql:xml2-xpath-list-leak-repro.sql:38: NOTICE: xpath_list i=7,
total_kb=50892, diff_kb=2924 psql:xml2-xpath-list-leak-repro.sql:38:
NOTICE: xpath_list i=8, total_kb=52712, diff_kb=1820
psql:xml2-xpath-list-leak-repro.sql:38: NOTICE: xpath_list i=9,
total_kb=54276, diff_kb=1564 psql:xml2-xpath-list-leak-repro.sql:38:
NOTICE: xpath_list i=10, total_kb=55328, diff_kb=1052

postgres=# \i ./xml2-xpath-table-leak-repro.sql
psql:xml2-xpath-table-leak-repro.sql:40: NOTICE: xpath_table i=1,
total_kb=26452, diff_kb=5136 psql:xml2-xpath-table-leak-repro.sql:40:
NOTICE: xpath_table i=2, total_kb=27772, diff_kb=1312
psql:xml2-xpath-table-leak-repro.sql:40: NOTICE: xpath_table i=3,
total_kb=29068, diff_kb=1296 psql:xml2-xpath-table-leak-repro.sql:40:
NOTICE: xpath_table i=4, total_kb=30368, diff_kb=1300
psql:xml2-xpath-table-leak-repro.sql:40: NOTICE: xpath_table i=5,
total_kb=31668, diff_kb=1300 psql:xml2-xpath-table-leak-repro.sql:40:
NOTICE: xpath_table i=6, total_kb=32968, diff_kb=1300
psql:xml2-xpath-table-leak-repro.sql:40: NOTICE: xpath_table i=7,
total_kb=34260, diff_kb=1292 psql:xml2-xpath-table-leak-repro.sql:40:
NOTICE: xpath_table i=8, total_kb=35568, diff_kb=1308
psql:xml2-xpath-table-leak-repro.sql:40: NOTICE: xpath_table i=9,
total_kb=36876, diff_kb=1308 psql:xml2-xpath-table-leak-repro.sql:40:
NOTICE: xpath_table i=10, total_kb=38168, diff_kb=1292

With both patches applied, the same scripts plateau after the initial
allocations:

postgres=# \i ./xml2-xpath-list-leak-repro.sql
psql:xml2-xpath-list-leak-repro.sql:38: NOTICE: xpath_list i=1,
total_kb=24480, diff_kb=3020 psql:xml2-xpath-list-leak-repro.sql:38:
NOTICE: xpath_list i=2, total_kb=24532, diff_kb=44
psql:xml2-xpath-list-leak-repro.sql:38: NOTICE: xpath_list i=3,
total_kb=24580, diff_kb=48 psql:xml2-xpath-list-leak-repro.sql:38:
NOTICE: xpath_list i=4, total_kb=24580, diff_kb=0
psql:xml2-xpath-list-leak-repro.sql:38: NOTICE: xpath_list i=5,
total_kb=24624, diff_kb=44 psql:xml2-xpath-list-leak-repro.sql:38:
NOTICE: xpath_list i=6, total_kb=24580, diff_kb=-44
psql:xml2-xpath-list-leak-repro.sql:38: NOTICE: xpath_list i=7,
total_kb=24580, diff_kb=0 psql:xml2-xpath-list-leak-repro.sql:38:
NOTICE: xpath_list i=8, total_kb=24580, diff_kb=0
psql:xml2-xpath-list-leak-repro.sql:38: NOTICE: xpath_list i=9,
total_kb=24580, diff_kb=0 psql:xml2-xpath-list-leak-repro.sql:38:
NOTICE: xpath_list i=10, total_kb=24580, diff_kb=0

postgres=# \i ./xml2-xpath-table-leak-repro.sql
psql:xml2-xpath-table-leak-repro.sql:40: NOTICE: xpath_table i=1,
total_kb=26572, diff_kb=188 psql:xml2-xpath-table-leak-repro.sql:40:
NOTICE: xpath_table i=2, total_kb=26600, diff_kb=28
psql:xml2-xpath-table-leak-repro.sql:40: NOTICE: xpath_table i=3,
total_kb=26600, diff_kb=0 psql:xml2-xpath-table-leak-repro.sql:40:
NOTICE: xpath_table i=4, total_kb=26600, diff_kb=0
psql:xml2-xpath-table-leak-repro.sql:40: NOTICE: xpath_table i=5,
total_kb=26600, diff_kb=0 psql:xml2-xpath-table-leak-repro.sql:40:
NOTICE: xpath_table i=6, total_kb=26600, diff_kb=0
psql:xml2-xpath-table-leak-repro.sql:40: NOTICE: xpath_table i=7,
total_kb=26600, diff_kb=0 psql:xml2-xpath-table-leak-repro.sql:40:
NOTICE: xpath_table i=8, total_kb=26600, diff_kb=0
psql:xml2-xpath-table-leak-repro.sql:40: NOTICE: xpath_table i=9,
total_kb=26600, diff_kb=0 psql:xml2-xpath-table-leak-repro.sql:40:
NOTICE: xpath_table i=10, total_kb=26600, diff_kb=0

contrib/xml2 regression tests still pass:

make -C contrib/xml2 check
# All 1 tests passed.

I also checked that the two patches apply cleanly to current
origin/master with git am.

This is related to, but not fixed by, the recent xml2/libxml
error-handling work from BUG #18943 / commit 732061150b0. That patch
improved cleanup and OOM handling around libxml calls, while these
cases are successful execution path leaks.

These are long-standing leaks and look like candidates for
back-patching to all supported branches.

--
Andrey Chernyy

Attachments:

xml2-xpath-list-leak-repro.sqlapplication/sqlDownload
xml2-xpath-table-leak-repro.sqlapplication/sqlDownload
0001-Fix-libxml-string-leak-in-contrib-xml2-xpath_list.patchtext/x-patchDownload+11-3
0002-Fix-libxml-leaks-in-contrib-xml2-xpath_table.patchtext/x-patchDownload+42-6
#2Michael Paquier
michael@paquier.xyz
In reply to: Andrey Chernyy (#1)
Re: [PATCH] Fix libxml leaks in contrib/xml2 XPath functions

On Mon, Jun 01, 2026 at 01:01:24AM +0300, Andrey Chernyy wrote:

This is related to, but not fixed by, the recent xml2/libxml
error-handling work from BUG #18943 / commit 732061150b0. That patch
improved cleanup and OOM handling around libxml calls, while these
cases are successful execution path leaks.

These are long-standing leaks and look like candidates for
back-patching to all supported branches.

Right. I have missed that xmlXPathCastNodeToString() does an
allocation of the result, and it is documented in the upstream code
that the caller is responsible for freeing what's been returned.

+   xmlXPathContextPtr volatile ctxt = NULL;
+   xmlXPathObjectPtr volatile res = NULL;
+   xmlXPathCompExprPtr volatile comppath = NULL;
+   xmlChar    *volatile resstr = NULL;

For these fout, ctxt is reachable once xmlXPathNewContext() does not
return NULL. For res, that's xmlXPathCompiledEval, and caller has to
free the result (documented). comppath is limited to the case where
the call of xmlXPathCtxtCompile() allocates a result, returns non-NULL
and has an error, which is a short window, still reachable. resstr
can be reached under when pg_xml_error_occurred() fails with an
allocation done.

I can see two more problematic patterns, that happen in error-only
paths still are worth addressing:
- pgxmlNodeSetToText() is missing a xmlFree() for "result".
- xmlXPathCtxtCompile() in pgxml_xpath(), where pg_xml_error_occurred
is reached and an allocation is done.

732061150b0 was only done on HEAD because these leaks were considered
as not worth the trouble in the back branches. This could qualify as
an open item, even if the leaks you are pointing at here are much
older than that.

Attached is a patch for the code paths I have grabbed on the way.
I'll take care of that, merging everything together. Thanks for the
report!
--
Michael

Attachments:

v2-0001-Fix-libxml-string-leak-in-contrib-xml2-xpath_list.patchtext/plain; charset=us-asciiDownload+11-3
v2-0002-Fix-libxml-leaks-in-contrib-xml2-xpath_table.patchtext/plain; charset=us-asciiDownload+42-6
v2-0003-xml2-Fix-two-more-leaks.patchtext/plain; charset=us-asciiDownload+6-4
#3Andrey Chernyy
andrey.cherny@tantorlabs.com
In reply to: Michael Paquier (#2)
Re: [PATCH] Fix libxml leaks in contrib/xml2 XPath functions

Hi Michael,

I checked v2, and the two additional error-path cases make sense to me.

While looking at pgxml_xpath(), I noticed one more nearby cleanup gap.
pgxml_xpath() builds an xpath_workspace before returning it to callers.
The callers already wrap pgxml_xpath() with PG_TRY/PG_CATCH, but if an
ERROR is thrown before pgxml_xpath() returns, the assignment of the
workspace pointer in the caller has not completed yet. The caller-side
PG_CATCH blocks therefore cannot call cleanup_workspace() for the
partially-built workspace.

Attached is an incremental patch on top of your v2-0001..v2-0003. It
adds local cleanup in pgxml_xpath(), tracks comppath for the error path,
and checks xmlXPathNewContext() before dereferencing the returned
context. I kept it separate for review, but it can of course be folded
if that makes the final series cleaner.

I also used the attached manual repro script, which catches repeated
XPath syntax errors in a single backend and samples VmRSS from /proc.
On my checkout without this pgxml_xpath() cleanup it kept growing after
each round, for example:

postgres=# \i xml2-pgxml-xpath-error-leak-repro.sql
NOTICE: error i=1, failures=20, total_kb=109748, diff_kb=88008
NOTICE: error i=2, failures=20, total_kb=193016, diff_kb=83268
NOTICE: error i=3, failures=20, total_kb=276220, diff_kb=83204
NOTICE: error i=4, failures=20, total_kb=359428, diff_kb=83208
NOTICE: error i=5, failures=20, total_kb=442632, diff_kb=83204
NOTICE: error i=6, failures=20, total_kb=525840, diff_kb=83208
NOTICE: error i=7, failures=20, total_kb=609044, diff_kb=83204
NOTICE: error i=8, failures=20, total_kb=692252, diff_kb=83208
NOTICE: error i=9, failures=20, total_kb=775456, diff_kb=83204
NOTICE: error i=10, failures=20, total_kb=858664, diff_kb=83208

With this patch applied, it plateaued after the initial warmup:

postgres=# \i xml2-pgxml-xpath-error-leak-repro.sql
NOTICE: error i=1, failures=20, total_kb=22672, diff_kb=972
NOTICE: error i=2, failures=20, total_kb=22692, diff_kb=20
NOTICE: error i=3, failures=20, total_kb=22704, diff_kb=8
NOTICE: error i=4, failures=20, total_kb=22700, diff_kb=-8
NOTICE: error i=5, failures=20, total_kb=22704, diff_kb=4
NOTICE: error i=6, failures=20, total_kb=22704, diff_kb=0
NOTICE: error i=7, failures=20, total_kb=22708, diff_kb=4
NOTICE: error i=8, failures=20, total_kb=22704, diff_kb=-8
NOTICE: error i=9, failures=20, total_kb=22708, diff_kb=4
NOTICE: error i=10, failures=20, total_kb=22708, diff_kb=-4

--
Andrey Chernyy

Attachments:

xml2-pgxml-xpath-error-leak-repro.sqlapplication/sqlDownload
v2-0004-xml2-Avoid-libxml-leaks-in-pgxml_xpath-error-path.patchtext/x-patchDownload+32-20
#4Michael Paquier
michael@paquier.xyz
In reply to: Andrey Chernyy (#3)
Re: [PATCH] Fix libxml leaks in contrib/xml2 XPath functions

On Tue, Jun 02, 2026 at 03:35:44AM +0300, Andrey Chernyy wrote:

While looking at pgxml_xpath(), I noticed one more nearby cleanup gap.
pgxml_xpath() builds an xpath_workspace before returning it to callers.
The callers already wrap pgxml_xpath() with PG_TRY/PG_CATCH, but if an
ERROR is thrown before pgxml_xpath() returns, the assignment of the
workspace pointer in the caller has not completed yet. The caller-side
PG_CATCH blocks therefore cannot call cleanup_workspace() for the
partially-built workspace.

Aha, nice. That's because xmlXPathNewContext() can fail internally on
a call of xmlMalloc().

One thing that slightly confused me is that it could be possible to do
twice cleanup_workspace() for the callers of pgxml_xpath(), but that's
fine: once if the TRY/CATCH block of pgxml_xpath() fails and a second
time in the TRY/CATCH block of xpath_string(), xpath_number() or
xpath_bool().

Now wait a minute, something is off in upstream with
xmlXPathCompiledEval(), no? This stuff does a chain of
xmlXPathCompiledEval() -> xmlXPathCompiledEvalInternal() ->
xmlXPathCompParserContext(). xmlXPathCompParserContext() may fail on
a xmlMalloc(), and xmlXPathCompiledEvalInternal() has the idea to
return the same result if given NULL in input by its caller or on OOM.
That means that we could incorrectly assign a NULL result that should
not be. I don't think that there is much we can do in the Postgres
code because we have no idea of the error state (aka valid NULL or
just an OOM), but that may be worth mentioning to upstream, where they
would need a new "extensible" API with an error reason or a different
error code depending on the failure. As far as I can see we are doing
nothing wrong in xml2. Well, at least nothing worse than the current
deal. :)

The set of v2-0001~0004 merged together should be fine as final
solution, after double-checking all the callers with the set applied.
--
Michael

#5Andrey Chernyy
andrey.cherny@tantorlabs.com
In reply to: Michael Paquier (#4)
Re: [PATCH] Fix libxml leaks in contrib/xml2 XPath functions

Thanks, Michael.

Agreed about xmlXPathCompiledEval(); without a reliable error indication
from libxml2, xml2 cannot distinguish a valid NULL result from an
internal failure locally.

The v2-0001..v2-0004 merge sounds good to me.

--
Andrey Chernyy