Valgrind - showing memory leaks

Started by Yasir8 months ago9 messages
#1Yasir
yasir.hussain.shah@gmail.com
1 attachment(s)

Hi Hackers,

I ran a postgres server with valgrind looking for memory leaks at a
particular extension, but I am experiencing something strange. Here are the
steps:

1. PG configured & compile as follows:
---------------------------------------------------
$ ./configure --enable-debug CFLAGS="-ggdb -Og -g3 -fno-omit-frame-pointer"
--enable-cassert \
--enable-rpath --enable-tap-tests --with-ssl=openssl --with-systemd
--enable-depend \
--enable-injection-points --with-liburing --enable-dtrace --prefix=/tmp/pg18
$ make install

2. In terminal-1, PG Server launched as follows:
----------------------------------------------------------------
$ valgrind --leak-check=full --gen-suppressions=all
--suppressions=postgres/src/tools/valgrind.supp \
--time-stamp=yes
--error-markers=VALGRINDERROR-BEGIN,VALGRINDERROR-END
--log-file=/valgrind_%p.log
\
--trace-children=yes /tmp/pg18/bin/postgres -D /tmp/data1
--log_line_prefix="%m %p " \
--log_statement=all --shared_buffers=64MB 2>&1 | tee
/tmp/valgrind_postmaster.log

3. In terminal-2, psql launched as follows:
-------------------------------------------------------
$ /tmp/pg18/bin/psql postgres
psql (18devel)
Type "help" for help.

postgres=# select pg_backend_pid();
pg_backend_pid
----------------
2293900
(1 row)

postgres=# CREATE TABLE tbl1 (scope text NOT NULL);
CREATE TABLE
postgres=# INSERT INTO tbl1 VALUES ('some'),('full');
INSERT 0 2
postgres=# select * from tbl1;
scope
-------
some
full
(2 rows)

postgres=# drop table tbl1;
DROP TABLE
postgres=# \q

Attached is valgrind output generated showing memory leaks.

5. Question:
----------------
I believe that the valgrind should not report any memory leaks in such
simple/common commands. What am I doing wrong here?

Regards,

Yasir Hussain
Data Bene

Attachments:

valgrind_2293900.logtext/x-log; charset=US-ASCII; name=valgrind_2293900.logDownload
#2Álvaro Herrera
alvherre@kurilemu.de
In reply to: Yasir (#1)
Re: Valgrind - showing memory leaks

On 2025-05-08, Yasir wrote:

I believe that the valgrind should not report any memory leaks in such
simple/common commands. What am I doing wrong here?

Hmm, that function was modified relatively recently so I think you found a genuine bug, which is probably mine. I'll have a look ...

--
Álvaro Herrera

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Yasir (#1)
Re: Valgrind - showing memory leaks

Yasir <yasir.hussain.shah@gmail.com> writes:

I believe that the valgrind should not report any memory leaks in such
simple/common commands. What am I doing wrong here?

I think you are vastly overestimating both the intelligence of
valgrind, and our level of concern about minor one-time leaks.
Most of these are probably not really leaks at all, but failure
on valgrind's part to notice the relevant pointers. Moreover,
almost all are blamed on catcache setup, which is a one-time
operation; so even if it is losing track of some allocations,
it's not likely to be something worth worrying about.

Alvaro seems to think CheckNNConstraintFetch is worth taking
a second look at, and maybe he's right, but the amount of
storage involved there seems unexciting too.

regards, tom lane

#4Yasir
yasir.hussain.shah@gmail.com
In reply to: Álvaro Herrera (#2)
Re: Valgrind - showing memory leaks

On Thu, May 8, 2025 at 6:17 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:

On 2025-05-08, Yasir wrote:

I believe that the valgrind should not report any memory leaks in such
simple/common commands. What am I doing wrong here?

Hmm, that function was modified relatively recently so I think you found a
genuine bug, which is probably mine. I'll have a look ...

I will wait for your next response, after your investigation.

Regards,

Yasir
Data Bene

#5Yasir
yasir.hussain.shah@gmail.com
In reply to: Tom Lane (#3)
Re: Valgrind - showing memory leaks

On Thu, May 8, 2025 at 7:27 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yasir <yasir.hussain.shah@gmail.com> writes:

I believe that the valgrind should not report any memory leaks in such
simple/common commands. What am I doing wrong here?

I think you are vastly overestimating both the intelligence of
valgrind, and our level of concern about minor one-time leaks.
Most of these are probably not really leaks at all, but failure
on valgrind's part to notice the relevant pointers. Moreover,
almost all are blamed on catcache setup, which is a one-time
operation; so even if it is losing track of some allocations,
it's not likely to be something worth worrying about.

Alvaro seems to think CheckNNConstraintFetch is worth taking
a second look at, and maybe he's right, but the amount of
storage involved there seems unexciting too.

regards, tom lane

Makes sense, these look like valgrind false positives tied to long-lived
allocations.
Since Alvaro flagged CheckNNConstraintFetch, I’ll leave it to him to take a
closer look if he still thinks it’s worth digging into.

Best,
Yasir
Data Bene

#6Álvaro Herrera
alvherre@kurilemu.de
In reply to: Tom Lane (#3)
Re: Valgrind - showing memory leaks

On 2025-May-08, Tom Lane wrote:

Alvaro seems to think CheckNNConstraintFetch is worth taking
a second look at, and maybe he's right, but the amount of
storage involved there seems unexciting too.

Yeah, I think the new issue here is that we're calling that function at
all; previously we would only call that function if the table had check
constraints, but now we call it if it has either that or not-null
constraints.

If the table doesn't have check constraints, we end up doing
MemoryContextAllocZero() with size 0 in CacheMemoryContext, which isn't
great (IIUC we innecessarily allocate a chunk of size 8 in this case).
I think we should make the allocation conditional on nchecks not being
zero, otherwise I think we're indeed leaking memory permanently in
CacheMemoryContext, since that allocation is not recorded anywhere:

diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 68ff67de549..a5ae3d8de5b 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -4598,10 +4598,11 @@ CheckNNConstraintFetch(Relation relation)
 	HeapTuple	htup;
 	int			found = 0;
-	/* Allocate array with room for as many entries as expected */
-	check = (ConstrCheck *)
-		MemoryContextAllocZero(CacheMemoryContext,
-							   ncheck * sizeof(ConstrCheck));
+	/* Allocate array with room for as many entries as expected, if needed */
+	if (ncheck > 0)
+		check = (ConstrCheck *)
+			MemoryContextAllocZero(CacheMemoryContext,
+								   ncheck * sizeof(ConstrCheck));

/* Search pg_constraint for relevant entries */
ScanKeyInit(&skey[0],

On the other hand, the bug I was thinking about, is that if the table
has an invalid not-null constraint, we leak during detoasting in
extractNotNullColumn(). We purposefully made that function leak that
memory, because it was only used in DDL code (so the leak didn't
matter), and to simplify code; commit ff239c3bf4e8. This uses the
caller memory context, so it's not a permanent leak and I don't think we
need any fixes. But it's no longer so obvious that extractNotNullColumn
is okay to leak those few bytes.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Álvaro Herrera (#6)
Re: Valgrind - showing memory leaks

=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@kurilemu.de> writes:

If the table doesn't have check constraints, we end up doing
MemoryContextAllocZero() with size 0 in CacheMemoryContext, which isn't
great (IIUC we innecessarily allocate a chunk of size 8 in this case).
I think we should make the allocation conditional on nchecks not being
zero, otherwise I think we're indeed leaking memory permanently in
CacheMemoryContext, since that allocation is not recorded anywhere:

Uh ... yeah it is, down at the bottom of the function:

/* Install array only after it's fully valid */
relation->rd_att->constr->check = check;
relation->rd_att->constr->num_check = found;

So it seems like valgrind is wrong here, or else we're leaking the
whole rd_att structure later on somehow.

In any case, you're right that asking for a zero-size chunk is
pretty pointless. I'd support doing

+	if (ncheck > 0)
+		check = (ConstrCheck *)
+			MemoryContextAllocZero(CacheMemoryContext,
+								   ncheck * sizeof(ConstrCheck));
+	else
+		check = NULL;

but I think we have to make sure it's null if we don't palloc it.

On the other hand, the bug I was thinking about, is that if the table
has an invalid not-null constraint, we leak during detoasting in
extractNotNullColumn(). We purposefully made that function leak that
memory, because it was only used in DDL code (so the leak didn't
matter), and to simplify code; commit ff239c3bf4e8. This uses the
caller memory context, so it's not a permanent leak and I don't think we
need any fixes. But it's no longer so obvious that extractNotNullColumn
is okay to leak those few bytes.

Given your description it still sounds fine to me.

regards, tom lane

#8Álvaro Herrera
alvherre@kurilemu.de
In reply to: Tom Lane (#7)
Re: Valgrind - showing memory leaks

On 2025-May-08, Tom Lane wrote:

Uh ... yeah it is, down at the bottom of the function:

/* Install array only after it's fully valid */
relation->rd_att->constr->check = check;
relation->rd_att->constr->num_check = found;

So it seems like valgrind is wrong here, or else we're leaking the
whole rd_att structure later on somehow.

Well, the problem is that if num_check is zero, FreeTupleDesc() doesn't
free ->check.

In any case, you're right that asking for a zero-size chunk is pretty
pointless. I'd support doing

+	if (ncheck > 0)
+		check = (ConstrCheck *)
+			MemoryContextAllocZero(CacheMemoryContext,
+								   ncheck * sizeof(ConstrCheck));
+	else
+		check = NULL;

but I think we have to make sure it's null if we don't palloc it.

Done that way, thanks.

On the other hand, the bug I was thinking about, is that if the table
has an invalid not-null constraint, we leak during detoasting in
extractNotNullColumn(). [...] But it's no longer so obvious that
extractNotNullColumn is okay to leak those few bytes.

Given your description it still sounds fine to me.

Cool, I left it alone.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
“Cuando no hay humildad las personas se degradan” (A. Christie)

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Álvaro Herrera (#8)
Re: Valgrind - showing memory leaks

=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@kurilemu.de> writes:

On 2025-May-08, Tom Lane wrote:

So it seems like valgrind is wrong here, or else we're leaking the
whole rd_att structure later on somehow.

Well, the problem is that if num_check is zero, FreeTupleDesc() doesn't
free ->check.

Ah-hah.

Done that way, thanks.

Thanks! I've been putting together a list of fixes to make
Valgrind less noisy, and it just got one shorter.

regards, tom lane