pg_stat_get_activity(): integer overflow due to (int) * (int) for MemoryContextAllocHuge()
Hi -hackers,
We've got customer report that high max_connections (3k) with high
pgstat_track_activity_query_size (1MB) ends up with:
postgres=# select * from pg_stat_get_activity(NULL);
ERROR: invalid memory alloc request size 18446744072590721024
postgres=# select version();
version
------------------------------------------------------------------------------------------------------
PostgreSQL 17devel on x86_64-pc-linux-gnu, compiled by gcc (Debian
10.2.1-6) 10.2.1 20210110, 64-bit
(1 row)
it's in:
#0 errstart (elevel=elevel@entry=21, domain=domain@entry=0x0) at elog.c:358
#1 0x000055971cafc9a8 in errstart_cold (elevel=elevel@entry=21,
domain=domain@entry=0x0) at elog.c:333
#2 0x000055971cb018b7 in MemoryContextAllocHuge (context=<optimized
out>, size=18446744072590721024) at mcxt.c:1594
#3 0x000055971ce2fd59 in pgstat_read_current_status () at backend_status.c:767
#4 0x000055971ce30ab1 in pgstat_read_current_status () at backend_status.c:1167
#5 pgstat_fetch_stat_numbackends () at backend_status.c:1168
#6 0x000055971ceccc7f in pg_stat_get_activity (fcinfo=0x55971e239ff8)
at pgstatfuncs.c:308
It seems to be integer overflow due to (int) * (int), while
MemoryContextAllocHuge() allows taking Size(size_t) as parameter. I
get similar behaviour with:
size_t val = (int)1048576 * (int)3022;
Attached patch adjusts pgstat_track_activity_query_size to be of
size_t from int and fixes the issue.
Regards,
-Jakub Wartak.
Attachments:
0001-Adjust-pgstat_track_activity_query_size-to-be-of-siz.patchapplication/octet-stream; name=0001-Adjust-pgstat_track_activity_query_size-to-be-of-siz.patchDownload
From 7517e6a673644b1ebfc1322438a2d82f92d15d8f Mon Sep 17 00:00:00 2001
From: Jakub Wartak <jakub.wartak@enterprisedb.com>
Date: Wed, 27 Sep 2023 08:26:22 +0200
Subject: [PATCH] Adjust pgstat_track_activity_query_size to be of size_t
instead of int.
This prevents integer overflow for MemoryContextAllocHuge() in pg_stat_get_activity()
SQL function when it is being called on system with high values of
max_connections and high pgstat_track_activity_query_size (e.g. 1MB).
postgres=# select * from pg_stat_get_activity(NULL);
ERROR: invalid memory alloc request size 18446744072590721024
---
src/backend/utils/activity/backend_status.c | 2 +-
src/include/utils/backend_status.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index 722c5acf38..a08a38c78c 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -43,7 +43,7 @@
* ----------
*/
bool pgstat_track_activities = false;
-int pgstat_track_activity_query_size = 1024;
+size_t pgstat_track_activity_query_size = 1024;
/* exposed so that backend_progress.c can access it */
diff --git a/src/include/utils/backend_status.h b/src/include/utils/backend_status.h
index d51c840daf..a487aeddd7 100644
--- a/src/include/utils/backend_status.h
+++ b/src/include/utils/backend_status.h
@@ -286,7 +286,7 @@ typedef struct LocalPgBackendStatus
* ----------
*/
extern PGDLLIMPORT bool pgstat_track_activities;
-extern PGDLLIMPORT int pgstat_track_activity_query_size;
+extern PGDLLIMPORT size_t pgstat_track_activity_query_size;
/* ----------
--
2.30.2
On Wed, Sep 27, 2023 at 08:41:55AM +0200, Jakub Wartak wrote:
Attached patch adjusts pgstat_track_activity_query_size to be of
size_t from int and fixes the issue.
This cannot be backpatched, and using size_t is not really needed as
track_activity_query_size is capped at 1MB. Why don't you just tweak
the calculation done in pgstat_read_current_status() instead, say with
a cast?
--
Michael
On Wed, Sep 27, 2023 at 10:08 AM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Sep 27, 2023 at 08:41:55AM +0200, Jakub Wartak wrote:
Attached patch adjusts pgstat_track_activity_query_size to be of
size_t from int and fixes the issue.This cannot be backpatched, and using size_t is not really needed as
track_activity_query_size is capped at 1MB. Why don't you just tweak
the calculation done in pgstat_read_current_status() instead, say with
a cast?
Thanks Michael, sure, that is probably a better alternative. I've attached v2.
BTW: CF entry is https://commitfest.postgresql.org/45/4592/
Regards,
-Jakub Wartak.
Attachments:
v2-0001-Cast-MemoryContextAllocHuge-calculations-in-pg_st.patchapplication/octet-stream; name=v2-0001-Cast-MemoryContextAllocHuge-calculations-in-pg_st.patchDownload
From 9a43cbb2b9b458e0f950cdab7ae557aa8f914be9 Mon Sep 17 00:00:00 2001
From: Jakub Wartak <jakub.wartak@enterprisedb.com>
Date: Wed, 27 Sep 2023 10:13:39 +0200
Subject: [PATCH v2] Cast MemoryContextAllocHuge() calculations in
pg_stat_get_activity() SQL function to size_t.
This prevents integer overflow for MemoryContextAllocHuge() in pg_stat_get_activity()
SQL function when it is being called on system with high values of
max_connections and high pgstat_track_activity_query_size (e.g. 1MB):
postgres=# select * from pg_stat_get_activity(NULL);
ERROR: invalid memory alloc request size 18446744072590721024
---
src/backend/utils/activity/backend_status.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index 722c5acf38..e6d1397db2 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -765,7 +765,7 @@ pgstat_read_current_status(void)
NAMEDATALEN * NumBackendStatSlots);
localactivity = (char *)
MemoryContextAllocHuge(backendStatusSnapContext,
- pgstat_track_activity_query_size * NumBackendStatSlots);
+ (size_t)pgstat_track_activity_query_size * (size_t)NumBackendStatSlots);
#ifdef USE_SSL
localsslstatus = (PgBackendSSLStatus *)
MemoryContextAlloc(backendStatusSnapContext,
--
2.30.2
On 27.09.23 09:08, Michael Paquier wrote:
On Wed, Sep 27, 2023 at 08:41:55AM +0200, Jakub Wartak wrote:
Attached patch adjusts pgstat_track_activity_query_size to be of
size_t from int and fixes the issue.This cannot be backpatched, and using size_t is not really needed as
track_activity_query_size is capped at 1MB. Why don't you just tweak
the calculation done in pgstat_read_current_status() instead, say with
a cast?
I think it's preferable to use the right type to begin with, rather than
fixing it up afterwards with casts.
Hi,
On 2023-09-27 15:42:15 +0100, Peter Eisentraut wrote:
On 27.09.23 09:08, Michael Paquier wrote:
On Wed, Sep 27, 2023 at 08:41:55AM +0200, Jakub Wartak wrote:
Attached patch adjusts pgstat_track_activity_query_size to be of
size_t from int and fixes the issue.This cannot be backpatched, and using size_t is not really needed as
track_activity_query_size is capped at 1MB. Why don't you just tweak
the calculation done in pgstat_read_current_status() instead, say with
a cast?I think it's preferable to use the right type to begin with, rather than
fixing it up afterwards with casts.
I don't think going for size_t is a viable path for fixing this. I'm pretty
sure the initial patch would trigger a type mismatch from guc_tables.c - we
don't have infrastructure for size_t GUCs.
Nor do I think either of the patches here is a complete fix - there will still
be overflows on 32bit systems.
Perhaps we ought to error out (in BackendStatusShmemSize() or such) if
pgstat_track_activity_query_size * MaxBackends >= 4GB?
Frankly, it seems like a quite bad idea to have such a high limit for
pgstat_track_activity_query_size. The overhead such a high value has will
surprise people...
Greetings,
Andres Freund
On Wed, Sep 27, 2023 at 10:29:25AM -0700, Andres Freund wrote:
I don't think going for size_t is a viable path for fixing this. I'm pretty
sure the initial patch would trigger a type mismatch from guc_tables.c - we
don't have infrastructure for size_t GUCs.
Nothing marked as PGDLLIMPORT uses size_t in the tree currently, FWIW.
Perhaps we ought to error out (in BackendStatusShmemSize() or such) if
pgstat_track_activity_query_size * MaxBackends >= 4GB?
Yeah, agreed that putting a check like that could catch errors more
quickly.
Frankly, it seems like a quite bad idea to have such a high limit for
pgstat_track_activity_query_size. The overhead such a high value has will
surprise people...
Still it could have some value for some users with large analytical
queries where the syslogger is not going to be a bottleneck? It seems
too late to me to change that, but perhaps the docs could be improved
to tell that using a too high value can have performance consequences,
while mentioning the maximum value.
--
Michael
Hi,
On 2023-09-28 07:53:45 +0900, Michael Paquier wrote:
On Wed, Sep 27, 2023 at 10:29:25AM -0700, Andres Freund wrote:
Frankly, it seems like a quite bad idea to have such a high limit for
pgstat_track_activity_query_size. The overhead such a high value has will
surprise people...Still it could have some value for some users with large analytical
queries where the syslogger is not going to be a bottleneck? It seems
too late to me to change that, but perhaps the docs could be improved
to tell that using a too high value can have performance consequences,
while mentioning the maximum value.
I don't think the issue is syslogger, the problem is that suddenly accessing
pg_stat_activity requires gigabytes of memory. That's insane.
Greetings,
Andres Freund
On Thu, Sep 28, 2023 at 12:53 AM Michael Paquier <michael@paquier.xyz> wrote:
On Wed, Sep 27, 2023 at 10:29:25AM -0700, Andres Freund wrote:
I don't think going for size_t is a viable path for fixing this. I'm pretty
sure the initial patch would trigger a type mismatch from guc_tables.c - we
don't have infrastructure for size_t GUCs.Nothing marked as PGDLLIMPORT uses size_t in the tree currently, FWIW.
Perhaps we ought to error out (in BackendStatusShmemSize() or such) if
pgstat_track_activity_query_size * MaxBackends >= 4GB?Yeah, agreed that putting a check like that could catch errors more
quickly.
Hi,
v3 attached. I had a problem coming out with a better error message,
so suggestions are welcome. The cast still needs to be present as per
above suggestion as 3GB is still valid buf size and still was causing
integer overflow. We just throw an error on >= 4GB with v3.
-J.
Attachments:
v3-0001-Introduce-memory-limit-for-BackendActivityBuffer-.patchapplication/octet-stream; name=v3-0001-Introduce-memory-limit-for-BackendActivityBuffer-.patchDownload
From eae6ab255b390404633709e2a150e3af0aa13b00 Mon Sep 17 00:00:00 2001
From: Jakub Wartak <jakub.wartak@enterprisedb.com>
Date: Thu, 28 Sep 2023 10:56:57 +0200
Subject: [PATCH v3] Introduce memory limit for BackendActivityBuffer size, and
cast calculations to size_t.
As per email discussion put a limit on BackendActivityBuffer max size to 4GB.
In addition to this, cast related MemoryContextAllocHuge() calculations
in pg_stat_get_activity() SQL function to size_t to avoid errors with
with still buffer size but lower than 4GB. This prevents integer overflow
for MemoryContextAllocHuge() when it is being called on system with
high values of max_connections (3000) and high
pgstat_track_activity_query_size (e.g. 1MB):
postgres=# select * from pg_stat_get_activity(NULL);
ERROR: invalid memory alloc request size 18446744072590721024
---
src/backend/utils/activity/backend_status.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index 722c5acf38..64d07b15da 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -37,6 +37,8 @@
*/
#define NumBackendStatSlots (MaxBackends + NUM_AUXPROCTYPES)
+/* Safety net to prevent requesting huge memory by each query to pg_stat_activity */
+#define PGSTAT_MAX_ACTIVITY_BUF_SIZE 4 * 1024 * 1024 * 1024L
/* ----------
* GUC parameters
@@ -84,6 +86,7 @@ Size
BackendStatusShmemSize(void)
{
Size size;
+ Size pgstat_track_size;
/* BackendStatusArray: */
size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots);
@@ -94,8 +97,12 @@ BackendStatusShmemSize(void)
size = add_size(size,
mul_size(NAMEDATALEN, NumBackendStatSlots));
/* BackendActivityBuffer: */
- size = add_size(size,
- mul_size(pgstat_track_activity_query_size, NumBackendStatSlots));
+ pgstat_track_size = mul_size(pgstat_track_activity_query_size,
+ NumBackendStatSlots);
+ if(pgstat_track_size >= PGSTAT_MAX_ACTIVITY_BUF_SIZE)
+ elog(FATAL, "too big Backend Activity Buffer allocation of %zu bytes", pgstat_track_size);
+ size = add_size(size, pgstat_track_size);
+
#ifdef USE_SSL
/* BackendSslStatusBuffer: */
size = add_size(size,
@@ -765,7 +772,7 @@ pgstat_read_current_status(void)
NAMEDATALEN * NumBackendStatSlots);
localactivity = (char *)
MemoryContextAllocHuge(backendStatusSnapContext,
- pgstat_track_activity_query_size * NumBackendStatSlots);
+ (size_t)pgstat_track_activity_query_size * (size_t)NumBackendStatSlots);
#ifdef USE_SSL
localsslstatus = (PgBackendSSLStatus *)
MemoryContextAlloc(backendStatusSnapContext,
--
2.30.2
On Thu, Sep 28, 2023 at 11:01:14AM +0200, Jakub Wartak wrote:
v3 attached. I had a problem coming out with a better error message,
so suggestions are welcome. The cast still needs to be present as per
above suggestion as 3GB is still valid buf size and still was causing
integer overflow. We just throw an error on >= 4GB with v3.
+/* Safety net to prevent requesting huge memory by each query to pg_stat_activity */
+#define PGSTAT_MAX_ACTIVITY_BUF_SIZE 4 * 1024 * 1024 * 1024L
- size = add_size(size,
- mul_size(pgstat_track_activity_query_size, NumBackendStatSlots));
+ pgstat_track_size = mul_size(pgstat_track_activity_query_size,
+ NumBackendStatSlots);
+ if(pgstat_track_size >= PGSTAT_MAX_ACTIVITY_BUF_SIZE)
+ elog(FATAL, "too big Backend Activity Buffer allocation of %zu bytes", pgstat_track_size);
+ size = add_size(size, pgstat_track_size);
That should be enough to put in a hardcoded 4GB safety limit, while
mul_size() detects it at a higher range. Note, however, that elog()
is only used for internal errors that users should never face, but
this one can come from a misconfiguration. This would be better as an
ereport(), with ERRCODE_CONFIG_FILE_ERROR as errcode, I guess.
"Backend Activity Buffer" is the name of the internal struct. Sure,
it shows up on the system views for shmem, but I wouldn't use this
term in a user-facing error message. Perhaps something like "size
requested for backend status is out of range" would be cleaner. Other
ideas are welcome.
--
Michael
On Fri, Sep 29, 2023 at 4:00 AM Michael Paquier <michael@paquier.xyz> wrote:
On Thu, Sep 28, 2023 at 11:01:14AM +0200, Jakub Wartak wrote:
v3 attached. I had a problem coming out with a better error message,
so suggestions are welcome. The cast still needs to be present as per
above suggestion as 3GB is still valid buf size and still was causing
integer overflow. We just throw an error on >= 4GB with v3.+/* Safety net to prevent requesting huge memory by each query to pg_stat_activity */ +#define PGSTAT_MAX_ACTIVITY_BUF_SIZE 4 * 1024 * 1024 * 1024L- size = add_size(size, - mul_size(pgstat_track_activity_query_size, NumBackendStatSlots)); + pgstat_track_size = mul_size(pgstat_track_activity_query_size, + NumBackendStatSlots); + if(pgstat_track_size >= PGSTAT_MAX_ACTIVITY_BUF_SIZE) + elog(FATAL, "too big Backend Activity Buffer allocation of %zu bytes", pgstat_track_size); + size = add_size(size, pgstat_track_size);That should be enough to put in a hardcoded 4GB safety limit, while
mul_size() detects it at a higher range. Note, however, that elog()
is only used for internal errors that users should never face, but
this one can come from a misconfiguration. This would be better as an
ereport(), with ERRCODE_CONFIG_FILE_ERROR as errcode, I guess."Backend Activity Buffer" is the name of the internal struct. Sure,
it shows up on the system views for shmem, but I wouldn't use this
term in a user-facing error message. Perhaps something like "size
requested for backend status is out of range" would be cleaner. Other
ideas are welcome.
Hi Michael,
I've attached v4 that covers your suggestions.
-J.
Attachments:
v4-0001-Introduce-memory-limit-for-BackendActivityBuffer-.patchapplication/octet-stream; name=v4-0001-Introduce-memory-limit-for-BackendActivityBuffer-.patchDownload
From 039a176da01f4a7bf6a28f434683b7da6c9f0965 Mon Sep 17 00:00:00 2001
From: Jakub Wartak <jakub.wartak@enterprisedb.com>
Date: Mon, 2 Oct 2023 10:19:46 +0200
Subject: [PATCH v4] Introduce memory limit for BackendActivityBuffer size, and
cast calculations to size_t.
As per email discussion put a limit on BackendActivityBuffer max size to 4GB.
In addition to this, cast related MemoryContextAllocHuge() calculations
in pg_stat_get_activity() SQL function to size_t to avoid errors with
with still buffer size but lower than 4GB. This prevents integer overflow
for MemoryContextAllocHuge() when it is being called on system with
high values of max_connections (3000) and high
pgstat_track_activity_query_size (e.g. 1MB):
postgres=# select * from pg_stat_get_activity(NULL);
ERROR: invalid memory alloc request size 18446744072590721024
---
src/backend/utils/activity/backend_status.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index 722c5acf38..f638058cc3 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -37,6 +37,8 @@
*/
#define NumBackendStatSlots (MaxBackends + NUM_AUXPROCTYPES)
+/* Safety net to prevent requesting huge memory by each query to pg_stat_activity */
+#define PGSTAT_MAX_ACTIVITY_BUF_SIZE 4 * 1024 * 1024 * 1024L
/* ----------
* GUC parameters
@@ -84,6 +86,7 @@ Size
BackendStatusShmemSize(void)
{
Size size;
+ Size pgstat_track_size;
/* BackendStatusArray: */
size = mul_size(sizeof(PgBackendStatus), NumBackendStatSlots);
@@ -94,8 +97,14 @@ BackendStatusShmemSize(void)
size = add_size(size,
mul_size(NAMEDATALEN, NumBackendStatSlots));
/* BackendActivityBuffer: */
- size = add_size(size,
- mul_size(pgstat_track_activity_query_size, NumBackendStatSlots));
+ pgstat_track_size = mul_size(pgstat_track_activity_query_size,
+ NumBackendStatSlots);
+ if(pgstat_track_size >= PGSTAT_MAX_ACTIVITY_BUF_SIZE)
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIG_FILE_ERROR),
+ errmsg("size requested for backend status is out of range")));
+ size = add_size(size, pgstat_track_size);
+
#ifdef USE_SSL
/* BackendSslStatusBuffer: */
size = add_size(size,
@@ -765,7 +774,7 @@ pgstat_read_current_status(void)
NAMEDATALEN * NumBackendStatSlots);
localactivity = (char *)
MemoryContextAllocHuge(backendStatusSnapContext,
- pgstat_track_activity_query_size * NumBackendStatSlots);
+ (size_t)pgstat_track_activity_query_size * (size_t)NumBackendStatSlots);
#ifdef USE_SSL
localsslstatus = (PgBackendSSLStatus *)
MemoryContextAlloc(backendStatusSnapContext,
--
2.30.2
On Mon, Oct 02, 2023 at 10:22:06AM +0200, Jakub Wartak wrote:
I've attached v4 that covers your suggestions.
Hmm. I was looking at all that and pondered quite a bit about the
addition of the restriction when starting up the server, particularly
why there would be any need to include it in the same commit as the
one fixing the arguments given to AllocHuge(). At the end, as the
restriction goes a bit against 8d0ddccec636, I have removed it and
went to the simplest solution of adding the casts to Size, which is
the same thing as we do in all the other callers that deal with signed
variables (like mbutils.c).
Regarding any restrictions, perhaps we should improve the docs, at
least. It would be better to discuss that separately.
--
Michael