BUG #17280: global-buffer-overflow on select from pg_stat_slru

Started by PG Bug reporting formover 4 years ago8 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 17280
Logged by: Alexander Kozhemyakin
Email address: a.kozhemyakin@postgrespro.ru
PostgreSQL version: 14.0
Operating system: Ubuntu 21.04
Description:

The following simple query:
select * from pg_catalog.pg_stat_slru
leads to the sanitizer-detected error:
==23911==ERROR: AddressSanitizer: global-buffer-overflow on address
0x5582bec7c5e0 at pc 0x5582bbd2c01c bp 0x7fff0b73a470 sp 0x7fff0b73a460
READ of size 64 at 0x5582bec7c5e0 thread T0
#0 0x5582bbd2c01b in pg_stat_get_slru
/home/postgres/postgres/src/backend/utils/adt/pgstatfuncs.c:1914
#1 0x5582bb405b83 in ExecMakeTableFunctionResult
/home/postgres/postgres/src/backend/executor/execSRF.c:234
#2 0x5582bb45dfd5 in FunctionNext
/home/postgres/postgres/src/backend/executor/nodeFunctionscan.c:95
#3 0x5582bb408a6f in ExecScanFetch
/home/postgres/postgres/src/backend/executor/execScan.c:133
#4 0x5582bb408cba in ExecScan
/home/postgres/postgres/src/backend/executor/execScan.c:182
#5 0x5582bb45db99 in ExecFunctionScan
/home/postgres/postgres/src/backend/executor/nodeFunctionscan.c:270
#6 0x5582bb3fd916 in ExecProcNodeFirst
/home/postgres/postgres/src/backend/executor/execProcnode.c:463
#7 0x5582bb3ddf35 in ExecProcNode
../../../src/include/executor/executor.h:257
#8 0x5582bb3ddf35 in ExecutePlan
/home/postgres/postgres/src/backend/executor/execMain.c:1551
#9 0x5582bb3de54b in standard_ExecutorRun
/home/postgres/postgres/src/backend/executor/execMain.c:361
#10 0x5582bb3de75a in ExecutorRun
/home/postgres/postgres/src/backend/executor/execMain.c:305
#11 0x5582bbabc326 in PortalRunSelect
/home/postgres/postgres/src/backend/tcop/pquery.c:921
#12 0x5582bbac25e3 in PortalRun
/home/postgres/postgres/src/backend/tcop/pquery.c:765
#13 0x5582bbab6277 in exec_simple_query
/home/postgres/postgres/src/backend/tcop/postgres.c:1214
#14 0x5582bbabb2e1 in PostgresMain
/home/postgres/postgres/src/backend/tcop/postgres.c:4497
#15 0x5582bb86dadd in BackendRun
/home/postgres/postgres/src/backend/postmaster/postmaster.c:4584
#16 0x5582bb876e01 in BackendStartup
/home/postgres/postgres/src/backend/postmaster/postmaster.c:4312
#17 0x5582bb8775a9 in ServerLoop
/home/postgres/postgres/src/backend/postmaster/postmaster.c:1801
#18 0x5582bb879d6f in PostmasterMain
/home/postgres/postgres/src/backend/postmaster/postmaster.c:1473
#19 0x5582bb563465 in main
/home/postgres/postgres/src/backend/main/main.c:198
#20 0x7fac88170564 in __libc_start_main
(/lib/x86_64-linux-gnu/libc.so.6+0x28564)
#21 0x5582baba0ded in _start
(/home/postgres/rel_master/bin/postgres+0x1a72ded)

0x5582bec7c5e0 is located 32 bytes to the left of global variable 'walStats'
defined in 'pgstat.c:282:24' (0x5582bec7c600) of size 72
0x5582bec7c5e0 is located 0 bytes to the right of global variable
'slruStats' defined in 'pgstat.c:283:25' (0x5582bec7c3e0) of size 512
SUMMARY: AddressSanitizer: global-buffer-overflow
/home/postgres/postgres/src/backend/utils/adt/pgstatfuncs.c:1914 in
pg_stat_get_slru
Shadow bytes around the buggy address:
0x0ab0d7d87860: f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9 00 f9 f9 f9
0x0ab0d7d87870: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00
0x0ab0d7d87880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ab0d7d87890: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0ab0d7d878a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0ab0d7d878b0: 00 00 00 00 00 00 00 00 00 00 00 00[f9]f9 f9 f9
0x0ab0d7d878c0: 00 00 00 00 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9
0x0ab0d7d878d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9 f9
0x0ab0d7d878e0: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
0x0ab0d7d878f0: 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9
0x0ab0d7d87900: f9 f9 f9 f9 00 f9 f9 f9 f9 f9 f9 f9 00 f9 f9 f9
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc

#2Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #17280: global-buffer-overflow on select from pg_stat_slru

At Wed, 10 Nov 2021 13:42:25 +0000, PG Bug reporting form <noreply@postgresql.org> wrote in

The following bug has been logged on the website:

Bug reference: 17280
Logged by: Alexander Kozhemyakin
Email address: a.kozhemyakin@postgrespro.ru
PostgreSQL version: 14.0
Operating system: Ubuntu 21.04
Description:

The following simple query:
select * from pg_catalog.pg_stat_slru
leads to the sanitizer-detected error:
==23911==ERROR: AddressSanitizer: global-buffer-overflow on address
0x5582bec7c5e0 at pc 0x5582bbd2c01c bp 0x7fff0b73a470 sp 0x7fff0b73a460
READ of size 64 at 0x5582bec7c5e0 thread T0
#0 0x5582bbd2c01b in pg_stat_get_slru
/home/postgres/postgres/src/backend/utils/adt/pgstatfuncs.c:1914

slruStats has 8 elements (SLRU_NUM_ELEMENTS).

In the loop in pg_stat_get_slru digested as below,

stats = pgstat_fetch_slru(); - returns slruStats

for (i = 0;; i++)
{

!> PgStat_SLRUStats stat = stats[i];

name = pgstat_slru_name(i);

if (!name)
break;

The line prefixed by '!' runs with i = 8, which is actually overrun.

I see three ways to fix it. One is to move the assignment to stat to
after the break. Another is to limit the for loop using
SLRU_NUM_ELEMENTS. The last one is limit the for loop using
pgstat_slru_name().

The loop is designed not to directly rely on SRLU_NUM_ELEMENTS so if
we honor that design, we would take the first or the third way. The
first way is smallest but I prefer the third way as it is
straightforward as such kind of loops. The attached is that for the
master.

The code was introduced at 13 and the attached applies to the versions
back to 13.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

0001-Fix-memory-overrun-of-pg_stat_get_slru.patchtext/x-patch; charset=us-asciiDownload+2-8
#3Michael Paquier
michael@paquier.xyz
In reply to: Kyotaro Horiguchi (#2)
Re: BUG #17280: global-buffer-overflow on select from pg_stat_slru

On Thu, Nov 11, 2021 at 10:39:23AM +0900, Kyotaro Horiguchi wrote:

The loop is designed not to directly rely on SRLU_NUM_ELEMENTS so if
we honor that design, we would take the first or the third way. The
first way is smallest but I prefer the third way as it is
straightforward as such kind of loops. The attached is that for the
master.

The code was introduced at 13 and the attached applies to the versions
back to 13.

Or it would be easier for the reader to assign stat after checking for
the result of pgstat_slru_name(), no? I am not much a fan of this
code style that uses a counter, FWIW, but at the same time
SLRU_NUM_ELEMENTS is local to pgstat.c, so..
--
Michael

#4Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#3)
Re: BUG #17280: global-buffer-overflow on select from pg_stat_slru

At Thu, 11 Nov 2021 11:52:27 +0900, Michael Paquier <michael@paquier.xyz> wrote in

On Thu, Nov 11, 2021 at 10:39:23AM +0900, Kyotaro Horiguchi wrote:

The loop is designed not to directly rely on SRLU_NUM_ELEMENTS so if
we honor that design, we would take the first or the third way. The
first way is smallest but I prefer the third way as it is
straightforward as such kind of loops. The attached is that for the
master.

The code was introduced at 13 and the attached applies to the versions
back to 13.

Or it would be easier for the reader to assign stat after checking for
the result of pgstat_slru_name(), no? I am not much a fan of this
code style that uses a counter, FWIW, but at the same time
SLRU_NUM_ELEMENTS is local to pgstat.c, so..

I'm not sure which is easier to read, but it might be a bit hard since
the conditino term in not mention counter itself. I don't object to
that way. And, yes SLRU_NUM_ELEMENTS cannot be used here:p

The attached is the first way in the choices.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

v2-0001-Fix-memory-overrun-of-pg_stat_get_slru.patchtext/x-patch; charset=us-asciiDownload+2-2
#5Michael Paquier
michael@paquier.xyz
In reply to: Kyotaro Horiguchi (#4)
Re: BUG #17280: global-buffer-overflow on select from pg_stat_slru

On Thu, Nov 11, 2021 at 12:19:09PM +0900, Kyotaro Horiguchi wrote:

I'm not sure which is easier to read, but it might be a bit hard since
the conditino term in not mention counter itself. I don't object to
that way. And, yes SLRU_NUM_ELEMENTS cannot be used here:p

The attached is the first way in the choices.

That looks fine.

Also, what do you think about adding a test in sysviews.sql? This
could be as simple as that:
SELECT count(*) > 0 AS ok FROM pg_stat_slru;

I am not sure if the buildfarm animals running valgrind would have
caught this issue, but having something would be better than nothing
for this case.
--
Michael

#6Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#5)
Re: BUG #17280: global-buffer-overflow on select from pg_stat_slru

At Thu, 11 Nov 2021 17:08:34 +0900, Michael Paquier <michael@paquier.xyz> wrote in

On Thu, Nov 11, 2021 at 12:19:09PM +0900, Kyotaro Horiguchi wrote:

I'm not sure which is easier to read, but it might be a bit hard since
the conditino term in not mention counter itself. I don't object to
that way. And, yes SLRU_NUM_ELEMENTS cannot be used here:p

The attached is the first way in the choices.

That looks fine.

Also, what do you think about adding a test in sysviews.sql? This
could be as simple as that:
SELECT count(*) > 0 AS ok FROM pg_stat_slru;

Rahter it is touching only pg_stat_wal and pg_stat_walreceiver among
42 pg_stat_* views. However I agree to it is good to add it for the
reason mentioned below.

I am not sure if the buildfarm animals running valgrind would have
caught this issue, but having something would be better than nothing
for this case.

Valgrind didn't detect that for me (-O0). The address area accessed
by the overun is allocated to other variables. That being said. I
agree that causing access to the full range of the variable should
help checker tools. And that ends in a time shorter than a blink.

Please find the attached.

regrds.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

v2-0001-Fix-memory-overrun-of-pg_stat_get_slru.patchtext/x-patch; charset=us-asciiDownload+12-2
#7Michael Paquier
michael@paquier.xyz
In reply to: Kyotaro Horiguchi (#6)
Re: BUG #17280: global-buffer-overflow on select from pg_stat_slru

On Fri, Nov 12, 2021 at 04:25:46PM +0900, Kyotaro Horiguchi wrote:

Please find the attached.

Thanks, done.
--
Michael

#8Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Michael Paquier (#7)
Re: BUG #17280: global-buffer-overflow on select from pg_stat_slru

At Fri, 12 Nov 2021 21:54:14 +0900, Michael Paquier <michael@paquier.xyz> wrote in

On Fri, Nov 12, 2021 at 04:25:46PM +0900, Kyotaro Horiguchi wrote:

Please find the attached.

Thanks, done.

Thanks for committing!

--
Kyotaro Horiguchi
NTT Open Source Software Center