Recovery mode with partitioned tables and ANY(NULL) in WHERE clause
Hello everyone,
We're going to introduce in production environment solution based on table
partitioning.
I think we found BUG in version from 11.4 to 11.5 (maybe and previous).
We have one big table, let's say "book". To improve manage and maintenance
we've decided to partition the table in two partitions levels. First based
on ranges of one column and level on another one. After release our changes
on develop stage we've met unexpected behaviour: database randomly went
into recovery mode. After a little investigation we found query that cause
database crash. We've noticed before that in good practices in partitioning
was point saying about limit for one hundred of partitions. We have +/- 400
so we thought it can be problem. For example with memory or something like
that. We've tried to "explain" problematic query to show planner but
explain fails too (without analyze). Finally we've discovered that problem
occures likewise when we operate on empty tables with two ranged partitions
and one default on one level (probably error occurs too on smaller count of
partitions). Below reproduction path:
PostgreSQL 11.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-36), 64-bit, or
PostgreSQL 11.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5
20150623 (Red Hat 4.8.5-36), 64-bit, or
PostgreSQL 11.5, compiled by Visual C++ build 1914, 64-bit.
Table create:
CREATE TABLE book (
id SERIAL,
id_publishing_house BIGINT,
date_of_publish DATE
) PARTITION BY RANGE (id_publishing_house);
-- First level
CREATE TABLE book_default PARTITION OF book DEFAULT;
CREATE TABLE book_10000000 PARTITION OF book FOR VALUES FROM (1) TO
(10000001);
CREATE TABLE book_20000000 PARTITION OF book FOR VALUES FROM (10000001) TO
(20000001);
QUERY:
SELECT * FROM book WHERE id_publishing_house = ANY(NULL::BIGINT[]);
This query is crashing database. What is more interesting:
SELECT * FROM book WHERE id_publishing_house IN (NULL);
works fine.
Logs:
2019-08-09 18:02:29.560 CEST [13500] LOG: database system was shut down at
2019-08-09 18:02:20 CEST
2019-08-09 18:02:29.591 CEST [18256] LOG: database system is ready to
accept connections
2019-08-09 18:02:53.566 CEST [18256] LOG: server process (PID 2076) was
terminated by exception 0xC0000005
2019-08-09 18:02:53.566 CEST [18256] DETAIL: Failed process was running:
SELECT * FROM book WHERE id_publishing_house = ANY(NULL::BIGINT[]);
2019-08-09 18:02:53.566 CEST [18256] HINT: See C include file "ntstatus.h"
for a description of the hexadecimal value.
2019-08-09 18:02:53.566 CEST [18256] LOG: terminating any other active
server processes
2019-08-09 18:02:53.571 CEST [16400] WARNING: terminating connection
because of crash of another server process
2019-08-09 18:02:53.571 CEST [16400] DETAIL: The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2019-08-09 18:02:53.571 CEST [16400] HINT: In a moment you should be able
to reconnect to the database and repeat your command.
2019-08-09 18:02:53.594 CEST [4980] WARNING: terminating connection
because of crash of another server process
2019-08-09 18:02:53.594 CEST [4980] DETAIL: The postmaster has commanded
this server process to roll back the current transaction and exit, because
another server process exited abnormally and possibly corrupted shared
memory.
2019-08-09 18:02:53.594 CEST [4980] HINT: In a moment you should be able
to reconnect to the database and repeat your command.
2019-08-09 18:02:53.634 CEST [18256] LOG: all server processes terminated;
reinitializing
2019-08-09 18:02:53.756 CEST [14972] FATAL: the database system is in
recovery mode
2019-08-09 18:02:53.769 CEST [13816] LOG: database system was interrupted;
last known up at 2019-08-09 18:02:29 CEST
2019-08-09 18:02:53.778 CEST [15388] FATAL: the database system is in
recovery mode
2019-08-09 18:02:54.640 CEST [12560] FATAL: the database system is in
recovery mode
2019-08-09 18:02:54.904 CEST [18008] FATAL: the database system is in
recovery mode
2019-08-09 18:02:55.637 CEST [16120] FATAL: the database system is in
recovery mode
2019-08-09 18:02:55.965 CEST [13816] LOG: database system was not properly
shut down; automatic recovery in progress
2019-08-09 18:02:55.972 CEST [13816] LOG: redo starts at 0/168E160
2019-08-09 18:02:55.973 CEST [13816] LOG: redo done at 0/168E160
2019-08-09 18:02:56.037 CEST [18256] LOG: database system is ready to
accept connections
--
Pozdrawiam
Piotr Włodarczyk
=?UTF-8?Q?Piotr_W=C5=82odarczyk?= <piotrwlodarczyk89@gmail.com> writes:
I think we found BUG in version from 11.4 to 11.5 (maybe and previous).
Yeah, looks like the ScalarArrayOpExpr path in
match_clause_to_partition_key forgot to consider the possibility
of a constant-null array. Will fix, thanks for the report!
regards, tom lane
Thanks for replay Tom,
Can you tell me in which versions it'll be fixed? For me it's critical
situation because we can have some potentially dangerous places in
application.
On Fri, Aug 9, 2019 at 6:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
=?UTF-8?Q?Piotr_W=C5=82odarczyk?= <piotrwlodarczyk89@gmail.com> writes:
I think we found BUG in version from 11.4 to 11.5 (maybe and previous).
Yeah, looks like the ScalarArrayOpExpr path in
match_clause_to_partition_key forgot to consider the possibility
of a constant-null array. Will fix, thanks for the report!regards, tom lane
--
Pozdrawiam
Piotr Włodarczyk
=?UTF-8?Q?Piotr_W=C5=82odarczyk?= <piotrwlodarczyk89@gmail.com> writes:
Can you tell me in which versions it'll be fixed? For me it's critical
situation because we can have some potentially dangerous places in
application.
Unfortunately this just missed this week's releases, so it'll be the
next quarterly releases in November.
If you're sufficiently worried about it you could apply the patch locally.
It's pretty trivial:
https://git.postgresql.org/pg/commitdiff/2f729d83226705d1149419a2aef7c1678fe641ec
regards, tom lane
Cool, thanks!
pt., 9 sie 2019, 19:21 użytkownik Tom Lane <tgl@sss.pgh.pa.us> napisał:
Show quoted text
=?UTF-8?Q?Piotr_W=C5=82odarczyk?= <piotrwlodarczyk89@gmail.com> writes:
Can you tell me in which versions it'll be fixed? For me it's critical
situation because we can have some potentially dangerous places in
application.Unfortunately this just missed this week's releases, so it'll be the
next quarterly releases in November.If you're sufficiently worried about it you could apply the patch locally.
It's pretty trivial:https://git.postgresql.org/pg/commitdiff/2f729d83226705d1149419a2aef7c1678fe641ec
regards, tom lane
Hi,
I've checked your commit and let me ensue it's fulfill all cases.
For example what with query like that: select * from coercepart where a =
any(null); ? there is no casting to array of specified type. I'm not master
in C but looking in source I believe that "if (array->const snull)" is
enough. But for better check you can add that query (and similar) to
tests/regress
On Fri, Aug 9, 2019 at 7:23 PM Piotr Włodarczyk <piotrwlodarczyk89@gmail.com>
wrote:
Cool, thanks!
pt., 9 sie 2019, 19:21 użytkownik Tom Lane <tgl@sss.pgh.pa.us> napisał:
=?UTF-8?Q?Piotr_W=C5=82odarczyk?= <piotrwlodarczyk89@gmail.com> writes:
Can you tell me in which versions it'll be fixed? For me it's critical
situation because we can have some potentially dangerous places in
application.Unfortunately this just missed this week's releases, so it'll be the
next quarterly releases in November.If you're sufficiently worried about it you could apply the patch locally.
It's pretty trivial:https://git.postgresql.org/pg/commitdiff/2f729d83226705d1149419a2aef7c1678fe641ec
regards, tom lane
--
Pozdrawiam
Piotr Włodarczyk
Hi,
In V12 this bug exists too. Is your commit going to be part of V12?
pt., 9 sie 2019, 19:44 użytkownik Piotr Włodarczyk <
piotrwlodarczyk89@gmail.com> napisał:
Show quoted text
Hi,
I've checked your commit and let me ensue it's fulfill all cases.
For example what with query like that: select * from coercepart where a =
any(null); ? there is no casting to array of specified type. I'm not master
in C but looking in source I believe that "if (array->const snull)" is
enough. But for better check you can add that query (and similar) to
tests/regressOn Fri, Aug 9, 2019 at 7:23 PM Piotr Włodarczyk <
piotrwlodarczyk89@gmail.com> wrote:Cool, thanks!
pt., 9 sie 2019, 19:21 użytkownik Tom Lane <tgl@sss.pgh.pa.us> napisał:
=?UTF-8?Q?Piotr_W=C5=82odarczyk?= <piotrwlodarczyk89@gmail.com> writes:
Can you tell me in which versions it'll be fixed? For me it's critical
situation because we can have some potentially dangerous places in
application.Unfortunately this just missed this week's releases, so it'll be the
next quarterly releases in November.If you're sufficiently worried about it you could apply the patch
locally.
It's pretty trivial:https://git.postgresql.org/pg/commitdiff/2f729d83226705d1149419a2aef7c1678fe641ec
regards, tom lane
--
Pozdrawiam
Piotr Włodarczyk
On Tue, 13 Aug 2019 at 22:51, Piotr Włodarczyk
<piotrwlodarczyk89@gmail.com> wrote:
In V12 this bug exists too. Is your commit going to be part of V12?
The fix Tom pushed is in master, v12 and v11. v12's commit is
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=59d3789294cf6d42325e92486b053f1ee5934eb8
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services