BUG #16840: Rows not found in table partitioned by hash when not all partitions exists

Started by PG Bug reporting formabout 5 years ago6 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 16840
Logged by: Michał Albrycht
Email address: michalalbrycht@gmail.com
PostgreSQL version: 13.1
Operating system: Ubuntu 18
Description:

Lets create a table partitioned by hash:

CREATE TABLE dir (
id SERIAL,
volume_id BIGINT,
path TEXT
) PARTITION BY HASH (volume_id);

Now let's create just one partition:

CREATE TABLE dir_part_2 PARTITION OF dir FOR VALUES WITH (modulus 3,
remainder 2);

Insert sample value:

INSERT INTO dir (volume_id, path) VALUES
(1, 'abc'),
(1, 'def'),
(1, 'ghi');

Queries to verify db state:
SELECT * FROM dir WHERE volume_id=1 AND path='abc'; -- returns 1 row - OK
SELECT * FROM dir WHERE volume_id=1 AND (path='abc' OR path='def') --
returns 0 rows - NOT OK!
SELECT * FROM dir_part_2 WHERE volume_id=1 AND (path='abc' or path='def') --
returns 2 rows - OK

Now lets add missing partitions:
CREATE TABLE dir_part_0 PARTITION OF dir FOR VALUES WITH (modulus 3,
remainder 0);
CREATE TABLE dir_part_1 PARTITION OF dir FOR VALUES WITH (modulus 3,
remainder 1);

And run problematic query one more time:
SELECT * FROM dir WHERE volume_id=1 AND (path='abc' OR path='def') --
returns 2 rows - OK

So Postgres silently fails to find a rows, and does not throw any error when
not all hash partitions are present.

#2Michał Albrycht
michalalbrycht@gmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #16840: Rows not found in table partitioned by hash when not all partitions exists

Argh! I pasted the wrong value of volume_id in my examples. Number 1 should
be replaced with 2 so correct insert should look like this:

INSERT INTO dir (volume_id, path) VALUES
(2, 'abc'),
(2, 'def'),
(2, 'ghi');

And queries:
SELECT * FROM dir WHERE volume_id=2 AND path='abc';
SELECT * FROM dir WHERE volume_id=2 AND (path='abc' OR path='def');
SELECT * FROM dir_part_2 WHERE volume_id=2 AND (path='abc' or path='def');

Sorry for creating unnecessary confusion.

Michał Albrycht

śr., 27 sty 2021 o 18:03 PG Bug reporting form <noreply@postgresql.org>
napisał(a):

Show quoted text

The following bug has been logged on the website:

Bug reference: 16840
Logged by: Michał Albrycht
Email address: michalalbrycht@gmail.com
PostgreSQL version: 13.1
Operating system: Ubuntu 18
Description:

Lets create a table partitioned by hash:

CREATE TABLE dir (
id SERIAL,
volume_id BIGINT,
path TEXT
) PARTITION BY HASH (volume_id);

Now let's create just one partition:

CREATE TABLE dir_part_2 PARTITION OF dir FOR VALUES WITH (modulus 3,
remainder 2);

Insert sample value:

INSERT INTO dir (volume_id, path) VALUES
(1, 'abc'),
(1, 'def'),
(1, 'ghi');

Queries to verify db state:
SELECT * FROM dir WHERE volume_id=1 AND path='abc'; -- returns 1 row - OK
SELECT * FROM dir WHERE volume_id=1 AND (path='abc' OR path='def') --
returns 0 rows - NOT OK!
SELECT * FROM dir_part_2 WHERE volume_id=1 AND (path='abc' or path='def')
--
returns 2 rows - OK

Now lets add missing partitions:
CREATE TABLE dir_part_0 PARTITION OF dir FOR VALUES WITH (modulus 3,
remainder 0);
CREATE TABLE dir_part_1 PARTITION OF dir FOR VALUES WITH (modulus 3,
remainder 1);

And run problematic query one more time:
SELECT * FROM dir WHERE volume_id=1 AND (path='abc' OR path='def') --
returns 2 rows - OK

So Postgres silently fails to find a rows, and does not throw any error
when
not all hash partitions are present.

#3Michał Albrycht
michalalbrycht@gmail.com
In reply to: Michał Albrycht (#2)
Re: BUG #16840: Rows not found in table partitioned by hash when not all partitions exists

Eh, too much work today. Please ignore my second email. All examples from
the first message are ok. Sorry.

Michał Albrycht

śr., 27 sty 2021 o 18:10 Michał Albrycht <michalalbrycht@gmail.com>
napisał(a):

Show quoted text

Argh! I pasted the wrong value of volume_id in my examples. Number 1
should be replaced with 2 so correct insert should look like this:

INSERT INTO dir (volume_id, path) VALUES
(2, 'abc'),
(2, 'def'),
(2, 'ghi');

And queries:
SELECT * FROM dir WHERE volume_id=2 AND path='abc';
SELECT * FROM dir WHERE volume_id=2 AND (path='abc' OR path='def');
SELECT * FROM dir_part_2 WHERE volume_id=2 AND (path='abc' or path='def');

Sorry for creating unnecessary confusion.

Michał Albrycht

śr., 27 sty 2021 o 18:03 PG Bug reporting form <noreply@postgresql.org>
napisał(a):

The following bug has been logged on the website:

Bug reference: 16840
Logged by: Michał Albrycht
Email address: michalalbrycht@gmail.com
PostgreSQL version: 13.1
Operating system: Ubuntu 18
Description:

Lets create a table partitioned by hash:

CREATE TABLE dir (
id SERIAL,
volume_id BIGINT,
path TEXT
) PARTITION BY HASH (volume_id);

Now let's create just one partition:

CREATE TABLE dir_part_2 PARTITION OF dir FOR VALUES WITH (modulus 3,
remainder 2);

Insert sample value:

INSERT INTO dir (volume_id, path) VALUES
(1, 'abc'),
(1, 'def'),
(1, 'ghi');

Queries to verify db state:
SELECT * FROM dir WHERE volume_id=1 AND path='abc'; -- returns 1 row - OK
SELECT * FROM dir WHERE volume_id=1 AND (path='abc' OR path='def') --
returns 0 rows - NOT OK!
SELECT * FROM dir_part_2 WHERE volume_id=1 AND (path='abc' or path='def')
--
returns 2 rows - OK

Now lets add missing partitions:
CREATE TABLE dir_part_0 PARTITION OF dir FOR VALUES WITH (modulus 3,
remainder 0);
CREATE TABLE dir_part_1 PARTITION OF dir FOR VALUES WITH (modulus 3,
remainder 1);

And run problematic query one more time:
SELECT * FROM dir WHERE volume_id=1 AND (path='abc' OR path='def') --
returns 2 rows - OK

So Postgres silently fails to find a rows, and does not throw any error
when
not all hash partitions are present.

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: PG Bug reporting form (#1)
Re: BUG #16840: Rows not found in table partitioned by hash when not all partitions exists

PG Bug reporting form <noreply@postgresql.org> writes:

CREATE TABLE dir (
id SERIAL,
volume_id BIGINT,
path TEXT
) PARTITION BY HASH (volume_id);
CREATE TABLE dir_part_2 PARTITION OF dir FOR VALUES WITH (modulus 3,
remainder 2);
SELECT * FROM dir WHERE volume_id=1 AND (path='abc' OR path='def') --
returns 0 rows - NOT OK!

Hmm, seems to be a case of faulty partition exclusion, because the
plan isn't scanning anything:

=# explain SELECT * FROM dir WHERE volume_id=1 AND (path='abc' OR path='def');
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.00 rows=0 width=0)
One-Time Filter: false
(2 rows)

Probably the reason we'd not noticed is that an incomplete set of
hash partitions isn't a very useful situation: if you don't
populate all of the partitions, you risk unexpected insertion
failures, since you really shouldn't be assuming which partition
any given key value will map into. Still, it's clearly a bug.
Thanks for the report!

regards, tom lane

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#4)
Re: BUG #16840: Rows not found in table partitioned by hash when not all partitions exists

I wrote:

Hmm, seems to be a case of faulty partition exclusion, because the
plan isn't scanning anything:

Here's a proposed patch for this. The core of the problem is confusion
around the number of entries in the PartitionBoundInfoData.indexes array.
Each of the three types of partitioning has a different rule for that,
despite which we were expecting assorted code to know what to do, and
some places got it wrong for hash --- even hash-specific code :-(

I propose here to solve that by explicitly storing the number of entries
in PartitionBoundInfoData, and thereby removing the need for partition-
strategy-independent code to know anything about the rules. I think
we can get away with that in the back branches by adding "nindexes"
at the end of the struct. This could break extensions that are
manufacturing their own PartitionBoundInfoData structs, but it seems
unlikely that there are any.

Most of the patch just straightforwardly sets or uses the new field.
Notably, partition_bounds_equal() and partition_bounds_copy() get
significantly simpler and safer. The actual bug fix is in
get_matching_hash_bounds() and perform_pruning_combine_step(), where
"all partitions" needs to be 0 .. nindexes-1 not 0 .. ndatums-1.
(The reason your example fails is that the OR clause should produce
"all partitions potentially match", but because of this bug, it's
producing a bitmask that doesn't include the partition we need.)

regards, tom lane

Attachments:

fix-bug-16840.patchtext/x-diff; charset=us-ascii; name=fix-bug-16840.patchDownload+126-139
#6Michał Albrycht
michalalbrycht@gmail.com
In reply to: Tom Lane (#5)
Re: BUG #16840: Rows not found in table partitioned by hash when not all partitions exists

Thanks for quick bug confirmation and patch. I agree that the way code is
presented here is unlikely to happen on production, but I was experimenting
with a custom, dummy hash function which would guarantee that all rows with
volume_id=1 would go to partition 1, volume_id=2 to partition 2 and so on.

CREATE OR REPLACE FUNCTION partition_custom_bigint_hash(value INT8, seed INT8)
RETURNS INT8 AS $$
-- this number is UINT64CONST(0x49a0f4dd15e5a8e3) from
https://github.com/postgres/postgres/blob/REL_13_STABLE/src/include/common/hashfn.h#L83
<https://doxygen.postgresql.org/hashfn_8h_source.html&gt;
SELECT value - 5305509591434766563;
$$ LANGUAGE SQL IMMUTABLE;

CREATE OPERATOR CLASS partition_custom_bigint_hash_op
FOR TYPE int8
USING hash AS
OPERATOR 1 =,
FUNCTION 2 partition_custom_bigint_hash(int8, int8);

Then adding that operator class to table definition:

CREATE TABLE dir (
id SERIAL,
volume_id BIGINT,
path TEXT
) PARTITION BY HASH (volume_id partition_custom_bigint_hash_op);

Now I'm able to create a partition only when it's needed and I know
which partition should be created for a given volume_id
(partition_number = volume_id % number_of_partitions).

Michał Albrycht

śr., 27 sty 2021 o 23:54 Tom Lane <tgl@sss.pgh.pa.us> napisał(a):

Show quoted text

I wrote:

Hmm, seems to be a case of faulty partition exclusion, because the
plan isn't scanning anything:

Here's a proposed patch for this. The core of the problem is confusion
around the number of entries in the PartitionBoundInfoData.indexes array.
Each of the three types of partitioning has a different rule for that,
despite which we were expecting assorted code to know what to do, and
some places got it wrong for hash --- even hash-specific code :-(

I propose here to solve that by explicitly storing the number of entries
in PartitionBoundInfoData, and thereby removing the need for partition-
strategy-independent code to know anything about the rules. I think
we can get away with that in the back branches by adding "nindexes"
at the end of the struct. This could break extensions that are
manufacturing their own PartitionBoundInfoData structs, but it seems
unlikely that there are any.

Most of the patch just straightforwardly sets or uses the new field.
Notably, partition_bounds_equal() and partition_bounds_copy() get
significantly simpler and safer. The actual bug fix is in
get_matching_hash_bounds() and perform_pruning_combine_step(), where
"all partitions" needs to be 0 .. nindexes-1 not 0 .. ndatums-1.
(The reason your example fails is that the OR clause should produce
"all partitions potentially match", but because of this bug, it's
producing a bitmask that doesn't include the partition we need.)

regards, tom lane