hash partitioning

Started by David Westover 17 years ago5 messagesgeneral
Jump to latest
#1David West
david.west@cusppoint.com

Hi folks,

I'm wondering why the postgres planner is not capable of determining the
correct partition for a simple select for the following partitioning scheme,
in which I'd like to automatically divide rows into four sub-tables, ie, a
simple form of hash partitioning.

Any ideas why this doesn't work, or a work around to make it work? I would
have expected the query plan below to only query the test_1 table.

Regards

David

CREATE TABLE test (

id int not null primary key

);

CREATE TABLE test_0 ( CHECK ( id % 4 = 0) ) INHERITS (test);

CREATE TABLE test_1 ( CHECK ( id % 4 = 1) ) INHERITS (test);

CREATE TABLE test_2 ( CHECK ( id % 4 = 2) ) INHERITS (test);

CREATE TABLE test_3 ( CHECK ( id % 4 = 3) ) INHERITS (test);

CREATE RULE test_0 AS ON INSERT TO test WHERE ( id % 4 = 0 ) DO INSTEAD
INSERT INTO test_0 VALUES ( NEW.id );

CREATE RULE test_1 AS ON INSERT TO test WHERE ( id % 4 = 1 ) DO INSTEAD
INSERT INTO test_1 VALUES ( NEW.id );

CREATE RULE test_2 AS ON INSERT TO test WHERE ( id % 4 = 2 ) DO INSTEAD
INSERT INTO test_2 VALUES ( NEW.id );

CREATE RULE test_3 AS ON INSERT TO test WHERE ( id % 4 = 3 ) DO INSTEAD
INSERT INTO test_3 VALUES ( NEW.id );

insert into test values(1);

explain analyse select * from test;

"Result (cost=0.00..170.00 rows=12000 width=4) (actual time=0.027..0.042
rows=1 loops=1)"

" -> Append (cost=0.00..170.00 rows=12000 width=4) (actual
time=0.020..0.032 rows=1 loops=1)"

" -> Seq Scan on test (cost=0.00..34.00 rows=2400 width=4) (actual
time=0.002..0.002 rows=0 loops=1)"

" -> Seq Scan on test_0 test (cost=0.00..34.00 rows=2400 width=4)
(actual time=0.001..0.001 rows=0 loops=1)"

" -> Seq Scan on test_1 test (cost=0.00..34.00 rows=2400 width=4)
(actual time=0.007..0.009 rows=1 loops=1)"

" -> Seq Scan on test_2 test (cost=0.00..34.00 rows=2400 width=4)
(actual time=0.001..0.001 rows=0 loops=1)"

" -> Seq Scan on test_3 test (cost=0.00..34.00 rows=2400 width=4)
(actual time=0.001..0.001 rows=0 loops=1)"

"Total runtime: 0.115 ms"

#2Scott Marlowe
scott.marlowe@gmail.com
In reply to: David West (#1)
Re: hash partitioning

On Wed, Sep 3, 2008 at 10:24 AM, David West <david.west@cusppoint.com> wrote:

Hi folks,

I'm wondering why the postgres planner is not capable of determining the
correct partition for a simple select for the following partitioning scheme,
in which I'd like to automatically divide rows into four sub-tables, ie, a
simple form of hash partitioning.

Have you got constraint_exclusion turned on?

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: David West (#1)
Re: hash partitioning

"David West" <david.west@cusppoint.com> writes:

I'm wondering why the postgres planner is not capable of determining the
correct partition for a simple select for the following partitioning scheme,

The planner doesn't know anything about the behavior of %.
Heed the fine manual's advice:

Keep the partitioning constraints simple, else the planner may not be
able to prove that partitions don't need to be visited. Use simple
equality conditions for list partitioning, or simple range tests for
range partitioning, as illustrated in the preceding examples. A good
rule of thumb is that partitioning constraints should contain only
comparisons of the partitioning column(s) to constants using
B-tree-indexable operators.

regards, tom lane

#4William Garrison
postgres@mobydisk.com
In reply to: David West (#1)
Re: hash partitioning

When I attended the PostgreSQL East conference, someone presented a way
of doing this that they used for http://www.mailermailer.com/ and they
did this:

SET constraint_exclusion = on;
EXPLAIN
SELECT
*
FROM
test
WHERE
id = 7
AND id % 4 = 3

Their business layer then generated the "AND id % 4 = 3" part of the
SQL. :(

Does anyone know if Oracle or any other database can handle this?

Does this work with stored procs? Ex, suppose a stored procedure like this:
get_from_test(id int, id_mod_4 int)
SELECT id FROM test WHERE id = $1 and id % 4 = $2;
Would the optimizer know the correct table to use in that case?

David West wrote:

Show quoted text

Hi folks,

I'm wondering why the postgres planner is not capable of determining
the correct partition for a simple select for the following
partitioning scheme, in which I'd like to automatically divide rows
into four sub-tables, ie, a simple form of hash partitioning.

Any ideas why this doesn't work, or a work around to make it work? I
would have expected the query plan below to only query the test_1 table.

Regards

David

CREATE TABLE test (

id int not null primary key

);

CREATE TABLE test_0 ( CHECK ( id % 4 = 0) ) INHERITS (test);

CREATE TABLE test_1 ( CHECK ( id % 4 = 1) ) INHERITS (test);

CREATE TABLE test_2 ( CHECK ( id % 4 = 2) ) INHERITS (test);

CREATE TABLE test_3 ( CHECK ( id % 4 = 3) ) INHERITS (test);

CREATE RULE test_0 AS ON INSERT TO test WHERE ( id % 4 = 0 ) DO
INSTEAD INSERT INTO test_0 VALUES ( NEW.id );

CREATE RULE test_1 AS ON INSERT TO test WHERE ( id % 4 = 1 ) DO
INSTEAD INSERT INTO test_1 VALUES ( NEW.id );

CREATE RULE test_2 AS ON INSERT TO test WHERE ( id % 4 = 2 ) DO
INSTEAD INSERT INTO test_2 VALUES ( NEW.id );

CREATE RULE test_3 AS ON INSERT TO test WHERE ( id % 4 = 3 ) DO
INSTEAD INSERT INTO test_3 VALUES ( NEW.id );

insert into test values(1);

explain analyse select * from test;

"Result (cost=0.00..170.00 rows=12000 width=4) (actual
time=0.027..0.042 rows=1 loops=1)"

" -> Append (cost=0.00..170.00 rows=12000 width=4) (actual
time=0.020..0.032 rows=1 loops=1)"

" -> Seq Scan on test (cost=0.00..34.00 rows=2400 width=4)
(actual time=0.002..0.002 rows=0 loops=1)"

" -> Seq Scan on test_0 test (cost=0.00..34.00 rows=2400
width=4) (actual time=0.001..0.001 rows=0 loops=1)"

" -> Seq Scan on test_1 test (cost=0.00..34.00 rows=2400
width=4) (actual time=0.007..0.009 rows=1 loops=1)"

" -> Seq Scan on test_2 test (cost=0.00..34.00 rows=2400
width=4) (actual time=0.001..0.001 rows=0 loops=1)"

" -> Seq Scan on test_3 test (cost=0.00..34.00 rows=2400
width=4) (actual time=0.001..0.001 rows=0 loops=1)"

"Total runtime: 0.115 ms"

#5Ian Harding
harding.ian@gmail.com
In reply to: William Garrison (#4)
Re: hash partitioning

On Wed, Sep 3, 2008 at 10:36 AM, William Garrison <postgres@mobydisk.com> wrote:

When I attended the PostgreSQL East conference, someone presented a way of
doing this that they used for http://www.mailermailer.com/ and they did
this:

SET constraint_exclusion = on;
EXPLAIN
SELECT
*
FROM
test
WHERE
id = 7
AND id % 4 = 3

Their business layer then generated the "AND id % 4 = 3" part of the SQL.
:(

Does anyone know if Oracle or any other database can handle this?

Oracle has support for hash partitioning like so:

CREATE TABLE sales_hash
(salesman_id NUMBER(5),
salesman_name VARCHAR2(30),
sales_amount NUMBER(10),
week_no NUMBER(2))
PARTITION BY HASH(salesman_id)
PARTITIONS 4
STORE IN (data1, data2, data3, data4);

There is no need to specify which partition to search or reference any
hash function in queries, it's all magic.

Show quoted text

David West wrote:

Hi folks,

I'm wondering why the postgres planner is not capable of determining the
correct partition for a simple select for the following partitioning scheme,
in which I'd like to automatically divide rows into four sub-tables, ie, a
simple form of hash partitioning.

Any ideas why this doesn't work, or a work around to make it work? I would
have expected the query plan below to only query the test_1 table.