Re: [POC] hash partitioning

Started by yangjie@highgo.comalmost 9 years ago67 messageshackers

almost 9 years ago

Hello

Looking at your hash partitioning syntax, I implemented a hash partition in a more concise way, with no need to determine the number of sub-tables, and dynamically add partitions.

Description

The hash partition's implement is on the basis of the original range / list partition,and using similar syntax.

To create a partitioned table ,use:

CREATE TABLE h (id int) PARTITION BY HASH(id);

The partitioning key supports only one value, and I think the partition key can support multiple values,
which may be difficult to implement when querying, but it is not impossible.

A partition table can be create as bellow:

CREATE TABLE h1 PARTITION OF h;
CREATE TABLE h2 PARTITION OF h;
CREATE TABLE h3 PARTITION OF h;

FOR VALUES clause cannot be used, and the partition bound is calclulated automatically as partition index of single integer value.

An inserted record is stored in a partition whose index equals
DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts/* Number of partitions */
;
In the above example, this is DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, id)) % 3;

postgres=# insert into h select generate_series(1,20);
INSERT 0 20
postgres=# select tableoid::regclass,* from h;
tableoid | id
----------+----
h1 | 3
h1 | 5
h1 | 17
h1 | 19
h2 | 2
h2 | 6
h2 | 7
h2 | 11
h2 | 12
h2 | 14
h2 | 15
h2 | 18
h2 | 20
h3 | 1
h3 | 4
h3 | 8
h3 | 9
h3 | 10
h3 | 13
h3 | 16
(20 rows)

The number of partitions here can be dynamically added, and if a new partition is created, the number of partitions changes, the calculated target partitions will change, and the same data is not reasonable in different partitions,So you need to re-calculate the existing data and insert the target partition when you create a new partition.

postgres=# create table h4 partition of h;
CREATE TABLE
postgres=# select tableoid::regclass,* from h;
tableoid | id
----------+----
h1 | 5
h1 | 17
h1 | 19
h1 | 6
h1 | 12
h1 | 8
h1 | 13
h2 | 11
h2 | 14
h3 | 1
h3 | 9
h3 | 2
h3 | 15
h4 | 3
h4 | 7
h4 | 18
h4 | 20
h4 | 4
h4 | 10
h4 | 16
(20 rows)

When querying the data, the hash partition uses the same algorithm as the insertion, and filters out the table that does not need to be scanned.

postgres=# explain analyze select * from h where id = 1;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Append (cost=0.00..41.88 rows=13 width=4) (actual time=0.020..0.023 rows=1 loops=1)
-> Seq Scan on h3 (cost=0.00..41.88 rows=13 width=4) (actual time=0.013..0.016 rows=1 loops=1)
Filter: (id = 1)
Rows Removed by Filter: 3
Planning time: 0.346 ms
Execution time: 0.061 ms
(6 rows)

postgres=# explain analyze select * from h where id in (1,5);;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Append (cost=0.00..83.75 rows=52 width=4) (actual time=0.016..0.028 rows=2 loops=1)
-> Seq Scan on h1 (cost=0.00..41.88 rows=26 width=4) (actual time=0.015..0.018 rows=1 loops=1)
Filter: (id = ANY ('{1,5}'::integer[]))
Rows Removed by Filter: 6
-> Seq Scan on h3 (cost=0.00..41.88 rows=26 width=4) (actual time=0.005..0.007 rows=1 loops=1)
Filter: (id = ANY ('{1,5}'::integer[]))
Rows Removed by Filter: 3
Planning time: 0.720 ms
Execution time: 0.074 ms
(9 rows)

postgres=# explain analyze select * from h where id = 1 or id = 5;;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Append (cost=0.00..96.50 rows=50 width=4) (actual time=0.017..0.078 rows=2 loops=1)
-> Seq Scan on h1 (cost=0.00..48.25 rows=25 width=4) (actual time=0.015..0.019 rows=1 loops=1)
Filter: ((id = 1) OR (id = 5))
Rows Removed by Filter: 6
-> Seq Scan on h3 (cost=0.00..48.25 rows=25 width=4) (actual time=0.005..0.010 rows=1 loops=1)
Filter: ((id = 1) OR (id = 5))
Rows Removed by Filter: 3
Planning time: 0.396 ms
Execution time: 0.139 ms
(9 rows)

Can not detach / attach / drop partition table.

Best regards,
young

yonj1e.github.io
yangjie@highgo.com

Yugo Nagata

nagata@sraoss.co.jp

almost 9 years ago

In reply to: yangjie@highgo.com (#1)

Hi young,

On Mon, 28 Aug 2017 15:33:46 +0800
"yangjie@highgo.com" <yangjie@highgo.com> wrote:

Hello

Looking at your hash partitioning syntax, I implemented a hash partition in a more concise way, with no need to determine the number of sub-tables, and dynamically add partitions.

I think it is great work, but the current consensus about hash-partitioning supports
Amul's patch[1]/messages/by-id/CAAJ_b965A2oog=6eFUhELexL3RmgFssB3G7LwkVA1bw0WUJJoA@mail.gmail.com, in which the syntax is different from the my original proposal.
So, you will have to read Amul's patch and make a discussion if you still want to
propose your implementation.

Regards,

[1]: /messages/by-id/CAAJ_b965A2oog=6eFUhELexL3RmgFssB3G7LwkVA1bw0WUJJoA@mail.gmail.com

Description

The hash partition's implement is on the basis of the original range / list partition,and using similar syntax.

To create a partitioned table ,use:

CREATE TABLE h (id int) PARTITION BY HASH(id);

The partitioning key supports only one value, and I think the partition key can support multiple values,
which may be difficult to implement when querying, but it is not impossible.

A partition table can be create as bellow:

CREATE TABLE h1 PARTITION OF h;
CREATE TABLE h2 PARTITION OF h;
CREATE TABLE h3 PARTITION OF h;

FOR VALUES clause cannot be used, and the partition bound is calclulated automatically as partition index of single integer value.

An inserted record is stored in a partition whose index equals
DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, values[0])) % nparts/* Number of partitions */
;
In the above example, this is DatumGetUInt32(OidFunctionCall1(lookup_type_cache(key->parttypid[0], TYPECACHE_HASH_PROC)->hash_proc, id)) % 3;

postgres=# insert into h select generate_series(1,20);
INSERT 0 20
postgres=# select tableoid::regclass,* from h;
tableoid | id
----------+----
h1 | 3
h1 | 5
h1 | 17
h1 | 19
h2 | 2
h2 | 6
h2 | 7
h2 | 11
h2 | 12
h2 | 14
h2 | 15
h2 | 18
h2 | 20
h3 | 1
h3 | 4
h3 | 8
h3 | 9
h3 | 10
h3 | 13
h3 | 16
(20 rows)

The number of partitions here can be dynamically added, and if a new partition is created, the number of partitions changes, the calculated target partitions will change, and the same data is not reasonable in different partitions,So you need to re-calculate the existing data and insert the target partition when you create a new partition.

postgres=# create table h4 partition of h;
CREATE TABLE
postgres=# select tableoid::regclass,* from h;
tableoid | id
----------+----
h1 | 5
h1 | 17
h1 | 19
h1 | 6
h1 | 12
h1 | 8
h1 | 13
h2 | 11
h2 | 14
h3 | 1
h3 | 9
h3 | 2
h3 | 15
h4 | 3
h4 | 7
h4 | 18
h4 | 20
h4 | 4
h4 | 10
h4 | 16
(20 rows)

When querying the data, the hash partition uses the same algorithm as the insertion, and filters out the table that does not need to be scanned.

postgres=# explain analyze select * from h where id = 1;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Append (cost=0.00..41.88 rows=13 width=4) (actual time=0.020..0.023 rows=1 loops=1)
-> Seq Scan on h3 (cost=0.00..41.88 rows=13 width=4) (actual time=0.013..0.016 rows=1 loops=1)
Filter: (id = 1)
Rows Removed by Filter: 3
Planning time: 0.346 ms
Execution time: 0.061 ms
(6 rows)

postgres=# explain analyze select * from h where id in (1,5);;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Append (cost=0.00..83.75 rows=52 width=4) (actual time=0.016..0.028 rows=2 loops=1)
-> Seq Scan on h1 (cost=0.00..41.88 rows=26 width=4) (actual time=0.015..0.018 rows=1 loops=1)
Filter: (id = ANY ('{1,5}'::integer[]))
Rows Removed by Filter: 6
-> Seq Scan on h3 (cost=0.00..41.88 rows=26 width=4) (actual time=0.005..0.007 rows=1 loops=1)
Filter: (id = ANY ('{1,5}'::integer[]))
Rows Removed by Filter: 3
Planning time: 0.720 ms
Execution time: 0.074 ms
(9 rows)

postgres=# explain analyze select * from h where id = 1 or id = 5;;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Append (cost=0.00..96.50 rows=50 width=4) (actual time=0.017..0.078 rows=2 loops=1)
-> Seq Scan on h1 (cost=0.00..48.25 rows=25 width=4) (actual time=0.015..0.019 rows=1 loops=1)
Filter: ((id = 1) OR (id = 5))
Rows Removed by Filter: 6
-> Seq Scan on h3 (cost=0.00..48.25 rows=25 width=4) (actual time=0.005..0.010 rows=1 loops=1)
Filter: ((id = 1) OR (id = 5))
Rows Removed by Filter: 3
Planning time: 0.396 ms
Execution time: 0.139 ms
(9 rows)

Can not detach / attach / drop partition table.

Best regards,
young

yonj1e.github.io
yangjie@highgo.com

--
Yugo Nagata <nagata@sraoss.co.jp>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

yangjie@highgo.com

almost 9 years ago

In reply to: Yugo Nagata (#2)

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: yangjie@highgo.com (#3)

I've updated patch to use an extended hash function (Commit #
81c5e46c490e2426db243eada186995da5bb0ba7) for the partitioning.

Regards,
Amul

On Thu, Jul 27, 2017 at 5:11 PM, amul sul <sulamul@gmail.com> wrote:

Show quoted text

Attaching newer patches rebased against the latest master head. Thanks !

Regards,
Amul

Rajkumar Raghuwanshi

rajkumar.raghuwanshi@enterprisedb.com

over 8 years ago

In reply to: Amul Sul (#4)

On Mon, Sep 4, 2017 at 4:08 PM, amul sul <sulamul@gmail.com> wrote:

I've updated patch to use an extended hash function (Commit #
81c5e46c490e2426db243eada186995da5bb0ba7) for the partitioning.

I have done some testing with these patches, everything looks fine,

attaching sql and out file for reference.

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Amul Sul (#4)

On Mon, Sep 4, 2017 at 6:38 AM, amul sul <sulamul@gmail.com> wrote:

I've updated patch to use an extended hash function (Commit #
81c5e46c490e2426db243eada186995da5bb0ba7) for the partitioning.

Committed 0001 after noticing that Jeevan Ladhe also found that change
convenient for default partitioning. I made a few minor cleanups;
hopefully I didn't break anything.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Robert Haas (#6)

On Fri, Sep 8, 2017 at 6:45 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Sep 4, 2017 at 6:38 AM, amul sul <sulamul@gmail.com> wrote:

I've updated patch to use an extended hash function (Commit #
81c5e46c490e2426db243eada186995da5bb0ba7) for the partitioning.

Committed 0001 after noticing that Jeevan Ladhe also found that change
convenient for default partitioning. I made a few minor cleanups;
hopefully I didn't break anything.

Thanks you.

Rebased 0002 against this commit & renamed to 0001, PFA.

Regards,
Amul

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

over 8 years ago

In reply to: Amul Sul (#7)

On Fri, Sep 8, 2017 at 6:10 PM, amul sul <sulamul@gmail.com> wrote:

On Fri, Sep 8, 2017 at 6:45 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Sep 4, 2017 at 6:38 AM, amul sul <sulamul@gmail.com> wrote:

I've updated patch to use an extended hash function (Commit #
81c5e46c490e2426db243eada186995da5bb0ba7) for the partitioning.

Committed 0001 after noticing that Jeevan Ladhe also found that change
convenient for default partitioning. I made a few minor cleanups;
hopefully I didn't break anything.

Thanks you.

Rebased 0002 against this commit & renamed to 0001, PFA.

Given that we have default partition support now, I am wondering
whether hash partitioned tables also should have default partitions.
The way we have structured hash partitioning syntax, there can be
"holes" in partitions. Default partition would help plug those holes.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Ashutosh Bapat (#8)

On Mon, Sep 11, 2017 at 4:17 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

Rebased 0002 against this commit & renamed to 0001, PFA.

Given that we have default partition support now, I am wondering
whether hash partitioned tables also should have default partitions.
The way we have structured hash partitioning syntax, there can be
"holes" in partitions. Default partition would help plug those holes.

Yeah, I was thinking about that, too. On the one hand, it seems like
it's solving the problem the wrong way: if you've set up hash
partitioning properly, you shouldn't have any holes. On the other
hand, supporting it probably wouldn't cost anything noticeable and
might make things seem more consistent. I'm not sure which way to
jump on this one.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Alvaro Herrera

alvherre@2ndquadrant.com

over 8 years ago

In reply to: Robert Haas (#9)

Robert Haas wrote:

On Mon, Sep 11, 2017 at 4:17 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

Rebased 0002 against this commit & renamed to 0001, PFA.

Given that we have default partition support now, I am wondering
whether hash partitioned tables also should have default partitions.
The way we have structured hash partitioning syntax, there can be
"holes" in partitions. Default partition would help plug those holes.

Yeah, I was thinking about that, too. On the one hand, it seems like
it's solving the problem the wrong way: if you've set up hash
partitioning properly, you shouldn't have any holes. On the other
hand, supporting it probably wouldn't cost anything noticeable and
might make things seem more consistent. I'm not sure which way to
jump on this one.

How difficult/tedious/troublesome would be to install the missing
partitions if you set hash partitioning with a default partition and
only later on notice that some partitions are missing? I think if the
answer is that you need to exclusive-lock something for a long time and
this causes a disruption in production systems, then it's better not to
allow a default partition at all and just force all the hash partitions
to be there from the start.

On the other hand, if you can get tuples out of the default partition
into their intended regular partitions without causing any disruption,
then it seems okay to allow default partitions in hash partitioning
setups.

(I, like many others, was unable to follow the default partition stuff
as closely as I would have liked.)

--
ï¿½lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Alvaro Herrera (#10)

On Mon, Sep 11, 2017 at 8:00 AM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

How difficult/tedious/troublesome would be to install the missing
partitions if you set hash partitioning with a default partition and
only later on notice that some partitions are missing? I think if the
answer is that you need to exclusive-lock something for a long time and
this causes a disruption in production systems, then it's better not to
allow a default partition at all and just force all the hash partitions
to be there from the start.

On the other hand, if you can get tuples out of the default partition
into their intended regular partitions without causing any disruption,
then it seems okay to allow default partitions in hash partitioning
setups.

I think there's no real use case for default partitioning, and yeah,
you do need exclusive locks to repartition things (whether hash
partitioning or otherwise). It would be nice to fix that eventually,
but it's hard, because the executor has to cope with the floor moving
under it, and as of today, it really can't cope with that at all - not
because of partitioning specifically, but because of existing design
decisions that will require a lot of work (and probably arguing) to
revisit.

I think the way to get around the usability issues for hash
partitioning is to eventually add some syntax that does things like
(1) automatically create the table with N properly-configured
partitions, (2) automatically split an existing partition into N
pieces, and (3) automatically rewrite the whole table using a
different partition count.

People seem to find the hash partitioning stuff a little arcane. I
don't want to discount that confusion with some sort of high-handed "I
know better" attitude, I think the interface that users will actually
see can end up being pretty straightforward. The complexity that is
there in the syntax is to allow pg_upgrade and pg_dump/restore to work
properly. But users don't necessarily have to use the same syntax
that pg_dump does, just as you can say CREATE INDEX ON a (b) and let
the system specify the index name, but at dump time the index name is
specified explicitly.

(I, like many others, was unable to follow the default partition stuff
as closely as I would have liked.)

Uh, sorry about that. Would it help if I wrote a blog post on it or
something? The general idea is simple: any tuples that don't route to
any other partition get routed to the default partition.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Alvaro Herrera (#10)

On Mon, Sep 11, 2017 at 5:30 PM, Alvaro Herrera <alvherre@alvh.no-ip.org>
wrote:

Robert Haas wrote:

On Mon, Sep 11, 2017 at 4:17 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

Rebased 0002 against this commit & renamed to 0001, PFA.

Given that we have default partition support now, I am wondering
whether hash partitioned tables also should have default partitions.
The way we have structured hash partitioning syntax, there can be
"holes" in partitions. Default partition would help plug those holes.

Yeah, I was thinking about that, too. On the one hand, it seems like
it's solving the problem the wrong way: if you've set up hash
partitioning properly, you shouldn't have any holes. On the other
hand, supporting it probably wouldn't cost anything noticeable and
might make things seem more consistent. I'm not sure which way to
jump on this one.

How difficult/tedious/troublesome would be to install the missing
partitions if you set hash partitioning with a default partition and
only later on notice that some partitions are missing? I think if the
answer is that you need to exclusive-lock something for a long time and
this causes a disruption in production systems, then it's better not to
allow a default partition at all and just force all the hash partitions
to be there from the start.

I am also leaning toward not to support a default partition for a hash
partitioned table.

The major drawback I can see is the constraint get created on the default
partition
table. IIUC, constraint on the default partition table are just negation
of partition
constraint on all its sibling partitions.

Consider a hash partitioned table having partitions with (modulus 64,
remainder 0) ,
...., (modulus 64, remainder 62) hash bound and partition column are col1,
col2,...,so on,
then constraint for the default partition will be :

NOT( (satisfies_hash_partition(64, 0, hash_fn1(col1), hash_fn2(col2), ...)
&& ... &&
satisfies_hash_partition(64, 62, hash_fn1(col1),hash_fn2(col2), ...))

Which will be much harmful to the performance than any other partitioning
strategy because it calculate a hash for the same partitioning key multiple
time.
We could overcome this by having an another SQL function (e.g
satisfies_default_hash_partition)
which calculates hash value once and checks the remainder, and that would be
a different path from the current default partition framework.

Regards,
Amul

#13

Jesper Pedersen

jesper.pedersen@redhat.com

over 8 years ago

In reply to: Amul Sul (#7)

Hi Amul,

On 09/08/2017 08:40 AM, amul sul wrote:

Rebased 0002 against this commit & renamed to 0001, PFA.

This patch needs a rebase.

Best regards,
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Jesper Pedersen (#13)

On Wed, Sep 13, 2017 at 7:43 PM, Jesper Pedersen <jesper.pedersen@redhat.com

wrote:

Hi Amul,

On 09/08/2017 08:40 AM, amul sul wrote:

Rebased 0002 against this commit & renamed to 0001, PFA.

This patch needs a rebase.

Thanks for your note.

Attached is the patch rebased on the latest master head.
Also added error on
creating
d
efault partition
for the hash partitioned table
,
and updated document &

test script for the same.

Regards,
Amul

#15

Jesper Pedersen

jesper.pedersen@redhat.com

over 8 years ago

In reply to: Amul Sul (#14)

Hi Amul,

On 09/14/2017 04:58 AM, amul sul wrote:

On Wed, Sep 13, 2017 at 7:43 PM, Jesper Pedersen <jesper.pedersen@redhat.com

This patch needs a rebase.

Thanks for your note.

Attached is the patch rebased on the latest master head.
Also added error on creating default partition for the hash partitioned table,
and updated document & test script for the same.

Thanks !

When I do

CREATE TABLE mytab (
a integer NOT NULL,
b integer NOT NULL,
c integer,
d integer
) PARTITION BY HASH (b);

and create 64 partitions;

CREATE TABLE mytab_p00 PARTITION OF mytab FOR VALUES WITH (MODULUS 64,
REMAINDER 0);
...
CREATE TABLE mytab_p63 PARTITION OF mytab FOR VALUES WITH (MODULUS 64,
REMAINDER 63);

and associated indexes

CREATE INDEX idx_p00 ON mytab_p00 USING btree (b, a);
...
CREATE INDEX idx_p63 ON mytab_p63 USING btree (b, a);

Populate the database, and do ANALYZE.

Given

EXPLAIN (ANALYZE, VERBOSE, BUFFERS ON) SELECT a, b, c, d FROM mytab
WHERE b = 42

gives

Append
-> Index Scan using idx_p00 (cost rows=7) (actual rows=0)
...
-> Index Scan using idx_p63 (cost rows=7) (actual rows=0)

E.g. all partitions are being scanned. Of course one partition will
contain the rows I'm looking for.

Best regards,
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Jesper Pedersen (#15)

On Thu, Sep 14, 2017 at 11:39 AM, Jesper Pedersen
<jesper.pedersen@redhat.com> wrote:

When I do

CREATE TABLE mytab (
a integer NOT NULL,
b integer NOT NULL,
c integer,
d integer
) PARTITION BY HASH (b);

and create 64 partitions;

CREATE TABLE mytab_p00 PARTITION OF mytab FOR VALUES WITH (MODULUS 64,
REMAINDER 0);
...
CREATE TABLE mytab_p63 PARTITION OF mytab FOR VALUES WITH (MODULUS 64,
REMAINDER 63);

and associated indexes

CREATE INDEX idx_p00 ON mytab_p00 USING btree (b, a);
...
CREATE INDEX idx_p63 ON mytab_p63 USING btree (b, a);

Populate the database, and do ANALYZE.

Given

EXPLAIN (ANALYZE, VERBOSE, BUFFERS ON) SELECT a, b, c, d FROM mytab WHERE b
= 42

gives

Append
-> Index Scan using idx_p00 (cost rows=7) (actual rows=0)
...
-> Index Scan using idx_p63 (cost rows=7) (actual rows=0)

E.g. all partitions are being scanned. Of course one partition will contain
the rows I'm looking for.

Yeah, we need Amit Langote's work in
/messages/by-id/098b9c71-1915-1a2a-8d52-1a7a50ce79e8@lab.ntt.co.jp
to land and this patch to be adapted to make use of it. I think
that's the major thing still standing in the way of this. Concerns
were also raised about not having a way to see the hash function, but
we fixed that in 81c5e46c490e2426db243eada186995da5bb0ba7 and
hopefully this patch has been updated to use a seed (I haven't looked
yet). And there was a concern about hash functions not being
portable, but the conclusion of that was basically that most people
think --load-via-partition-root will be a satisfactory workaround for
cases where that becomes a problem (cf. commit
23d7680d04b958de327be96ffdde8f024140d50e). So this is the major
remaining issue that I know about.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

Jesper Pedersen

jesper.pedersen@redhat.com

over 8 years ago

In reply to: Robert Haas (#16)

Hi,

On 09/14/2017 12:05 PM, Robert Haas wrote:

On Thu, Sep 14, 2017 at 11:39 AM, Jesper Pedersen
<jesper.pedersen@redhat.com> wrote:

When I do

CREATE TABLE mytab (
a integer NOT NULL,
b integer NOT NULL,
c integer,
d integer
) PARTITION BY HASH (b);

and create 64 partitions;

CREATE TABLE mytab_p00 PARTITION OF mytab FOR VALUES WITH (MODULUS 64,
REMAINDER 0);
...
CREATE TABLE mytab_p63 PARTITION OF mytab FOR VALUES WITH (MODULUS 64,
REMAINDER 63);

and associated indexes

CREATE INDEX idx_p00 ON mytab_p00 USING btree (b, a);
...
CREATE INDEX idx_p63 ON mytab_p63 USING btree (b, a);

Populate the database, and do ANALYZE.

Given

EXPLAIN (ANALYZE, VERBOSE, BUFFERS ON) SELECT a, b, c, d FROM mytab WHERE b
= 42

gives

Append
-> Index Scan using idx_p00 (cost rows=7) (actual rows=0)
...
-> Index Scan using idx_p63 (cost rows=7) (actual rows=0)

E.g. all partitions are being scanned. Of course one partition will contain
the rows I'm looking for.

Yeah, we need Amit Langote's work in
/messages/by-id/098b9c71-1915-1a2a-8d52-1a7a50ce79e8@lab.ntt.co.jp
to land and this patch to be adapted to make use of it. I think
that's the major thing still standing in the way of this. Concerns
were also raised about not having a way to see the hash function, but
we fixed that in 81c5e46c490e2426db243eada186995da5bb0ba7 and
hopefully this patch has been updated to use a seed (I haven't looked
yet). And there was a concern about hash functions not being
portable, but the conclusion of that was basically that most people
think --load-via-partition-root will be a satisfactory workaround for
cases where that becomes a problem (cf. commit
23d7680d04b958de327be96ffdde8f024140d50e). So this is the major
remaining issue that I know about.

Thanks for the information, Robert !

Best regards,
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

David Fetter

david@fetter.org

over 8 years ago

In reply to: Robert Haas (#9)

On Mon, Sep 11, 2017 at 07:43:29AM -0400, Robert Haas wrote:

On Mon, Sep 11, 2017 at 4:17 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:

Rebased 0002 against this commit & renamed to 0001, PFA.

Given that we have default partition support now, I am wondering
whether hash partitioned tables also should have default
partitions. The way we have structured hash partitioning syntax,
there can be "holes" in partitions. Default partition would help
plug those holes.

Yeah, I was thinking about that, too. On the one hand, it seems
like it's solving the problem the wrong way: if you've set up hash
partitioning properly, you shouldn't have any holes.

Should we be pointing the gun away from people's feet by making hash
partitions that cover the space automagically when the partitioning
scheme[1]For now, that's just the modulus, but the PoC included specifying hashing functions, so I assume other ways to specify the partitioning scheme could eventually be proposed. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com is specified? In other words, do we have a good reason to have
only some of the hash partitions so defined by default?

Best,
David.

[1]: For now, that's just the modulus, but the PoC included specifying hashing functions, so I assume other ways to specify the partitioning scheme could eventually be proposed. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com
hashing functions, so I assume other ways to specify the partitioning
scheme could eventually be proposed.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: David Fetter (#18)

On Thu, Sep 14, 2017 at 12:54 PM, David Fetter <david@fetter.org> wrote:

Should we be pointing the gun away from people's feet by making hash
partitions that cover the space automagically when the partitioning
scheme[1] is specified? In other words, do we have a good reason to have
only some of the hash partitions so defined by default?

Sure, we can add some convenience syntax for that, but I'd like to get
the basic stuff working before doing that kind of polishing.

If nothing else, I assume Keith Fiske's pg_partman will provide a way
to magically DTRT about an hour after this goes in. But probably we
can do better in core easily enough.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

Jesper Pedersen

jesper.pedersen@redhat.com

over 8 years ago

In reply to: Robert Haas (#19)

On 09/14/2017 12:56 PM, Robert Haas wrote:

On Thu, Sep 14, 2017 at 12:54 PM, David Fetter <david@fetter.org> wrote:

Should we be pointing the gun away from people's feet by making hash
partitions that cover the space automagically when the partitioning
scheme[1] is specified? In other words, do we have a good reason to have
only some of the hash partitions so defined by default?

Sure, we can add some convenience syntax for that, but I'd like to get
the basic stuff working before doing that kind of polishing.

If nothing else, I assume Keith Fiske's pg_partman will provide a way
to magically DTRT about an hour after this goes in. But probably we
can do better in core easily enough.

Yeah, it would be nice to have a syntax like

) PARTITION BY HASH (col) WITH (AUTO_CREATE = 64);

But then there also needs to be a way to create the 64 associated
indexes too for everything to be easy.

Best regards,
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Jesper Pedersen (#20)

#22

Jesper Pedersen

jesper.pedersen@redhat.com

over 8 years ago

In reply to: Robert Haas (#21)

#23

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Jesper Pedersen (#22)

#24

Thom Brown

thom@linux.com

over 8 years ago

In reply to: Amul Sul (#14)

#25

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Thom Brown (#24)

#26

Jesper Pedersen

jesper.pedersen@redhat.com

over 8 years ago

In reply to: Amul Sul (#25)

#27

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Jesper Pedersen (#26)

#28

Jesper Pedersen

jesper.pedersen@redhat.com

over 8 years ago

In reply to: Amul Sul (#27)

#29

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

over 8 years ago

In reply to: Jesper Pedersen (#28)

#30

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Amit Langote (#29)

#31

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Amit Langote (#29)

#32

Jesper Pedersen

jesper.pedersen@redhat.com

over 8 years ago

In reply to: Amul Sul (#30)

#33

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Jesper Pedersen (#32)

#34

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Amul Sul (#33)

#35

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

over 8 years ago

In reply to: Amul Sul (#34)

#36

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Ashutosh Bapat (#35)

#37

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Amul Sul (#36)

#38

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

over 8 years ago

In reply to: Amul Sul (#36)

#39

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

over 8 years ago

In reply to: Amul Sul (#37)

#40

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Ashutosh Bapat (#38)

#41

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Amul Sul (#40)

#42

Amit Langote

Langote_Amit_f8@lab.ntt.co.jp

over 8 years ago

In reply to: Robert Haas (#31)

#43

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Robert Haas (#41)

#44

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Amul Sul (#43)

#45

Andres Freund

andres@anarazel.de

over 8 years ago

In reply to: Robert Haas (#44)

#46

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Andres Freund (#45)

#47

Andres Freund

andres@anarazel.de

over 8 years ago

In reply to: Robert Haas (#46)

#48

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Andres Freund (#47)

#49

Andres Freund

andres@anarazel.de

over 8 years ago

In reply to: Robert Haas (#48)

#50

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

over 8 years ago

In reply to: Amul Sul (#40)

#51

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

over 8 years ago

In reply to: Ashutosh Bapat (#50)

#52

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Andres Freund (#49)

#53

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Amul Sul (#43)

#54

Andres Freund

andres@anarazel.de

over 8 years ago

In reply to: Amul Sul (#52)

#55

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Andres Freund (#54)

#56

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Amul Sul (#53)

#57

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Robert Haas (#56)

#58

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Amul Sul (#57)

#59

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Robert Haas (#58)

#60

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Amul Sul (#59)

#61

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Amul Sul (#60)

#62

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Robert Haas (#61)

#63

Ashutosh Bapat

ashutosh.bapat@enterprisedb.com

over 8 years ago

In reply to: Robert Haas (#61)

#64

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Amul Sul (#62)

#65

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Ashutosh Bapat (#63)

#66

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Amul Sul (#60)

#67

Amul Sul

sulamul@gmail.com

over 8 years ago

In reply to: Robert Haas (#66)

Re: [POC] hash partitioning

Attachments:

Attachments:

Attachments:

Attachments:

Attachments: