Automating Partitions in PostgreSQL - Query on syntax

Started by Kedar Potdarabout 17 years ago32 messageshackers

kedar.potdar@gmail.com

about 17 years ago

Hi ,

We are working on a
patch<http://archives.postgresql.org/pgsql-hackers/2009-03/msg00897.php>to
automate partitioning in
PostgreSQL.

For Range partitions, we have proposed the syntax which is as follows –

*CREATE TABLE emp (*

* emp_id int not null primary key,*

* designation text not null,*

* location varchar(50) not null,*

* jdate date not null,*

* ctc float not null*

* *

*)*

*PARTITION BY RANGE (emp_id)*

*(*

*emp_500 (START 1 END 500),*

*emp_1500 (START 500 END 1500),*

*emp_4000 (START 1520 END 4000)*

*);*

As observed in this syntax, user needs to specify explicitly, the min and
max values of a range for a given partition.

With this design, partition ranges are inherently allowed to be fragmented
and non-contiguous. As ‘gaps’ are allowed

in the ranges, we’re also supporting an ‘overflow’ partition, so that any
row, which does not satisfy constraints of any

existing partitions, does not stall a big UPDATE operation and such rows are
preserved.(in overflow table)

However, Oracle uses user-friendly syntax but makes it compulsion that
partition ranges *have* to be contiguous.

*PARTITION BY RANGE (emp_id)*

*(*

*Partition emp_500 values less than (500),*

*Partition emp_1500 values less than (1500),*

*Partition emp_4000 values less than (4000),*

*Partition emp_max values less than (maxvalue)*

*);*

As it does not allow fragmented ranges, it automatically removes the need
for an ‘overflow’ partition.

The syntax proposed by us is more flexible and would handle both the cases
of ranges with gaps or ranges without gaps.

I want to seek general opinion from the community on preferences between
user-friendly ‘Oracle’ syntax, and a more generic syntax that allows ‘gaps’
in partition ranges?

Regards,

Kedar

Bruce Momjian

bruce@momjian.us

about 17 years ago

In reply to: Kedar Potdar (#1)

Re: Automating Partitions in PostgreSQL - Query on syntax

On Tue, Apr 21, 2009 at 12:50 PM, Kedar Potdar <kedar.potdar@gmail.com> wrote:

I want to seek general opinion from the community on preferences between
user-friendly ‘Oracle’ syntax, and a more generic syntax that allows ‘gaps’
in partition ranges?

What happens to records in the overflow table when you add a new
partition whose range covers their values?

--
greg

Kedar Potdar

kedar.potdar@gmail.com

about 17 years ago

In reply to: Bruce Momjian (#2)

Re: Automating Partitions in PostgreSQL - Query on syntax

Hi Greg,

Thanks for your response.

Currently, such records are left in the overflow partition and its
responsibility
of user to insert them into partitioned-table which will then re-direct
those to
appropriate partitions.

Regards,
--
Kedar.

On Tue, Apr 21, 2009 at 5:29 PM, Greg Stark <stark@enterprisedb.com> wrote:

Show quoted text

On Tue, Apr 21, 2009 at 12:50 PM, Kedar Potdar <kedar.potdar@gmail.com>
wrote:

I want to seek general opinion from the community on preferences between
user-friendly ‘Oracle’ syntax, and a more generic syntax that allows

‘gaps’

in partition ranges?

What happens to records in the overflow table when you add a new
partition whose range covers their values?

--
greg

Dickson S. Guedes

listas@guedesoft.net

about 17 years ago

In reply to: Kedar Potdar (#1)

Re: Automating Partitions in PostgreSQL - Query on syntax

Em Ter, 2009-04-21 às 17:20 +0530, Kedar Potdar escreveu:

Hi ,

We are working on a patch to automate partitioning in PostgreSQL.

Nice. ":)

For Range partitions, we have proposed the syntax which is as
follows

(...)
PARTITION BY RANGE (emp_id)
(
emp_500 (START 1 END 500),
emp_1500 (START 500 END 1500),
emp_4000 (START 1520 END 4000)
);

What if I need more columns to set the partitions?

[]s
--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://guedesoft.net - http://planeta.postgresql.org.br

Kedar Potdar

kedar.potdar@gmail.com

about 17 years ago

In reply to: Dickson S. Guedes (#4)

Re: Automating Partitions in PostgreSQL - Query on syntax

2009/4/21 Dickson S. Guedes <listas@guedesoft.net>

Em Ter, 2009-04-21 às 17:20 +0530, Kedar Potdar escreveu:

Hi ,

We are working on a patch to automate partitioning in PostgreSQL.

Nice. ":)

For Range partitions, we have proposed the syntax which is as
follows

(...)
PARTITION BY RANGE (emp_id)
(
emp_500 (START 1 END 500),
emp_1500 (START 500 END 1500),
emp_4000 (START 1520 END 4000)
);

What if I need more columns to set the partitions?

You can do so by using command like this,

CREATE TABLE emp (
emp_id int not null primary key,
designation text not null,
location varchar(50) not null,
jdate date not null,
ctc float not null

)
PARTITION BY RANGE (jdate, emp_id)
(
emp_500 (START '01-01-1980',1 END '01-01-1990',500),
emp_1500(START '01-01-1990',500 END '01-01-2000',1500+10),
emp_4000 (START '1 Jan 2000',1500+20 END 'Jan 1, 2010', 4000-50)
);

You can have multiple columns as partition key attributes and values for
these attributes should appear in the order specified.

Show quoted text

[]s
--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://guedesoft.net - http://planeta.postgresql.org.br

Dickson S. Guedes

listas@guedesoft.net

about 17 years ago

In reply to: Kedar Potdar (#1)

Re: Automating Partitions in PostgreSQL - Query on syntax

Vacuum, I guess you forgot to reply-to-all. ":)

Em Ter, 2009-04-21 às 16:05 +0200, vacuum@quantentunnel.de escreveu:

why not partitions by "conditions"?

you can do that in similar way how "case when then " works ..

CREATE PARTITIONS ON <table>
SWITCH <any expression>
CASE <value> THEN <part_1>,
CASE <value> THEN <part_2>,
CASE <value> THEN <part_3>,
DEFAULT emp_default;

with a difference to case: <value> may not be a boolean expression

or you can do it as CASE works

forinstance:

CREATE PARTITIONS ON emp (
CASE WHEN <value> THEN emp_xxx,
CASE WHEN <value> THEN emp_yyy,
CASE WHEN <value> emp_zzz,
ELSE emp_default
END CASE;
);

-------- Original-Nachricht --------

Datum: Tue, 21 Apr 2009 10:46:41 -0300
Von: "Dickson S. Guedes" <listas@guedesoft.net>
An: Kedar Potdar <kedar.potdar@gmail.com>, pgsql-hackers@postgresql.org
Betreff: Re: [HACKERS] Automating Partitions in PostgreSQL - Query on syntax

Em Ter, 2009-04-21 às 17:20 +0530, Kedar Potdar escreveu:

Hi ,

We are working on a patch to automate partitioning in PostgreSQL.

Nice. ":)

For Range partitions, we have proposed the syntax which is as
follows

(...)
PARTITION BY RANGE (emp_id)
(
emp_500 (START 1 END 500),
emp_1500 (START 500 END 1500),
emp_4000 (START 1520 END 4000)
);

What if I need more columns to set the partitions?

[]s
--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://guedesoft.net - http://planeta.postgresql.org.br

--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://guedesoft.net - http://planeta.postgresql.org.br

Import Notes

Reply to msg id not found: 20090421140536.281500@gmx.net

Pavel Stehule

pavel.stehule@gmail.com

about 17 years ago

In reply to: Kedar Potdar (#5)

Re: Automating Partitions in PostgreSQL - Query on syntax

2009/4/21 Kedar Potdar <kedar.potdar@gmail.com>:

2009/4/21 Dickson S. Guedes <listas@guedesoft.net>

Em Ter, 2009-04-21 às 17:20 +0530, Kedar Potdar escreveu:

Hi ,

We are working on a patch to automate partitioning in PostgreSQL.

Nice. ":)

For Range partitions, we have proposed the syntax which is as
follows

(...)
PARTITION BY RANGE (emp_id)
(
emp_500 (START 1 END 500),
emp_1500 (START 500 END 1500),
emp_4000 (START 1520 END 4000)
);

What if I need more columns to set the partitions?

You can do so by using command like this,

CREATE TABLE emp (
    emp_id          int not null primary key,
    designation     text not null,
    location        varchar(50) not null,
    jdate           date not null,
    ctc             float not null

)
PARTITION BY RANGE (jdate, emp_id)
(
emp_500 (START '01-01-1980',1 END '01-01-1990',500),
emp_1500(START '01-01-1990',500 END '01-01-2000',1500+10),
emp_4000 (START '1 Jan 2000',1500+20 END 'Jan 1, 2010', 4000-50)
);

You can have multiple columns as partition key attributes and values for
these attributes should appear in the order specified.

-1, this is really ugly

Pavel

Show quoted text

[]s
--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://guedesoft.net - http://planeta.postgresql.org.br

Nikhil Sontakke

nikhil.sontakke@enterprisedb.com

about 17 years ago

In reply to: Pavel Stehule (#7)

Re: Automating Partitions in PostgreSQL - Query on syntax

(...)
PARTITION BY RANGE (emp_id)
(
emp_500 (START 1 END 500),
emp_1500 (START 500 END 1500),
emp_4000 (START 1520 END 4000)
);

What if I need more columns to set the partitions?

You can do so by using command like this,

CREATE TABLE emp (
emp_id int not null primary key,
designation text not null,
location varchar(50) not null,
jdate date not null,
ctc float not null

)
PARTITION BY RANGE (jdate, emp_id)
(
emp_500 (START '01-01-1980',1 END '01-01-1990',500),
emp_1500(START '01-01-1990',500 END '01-01-2000',1500+10),
emp_4000 (START '1 Jan 2000',1500+20 END 'Jan 1, 2010', 4000-50)
);

You can have multiple columns as partition key attributes and values for
these attributes should appear in the order specified.

-1, this is really ugly

Yeah, but what is the syntax for multiple column ranges in Orcl - is it
similarly ugly?

As to Kedar's original question about gap-based ranges or Orcl style no-gap
ranges, I don't think while partitioning such fine-grained gap-based ranges
is such a stringent requirement.

Also specification of the maxvalue range in Orcl's case nicely maps to the
overflow partition that is being mentioned upthread.

So whichever way we go maybe we can also consider accomodating the maxval
for different datatypes if possible within the syntax itself.

Regards,
Nikhils

Pavel

[]s
--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://guedesoft.net - http://planeta.postgresql.org.br

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
http://www.enterprisedb.com

steven king

vacuum@quantentunnel.de

about 17 years ago

In reply to: Dickson S. Guedes (#6)

Re: Automating Partitions in PostgreSQL - Query on syntax

oh sorry ..

but what I miss in this discussion - where we can define tablespace?

CREATE PARTITIONS ON <table> (

CASE WHEN <condition> THEN emp_xxx [IN <tablespace>],
CASE WHEN <condition> THEN emp_yyy [IN <tablespace>],
CASE WHEN <condition> emp_zzz [IN <tablespace>],
ELSE emp_default [IN <tablespace>]
)

-------- Original-Nachricht --------

Datum: Tue, 21 Apr 2009 11:11:54 -0300
Von: "Dickson S. Guedes" <listas@guedesoft.net>
An: vacuum@quantentunnel.de, pgsql-hackers@postgresql.org
Betreff: Re: [HACKERS] Automating Partitions in PostgreSQL - Query on syntax

Vacuum, I guess you forgot to reply-to-all. ":)

Em Ter, 2009-04-21 ï¿½s 16:05 +0200, vacuum@quantentunnel.de escreveu:

why not partitions by "conditions"?

you can do that in similar way how "case when then " works ..

CREATE PARTITIONS ON <table>
SWITCH <any expression>
CASE <value> THEN <part_1>,
CASE <value> THEN <part_2>,
CASE <value> THEN <part_3>,
DEFAULT emp_default;

with a difference to case: <value> may not be a boolean expression

or you can do it as CASE works

forinstance:

CREATE PARTITIONS ON emp (
CASE WHEN <value> THEN emp_xxx,
CASE WHEN <value> THEN emp_yyy,
CASE WHEN <value> emp_zzz,
ELSE emp_default
END CASE;
);

-------- Original-Nachricht --------

Datum: Tue, 21 Apr 2009 10:46:41 -0300
Von: "Dickson S. Guedes" <listas@guedesoft.net>
An: Kedar Potdar <kedar.potdar@gmail.com>,

pgsql-hackers@postgresql.org

Betreff: Re: [HACKERS] Automating Partitions in PostgreSQL - Query on

syntax

Em Ter, 2009-04-21 ï¿½s 17:20 +0530, Kedar Potdar escreveu:

Hi ,

We are working on a patch to automate partitioning in PostgreSQL.

Nice. ":)

For Range partitions, we have proposed the syntax which is as
follows

(...)
PARTITION BY RANGE (emp_id)
(
emp_500 (START 1 END 500),
emp_1500 (START 500 END 1500),
emp_4000 (START 1520 END 4000)
);

What if I need more columns to set the partitions?

[]s
--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://guedesoft.net - http://planeta.postgresql.org.br

--
Dickson S. Guedes
mail/xmpp: guedes@guedesoft.net - skype: guediz
http://guedesoft.net - http://planeta.postgresql.org.br

--
Neu: GMX FreeDSL Komplettanschluss mit DSL 6.000 Flatrate + Telefonanschluss fï¿½r nur 17,95 Euro/mtl.!* http://dslspecial.gmx.de/freedsl-surfflat/?ac=OM.AD.PD003K11308T4569a

#10

Bruce Momjian

bruce@momjian.us

about 17 years ago

In reply to: steven king (#9)

Re: Automating Partitions in PostgreSQL - Query on syntax

On Tue, Apr 21, 2009 at 4:13 PM, <vacuum@quantentunnel.de> wrote:

oh sorry ..

but what I miss in this discussion - where we can define tablespace?

CREATE PARTITIONS ON <table> (

CASE WHEN <condition> THEN emp_xxx [IN <tablespace>],

One thing you should realize is that the actual details of the syntax
will be knocked around and redefined later when there's actual code
anyways. People love to add their contribution at that point, there's
a nickname for it "bike shedding". You should worry more about the
functionality rather than the precise presentation of it. What are you
aiming to achieve here that the existing system lacks?

The point that you want to be able to set tablespaces is a good one
though. The general point is that you're going to want to be able to
set arbitrary table storage options though. That might mean you need a
separate DDL command for each partition.

The main thing the existing system lacks is an idea of what the
partition key is and how to determine which partition a given key
belongs in without checking every single partition. Figure out how to
accomplish this in the backend and people will be happy to offer their
suggestions for syntax.

--
greg

#11

Robert Haas

robertmhaas@gmail.com

about 17 years ago

In reply to: steven king (#9)

Re: Automating Partitions in PostgreSQL - Query on syntax

On Tue, Apr 21, 2009 at 11:13 AM, <vacuum@quantentunnel.de> wrote:

oh sorry ..

but what I miss in this discussion - where we can define tablespace?

CREATE PARTITIONS ON <table> (

CASE WHEN <condition> THEN emp_xxx [IN <tablespace>],
CASE WHEN <condition> THEN emp_yyy [IN <tablespace>],
CASE WHEN <condition> emp_zzz [IN <tablespace>],
ELSE emp_default [IN <tablespace>]
)

Oh blech. That IN <tablespace> syntax is awful. What about the other
relevant parameters to CREATE TABLE - where are you going to put
those?

WITH (<storage parameter>)
WITH OIDS
WITHOUT OIDS

And what happens when I need to alter a partition after the fact, say
to move it to a new tablespace, or change the fillfactor? Since your
syntax doesn't give names to the partitions, how will I refer to them
later?

...Robert

#12

Robert Haas

robertmhaas@gmail.com

about 17 years ago

In reply to: Kedar Potdar (#3)

Re: Automating Partitions in PostgreSQL - Query on syntax

On Tue, Apr 21, 2009 at 9:38 AM, Kedar Potdar <kedar.potdar@gmail.com> wrote:

Currently, such records are left in the overflow partition and its
responsibility
of user to insert them into partitioned-table which will then re-direct
those to
appropriate partitions.

This doesn't sound like a very good idea, because the planner cannot
then rely on the overflow table not containing tuples that ought to be
within some other partition.

The big win that is associated with table partitioning is using
constraint exclusion to avoid unnecessary partitions scans.

...Robert

#13

Csaba Nagy

nagy@ecircle-ag.com

about 17 years ago

In reply to: Robert Haas (#12)

Re: Automating Partitions in PostgreSQL - Query on syntax

On Tue, 2009-04-21 at 11:43 -0400, Robert Haas wrote:

This doesn't sound like a very good idea, because the planner cannot
then rely on the overflow table not containing tuples that ought to be
within some other partition.

The big win that is associated with table partitioning is using
constraint exclusion to avoid unnecessary partitions scans.

Well it could always check 2 partitions: the overflow and the one
selected by the constraint exclusion. If the overflow is kept empty by
properly setting up the partitions so that all insertions always go to
one of the active partitions, that would be cheap enough too while still
providing a way to catch unexpected data. Then when a new partition is
defined, there's no need to shuffle around data immediately, but there
could be a maintenance command to clean up the overflow... not to
mention that you could define a trigger to create the new partition once
you get something in the overflow (how cool would that be if it would
work ?).

Cheers,
Csaba.

#14

steven king

vacuum@quantentunnel.de

about 17 years ago

In reply to: Robert Haas (#11)

Re: Automating Partitions in PostgreSQL - Query on syntax

the idea is, that this statement does not create a new kind of db-object. this "CREATE PARTITIONS" statement should be a meta-statement -> "meta" like the serial type.

with create paritions you can create all defined tables (maybe by inheritin from <table>) at once.

but you're right - why this statement should not be able to define some other table-properties (except columns)?!

please notice ... the statement below was only a draft - to show the intention. the major difference is the usage of conditions .. this could lead into creation of triggers with a lot of PL/pqSQL code or invocations of built-in functions.

open your mind :)

-------- Original-Nachricht --------

Datum: Tue, 21 Apr 2009 11:39:37 -0400
Von: Robert Haas <robertmhaas@gmail.com>
An: vacuum@quantentunnel.de
CC: "Dickson S. Guedes" <listas@guedesoft.net>, pgsql-hackers@postgresql.org
Betreff: Re: [HACKERS] Automating Partitions in PostgreSQL - Query on syntax

On Tue, Apr 21, 2009 at 11:13 AM, <vacuum@quantentunnel.de> wrote:

oh sorry ..

but what I miss in this discussion - where we can define tablespace?

CREATE PARTITIONS ON <table> (

CASE WHEN <condition> THEN emp_xxx [IN <tablespace>],
CASE WHEN <condition> THEN emp_yyy [IN <tablespace>],
CASE WHEN <condition> emp_zzz [IN <tablespace>],
ELSE emp_default [IN <tablespace>]
)

Oh blech. That IN <tablespace> syntax is awful. What about the other
relevant parameters to CREATE TABLE - where are you going to put
those?

WITH (<storage parameter>)
WITH OIDS
WITHOUT OIDS

And what happens when I need to alter a partition after the fact, say
to move it to a new tablespace, or change the fillfactor? Since your
syntax doesn't give names to the partitions, how will I refer to them
later?

...Robert

--
Neu: GMX FreeDSL Komplettanschluss mit DSL 6.000 Flatrate + Telefonanschluss fï¿½r nur 17,95 Euro/mtl.!* http://dslspecial.gmx.de/freedsl-surfflat/?ac=OM.AD.PD003K11308T4569a

#15

steven king

vacuum@quantentunnel.de

about 17 years ago

In reply to: Csaba Nagy (#13)

Re: Automating Partitions in PostgreSQL - Query on syntax

-------- Original-Nachricht --------

Datum: Tue, 21 Apr 2009 17:50:02 +0200
Von: Csaba Nagy <nagy@ecircle-ag.com>
An: Robert Haas <robertmhaas@gmail.com>
CC: Kedar Potdar <kedar.potdar@gmail.com>, Greg Stark <stark@enterprisedb.com>, pgsql-hackers@postgresql.org
Betreff: Re: [HACKERS] Automating Partitions in PostgreSQL - Query on syntax

The big win that is associated with table partitioning is using
constraint exclusion to avoid unnecessary partitions scans.

there is no reason for not using the 'CASE condition' for constraint exclusion.

--
Neu: GMX FreeDSL Komplettanschluss mit DSL 6.000 Flatrate + Telefonanschluss fï¿½r nur 17,95 Euro/mtl.!* http://dslspecial.gmx.de/freedsl-surfflat/?ac=OM.AD.PD003K11308T4569a

#16

steven king

vacuum@quantentunnel.de

about 17 years ago

In reply to: Bruce Momjian (#10)

Re: Automating Partitions in PostgreSQL - Query on syntax

-------- Original-Nachricht --------

Datum: Tue, 21 Apr 2009 16:37:15 +0100
Von: Greg Stark <stark@enterprisedb.com>
An: vacuum@quantentunnel.de
CC: "Dickson S. Guedes" <listas@guedesoft.net>, pgsql-hackers@postgresql.org
Betreff: Re: [HACKERS] Automating Partitions in PostgreSQL - Query on syntax

On Tue, Apr 21, 2009 at 4:13 PM, <vacuum@quantentunnel.de> wrote:

oh sorry ..

but what I miss in this discussion - where we can define tablespace?

CREATE PARTITIONS ON <table> (

CASE WHEN <condition> THEN emp_xxx [IN <tablespace>],

The point that you want to be able to set tablespaces is a good one
though. The general point is that you're going to want to be able to
set arbitrary table storage options though. That might mean you need a
separate DDL command for each partition.

not in real - currently, when i create a table, the specified tablespaces must exist. this restriction you can apply to CREATE PARTITIONS too.

The main thing the existing system lacks is an idea of what the
partition key is and how to determine which partition a given key
belongs in without checking every single partition. Figure out how to
accomplish this in the backend and people will be happy to offer their
suggestions for syntax.

why a partition-key cannot be a complex expression?

like this:

SWITCH <expression>
CASE <key_value> TABLE <table> [IN <table_space>]
CASE <key_value> TABLE <table> [IN <table_space>]
CASE <key_value> TABLE <table> [IN <table_space>]
DEFAULT <table> [IN <table_space>]

that is generic

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Psssst! Schon vom neuen GMX MultiMessenger gehï¿½rt? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger01

#17

steven king

vacuum@quantentunnel.de

about 17 years ago

In reply to: steven king (#16)

Re: Automating Partitions in PostgreSQL - Query on syntax

SWITCH <expression>
CASE <key_value> TABLE <table> [IN <table_space>]
CASE <key_value> TABLE <table> [IN <table_space>]
CASE <key_value> TABLE <table> [IN <table_space>]
DEFAULT <table> [IN <table_space>]

some examples:

CREATE TABLE ttt (
id integer,
txt varchar
)

SWITCH id % 3
CASE 0 TABLE <table> [IN <table_space>]
CASE 1 TABLE <table> [IN <table_space>]
CASE 2 TABLE <table> [IN <table_space>]
DEFAULT <table> [IN <table_space>]

SWITCH lower(txt)
CASE 'hello' TABLE <table> [IN <table_space>]
CASE 'world' TABLE <table> [IN <table_space>]
CASE 'foo' TABLE <table> [IN <table_space>]
CASE 'bar' TABLE <table> [IN <table_space>]
DEFAULT <table> [IN <table_space>]

--
Neu: GMX FreeDSL Komplettanschluss mit DSL 6.000 Flatrate + Telefonanschluss fï¿½r nur 17,95 Euro/mtl.!* http://dslspecial.gmx.de/freedsl-surfflat/?ac=OM.AD.PD003K11308T4569a

#18

Robert Haas

robertmhaas@gmail.com

about 17 years ago

In reply to: Csaba Nagy (#13)

Re: Automating Partitions in PostgreSQL - Query on syntax

On Tue, Apr 21, 2009 at 11:50 AM, Csaba Nagy <nagy@ecircle-ag.com> wrote:

On Tue, 2009-04-21 at 11:43 -0400, Robert Haas wrote:

This doesn't sound like a very good idea, because the planner cannot
then rely on the overflow table not containing tuples that ought to be
within some other partition.

The big win that is associated with table partitioning is using
constraint exclusion to avoid unnecessary partitions scans.

Well it could always check 2 partitions: the overflow and the one
selected by the constraint exclusion. If the overflow is kept empty by
properly setting up the partitions so that all insertions always go to
one of the active partitions, that would be cheap enough too while still
providing a way to catch unexpected data. Then when a new partition is
defined, there's no need to shuffle around data immediately, but there
could be a maintenance command to clean up the overflow... not to
mention that you could define a trigger to create the new partition once
you get something in the overflow (how cool would that be if it would
work ?).

Sure, you could do it that way. But it will cause problems for people
who want to have a million rows in each of 100 partitions, and another
million rows in the overflow partition. Now all operations that can
be done on a single partition must scan 2 million rows instead of 1
million, just on the off chance that someone executed a DDL command
and didn't clean up after themselves.

...Robert

#19

Robert Haas

robertmhaas@gmail.com

about 17 years ago

In reply to: steven king (#14)

Re: Automating Partitions in PostgreSQL - Query on syntax

On Tue, Apr 21, 2009 at 12:17 PM, steven king <vacuum@quantentunnel.de> wrote:

the idea is, that this statement does not create a new kind of db-object. this "CREATE PARTITIONS" statement should be a meta-statement -> "meta" like the serial type.

That was my assumption as well.

with create paritions you can create all defined tables (maybe by inheritin from <table>) at once.

but you're right - why this statement should not be able to define some other table-properties (except columns)?!

I am of the opinion that defining partitions is a sufficiently
heavyweight operation that no one should worry too much about whether
or not each one needs to be created separately. As Greg Stark pointed
out, it's a lot more interesting to try to figure out how it's
actually going to be implemented. We can invent pretty syntax after
the fact easily enough.

please notice ... the statement below was only a draft - to show the intention. the major difference is the usage of conditions .. this could lead into creation of triggers with a lot of PL/pqSQL code or invocations of built-in functions.

Sure, I'm just providing my feedback, since, hey, someone posted to
the mailing list and asked for input...

I am not sold on the "CASE" method of defining partitions. It seems
to me that one of the issues that needs to be tackled is how to select
a partition efficiently, and I don't think CASE is going to be ideal
for that, because it presupposes an iterative model working down from
top to bottom. If there are 1000 partitions and the conditions are
complex, that could start to add up to significant overhead.

I think we should aim to set up range partitioning using a data
structure that is amenable to binary search. For example, you might
think of keeping a sorted array of length N of values (perhaps a row
type if a composite key is being used) and an array of length N+1 of
partitions. Now you can use binary search to find the first value in
the list which is greater than the key (if any) and then look up that
index in the second list to figure out where to put the tuple.

Now, you might think that's a bad design... you're welcome to propose
your own. But I think the design should come first and the syntax
afterward.

open your mind :)

Not trying to be close-minded...

...Robert

#20

Robert Haas

robertmhaas@gmail.com

about 17 years ago

In reply to: steven king (#16)

Re: Automating Partitions in PostgreSQL - Query on syntax

On Tue, Apr 21, 2009 at 12:35 PM, steven king <vacuum@quantentunnel.de> wrote:

why a partition-key cannot be a complex expression?

like this:

SWITCH <expression>
CASE <key_value> TABLE <table> [IN <table_space>]
CASE <key_value> TABLE <table> [IN <table_space>]
CASE <key_value> TABLE <table> [IN <table_space>]
DEFAULT <table> [IN <table_space>]

that is generic

Rather than SWITCH <expression> CASE <value> ... you probably would
want to reuse the existing PostgreSQL syntax of CASE <expression> WHEN
<value>...

But see my previous email for concerns about the performance of this
approach when the number of partitions is large.

...Robert

#21

Robert Haas

robertmhaas@gmail.com

about 17 years ago

In reply to: steven king (#15)

#22

Bruce Momjian

bruce@momjian.us

about 17 years ago