Proposal: Automatic partition creation

Started by Anastasia Lubennikovaalmost 6 years ago51 messageshackers
Jump to latest
#1Anastasia Lubennikova
a.lubennikova@postgrespro.ru

The previous discussion of automatic partition creation [1]/messages/by-id/alpine.DEB.2.21.1907150711080.22273@lancre has
addressed static and dynamic creation of partitions and ended up with
several syntax proposals.
In this thread, I want to continue this work.

Attached is PoC for static partition creation. The patch core is quite
straightforward. It adds one more transform clause to convert given
partitioning specification into several CREATE TABLE statements.

The patch implements following syntax:

CREATE TABLE ... PARTITION BY partition_method (list_of_columns)
partition_auto_create_clause

where partition_auto_create_clause is

CONFIGURATION [IMMEDIATE| DEFERRED] USING partition_bound_spec

and partition_bound_spec is:

MODULUS integer | VALUES IN (expr [,...]) [, ....] |  INTERVAL
range_step FROM range_start TO range_end

For more examples check auto_partitions.sql in the patch.

TODO:

- CONFIGURATION is just an existing keyword, that I picked as a stub.
 Ideas on better wording are welcome.

- IMMEDIATE| DEFERRED is optional, DEFERRED is not implemented yet
I wonder, is it worth placing a stub for dynamic partitioning, or we can
rather add these keywords later.

- HASH and LIST static partitioning works as expected.
Testing and feedback are welcome.

- RANGE partitioning is not really implemented in this patch.
Now it only accepts interval data type as 'interval' and respectively
date types as range_start and range_end expressions.
Only one partition is created. I found it difficult to implement the
generation of bounds using internal functions and data types.
Both existing solutions (pg_pathman and pg_partman) rely on SQL level
routines [2]https://github.com/postgrespro/pg_pathman/blob/dbcbd02e411e6acea6d97f572234746007979538/range.sql#L99.
I am going to implement this via SPI, which allow to simplify checks and
calculations. Do you see any pitfalls in this approach?

- Partition naming. Now partition names for all methods look like
$tablename_$partnum
Do we want more intelligence here? Now we have
RunObjectPostCreateHook(), which allows to rename the table.
To make it more user-friendly, we can later implement pl/pgsql function
that sets the callback, as it is done in pg_pathman set_init_callback() [3]https://github.com/postgrespro/pg_pathman#additional-parameters.

- Current design doesn't allow to create default partition
automatically. Do we need this functionality?

- Do you see any restrictions for future extensibility (dynamic
partitioning, init_callback, etc.) in the proposed design ?

I expect this to be a long discussion, so here is the wiki page [4]https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements to
fix important questions and final agreements.

[1]: /messages/by-id/alpine.DEB.2.21.1907150711080.22273@lancre
/messages/by-id/alpine.DEB.2.21.1907150711080.22273@lancre
[2]: https://github.com/postgrespro/pg_pathman/blob/dbcbd02e411e6acea6d97f572234746007979538/range.sql#L99
https://github.com/postgrespro/pg_pathman/blob/dbcbd02e411e6acea6d97f572234746007979538/range.sql#L99
[3]: https://github.com/postgrespro/pg_pathman#additional-parameters
[4]: https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

0001-WIP-create-partitions-automatically.patchtext/x-patch; charset=UTF-8; name=0001-WIP-create-partitions-automatically.patchDownload+523-30
#2Justin Pryzby
pryzby@telsasoft.com
In reply to: Anastasia Lubennikova (#1)
Re: Proposal: Automatic partition creation

On Mon, Jul 06, 2020 at 01:45:52PM +0300, Anastasia Lubennikova wrote:

The previous discussion of automatic partition creation [1] has addressed
static and dynamic creation of partitions and ended up with several syntax
proposals.

...

where partition_auto_create_clause is

CONFIGURATION [IMMEDIATE| DEFERRED] USING partition_bound_spec

- IMMEDIATE| DEFERRED is optional, DEFERRED is not implemented yet
I wonder, is it worth placing a stub for dynamic partitioning, or we can
rather add these keywords later.

I understand by "deferred" you mean that the partition isn't created at the
time CREATE TABLE is run but rather deferred until needed by INSERT.

For deferred, range partitioned tables, I think maybe what you'd want to
specify (and store) is the INTERVAL. If the table is partitioned by day, then
we'd date_trunc('day', time) and dynamically create that day. But if it was
partitioned by month, we'd create the month. I think you'd want to have an
ALTER command for that (we would use that to change tables between
daily/monthly based on their current size). That should also support setting
the MODULUS of a HASH partitioned table, to allow changing the size of its
partitions (currently, the user would have to more or less recreate the table
and move all its data into different partitions, but that's not ideal).

I don't know if it's important for anyone, but it would be interesting to think
about supporting sub-partitioning: partitions which are themselvese partitioned.
Like something => something_YYYY => something_YYYY_MM => something_YYYY_MM_DD.
You'd need to specify how to partition each layer of the heirarchy. In the
most general case, it could be different partition strategy.

If you have a callback function for partition renaming, I think you'd want to
pass it not just the current name of the partition, but also the "VALUES" used
in partition creation. Like (2020-04-05)TO(2020-05-06). Maybe instead, we'd
allow setting a "format" to use to construct the partition name. Like
"child.foo_bar_%Y_%m_%d". Ideally, the formats would be fixed-length
(zero-padded, etc), so failures with length can happen at "parse" time of the
statement and not at "run" time of the creation. You'd still have to handle
the case that the name already exists but isn't a partition (or is a partition
by doesn't handle the incoming tuple for some reason).

Also, maybe your "configuration" syntax would allow specifying other values.
Maybe including a retention period (as an INTERVAL for RANGE tables). That's
useful if you had a command to PRUNE the oldest partitions, like ALTER..PRUNE.

--
Justin

#3Robert Haas
robertmhaas@gmail.com
In reply to: Anastasia Lubennikova (#1)
Re: Proposal: Automatic partition creation

On Mon, Jul 6, 2020 at 6:46 AM Anastasia Lubennikova
<a.lubennikova@postgrespro.ru> wrote:

CREATE TABLE ... PARTITION BY partition_method (list_of_columns)
partition_auto_create_clause

where partition_auto_create_clause is

CONFIGURATION [IMMEDIATE| DEFERRED] USING partition_bound_spec

and partition_bound_spec is:

MODULUS integer | VALUES IN (expr [,...]) [, ....] | INTERVAL
range_step FROM range_start TO range_end

Might be good to compare this to what other databases support.

- IMMEDIATE| DEFERRED is optional, DEFERRED is not implemented yet
I wonder, is it worth placing a stub for dynamic partitioning, or we can
rather add these keywords later.

I think we should not add any keywords we don't need immediately - and
should seek to minimize the number of new keywords that we need to
add, though compatibility with other implementations might be a good
reason for accepting some new ones.

- HASH and LIST static partitioning works as expected.
Testing and feedback are welcome.

- RANGE partitioning is not really implemented in this patch.
Now it only accepts interval data type as 'interval' and respectively
date types as range_start and range_end expressions.
Only one partition is created. I found it difficult to implement the
generation of bounds using internal functions and data types.
Both existing solutions (pg_pathman and pg_partman) rely on SQL level
routines [2].
I am going to implement this via SPI, which allow to simplify checks and
calculations. Do you see any pitfalls in this approach?

I don't really see why we need SPI here. Why can't we just try to
evaluate the impression and see if we get a constant of the right
type, then use that?

I think the big problem here is identifying the operator to use. We
have no way of identifying the "plus" or "minus" operator associated
with a datatype; indeed, that constant doesn't exist. So either we (a)
limit this to a short list of data types and hard-code the operators
to be used (which is kind of sad given how extensible our type system
is) or we (b) invent some new mechanism for identifying the +/-
operators that should be used for a datatype, which was also proposed
in the context of some previous discussion of window framing options,
but which I don't think ever went anywhere (which is a lot of work) or
we (c) just look for operators called '+' and/or '-' by operator name
(which will probably make Tom throw up in his mouth a little).

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#3)
Re: Proposal: Automatic partition creation

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Jul 6, 2020 at 6:46 AM Anastasia Lubennikova
<a.lubennikova@postgrespro.ru> wrote:

I am going to implement this via SPI, which allow to simplify checks and
calculations. Do you see any pitfalls in this approach?

I don't really see why we need SPI here.

I would vote against any core facility that is implemented via SPI
queries. It is just too darn hard to control the semantics completely in
the face of fun stuff like varying search_path. Look at what a mess the
queries generated by the RI triggers are --- and they only have a very
small set of behaviors to worry about. I'm still only about 95% confident
they don't have security issues, too.

If you're using SPI to try to look up appropriate operators, I think
the chances of being vulnerable to security problems are 100%.

I think the big problem here is identifying the operator to use. We
have no way of identifying the "plus" or "minus" operator associated
with a datatype; indeed, that constant doesn't exist.

We did indeed solve this in connection with window functions, cf
0a459cec9. I may be misunderstanding what the problem is here,
but I think trying to reuse that infrastructure might help.

regards, tom lane

#5Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#4)
Re: Proposal: Automatic partition creation

On Mon, Jul 6, 2020 at 12:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

We did indeed solve this in connection with window functions, cf
0a459cec9. I may be misunderstanding what the problem is here,
but I think trying to reuse that infrastructure might help.

Ah, nice. I didn't realize that we'd added that. But I'm not sure that
it helps here, because I think we need to compute the end of the
range, not just test whether something is in a range. Like, if someone
wants monthly range partitions starting on 2020-01-01, we need to be
able to figure out that the subsequent months start on 2020-02-01,
2020-03-01, 2020-04-01, etc. Is there a way to use in_range to achieve
that?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#5)
Re: Proposal: Automatic partition creation

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Jul 6, 2020 at 12:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

We did indeed solve this in connection with window functions, cf
0a459cec9. I may be misunderstanding what the problem is here,
but I think trying to reuse that infrastructure might help.

Ah, nice. I didn't realize that we'd added that. But I'm not sure that
it helps here, because I think we need to compute the end of the
range, not just test whether something is in a range.

Yeah, I was thinking about that later, and I agree that the in_range
support function doesn't quite do the job. But we could expand on the
principle, and register addition (and subtraction?) functions as btree
support functions under the same rules as for in_range functions.

The reason in_range isn't just addition is that we wanted it to be able
to give correct answers even in cases where addition would overflow.
That's still valid for that use-case, but it doesn't apply here.

So it'd be something like "btree support function 4, registered under
amproclefttype x and amprocrighttype y, must have the signature
plus(x, y) returns x
and it gives results compatible with the opfamily's ordering of type x".
Similarly for subtraction if we think we need that.

I'm not sure if we need a formal notion of what "compatible results"
means, but it probably would be something like "if x < z according to the
opfamily sort ordering, then plus(x, y) < plus(z, y) for any given y".
Now this falls to the ground when y is a weird value like Inf or NaN,
but we'd want to exclude those as partitioning values anyway. Do we
also need some datatype-independent way of identifying such "weird
values"?

regards, tom lane

#7Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Anastasia Lubennikova (#1)
Re: Proposal: Automatic partition creation

Hello Anastasia,

My 0.02 €:

The patch implements following syntax:

CREATE TABLE ... PARTITION BY partition_method (list_of_columns)
partition_auto_create_clause

where partition_auto_create_clause is

CONFIGURATION [IMMEDIATE| DEFERRED] USING partition_bound_spec

and partition_bound_spec is:

MODULUS integer | VALUES IN (expr [,...]) [, ....] |  INTERVAL range_step
FROM range_start TO range_end

ISTM That we should avoid new specific syntaxes when possible, and prefer
free keyword option style, like it is being discussed for some other
commands, because it reduces the impact on the parser.

That would suggest a more versatile partition_bound_spec which could look
like (<keyword> <constant-or-maybe-even-expr>[, …]):

For modulus, looks easy:

(MODULUS 8)

For interval, maybe something like:

(STEP ..., FROM/START ..., TO/END ...)

The key point is that for dynamic partitioning there would be no need for
boundaries, so that it could just set a point and an interval

(START/INIT/FROM??? ..., STEP ...)

For lists of values, probably it would make little sense to have dynamic
partitioning? Or maybe yes, if we could partition on a column
value/expression?! eg "MOD(id, 8)"??

What about pg_dump? Should it be able to regenerate the initial create?

[4] https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements

Good point, a wiki is better than a thread for that type of things. I'll
look at this page.

--
Fabien.

#8Amul Sul
sulamul@gmail.com
In reply to: Fabien COELHO (#7)
Re: Proposal: Automatic partition creation

On Wed, Jul 8, 2020 at 10:24 AM Fabien COELHO <coelho@cri.ensmp.fr> wrote:

Hello Anastasia,

My 0.02 €:

The patch implements following syntax:

CREATE TABLE ... PARTITION BY partition_method (list_of_columns)
partition_auto_create_clause

where partition_auto_create_clause is

CONFIGURATION [IMMEDIATE| DEFERRED] USING partition_bound_spec

and partition_bound_spec is:

MODULUS integer | VALUES IN (expr [,...]) [, ....] | INTERVAL range_step
FROM range_start TO range_end

ISTM That we should avoid new specific syntaxes when possible, and prefer
free keyword option style, like it is being discussed for some other
commands, because it reduces the impact on the parser.

That would suggest a more versatile partition_bound_spec which could look
like (<keyword> <constant-or-maybe-even-expr>[, …]):

For modulus, looks easy:

(MODULUS 8)

For interval, maybe something like:

(STEP ..., FROM/START ..., TO/END ...)

The key point is that for dynamic partitioning there would be no need for
boundaries, so that it could just set a point and an interval

(START/INIT/FROM??? ..., STEP ...)

For lists of values, probably it would make little sense to have dynamic
partitioning? Or maybe yes, if we could partition on a column
value/expression?! eg "MOD(id, 8)"??

What about pg_dump? Should it be able to regenerate the initial create?

I don't think this is needed for the proposed "Automatic partitioning (static)"
which generates a bunch of CREATE TABLE statements, IIUC. Might be needed later
for "Automatic partitioning (dynamic)" where dynamic specifications need to be
stored.

[4] https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements

Good point, a wiki is better than a thread for that type of things. I'll
look at this page.

+1

Regards,
Amul

#9Anastasia Lubennikova
a.lubennikova@postgrespro.ru
In reply to: Tom Lane (#4)
Re: Proposal: Automatic partition creation

On 06.07.2020 19:10, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Jul 6, 2020 at 6:46 AM Anastasia Lubennikova
<a.lubennikova@postgrespro.ru> wrote:

I am going to implement this via SPI, which allow to simplify checks and
calculations. Do you see any pitfalls in this approach?

I don't really see why we need SPI here.

I would vote against any core facility that is implemented via SPI
queries. It is just too darn hard to control the semantics completely in
the face of fun stuff like varying search_path. Look at what a mess the
queries generated by the RI triggers are --- and they only have a very
small set of behaviors to worry about. I'm still only about 95% confident
they don't have security issues, too.

If you're using SPI to try to look up appropriate operators, I think
the chances of being vulnerable to security problems are 100%.

Good to know, thank you for that. I had doubts about the internal usage
of SPI,
but didn't know what exactly can go wrong.

I think the big problem here is identifying the operator to use. We
have no way of identifying the "plus" or "minus" operator associated
with a datatype; indeed, that constant doesn't exist.

We did indeed solve this in connection with window functions, cf
0a459cec9. I may be misunderstanding what the problem is here,
but I think trying to reuse that infrastructure might help.

Do we need to introduce a new support function? Is there a reason why we
can
not rely on '+' operator? I understand that the addition operator may
lack or
be overloaded for some complex datatypes, but I haven't found any
examples that
are useful for range partitioning. Both pg_pathman and pg_partman also
use '+'
to generate bounds.

I explored the code a bit more and came up with this function, which is
very
similar to generate_series_* functions, but it doesn't use SPI and looks
for
the function that implements the '+' operator, instead of direct call:

// almost pseudocode

static Const *
generate_next_bound(Const *start, Const *interval)
{
    ObjectWithArgs *sum_oper_object = makeNode(ObjectWithArgs);

    sum_oper_object->type = OBJECT_OPERATOR;
    /* hardcode '+' operator for addition */
    sum_oper_object->objname = list_make1(makeString("+"));

    ltype = makeTypeNameFromOid(start->consttype, start->consttypmod);
    rtype = makeTypeNameFromOid(interval->consttype,
interval->consttypmod);

    sum_oper_object->objargs = list_make2(ltype, rtype);

    sum_oper_oid = LookupOperWithArgs(sum_oper_object, false);
    oprcode = get_opcode(sum_oper_oid);
    fmgr_info(oprcode, &opproc);

next_bound->constvalue = FunctionCall2(&opproc,
                             start->constvalue,
                             interval->constvalue);
}

Thoughts?

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Anastasia Lubennikova (#9)
Re: Proposal: Automatic partition creation

Anastasia Lubennikova <a.lubennikova@postgrespro.ru> writes:

On 06.07.2020 19:10, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I think the big problem here is identifying the operator to use. We
have no way of identifying the "plus" or "minus" operator associated
with a datatype; indeed, that constant doesn't exist.

We did indeed solve this in connection with window functions, cf
0a459cec9. I may be misunderstanding what the problem is here,
but I think trying to reuse that infrastructure might help.

Do we need to introduce a new support function? Is there a reason why we
can not rely on '+' operator?

(1) the appropriate operator might not be named '+'
(2) even if it is, it might not be in your search_path
(3) you're vulnerable to security problems from someone capturing the
'+' operator with a better match; since you aren't writing the
operator explicitly, you can't fix that by qualifying it
(4) if the interval constant is written as an undecorated string
literal, the parser may have trouble resolving a match at all

I understand that the addition operator may lack or be overloaded for
some complex datatypes, but I haven't found any examples that are useful
for range partitioning.

"It works for all the built-in data types" isn't really a satisfactory
answer. But even just in the built-in types, consider "date":

# select oid::regoperator from pg_operator where oprname ='+' and oprleft = 'date'::regtype;
              oid               
--------------------------------
 +(date,interval)
 +(date,integer)
 +(date,time without time zone)
 +(date,time with time zone)
(4 rows)

It's not that immediately obvious which of these would make sense to use.

But the short answer here is that we did not accept relying on '+' being
the right thing for window function ranges, and I don't see why it is more
acceptable for partitioning ranges. The existing places where our parser
relies on implicit operator names are, without exception, problematic [1]/messages/by-id/ffefc172-a487-aa87-a0e7-472bf29735c8@gmail.com.

regards, tom lane

[1]: /messages/by-id/ffefc172-a487-aa87-a0e7-472bf29735c8@gmail.com

#11Anastasia Lubennikova
a.lubennikova@postgrespro.ru
In reply to: Anastasia Lubennikova (#1)
Re: Proposal: Automatic partition creation

On 06.07.2020 13:45, Anastasia Lubennikova wrote:

The previous discussion of automatic partition creation [1] has
addressed static and dynamic creation of partitions and ended up with
several syntax proposals.
In this thread, I want to continue this work.

...
[1]
/messages/by-id/alpine.DEB.2.21.1907150711080.22273@lancre

Syntax proposal v2, that takes into account received feedback.

I compared the syntax of other databases. You can find an overview here
[1]: https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements#Other_DBMS
seems that there is no industry standard, so every DBMS has its own
implementation. I decided to rely on a Greenplum syntax, as the most
similar to
the original PostgreSQL syntax.

New proposal is:

CREATE TABLE numbers(int number)
PARTITION BY partition_method (list_of_columns)
USING (partition_desc)

where partition_desc is:

MODULUS n
| VALUES IN (value_list), [DEFAULT PARTITION part_name]
| START ([datatype] 'start_value')
END ([datatype] 'end_value')
EVERY (partition_step), [DEFAULT PARTITION part_name]

where partition_step is:
[datatype] [number | INTERVAL] 'interval_value'

example:

CREATE TABLE years(int year)
PARTITION BY RANGE (year)
USING
(START (2006) END (2016) EVERY (1),
DEFAULT PARTITION other_years);

It is less wordy than the previous version. It uses a free keyword option
style. It covers static partitioning for all methods, default partition for
list and range methods, and can be extended to implement dynamic
partitioning
for range partitions.

[1]: https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements#Other_DBMS
https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements#Other_DBMS
[2]: https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements#Proposal_.28is_subject_to_change.29
https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements#Proposal_.28is_subject_to_change.29

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#12Anastasia Lubennikova
a.lubennikova@postgrespro.ru
In reply to: Justin Pryzby (#2)
Re: Proposal: Automatic partition creation

On 06.07.2020 17:59, Justin Pryzby wrote:

I think you'd want to have an
ALTER command for that (we would use that to change tables between
daily/monthly based on their current size). That should also support setting
the MODULUS of a HASH partitioned table, to allow changing the size of its
partitions (currently, the user would have to more or less recreate the table
and move all its data into different partitions, but that's not ideal).

New syntax fits to the ALTER command as well.

ALTER TABLE tbl
PARTITION BY HASH (number)
USING (partition_desc)

In simple cases (i.e. range partitioning granularity), it will simply
update
the rule of bound generation, saved in the catalog. More complex hash
partitions will require some rebalancing. Though, the syntax is pretty
straightforward for all cases. In the next versions, we can also add a
CONCURRENTLY keyword to cover partitioning of an existing
non-partitioned table
with data.

I don't know if it's important for anyone, but it would be interesting to think
about supporting sub-partitioning: partitions which are themselvese partitioned.
Like something => something_YYYY => something_YYYY_MM => something_YYYY_MM_DD.
You'd need to specify how to partition each layer of the heirarchy. In the
most general case, it could be different partition strategy.

I suppose it will be a natural extension of this work. Now we need to
ensure
that the proposed syntax is extensible. Greenplum syntax, which I choose
as an
example, provides subpartition syntax as well.

If you have a callback function for partition renaming, I think you'd want to
pass it not just the current name of the partition, but also the "VALUES" used
in partition creation. Like (2020-04-05)TO(2020-05-06). Maybe instead, we'd
allow setting a "format" to use to construct the partition name. Like
"child.foo_bar_%Y_%m_%d". Ideally, the formats would be fixed-length
(zero-padded, etc), so failures with length can happen at "parse" time of the
statement and not at "run" time of the creation. You'd still have to handle
the case that the name already exists but isn't a partition (or is a partition
by doesn't handle the incoming tuple for some reason).

In callback design, I want to use the best from pg_pathman's
set_init_callback().
The function accepts jsonb argument, which contains all the data about the
parent table, bounds, and so on. This information can be used to
construct name
for the partition and generate RENAME statement.

Also, maybe your "configuration" syntax would allow specifying other values.
Maybe including a retention period (as an INTERVAL for RANGE tables). That's
useful if you had a command to PRUNE the oldest partitions, like ALTER..PRUNE.

In this version, I got rid of the 'configuration' keyword. Speaking of
retention, I think that it would be hard to cover all use-cases with a
declarative syntax. While it is relatively easy to implement deletion
within a
callback function. See rotation_callback example in pg_pathman [1]https://github.com/postgrespro/pg_pathman/blob/79e11d94a147095f6e131e980033018c449f8e2e/sql/pathman_callbacks.sql#L107.

[1]: https://github.com/postgrespro/pg_pathman/blob/79e11d94a147095f6e131e980033018c449f8e2e/sql/pathman_callbacks.sql#L107
https://github.com/postgrespro/pg_pathman/blob/79e11d94a147095f6e131e980033018c449f8e2e/sql/pathman_callbacks.sql#L107

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#13Anastasia Lubennikova
a.lubennikova@postgrespro.ru
In reply to: Anastasia Lubennikova (#11)
[PATCH] Automatic HASH and LIST partition creation

On 14.07.2020 00:11, Anastasia Lubennikova wrote:

On 06.07.2020 13:45, Anastasia Lubennikova wrote:

The previous discussion of automatic partition creation [1] has
addressed static and dynamic creation of partitions and ended up with
several syntax proposals.
In this thread, I want to continue this work.

...
[1]
/messages/by-id/alpine.DEB.2.21.1907150711080.22273@lancre

Syntax proposal v2, that takes into account received feedback.

CREATE TABLE numbers(int number)
PARTITION BY partition_method (list_of_columns)
USING (partition_desc)

where partition_desc is:

MODULUS n
| VALUES IN (value_list), [DEFAULT PARTITION part_name]
| START ([datatype] 'start_value')
END ([datatype] 'end_value')
EVERY (partition_step), [DEFAULT PARTITION part_name]

where partition_step is:
[datatype] [number | INTERVAL] 'interval_value'

It is less wordy than the previous version. It uses a free keyword option
style. It covers static partitioning for all methods, default
partition for
list and range methods, and can be extended to implement dynamic
partitioning
for range partitions.

[1]
https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements#Other_DBMS
[2]
https://wiki.postgresql.org/wiki/Declarative_partitioning_improvements#Proposal_.28is_subject_to_change.29

Here is the patch for automated HASH and LIST partitioning, that
implements proposed syntax.

Range partitioning is more complicated. It will require new support
function to calculate bounds, new catalog attribute to store them and so
on. So I want to start small and implement automated range partitioning
in a separate patch later.

1) Syntax

New syntax is heavily based on Greenplum syntax for automated
partitioning with one change. Keyword "USING", that was suggested above,
causes shift/reduce conflict with "USING method" syntax of a table
access method. It seems that Greenplum folks will face this problem later.

I stick to CONFIGURATION as an existing keyword that makes sense in this
context.
Any better ideas are welcome.

Thus, current version is:

CREATE TABLE table_name (attrs)
PARTITION BY partition_method (list_of_columns)
CONFIGURATION (partition_desc)

where partition_desc is:

MODULUS n
| VALUES IN (value_list) [DEFAULT PARTITION part_name]

This syntax can be easily extended for range partitioning as well.

2) Implementation

PartitionBoundAutoSpec is a new part of PartitionSpec, that contains
information needed to generate partition bounds.

For HASH and LIST automatic partition creation, transformation happens
during parse analysis of CREATE TABLE statement.
transformPartitionAutoCreate() calculates bounds and generates
statements to create partition tables.

Partitions are named in a format: $tablename_$partnum. One can use post
create hook to rename relations.

For LIST partition one can also define a default partition.

3) TODO

The patch lacks documentation, because I expect some details may change
during discussion. Other than that, the feature is ready for review.

Regards

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

auto_part_hash_list_v0.patchtext/x-patch; charset=UTF-8; name=auto_part_hash_list_v0.patchDownload+369-2
#14Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Anastasia Lubennikova (#13)
Re: [PATCH] Automatic HASH and LIST partition creation

The patch lacks documentation, because I expect some details may change
during discussion. Other than that, the feature is ready for review.

Hi, hackers!

From what I've read I see there is much interest in automatic partitions
creation. (Overall discussion on the topic is partitioned into two threads:
(1)
/messages/by-id/alpine.DEB.2.21.1907150711080.22273@lancre
and
(2)
/messages/by-id/7fec3abb-c663-c0d2-8452-a46141be6d4a@postgrespro.ru
(current thread) )

There were many syntax proposals and finally, there is a patch realizing
one of them. So I'd like to review it.

The syntax proposed in the patch seems good enough for me and is in
accordance with one of the proposals in the discussion. Maybe I'd prefer
using the word AUTOMATICALLY/AUTO instead of CONFIGURATION with explicit
meaning that using this syntax we'd get already (automatically) created
partitions and don't need to create them manually, as in the existing state
of postgresql declarative partitioning.

CREATE TABLE tbl (i int) PARTITION BY HASH (i) AUTOMATICALLY (MODULUS
3); (partitions are created automatically)

vs

CREATE TABLE tbl (i int) PARTITION BY HASH (i); (partitions should be
created manually by use of PARTITION OF)

CREATE TABLE tbl (i char) PARTITION BY LIST (i) AUTOMATICALLY (VALUES
IN ('a', 'b'), ('c', 'd'), ('e','f') DEFAULT PARTITION tbl_default);

vs

CREATE TABLE tbl (i char) PARTITION BY LIST (i); (partitions should be
created manually by use of PARTITION OF)

I think this syntax can also be extended later with adding automatic
creation of RANGE partitions, with IMMEDIATE/DEFERRED for dynamic/on-demand
automatic partition creation, and with SUBPARTITION possibility.

But I don't have a strong preference for the word AUTOMATICALLY, moreover I
saw opposition to using AUTO at the top of the discussion. I suppose we can
go with the existing CONFIGURATION word.

If compare with existing declarative partitions, I think automatic creation
simplifies the process for the end-user and I'd vote for its committing
into Postgres. The patch is short and clean in code style. It has enough
comments Tests covering the new functionality are included. Yet it doesn't
have documentation and I'd suppose it's worth adding it. Even if there will
be syntax changes, I hope they will not be more than the replacement of
several words. Current syntax is described in the text of a patch.

The patch applies cleanly and installcheck-world is passed.

Some minor things:

I've got a compiler warning:
parse_utilcmd.c:4280:15: warning: unused variable 'lc' [-Wunused-variable]

When the number of partitions is over the maximum value of int32 the output
shows a generic syntax error. I don't think it is very important as it is
not the case someone will make deliberately, but maybe it's better to
output something like "Partitions number is more than the maximum supported
value"
create table test (i int, t text) partition by hash (i) configuration
(modulus 888888888888);
ERROR: syntax error at or near "888888888888"

One more piece of nitpicking. Probably we can go just with a mention in
documentation.
create table test (i int, t text) partition by hash (i) configuration
(modulus 8888);
ERROR: out of shared memory
HINT: You might need to increase max_locks_per_transaction.

Typo:
+ /* Add statemets to create each partition after we create parent table */

Overall I see the patch almost ready for commit and I'd like to meet this
functionality in v14.

Tested it and see this feature very cool and much simpler to use compared
to declarative partitioning to date.

Thanks!
--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com&gt;

#15Anastasia Lubennikova
a.lubennikova@postgrespro.ru
In reply to: Pavel Borisov (#14)
Re: [PATCH] Automatic HASH and LIST partition creation

On 08.09.2020 17:03, Pavel Borisov wrote:

The patch lacks documentation, because I expect some details may
change during discussion. Other than that, the feature is ready
for review.

Hi, hackers!

From what I've read I see there is much interest in automatic
partitions creation. (Overall discussion on the topic is partitioned
into two threads: (1)
/messages/by-id/alpine.DEB.2.21.1907150711080.22273@lancre%C2%A0and
(2)
/messages/by-id/7fec3abb-c663-c0d2-8452-a46141be6d4a@postgrespro.ru
(current thread) )

There were many syntax proposals and finally, there is a patch
realizing one of them. So I'd like to review it.

The syntax proposed in the patch seems good enough for me and is in
accordance with one of the proposals in the discussion. Maybe I'd
prefer using the word AUTOMATICALLY/AUTO instead of CONFIGURATION with
explicit meaning that using this syntax we'd get already
(automatically) created partitions and don't need to create them
manually, as in the existing state of postgresql declarative
partitioning.

CREATE TABLE tbl (iint) PARTITION BY HASH (i) AUTOMATICALLY (MODULUS 3); (partitions are created automatically)
vs
CREATE TABLE tbl (iint) PARTITION BY HASH (i); (partitions should be created manually by use of PARTITION OF)
CREATE TABLE tbl (i char) PARTITION BY LIST (i) AUTOMATICALLY (VALUES
IN ('a', 'b'), ('c', 'd'), ('e','f') DEFAULTPARTITION tbl_default);
vs
CREATE TABLE tbl (ichar) PARTITION BY LIST (i); (partitions should be created manually by use of PARTITION OF)

I think this syntax can also be extended later with adding automatic
creation of RANGE partitions, with IMMEDIATE/DEFERRED for
dynamic/on-demand automatic partition creation, and with SUBPARTITION
possibility.

But I don't have a strong preference for the word AUTOMATICALLY,
moreover I saw opposition to using AUTO at the top of the discussion.
I suppose we can go with the existing CONFIGURATION word.

I agree that 'AUTOMATICALLY' keyword is more specific and probably less
confusing for users. I've picked 'CONFIGURATION' simply because it is an
already existing keyword. It would like to hear other opinions on that.

If compare with existing declarative partitions, I think automatic
creation simplifies the process for the end-user and  I'd vote for its
committing into Postgres. The patch is short and clean in code style.
It has enough comments Tests covering the new functionality are
included. Yet it doesn't have documentation and I'd suppose it's worth
adding it. Even if there will be syntax changes, I hope they will not
be more than the replacement of several words. Current syntax is
described in the text of a patch.

Fair enough. New patch contains a documentation draft. While writing it,
I also noticed, that the syntax, introduced in the patch differs from
greenpulm one. For now, list partitioning clause doesn't support
'PARTITION name' part, that is supported in greenplum. I don't think
that we aim for 100% compatibility here. Still, the ability to provide
table names is probably a good optional feature, especially for list
partitions.

What do you think?

The patch applies cleanly and installcheck-world is passed.

Some minor things:

I've got a compiler warning:
parse_utilcmd.c:4280:15: warning: unused variable 'lc' [-Wunused-variable]

Fixed. This was also caught by cfbot. This version should pass it clean.

When the number of partitions is over the maximum value of int32 the
output shows a generic syntax error. I don't think it is very
important as it is not the case someone will make deliberately, but
maybe it's better to output something like "Partitions number is more
than the maximum supported value"
create table test (i int, t text) partition by hash (i) configuration
(modulus 888888888888);
ERROR:  syntax error at or near "888888888888"

This value is not a valid int32 number, thus parser throws the error
before we have a chance to handle it more gracefully.

One more piece of nitpicking. Probably we can go just with a mention
in documentation.
create table test (i int, t text) partition by hash (i) configuration
(modulus 8888);
ERROR:  out of shared memory
HINT:  You might need to increase max_locks_per_transaction.

Well, it looks like a legit error, when we try to lock a lot of objects
in one transaction. I will double check if we don't release a lock
somewhere.

Do we need to restrict the number of partitions, that can be created by
this statement? With what number?  As far as I see, there is no such
restriction for now, just a recommendation about performance issues.
With automatic creation it becomes easier to mess with it.

Probably, it's enough to mention it in documentation and rely on users
common sense.

Typo:
+ /* Add statemets to create each partition after we create parent
table */

Fixed.

Overall I see the patch almost ready for commit and I'd like to meet
this functionality in v14.

I also hope that this patch will make it to v14, but for now, I don't
see a consensus on the syntax and some details, so I wouldn't rush.

Besides, it definitely needs more testing. I haven't thoroughly tested
following cases yet:
- how triggers and constraints are propagated to partitions;
- how does it handle some tricky clauses in list partitioning expr_list;
and so on.

Also, there is an open question about partition naming. Currently, the
patch implements dummy %tbl_%partnum name generation, which is far from
user-friendly. I think we must provide some hook or trigger function to
rename partitions after they were created.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

auto_part_hash_list_v1.patchtext/x-patch; charset=UTF-8; name=auto_part_hash_list_v1.patchDownload+413-2
#16Michael Paquier
michael@paquier.xyz
In reply to: Anastasia Lubennikova (#15)
Re: [PATCH] Automatic HASH and LIST partition creation

On Mon, Sep 14, 2020 at 02:38:56PM +0300, Anastasia Lubennikova wrote:

Fixed. This was also caught by cfbot. This version should pass it clean.

Please note that regression tests are failing, because of 6b2c4e59.
--
Michael

#17Anastasia Lubennikova
a.lubennikova@postgrespro.ru
In reply to: Michael Paquier (#16)
Re: [PATCH] Automatic HASH and LIST partition creation

On 24.09.2020 06:27, Michael Paquier wrote:

On Mon, Sep 14, 2020 at 02:38:56PM +0300, Anastasia Lubennikova wrote:

Fixed. This was also caught by cfbot. This version should pass it clean.

Please note that regression tests are failing, because of 6b2c4e59.
--
Michael

Thank you. Updated patch is attached.

Open issues for review:
- new syntax;
- generation of partition names;
- overall patch review and testing, especially with complex partitioning
clauses.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

auto_part_hash_list_v2.patchtext/x-patch; charset=UTF-8; name=auto_part_hash_list_v2.patchDownload+415-2
#18Rahila Syed
rahilasyed90@gmail.com
In reply to: Anastasia Lubennikova (#17)
Re: [PATCH] Automatic HASH and LIST partition creation

Hi Anastasia,

I tested the syntax with some basic commands and it works fine, regression
tests also pass.

Couple of comments:
1. The syntax used omits the { IMMEDIATE | DEFERRED} keywords suggested in
the earlier discussions. I think it is intuitive to include IMMEDIATE with
the current implementation
so that the syntax can be extended with a DEFERRED clause in future for
dynamic partitions.

CREATE TABLE tbl_lst (i int) PARTITION BY LIST (i)
CONFIGURATION (values in (1, 2), (3, 4) DEFAULT PARTITION tbl_default);

2. One suggestion for generation of partition names is to append a unique
id to
avoid conflicts.

3. Probably, here you mean to write list and hash instead of range and list
as
per the current state.

<para>

Range and list partitioning also support automatic creation of
partitions
with an optional <literal>CONFIGURATION</literal> clause.
</para>

4. Typo in default_part_name

+VALUES IN ( <replaceable

class="parameter">partition_bound_expr</replaceable> [, ...] ), [(
<replaceable class="parameter">partition_bound_expr</replaceable> [, ...]
)] [, ...] [DEFAULT PARTITION <replaceable
class="parameter">defailt_part_name</replaceable>]
+MODULUS <replaceable class="parameter">numeric_literal</replaceable>

Thank you,
Rahila Syed

#19Anastasia Lubennikova
a.lubennikova@postgrespro.ru
In reply to: Rahila Syed (#18)
Re: [PATCH] Automatic HASH and LIST partition creation

On 30.09.2020 22:58, Rahila Syed wrote:

Hi Anastasia,

I tested the syntax with some basic commands and it works fine,
regression tests also pass.

Thank you for your review.

Couple of comments:
1. The syntax used omits the { IMMEDIATE | DEFERRED} keywords
suggested in
the earlier discussions. I think it is intuitive to include IMMEDIATE
with the current implementation
so that the syntax can be extended with a  DEFERRED clause in future
for dynamic partitions.

  CREATE TABLE tbl_lst (i int) PARTITION BY LIST (i)
 CONFIGURATION (values in (1, 2), (3, 4) DEFAULT PARTITION
tbl_default);

After some consideration, I decided that we don't actually need to
introduce IMMEDIATE | DEFERRED keyword. For hash and list partitions it
will always be immediate, as the number of partitions cannot change
after we initially set it. For range partitions, on the contrary, it
doesn't make much sense to make partitions immediately, because in many
use-cases one bound will be open.

2. One suggestion for generation of partition names is to append a
unique id to
avoid conflicts.

Can you please give an example of such a conflict? I agree that current
naming scheme is far from perfect, but I think that 'tablename'_partnum
provides unique name for each partition.

3. Probably, here you mean to write list and hash instead of range and
list as
per the current state.

     <para>
     Range and list partitioning also support automatic creation
of partitions
      with an optional <literal>CONFIGURATION</literal> clause.
    </para>

4. Typo in default_part_name

+VALUES IN ( <replaceable
class="parameter">partition_bound_expr</replaceable> [, ...] ), [(
<replaceable class="parameter">partition_bound_expr</replaceable>
[, ...] )] [, ...] [DEFAULT PARTITION <replaceable
class="parameter">defailt_part_name</replaceable>]
+MODULUS <replaceable class="parameter">numeric_literal</replaceable>

Yes, you're right. I will fix these typos in next version of the patch.

Thank you,
Rahila Syed

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#20Rahila Syed
rahilasyed90@gmail.com
In reply to: Anastasia Lubennikova (#19)
Re: [PATCH] Automatic HASH and LIST partition creation

Hi,

Couple of comments:

1. The syntax used omits the { IMMEDIATE | DEFERRED} keywords suggested in
the earlier discussions. I think it is intuitive to include IMMEDIATE with
the current implementation
so that the syntax can be extended with a DEFERRED clause in future for
dynamic partitions.

CREATE TABLE tbl_lst (i int) PARTITION BY LIST (i)
CONFIGURATION (values in (1, 2), (3, 4) DEFAULT PARTITION tbl_default);

After some consideration, I decided that we don't actually need to
introduce IMMEDIATE | DEFERRED keyword. For hash and list partitions it
will always be immediate, as the number of partitions cannot change after
we initially set it. For range partitions, on the contrary, it doesn't make
much sense to make partitions immediately, because in many use-cases one
bound will be open.

As per discussions on this thread:
/messages/by-id/alpine.DEB.2.21.1907150711080.22273@lancre
DEFERRED clause refers to creating partitions on the fly, while the data is
being inserted.
The number of partitions and partition bounds can be the same as specified
initially
during partitioned table creation, but the actual creation of
partitions can be deferred.
This seems like a potential extension to statically created partitions even
in the case of
hash and list partitions, as it won't involve moving any existing data.

2. One suggestion for generation of partition names is to append a

unique id to

avoid conflicts.

Can you please give an example of such a conflict? I agree that current
naming scheme is far from perfect, but I think that 'tablename'_partnum
provides unique name for each partition.

Sorry for not being clear earlier, I mean the partition name

'tablename_partnum' can conflict with any existing table name.
As per current impemetation, if I do the following it results in the table
name conflict.

postgres=# create table tbl_test_5_1(i int);
CREATE TABLE
postgres=# CREATE TABLE tbl_test_5 (i int) PARTITION BY LIST((tbl_test_5))

CONFIGURATION (values in
('(1)'::tbl_test_5), ('(3)'::tbl_test_5) default partition tbl_default_5);
ERROR: relation "tbl_test_5_1" already exists

Thank you,
Rahila Syed

Show quoted text
#21Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Rahila Syed (#20)
#22Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Pavel Borisov (#21)
#23Anastasia Lubennikova
a.lubennikova@postgrespro.ru
In reply to: Pavel Borisov (#22)
#24Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Anastasia Lubennikova (#23)
#25Anastasia Lubennikova
a.lubennikova@postgrespro.ru
In reply to: Rahila Syed (#20)
#26Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Anastasia Lubennikova (#25)
#27Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Pavel Borisov (#26)
#28Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Pavel Borisov (#27)
#29Maxim Orlov
m.orlov@postgrespro.ru
In reply to: Pavel Borisov (#28)
#30Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Maxim Orlov (#29)
#31Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Fabien COELHO (#30)
#32Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Pavel Borisov (#31)
#33Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Fabien COELHO (#32)
#34Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Pavel Borisov (#33)
#35Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Fabien COELHO (#34)
#36Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Pavel Borisov (#35)
#37Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Fabien COELHO (#36)
#38Thomas Munro
thomas.munro@gmail.com
In reply to: Anastasia Lubennikova (#23)
#39Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Thomas Munro (#38)
#40Justin Pryzby
pryzby@telsasoft.com
In reply to: Pavel Borisov (#39)
#41Nitin Jadhav
nitinjadhavpostgres@gmail.com
In reply to: Justin Pryzby (#40)
#42Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Nitin Jadhav (#41)
#43John Naylor
john.naylor@enterprisedb.com
In reply to: Pavel Borisov (#42)
#44Pavel Borisov
pashkin.elfe@gmail.com
In reply to: John Naylor (#43)
#45Robert Haas
robertmhaas@gmail.com
In reply to: Justin Pryzby (#40)
#46Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Robert Haas (#45)
#47Robert Haas
robertmhaas@gmail.com
In reply to: Pavel Borisov (#46)
#48Justin Pryzby
pryzby@telsasoft.com
In reply to: Robert Haas (#47)
#49Robert Haas
robertmhaas@gmail.com
In reply to: Justin Pryzby (#48)
#50Daniel Gustafsson
daniel@yesql.se
In reply to: Robert Haas (#49)
#51stephane tachoires
stephane.tachoires@gmail.com
In reply to: Daniel Gustafsson (#50)