Declarative partitioning

Started by Amit Langoteover 10 years ago161 messageshackers
Jump to latest
#1Amit Langote
Langote_Amit_f8@lab.ntt.co.jp

Hi,

I would like propose $SUBJECT for this development cycle. Attached is a
WIP patch that implements most if not all of what's described below. Some
yet unaddressed parts are mentioned below, too. I'll add this to the CF-SEP.

Syntax
======

1. Creating a partitioned table

CREATE TABLE table_name
PARTITION BY {RANGE|LIST}
ON (column_list);

Where column_list consists of simple column names or expressions:

PARTITION BY LIST ON (name)
PARTITION BY RANGE ON (year, month)

PARTITION BY LIST ON ((lower(left(name, 2)))
PARTITION BY RANGE ON ((extract(year from d)), (extract(month from d)))

Note: LIST partition key supports only one column.

For each column, you could write operator class name:

PARTITION BY LIST/RANGE ON (colname [USING] opclass_name),

If not specified, the default btree operator class based on type of each
key column is used. If none of the available btree operator classes are
compatible with the partitioning strategy (list/range), error is thrown.
Built-in btree operator classes cover a good number of types for list and
range partitioning in practical scenarios.

A table created using this form is of proposed new relkind
RELKIND_PARTITIONED_REL. An entry in pg_partitioned_rel (see below) is
created to store partition key info.

Note: A table cannot be partitioned after-the-fact using ALTER TABLE.

Normal dependencies are created between the partitioned table and operator
classes, object in partition expressions like functions.

2. Creating a partition of a partitioned table

CREATE TABLE table_name
PARTITION OF partitioned_table_name
FOR VALUES values_spec;

Where values_spec is:

listvalues: [IN] (val1, ...)

rangevalues: START (col1min, ... ) END (col1max, ... )
| START (col1min, ... )
| END (col1max, ... )

A table created using this form has proposed pg_class.relispartition set
to true. An entry in pg_partition (see below) is created to store the
partition bound info.

The values_spec should match the partitioning strategy of the partitioned
table. In case of a range partition, the values in START and/or END should
match columns in the partition key.

Defining a list partition is fairly straightforward - just spell out the
list of comma-separated values. Error is thrown if the list of values
overlaps with one of the existing partitions' list.

CREATE TABLE persons_by_state (name text, state text)
PARTITION BY LIST ON (state);

CREATE TABLE persons_IL
PARTITION OF persons_by_state
FOR VALUES IN ('IL');

CREATE TABLE persons_fail
PARTITION OF persons_by_state
FOR VALUES IN ('IL');
ERROR: cannot create partition that overlaps with an existing one

For a range partition, there are more than one way:

Specify both START and END bounds: resulting range should not overlap with
the range(s) covered by existing partitions. Error is thrown otherwise.
Although rare in practice, gaps between ranges are OK.

CREATE TABLE measurement(logdate date NOT NULL)
PARTITION BY RANGE ON (logdate);

CREATE TABLE measurement_y2006m02
PARTITION OF measurement
FOR VALUES START ('2006-02-01') END ('2006-03-01'); --success

CREATE TABLE measurement_fail
PARTITION OF measurement
FOR VALUES START ('2006-02-15') END ('2006-03-01');
ERROR: cannot create partition that overlaps with an existing one

Specify only the START bound: add the partition on the left of some range
covered by existing partitions provided no overlap occurs (also
considering gaps between ranges, if any). If no such range exists, the new
partition will cover the range [START, +INFINITY) and become the rightmost
partition. Error is thrown if the specified START causes overlap.

CREATE TABLE measurement_y2006m01
PARTITION OF measurement
FOR VALUES START ('2006-01-01'); --success

CREATE TABLE measurement_fail
PARTITION OF measurement
FOR VALUES START ('2006-02-01'); --overlaps with measurement_y2006m02
ERROR: cannot create partition that overlaps with an existing one

Specify only the END bound: add the partition on the right of some range
covered by existing partitions provided no overlap occurs (also
considering gaps between ranges, if any). If no such range exists, the new
partition would cover the range (-INFINITY, END) and become the leftmost
partition. Error is thrown if the specified END causes overlap.

CREATE TABLE measurement_y2006m03
PARTITION OF measurement
FOR VALUES END ('2006-04-01'); --success

CREATE TABLE measurement_fail
PARTITION OF measurement
FOR VALUES END ('2006-03-01'); --overlaps with measurement_y2006m02
ERROR: cannot create partition that overlaps with an existing one

For each partition, START and END bound values are stored in the
catalog. Note that the lower bound is inclusive, whereas the upper bound
is exclusive.

Note: At most one range partition can have null min bound in which case it
covers the range (-INFINITY, END). Also, at most one range partition can
have null max bound in which case it covers the range [START, +INFINITY).

A normal dependency is created between the parent and the partition.

3. Multi-level partitioning

CREATE TABLE table_name
PARTITION OF partitioned_table_name
FOR VALUES values_spec
PARTITION BY {RANGE|LIST} ON (columns_list)

This variant implements a form of so called composite or sub-partitioning
with arbitrarily deep partitioning structure. A table created using this
form has both the relkind RELKIND_PARTITIONED_REL and
pg_class.relispartition set to true.

4. (yet unimplemented) Attach partition (from existing table)

ALTER TABLE partitioned_table
ATTACH PARTITION partition_name
FOR VALUES values_spec
USING [TABLE] table_name;

ALTER TABLE table_name
SET VALID PARTITION OF <parent>;

The first of the above pair of commands would attach table_name as a (yet)
'invalid' partition of partitioned_table (after confirming that it matches
the schema and does not overlap with other partitions per FOR VALUES
spec). It would also record the FOR VALUES part in the partition catalog
and set pg_class.relispartition to true for table_name.

After the first command is done, the second command would take exclusive
lock on table_name, scan the table to check if it contains any values
outside the boundaries defined by FOR VALUES clause defined previously,
throw error if so, mark as valid partition of parent if not.

Does that make sense?

5. Detach partition

ALTER TABLE partitioned_table
DETACH PARTITION partition_name [USING table_name]

This removes partition_name as partition of partitioned_table. The table
continues to exist with the same name or 'table_name', if specified.
pg_class.relispartition is set to false for the table, so it behaves like
a normal table.

System catalogs
===============

1. pg_partitioned_rel

CATALOG(pg_partitioned_rel,oid) BKI_WITHOUT_OIDS
{
Oid partrelid; /* partitioned table pg_class.oid */
char partstrategy; /* partitioning strategy 'l'/'r' */
int16 partnatts; /* number of partition columns */
int2vector partkey; /* column numbers of partition columns;
* 0 where specified column is an
* expresion */
oidvector partclass; /* operator class to compare keys */
pg_node_tree partexprs; /* expression trees for partition key
* members that are not simple column
* references; one for each zero entry
* in partkey[] */
};

2. pg_partition (omits partisvalid alluded to above)

CATALOG(pg_partition,oid) BKI_WITHOUT_OIDS
{
Oid partitionid; /* partition oid */
Oid partparent; /* parent oid */
anyarray partlistvalues; /* list of allowed values of the only
* partition column */
anyarray partrangebounds; /* list of bounds of ranges of
* allowed values per partition key
* column */
};

Further notes
=============

There are a number of restrictions on performing after-the-fact changes
using ALTER TABLE to partitions (ie, relispartition=true):

* Cannot add/drop column
* Cannot set/drop OIDs
* Cannot set/drop NOT NULL
* Cannot set/drop default
* Cannot alter column type
* Cannot add/drop alter constraint (table level)
* Cannot change persistence
* Cannot change inheritance
* Cannot link to a composite type

Such changes should be made to the topmost parent in the partitioning
hierarchy (hereafter referred to as just parent). These are recursively
applied to all the tables in the hierarchy. Although the last two items
cannot be performed on parent either.

Dropping a partition using DROP TABLE is not allowed. It needs to detached
using ALTER TABLE on parent before it can be dropped as a normal table.

Triggers on partitions are not allowed. They should be defined on the
parent. That said, I could not figure out a way to implement row-level
AFTER triggers on partitioned tables (more about that in a moment); so
they are currently not allowed:

CREATE TRIGGER audit_trig
AFTER INSERT ON persons
FOR EACH ROW EXECUTE PROCEDURE audit_func();
ERROR: Row-level AFTER triggers are not supported on partitioned tables

Column/table constraints on partitions are not allowed. They should be
defined on the parent. Foreign key constraints are not allowed due to
above limitation (no row-level after triggers).

A partitioning parent (RELKIND_PARTITIONED_REL) does not have storage.

Creating index on parent is not allowed. They should be defined on (leaf)
partitions. Because of this limitation, primary keys are not allowed on a
partitioned table. Perhaps, we should be able to just create a dummy
entry somewhere to represent an index on parent (which every partition
then copies.) Then by restricting primary key to contain all partition key
columns, we can implement unique constraint over the whole partitioned
table. That will in turn allow us to use partitioned tables as PK rels in
a foreign key constraint provided row-level AFTER trigger issue is resolved.

VACUUM/ANALYZE on individual partitions should work like normal tables.
I've not implemented something like inheritance tree sampling for
partitioning tree in this patch. Autovacuum has been taught to ignore
parent tables and vacuum/analyze partitions normally.

Dropping a partitioned table should (?) unconditionally drop all its
partitions probably but, currently the patch uses dependencies, so
requires to specify CASCADE to do the same.

What should TRUNCATE on partitioned table do?

Ownership, privileges/permissions, RLS should be managed through the
parent table although not comprehensively addressed in the patch.

There is no need to define tuple routing triggers. CopyFrom() and
ExecInsert() determine target partition just before performing
heap_insert() and ExecInsertIndexTuples(). IOW, any BR triggers and
constraints (on parent) are executed for tuple before being routed to a
partition. If no partition can be found, it's an error.

Because row-level AFTER triggers need to save ItemPointers in trigger
event data and defining triggers on partitions (which is where tuples
really go) is not allowed, I could not find a straightforward way to
implement them. So, perhaps we should allow (only) row-level AFTER
triggers on partitions or think of modifying trigger.c to know about this
twist explicitly.

Internal representations
========================

For a RELKIND_PARTITIONED_REL relations, RelationData gets a few new
fields to store the partitioning metadata. That includes partition key
tuple (pg_partitioned_rel) including some derived info (opfamily,
opcintype, compare proc FmgrInfos, partition expression trees).

Additionally, it also includes a PartitionInfo object which includes
partition OIDs array, partition bound arrays (if range partitioned,
rangemax is sorted in ascending order and OIDs are likewise ordered). It
is built from information in pg_partition catalog.

While RelationBuildDesc() initializes the basic key info, fields like
expression trees, PartitionInfo are built on demand and cached. For
example, InitResultRelInfo() builds the latter to populate the newly added
ri_PartitionKeyInfo and ri_Partitions fields, respectively.

PartitionInfo object is rebuilt on every cache invalidation of the rel
which includes when adding/attaching/detaching a new partition.

Planner and executor considerations
=====================================

The patch does not yet implement any planner changes for partitioned
tables, although I'm working on the same and post updates as soon as
possible. That means, it is not possible to run SELECT/UPDATE/DELETE
queries on partitioned tables without getting:

postgres=# SELECT * FROM persons;
ERROR: could not open file "base/13244/106975": No such file or directory

Given that there would be more direct ways of performing partition pruning
decisions with the proposed, it would be nice to utilize them.
Specifically, I would like to avoid having to rely on constraint exclusion
for partition pruning whereby subquery_planner() builds append_rel_list
and the later steps exclude useless partitions.

By extending RelOptInfo to include partitioning info for partitioned rels,
it might be possible to perform partition pruning directly without
previously having to expand them. Although, as things stand now, it's not
clear how that might work - when would partition RTEs be added to the
rtable? The rtable is assumed not to change after
setup_simple_rel_arrays() has done its job which is much earlier than when
it would be desirable for the partitioned table expansion (along with
partition pruning) to happen. Moreover, if that means we might not be able
to build RelOptInfo's for partitions, how to choose best paths for them
(index paths or not, etc.)?

I'm also hoping we don't require something like inheritance_planner() for
when partitioned tables are target rels. I assume considerations for why
the special processing is necessary for inheritance trees in that scenario
don't apply to partitioning trees. So, if grouping_planner() returns a
Append plan (among other options) for the partitioning tree, tacking a
ModifyTable node on top should do the trick?

Suggestions greatly welcome in this area.

Other items
===========

Will include the following once we start reaching consensus on main parts
of the proposed design/implementation:

* New regression tests
* Documentation updates
* pg_dump, psql, etc.

For reference, some immediately previous discussions:

* On partitioning *
/messages/by-id/20140829155607.GF7705@eldon.alvh.no-ip.org

* Partitioning WIP patch *
/messages/by-id/54EC32B6.9070605@lab.ntt.co.jp

Comments welcome!

Thanks,
Amit

Attachments:

declarative-partitioning-wip-1.patchtext/x-diff; name=declarative-partitioning-wip-1.patchDownload+4912-129
#2Thom Brown
thom@linux.com
In reply to: Amit Langote (#1)
Re: Declarative partitioning

On 18 August 2015 at 11:30, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
wrote:

Hi,

I would like propose $SUBJECT for this development cycle. Attached is a
WIP patch that implements most if not all of what's described below. Some
yet unaddressed parts are mentioned below, too. I'll add this to the
CF-SEP.

Syntax
======

1. Creating a partitioned table

CREATE TABLE table_name
PARTITION BY {RANGE|LIST}
ON (column_list);

Where column_list consists of simple column names or expressions:

PARTITION BY LIST ON (name)
PARTITION BY RANGE ON (year, month)

PARTITION BY LIST ON ((lower(left(name, 2)))
PARTITION BY RANGE ON ((extract(year from d)), (extract(month from d)))

Note: LIST partition key supports only one column.

For each column, you could write operator class name:

PARTITION BY LIST/RANGE ON (colname [USING] opclass_name),

If not specified, the default btree operator class based on type of each
key column is used. If none of the available btree operator classes are
compatible with the partitioning strategy (list/range), error is thrown.
Built-in btree operator classes cover a good number of types for list and
range partitioning in practical scenarios.

A table created using this form is of proposed new relkind
RELKIND_PARTITIONED_REL. An entry in pg_partitioned_rel (see below) is
created to store partition key info.

Note: A table cannot be partitioned after-the-fact using ALTER TABLE.

Normal dependencies are created between the partitioned table and operator
classes, object in partition expressions like functions.

2. Creating a partition of a partitioned table

CREATE TABLE table_name
PARTITION OF partitioned_table_name
FOR VALUES values_spec;

Where values_spec is:

listvalues: [IN] (val1, ...)

rangevalues: START (col1min, ... ) END (col1max, ... )
| START (col1min, ... )
| END (col1max, ... )

A table created using this form has proposed pg_class.relispartition set
to true. An entry in pg_partition (see below) is created to store the
partition bound info.

The values_spec should match the partitioning strategy of the partitioned
table. In case of a range partition, the values in START and/or END should
match columns in the partition key.

Defining a list partition is fairly straightforward - just spell out the
list of comma-separated values. Error is thrown if the list of values
overlaps with one of the existing partitions' list.

CREATE TABLE persons_by_state (name text, state text)
PARTITION BY LIST ON (state);

CREATE TABLE persons_IL
PARTITION OF persons_by_state
FOR VALUES IN ('IL');

CREATE TABLE persons_fail
PARTITION OF persons_by_state
FOR VALUES IN ('IL');
ERROR: cannot create partition that overlaps with an existing one

For a range partition, there are more than one way:

Specify both START and END bounds: resulting range should not overlap with
the range(s) covered by existing partitions. Error is thrown otherwise.
Although rare in practice, gaps between ranges are OK.

CREATE TABLE measurement(logdate date NOT NULL)
PARTITION BY RANGE ON (logdate);

CREATE TABLE measurement_y2006m02
PARTITION OF measurement
FOR VALUES START ('2006-02-01') END ('2006-03-01'); --success

CREATE TABLE measurement_fail
PARTITION OF measurement
FOR VALUES START ('2006-02-15') END ('2006-03-01');
ERROR: cannot create partition that overlaps with an existing one

Specify only the START bound: add the partition on the left of some range
covered by existing partitions provided no overlap occurs (also
considering gaps between ranges, if any). If no such range exists, the new
partition will cover the range [START, +INFINITY) and become the rightmost
partition. Error is thrown if the specified START causes overlap.

CREATE TABLE measurement_y2006m01
PARTITION OF measurement
FOR VALUES START ('2006-01-01'); --success

CREATE TABLE measurement_fail
PARTITION OF measurement
FOR VALUES START ('2006-02-01'); --overlaps with measurement_y2006m02
ERROR: cannot create partition that overlaps with an existing one

Specify only the END bound: add the partition on the right of some range
covered by existing partitions provided no overlap occurs (also
considering gaps between ranges, if any). If no such range exists, the new
partition would cover the range (-INFINITY, END) and become the leftmost
partition. Error is thrown if the specified END causes overlap.

CREATE TABLE measurement_y2006m03
PARTITION OF measurement
FOR VALUES END ('2006-04-01'); --success

CREATE TABLE measurement_fail
PARTITION OF measurement
FOR VALUES END ('2006-03-01'); --overlaps with measurement_y2006m02
ERROR: cannot create partition that overlaps with an existing one

For each partition, START and END bound values are stored in the
catalog. Note that the lower bound is inclusive, whereas the upper bound
is exclusive.

Note: At most one range partition can have null min bound in which case it
covers the range (-INFINITY, END). Also, at most one range partition can
have null max bound in which case it covers the range [START, +INFINITY).

A normal dependency is created between the parent and the partition.

3. Multi-level partitioning

CREATE TABLE table_name
PARTITION OF partitioned_table_name
FOR VALUES values_spec
PARTITION BY {RANGE|LIST} ON (columns_list)

This variant implements a form of so called composite or sub-partitioning
with arbitrarily deep partitioning structure. A table created using this
form has both the relkind RELKIND_PARTITIONED_REL and
pg_class.relispartition set to true.

4. (yet unimplemented) Attach partition (from existing table)

ALTER TABLE partitioned_table
ATTACH PARTITION partition_name
FOR VALUES values_spec
USING [TABLE] table_name;

ALTER TABLE table_name
SET VALID PARTITION OF <parent>;

The first of the above pair of commands would attach table_name as a (yet)
'invalid' partition of partitioned_table (after confirming that it matches
the schema and does not overlap with other partitions per FOR VALUES
spec). It would also record the FOR VALUES part in the partition catalog
and set pg_class.relispartition to true for table_name.

After the first command is done, the second command would take exclusive
lock on table_name, scan the table to check if it contains any values
outside the boundaries defined by FOR VALUES clause defined previously,
throw error if so, mark as valid partition of parent if not.

Does that make sense?

5. Detach partition

ALTER TABLE partitioned_table
DETACH PARTITION partition_name [USING table_name]

This removes partition_name as partition of partitioned_table. The table
continues to exist with the same name or 'table_name', if specified.
pg_class.relispartition is set to false for the table, so it behaves like
a normal table.

System catalogs
===============

1. pg_partitioned_rel

CATALOG(pg_partitioned_rel,oid) BKI_WITHOUT_OIDS
{
Oid partrelid; /* partitioned table pg_class.oid */
char partstrategy; /* partitioning strategy 'l'/'r' */
int16 partnatts; /* number of partition columns */
int2vector partkey; /* column numbers of partition columns;
* 0 where specified column is an
* expresion */
oidvector partclass; /* operator class to compare keys */
pg_node_tree partexprs; /* expression trees for partition key
* members that are not simple column
* references; one for each zero entry
* in partkey[] */
};

2. pg_partition (omits partisvalid alluded to above)

CATALOG(pg_partition,oid) BKI_WITHOUT_OIDS
{
Oid partitionid; /* partition oid */
Oid partparent; /* parent oid */
anyarray partlistvalues; /* list of allowed values of the only
* partition column */
anyarray partrangebounds; /* list of bounds of ranges of
* allowed values per partition key
* column */
};

Further notes
=============

There are a number of restrictions on performing after-the-fact changes
using ALTER TABLE to partitions (ie, relispartition=true):

* Cannot add/drop column
* Cannot set/drop OIDs
* Cannot set/drop NOT NULL
* Cannot set/drop default
* Cannot alter column type
* Cannot add/drop alter constraint (table level)
* Cannot change persistence
* Cannot change inheritance
* Cannot link to a composite type

Such changes should be made to the topmost parent in the partitioning
hierarchy (hereafter referred to as just parent). These are recursively
applied to all the tables in the hierarchy. Although the last two items
cannot be performed on parent either.

Dropping a partition using DROP TABLE is not allowed. It needs to detached
using ALTER TABLE on parent before it can be dropped as a normal table.

Triggers on partitions are not allowed. They should be defined on the
parent. That said, I could not figure out a way to implement row-level
AFTER triggers on partitioned tables (more about that in a moment); so
they are currently not allowed:

CREATE TRIGGER audit_trig
AFTER INSERT ON persons
FOR EACH ROW EXECUTE PROCEDURE audit_func();
ERROR: Row-level AFTER triggers are not supported on partitioned tables

Column/table constraints on partitions are not allowed. They should be
defined on the parent. Foreign key constraints are not allowed due to
above limitation (no row-level after triggers).

A partitioning parent (RELKIND_PARTITIONED_REL) does not have storage.

Creating index on parent is not allowed. They should be defined on (leaf)
partitions. Because of this limitation, primary keys are not allowed on a
partitioned table. Perhaps, we should be able to just create a dummy
entry somewhere to represent an index on parent (which every partition
then copies.) Then by restricting primary key to contain all partition key
columns, we can implement unique constraint over the whole partitioned
table. That will in turn allow us to use partitioned tables as PK rels in
a foreign key constraint provided row-level AFTER trigger issue is
resolved.

VACUUM/ANALYZE on individual partitions should work like normal tables.
I've not implemented something like inheritance tree sampling for
partitioning tree in this patch. Autovacuum has been taught to ignore
parent tables and vacuum/analyze partitions normally.

Dropping a partitioned table should (?) unconditionally drop all its
partitions probably but, currently the patch uses dependencies, so
requires to specify CASCADE to do the same.

What should TRUNCATE on partitioned table do?

Ownership, privileges/permissions, RLS should be managed through the
parent table although not comprehensively addressed in the patch.

There is no need to define tuple routing triggers. CopyFrom() and
ExecInsert() determine target partition just before performing
heap_insert() and ExecInsertIndexTuples(). IOW, any BR triggers and
constraints (on parent) are executed for tuple before being routed to a
partition. If no partition can be found, it's an error.

Because row-level AFTER triggers need to save ItemPointers in trigger
event data and defining triggers on partitions (which is where tuples
really go) is not allowed, I could not find a straightforward way to
implement them. So, perhaps we should allow (only) row-level AFTER
triggers on partitions or think of modifying trigger.c to know about this
twist explicitly.

Internal representations
========================

For a RELKIND_PARTITIONED_REL relations, RelationData gets a few new
fields to store the partitioning metadata. That includes partition key
tuple (pg_partitioned_rel) including some derived info (opfamily,
opcintype, compare proc FmgrInfos, partition expression trees).

Additionally, it also includes a PartitionInfo object which includes
partition OIDs array, partition bound arrays (if range partitioned,
rangemax is sorted in ascending order and OIDs are likewise ordered). It
is built from information in pg_partition catalog.

While RelationBuildDesc() initializes the basic key info, fields like
expression trees, PartitionInfo are built on demand and cached. For
example, InitResultRelInfo() builds the latter to populate the newly added
ri_PartitionKeyInfo and ri_Partitions fields, respectively.

PartitionInfo object is rebuilt on every cache invalidation of the rel
which includes when adding/attaching/detaching a new partition.

Planner and executor considerations
=====================================

The patch does not yet implement any planner changes for partitioned
tables, although I'm working on the same and post updates as soon as
possible. That means, it is not possible to run SELECT/UPDATE/DELETE
queries on partitioned tables without getting:

postgres=# SELECT * FROM persons;
ERROR: could not open file "base/13244/106975": No such file or directory

Given that there would be more direct ways of performing partition pruning
decisions with the proposed, it would be nice to utilize them.
Specifically, I would like to avoid having to rely on constraint exclusion
for partition pruning whereby subquery_planner() builds append_rel_list
and the later steps exclude useless partitions.

By extending RelOptInfo to include partitioning info for partitioned rels,
it might be possible to perform partition pruning directly without
previously having to expand them. Although, as things stand now, it's not
clear how that might work - when would partition RTEs be added to the
rtable? The rtable is assumed not to change after
setup_simple_rel_arrays() has done its job which is much earlier than when
it would be desirable for the partitioned table expansion (along with
partition pruning) to happen. Moreover, if that means we might not be able
to build RelOptInfo's for partitions, how to choose best paths for them
(index paths or not, etc.)?

I'm also hoping we don't require something like inheritance_planner() for
when partitioned tables are target rels. I assume considerations for why
the special processing is necessary for inheritance trees in that scenario
don't apply to partitioning trees. So, if grouping_planner() returns a
Append plan (among other options) for the partitioning tree, tacking a
ModifyTable node on top should do the trick?

Suggestions greatly welcome in this area.

Other items
===========

Will include the following once we start reaching consensus on main parts
of the proposed design/implementation:

* New regression tests
* Documentation updates
* pg_dump, psql, etc.

For reference, some immediately previous discussions:

* On partitioning *

/messages/by-id/20140829155607.GF7705@eldon.alvh.no-ip.org

* Partitioning WIP patch *
/messages/by-id/54EC32B6.9070605@lab.ntt.co.jp

Comments welcome!

Thanks,
Amit

Wow, didn't expect to see that email this morning.

A very quick test:

CREATE TABLE purchases (purchase_id serial, purchase_time timestamp, item
text) partition by range on ((extract(year from
purchase_time)),(extract(month from purchase_time)));
ERROR: referenced relation "purchases" is not a table or foreign table

Thom

#3Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Thom Brown (#2)
Re: Declarative partitioning

Hi Thom,

On Tue, Aug 18, 2015 at 8:02 PM, Thom Brown <thom@linux.com> wrote:

Wow, didn't expect to see that email this morning.

A very quick test:

CREATE TABLE purchases (purchase_id serial, purchase_time timestamp, item
text) partition by range on ((extract(year from
purchase_time)),(extract(month from purchase_time)));
ERROR: referenced relation "purchases" is not a table or foreign table

Thanks for the quick test.

Damn, I somehow missed adding the new relkind to a check in
process_owned_by(). Will fix this and look for any such oversights.

Thanks,
Amit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4David Fetter
david@fetter.org
In reply to: Amit Langote (#1)
Re: Declarative partitioning

On Tue, Aug 18, 2015 at 07:30:20PM +0900, Amit Langote wrote:

Hi,

I would like propose $SUBJECT for this development cycle. Attached is a
WIP patch that implements most if not all of what's described below. Some
yet unaddressed parts are mentioned below, too. I'll add this to the CF-SEP.

Thanks for pushing this forward! We've needed this done for at least
a decade.

4. (yet unimplemented) Attach partition (from existing table)

ALTER TABLE partitioned_table
ATTACH PARTITION partition_name
FOR VALUES values_spec
USING [TABLE] table_name;

ALTER TABLE table_name
SET VALID PARTITION OF <parent>;

The first of the above pair of commands would attach table_name as a (yet)
'invalid' partition of partitioned_table (after confirming that it matches
the schema and does not overlap with other partitions per FOR VALUES
spec). It would also record the FOR VALUES part in the partition catalog
and set pg_class.relispartition to true for table_name.

After the first command is done, the second command would take exclusive
lock on table_name, scan the table to check if it contains any values
outside the boundaries defined by FOR VALUES clause defined previously,
throw error if so, mark as valid partition of parent if not.

One small change to make this part more efficient:

1. Take the access exclusive lock on table_name.
2. Check for a matching constraint on it.
3. If it's there, mark it as a valid partition.
4. If not, check for values outside the boundaries as above.

Should the be a *valid* constraint? Perhaps that should be
parameterized, as I'm not yet seeing a compelling argument either
direction. I'm picturing something like:

ALTER TABLE table_name SET VALID PARTITION OF <parent> [TRUST]

where TRUST would mean that an existing constraint need not be VALID.

Does that make sense?

Yep.

5. Detach partition

ALTER TABLE partitioned_table
DETACH PARTITION partition_name [USING table_name]

This removes partition_name as partition of partitioned_table. The table
continues to exist with the same name or 'table_name', if specified.
pg_class.relispartition is set to false for the table, so it behaves like
a normal table.

Could this take anything short of an access exclusive lock on the
parent?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Josh Berkus
josh@agliodbs.com
In reply to: Amit Langote (#1)
Re: Declarative partitioning

Amit,

I would like propose $SUBJECT for this development cycle. Attached is a
WIP patch that implements most if not all of what's described below. Some
yet unaddressed parts are mentioned below, too. I'll add this to the CF-SEP.

First of all, wow! Really happy to see this.

Syntax
======

1. Creating a partitioned table

CREATE TABLE table_name
PARTITION BY {RANGE|LIST}
ON (column_list);

Where column_list consists of simple column names or expressions:

PARTITION BY LIST ON (name)
PARTITION BY RANGE ON (year, month)

PARTITION BY LIST ON ((lower(left(name, 2)))
PARTITION BY RANGE ON ((extract(year from d)), (extract(month from d)))

So far so good. Have you given any thought as to how a user will
determine which partition corresponds to which values (for purposes of
dropping/maintaining a partition)?

Also, won't doing things like extract() for range partitions make it
much harder for you to develop the planner parts of this solution?

What about defining an interval instead, such as:

PARTITION BY RANGE USING ( interval ) ON ( column );

i.e.

PARTITION BY RANGE USING ( INTERVAL '1 month' ) ON ( submitted_date );
PARTITION BY RANGE USING ( 100000 ) ON ( user_id );

This would make it easy for you to construct range type values defining
the range of each partition, which would then make the planner work much
easier than calling a function would, no?

Or am I misunderstanding how you're using ranges here? It kind of seems
like you're still leaving specific range defintions up to the user,
which is (from my perspective) unsatisfactory (see below).

I'm assuming that all range partitions will be [ closed, open ) ranges.

2. Creating a partition of a partitioned table

CREATE TABLE table_name
PARTITION OF partitioned_table_name
FOR VALUES values_spec;

Where values_spec is:

listvalues: [IN] (val1, ...)

rangevalues: START (col1min, ... ) END (col1max, ... )
| START (col1min, ... )
| END (col1max, ... )

So, one thing I missed in here is anything about automated partitioning
of tables; that is, creating new partitions based on incoming data or a
simple statement which doesn't require knowledge of the partitioning
scheme. It's possible (and entirely accceptable) that you're
considering automated partition creation outside of the scope of this
patch. However, for range partitions, it would be *really* useful to
have this syntax:

CREATE NEXT PARTITION ON parent_table;

Which would just create the "next" partition based on whatever the range
partitoning scheme is, instead of requiring the user to calculate start
and end values which might or might not match the parent partitioning
scheme, and might leave gaps. Also this would be useful for range
partitions:

CREATE PARTITION ON parent_table USING ( start_value );

... where start_value is the start range of the new partition. Again,
easier for users to get correct.

Both of these require the idea of regular intervals for range
partitions, that is, on a table partitioned by month on a timestamptz
column, each partition will have the range [ month:1, nextmonth:1 ).
This is the most common use-case for range partitions (like, 95% of all
partitioning cases I've seen), so a new partitioning scheme ought to
address it.

While there are certainly users who desire the ability to define
arbitrary ranges for each range partition, these are by far the minority
and could be accomodated by a different path with more complex syntax.
Further, I'd wager that most users who want to define arbitrary ranges
for range partitions aren't going to be satisfied with the other
restrictions on declarative partitioning (e.g. same constraints, columns
for all partitions) and are going to use inheritance partitioning anyway.

5. Detach partition

ALTER TABLE partitioned_table
DETACH PARTITION partition_name [USING table_name]

This removes partition_name as partition of partitioned_table. The table
continues to exist with the same name or 'table_name', if specified.
pg_class.relispartition is set to false for the table, so it behaves like
a normal table.

What about DROPping partitions? Do they need to be detached first?

Creating index on parent is not allowed. They should be defined on (leaf)
partitions. Because of this limitation, primary keys are not allowed on a
partitioned table. Perhaps, we should be able to just create a dummy
entry somewhere to represent an index on parent (which every partition
then copies.)

This would be preferable, yes. Making users remember to manually create
indexes on each partition is undesirable.

What should TRUNCATE on partitioned table do?

On the master table? Truncate all individual partitions. Do not drop
the partitions.

On a partitition? Truncate just that partition.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Marc Mamin
M.Mamin@intershop.de
In reply to: Amit Langote (#1)
Re: Declarative partitioning

great :-)

2. Creating a partition of a partitioned table

CREATE TABLE table_name
PARTITION OF partitioned_table_name
FOR VALUES values_spec;

Where values_spec is:

listvalues: [IN] (val1, ...)

Would it make sense to allow one complementary partition to the listvalues?

listvalues: [[NOT] IN] (val1, ...)

I've thought a few times about moving data with some most common values to dedicated partitions
and keeping the rest in a separate one...

best regards,

Marc Mamin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Corey Huinker
corey.huinker@gmail.com
In reply to: Amit Langote (#1)
Re: Declarative partitioning

On Tue, Aug 18, 2015 at 6:30 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp

wrote:

Hi,

I would like propose $SUBJECT for this development cycle. Attached is a
WIP patch that implements most if not all of what's described below. Some
yet unaddressed parts are mentioned below, too. I'll add this to the
CF-SEP.

Syntax
======

1. Creating a partitioned table

CREATE TABLE table_name
PARTITION BY {RANGE|LIST}
ON (column_list);

Where column_list consists of simple column names or expressions:

PARTITION BY LIST ON (name)
PARTITION BY RANGE ON (year, month)

PARTITION BY LIST ON ((lower(left(name, 2)))
PARTITION BY RANGE ON ((extract(year from d)), (extract(month from d)))

Note: LIST partition key supports only one column.

For each column, you could write operator class name:

PARTITION BY LIST/RANGE ON (colname [USING] opclass_name),

If not specified, the default btree operator class based on type of each
key column is used. If none of the available btree operator classes are
compatible with the partitioning strategy (list/range), error is thrown.
Built-in btree operator classes cover a good number of types for list and
range partitioning in practical scenarios.

A table created using this form is of proposed new relkind
RELKIND_PARTITIONED_REL. An entry in pg_partitioned_rel (see below) is
created to store partition key info.

Note: A table cannot be partitioned after-the-fact using ALTER TABLE.

Normal dependencies are created between the partitioned table and operator
classes, object in partition expressions like functions.

2. Creating a partition of a partitioned table

CREATE TABLE table_name
PARTITION OF partitioned_table_name
FOR VALUES values_spec;

Where values_spec is:

listvalues: [IN] (val1, ...)

rangevalues: START (col1min, ... ) END (col1max, ... )
| START (col1min, ... )
| END (col1max, ... )

A table created using this form has proposed pg_class.relispartition set
to true. An entry in pg_partition (see below) is created to store the
partition bound info.

The values_spec should match the partitioning strategy of the partitioned
table. In case of a range partition, the values in START and/or END should
match columns in the partition key.

Defining a list partition is fairly straightforward - just spell out the
list of comma-separated values. Error is thrown if the list of values
overlaps with one of the existing partitions' list.

CREATE TABLE persons_by_state (name text, state text)
PARTITION BY LIST ON (state);

CREATE TABLE persons_IL
PARTITION OF persons_by_state
FOR VALUES IN ('IL');

CREATE TABLE persons_fail
PARTITION OF persons_by_state
FOR VALUES IN ('IL');
ERROR: cannot create partition that overlaps with an existing one

For a range partition, there are more than one way:

Specify both START and END bounds: resulting range should not overlap with
the range(s) covered by existing partitions. Error is thrown otherwise.
Although rare in practice, gaps between ranges are OK.

CREATE TABLE measurement(logdate date NOT NULL)
PARTITION BY RANGE ON (logdate);

CREATE TABLE measurement_y2006m02
PARTITION OF measurement
FOR VALUES START ('2006-02-01') END ('2006-03-01'); --success

CREATE TABLE measurement_fail
PARTITION OF measurement
FOR VALUES START ('2006-02-15') END ('2006-03-01');
ERROR: cannot create partition that overlaps with an existing one

Specify only the START bound: add the partition on the left of some range
covered by existing partitions provided no overlap occurs (also
considering gaps between ranges, if any). If no such range exists, the new
partition will cover the range [START, +INFINITY) and become the rightmost
partition. Error is thrown if the specified START causes overlap.

CREATE TABLE measurement_y2006m01
PARTITION OF measurement
FOR VALUES START ('2006-01-01'); --success

CREATE TABLE measurement_fail
PARTITION OF measurement
FOR VALUES START ('2006-02-01'); --overlaps with measurement_y2006m02
ERROR: cannot create partition that overlaps with an existing one

Specify only the END bound: add the partition on the right of some range
covered by existing partitions provided no overlap occurs (also
considering gaps between ranges, if any). If no such range exists, the new
partition would cover the range (-INFINITY, END) and become the leftmost
partition. Error is thrown if the specified END causes overlap.

CREATE TABLE measurement_y2006m03
PARTITION OF measurement
FOR VALUES END ('2006-04-01'); --success

CREATE TABLE measurement_fail
PARTITION OF measurement
FOR VALUES END ('2006-03-01'); --overlaps with measurement_y2006m02
ERROR: cannot create partition that overlaps with an existing one

For each partition, START and END bound values are stored in the
catalog. Note that the lower bound is inclusive, whereas the upper bound
is exclusive.

Note: At most one range partition can have null min bound in which case it
covers the range (-INFINITY, END). Also, at most one range partition can
have null max bound in which case it covers the range [START, +INFINITY).

A normal dependency is created between the parent and the partition.

3. Multi-level partitioning

CREATE TABLE table_name
PARTITION OF partitioned_table_name
FOR VALUES values_spec
PARTITION BY {RANGE|LIST} ON (columns_list)

This variant implements a form of so called composite or sub-partitioning
with arbitrarily deep partitioning structure. A table created using this
form has both the relkind RELKIND_PARTITIONED_REL and
pg_class.relispartition set to true.

4. (yet unimplemented) Attach partition (from existing table)

ALTER TABLE partitioned_table
ATTACH PARTITION partition_name
FOR VALUES values_spec
USING [TABLE] table_name;

ALTER TABLE table_name
SET VALID PARTITION OF <parent>;

The first of the above pair of commands would attach table_name as a (yet)
'invalid' partition of partitioned_table (after confirming that it matches
the schema and does not overlap with other partitions per FOR VALUES
spec). It would also record the FOR VALUES part in the partition catalog
and set pg_class.relispartition to true for table_name.

After the first command is done, the second command would take exclusive
lock on table_name, scan the table to check if it contains any values
outside the boundaries defined by FOR VALUES clause defined previously,
throw error if so, mark as valid partition of parent if not.

Does that make sense?

5. Detach partition

ALTER TABLE partitioned_table
DETACH PARTITION partition_name [USING table_name]

This removes partition_name as partition of partitioned_table. The table
continues to exist with the same name or 'table_name', if specified.
pg_class.relispartition is set to false for the table, so it behaves like
a normal table.

System catalogs
===============

1. pg_partitioned_rel

CATALOG(pg_partitioned_rel,oid) BKI_WITHOUT_OIDS
{
Oid partrelid; /* partitioned table pg_class.oid */
char partstrategy; /* partitioning strategy 'l'/'r' */
int16 partnatts; /* number of partition columns */
int2vector partkey; /* column numbers of partition columns;
* 0 where specified column is an
* expresion */
oidvector partclass; /* operator class to compare keys */
pg_node_tree partexprs; /* expression trees for partition key
* members that are not simple column
* references; one for each zero entry
* in partkey[] */
};

2. pg_partition (omits partisvalid alluded to above)

CATALOG(pg_partition,oid) BKI_WITHOUT_OIDS
{
Oid partitionid; /* partition oid */
Oid partparent; /* parent oid */
anyarray partlistvalues; /* list of allowed values of the only
* partition column */
anyarray partrangebounds; /* list of bounds of ranges of
* allowed values per partition key
* column */
};

Further notes
=============

There are a number of restrictions on performing after-the-fact changes
using ALTER TABLE to partitions (ie, relispartition=true):

* Cannot add/drop column
* Cannot set/drop OIDs
* Cannot set/drop NOT NULL
* Cannot set/drop default
* Cannot alter column type
* Cannot add/drop alter constraint (table level)
* Cannot change persistence
* Cannot change inheritance
* Cannot link to a composite type

Such changes should be made to the topmost parent in the partitioning
hierarchy (hereafter referred to as just parent). These are recursively
applied to all the tables in the hierarchy. Although the last two items
cannot be performed on parent either.

Dropping a partition using DROP TABLE is not allowed. It needs to detached
using ALTER TABLE on parent before it can be dropped as a normal table.

Triggers on partitions are not allowed. They should be defined on the
parent. That said, I could not figure out a way to implement row-level
AFTER triggers on partitioned tables (more about that in a moment); so
they are currently not allowed:

CREATE TRIGGER audit_trig
AFTER INSERT ON persons
FOR EACH ROW EXECUTE PROCEDURE audit_func();
ERROR: Row-level AFTER triggers are not supported on partitioned tables

Column/table constraints on partitions are not allowed. They should be
defined on the parent. Foreign key constraints are not allowed due to
above limitation (no row-level after triggers).

A partitioning parent (RELKIND_PARTITIONED_REL) does not have storage.

Creating index on parent is not allowed. They should be defined on (leaf)
partitions. Because of this limitation, primary keys are not allowed on a
partitioned table. Perhaps, we should be able to just create a dummy
entry somewhere to represent an index on parent (which every partition
then copies.) Then by restricting primary key to contain all partition key
columns, we can implement unique constraint over the whole partitioned
table. That will in turn allow us to use partitioned tables as PK rels in
a foreign key constraint provided row-level AFTER trigger issue is
resolved.

VACUUM/ANALYZE on individual partitions should work like normal tables.
I've not implemented something like inheritance tree sampling for
partitioning tree in this patch. Autovacuum has been taught to ignore
parent tables and vacuum/analyze partitions normally.

Dropping a partitioned table should (?) unconditionally drop all its
partitions probably but, currently the patch uses dependencies, so
requires to specify CASCADE to do the same.

What should TRUNCATE on partitioned table do?

Ownership, privileges/permissions, RLS should be managed through the
parent table although not comprehensively addressed in the patch.

There is no need to define tuple routing triggers. CopyFrom() and
ExecInsert() determine target partition just before performing
heap_insert() and ExecInsertIndexTuples(). IOW, any BR triggers and
constraints (on parent) are executed for tuple before being routed to a
partition. If no partition can be found, it's an error.

Because row-level AFTER triggers need to save ItemPointers in trigger
event data and defining triggers on partitions (which is where tuples
really go) is not allowed, I could not find a straightforward way to
implement them. So, perhaps we should allow (only) row-level AFTER
triggers on partitions or think of modifying trigger.c to know about this
twist explicitly.

Internal representations
========================

For a RELKIND_PARTITIONED_REL relations, RelationData gets a few new
fields to store the partitioning metadata. That includes partition key
tuple (pg_partitioned_rel) including some derived info (opfamily,
opcintype, compare proc FmgrInfos, partition expression trees).

Additionally, it also includes a PartitionInfo object which includes
partition OIDs array, partition bound arrays (if range partitioned,
rangemax is sorted in ascending order and OIDs are likewise ordered). It
is built from information in pg_partition catalog.

While RelationBuildDesc() initializes the basic key info, fields like
expression trees, PartitionInfo are built on demand and cached. For
example, InitResultRelInfo() builds the latter to populate the newly added
ri_PartitionKeyInfo and ri_Partitions fields, respectively.

PartitionInfo object is rebuilt on every cache invalidation of the rel
which includes when adding/attaching/detaching a new partition.

Planner and executor considerations
=====================================

The patch does not yet implement any planner changes for partitioned
tables, although I'm working on the same and post updates as soon as
possible. That means, it is not possible to run SELECT/UPDATE/DELETE
queries on partitioned tables without getting:

postgres=# SELECT * FROM persons;
ERROR: could not open file "base/13244/106975": No such file or directory

Given that there would be more direct ways of performing partition pruning
decisions with the proposed, it would be nice to utilize them.
Specifically, I would like to avoid having to rely on constraint exclusion
for partition pruning whereby subquery_planner() builds append_rel_list
and the later steps exclude useless partitions.

By extending RelOptInfo to include partitioning info for partitioned rels,
it might be possible to perform partition pruning directly without
previously having to expand them. Although, as things stand now, it's not
clear how that might work - when would partition RTEs be added to the
rtable? The rtable is assumed not to change after
setup_simple_rel_arrays() has done its job which is much earlier than when
it would be desirable for the partitioned table expansion (along with
partition pruning) to happen. Moreover, if that means we might not be able
to build RelOptInfo's for partitions, how to choose best paths for them
(index paths or not, etc.)?

I'm also hoping we don't require something like inheritance_planner() for
when partitioned tables are target rels. I assume considerations for why
the special processing is necessary for inheritance trees in that scenario
don't apply to partitioning trees. So, if grouping_planner() returns a
Append plan (among other options) for the partitioning tree, tacking a
ModifyTable node on top should do the trick?

Suggestions greatly welcome in this area.

Other items
===========

Will include the following once we start reaching consensus on main parts
of the proposed design/implementation:

* New regression tests
* Documentation updates
* pg_dump, psql, etc.

For reference, some immediately previous discussions:

* On partitioning *

/messages/by-id/20140829155607.GF7705@eldon.alvh.no-ip.org

* Partitioning WIP patch *
/messages/by-id/54EC32B6.9070605@lab.ntt.co.jp

Comments welcome!

Thanks,
Amit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Quick thoughts borne of years of slugging it out with partitions on Oracle:

- Finally!!!!!!!!!!!

- Your range partitioning will need to express exclusive/inclusive bounds,
or go to the Oracle model where every partition is a cascading "values less
than" test context dependent on the partitions defined before it. I would
suggest that leveraging existing range types (or allowing the user to
specify a range type, like for a specific collation of a text range) would
allow for the most flexible and postgres-ish range definition. You seem to
do this with the "[USING] opclass_name" bit, but don't follow through on
the START(...) and END(...). Something like FOR VALUES <@
'[''2014-01-01'',''2015-01-01)'::daterange would cover most needs
succinctly, though I admit the syntax for complex ranges could be
cumbersome, though something like FOR VALUES <@
'[(''a'',1),(''b'',1))'::letter_then_number_range is still readable.

- No partitioning scheme survives first contact with reality. So you will
need a facility for splitting and joining existing partitions. For
splitting partitions, it's sufficient to require that the new partition
share either a upper/lower bound (with the same inclusivity/exclusivity) of
an existing partition, thus uniquely identifying the partition to be split,
and require that the other bound be within the range of the partition to be
split. Similarly, it's fair to require that the partitions to be joined be
adjacent in range. In both cases, range operators make these tests simple.

- Your features 4 and 5 are implemented in Oracle with SWAP PARTITION,
which is really neat for doing ETLs and index rebuilds offline in a copy
table, and then swapping the data segment of that table with the partition
specified. Which could be considered cheating because none of the partition
metadata changed, just the pointers to the segments. We already do this
with adding removing INHERIT. I'm not saying they can't be separate
functionality, but keeping an atomic SWAP operation would be grand.

#8Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: David Fetter (#4)
Re: Declarative partitioning

On 2015-08-18 PM 10:43, David Fetter wrote:

After the first command is done, the second command would take exclusive
lock on table_name, scan the table to check if it contains any values
outside the boundaries defined by FOR VALUES clause defined previously,
throw error if so, mark as valid partition of parent if not.

One small change to make this part more efficient:

1. Take the access exclusive lock on table_name.
2. Check for a matching constraint on it.
3. If it's there, mark it as a valid partition.
4. If not, check for values outside the boundaries as above.

That's an interesting idea. Thanks!

By a matching constraint, I guess you mean a 'valid' constraint from which
the declared partition constraint can be proven to follow. For (a simple)
example, from a CHECK (a >= 100 AND a < 150) on table_name, the partition
constraint implied by FOR VALUES START (100) END (200) can be assumed to hold.

Should the be a *valid* constraint? Perhaps that should be
parameterized, as I'm not yet seeing a compelling argument either
direction. I'm picturing something like:

ALTER TABLE table_name SET VALID PARTITION OF <parent> [TRUST]

where TRUST would mean that an existing constraint need not be VALID.

Hmm, I'd think this step must be able to assert the partition constraint
beyond any doubt. If the DBA added the constraint and marked it invalid,
she should first VALIDATE the constraint to make it valid by performing
whatever steps necessary before. IOW, a full heap scan at least once is
inevitable (the reason why we might want to make this a two step process
at all). Am I missing something?

5. Detach partition

ALTER TABLE partitioned_table
DETACH PARTITION partition_name [USING table_name]

This removes partition_name as partition of partitioned_table. The table
continues to exist with the same name or 'table_name', if specified.
pg_class.relispartition is set to false for the table, so it behaves like
a normal table.

Could this take anything short of an access exclusive lock on the
parent?

Yes, both the step 1 of ATTACH command and DETACH command take access
exclusive lock on the parent. They are rather quick metadata changes, so
should not stall others significantly, I think.

Thanks,
Amit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Marc Mamin (#6)
Re: Declarative partitioning

On 2015-08-19 AM 02:57, Marc Mamin wrote:

2. Creating a partition of a partitioned table

CREATE TABLE table_name
PARTITION OF partitioned_table_name
FOR VALUES values_spec;

Where values_spec is:

listvalues: [IN] (val1, ...)

Would it make sense to allow one complementary partition to the listvalues?

listvalues: [[NOT] IN] (val1, ...)

I've thought a few times about moving data with some most common values to dedicated partitions
and keeping the rest in a separate one...

Thanks, that's definitely something to consider.

I have been thinking of a sort of default list partition for the "rest" of
values. Would you rather declare that with something like the below than
having to enumerate all the values in a NOT IN list? Or the NOT IN way is
more intuitive/friendly?

CREATE TABLE _rest PARTITION OF table_name FOR VALUES [ IN ] DEFAULT

Of course, at most one such partition would be allowed.

Thanks,
Amit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Simon Riggs
simon@2ndQuadrant.com
In reply to: Josh Berkus (#5)
Re: Declarative partitioning

On 18 August 2015 at 18:31, Josh Berkus <josh@agliodbs.com> wrote:

2. Creating a partition of a partitioned table

CREATE TABLE table_name
PARTITION OF partitioned_table_name
FOR VALUES values_spec;

Where values_spec is:

listvalues: [IN] (val1, ...)

rangevalues: START (col1min, ... ) END (col1max, ... )
| START (col1min, ... )
| END (col1max, ... )

So, one thing I missed in here is anything about automated partitioning
of tables; that is, creating new partitions based on incoming data or a
simple statement which doesn't require knowledge of the partitioning
scheme. It's possible (and entirely accceptable) that you're
considering automated partition creation outside of the scope of this
patch.

I would like to make automatic partitioning outside the scope of this first
patch.

However, for range partitions, it would be *really* useful to

have this syntax:

CREATE NEXT PARTITION ON parent_table;

Which would just create the "next" partition based on whatever the range
partitoning scheme is, instead of requiring the user to calculate start
and end values which might or might not match the parent partitioning
scheme, and might leave gaps. Also this would be useful for range
partitions:

CREATE PARTITION ON parent_table USING ( start_value );

... where start_value is the start range of the new partition. Again,
easier for users to get correct.

Both of these require the idea of regular intervals for range
partitions, that is, on a table partitioned by month on a timestamptz
column, each partition will have the range [ month:1, nextmonth:1 ).
This is the most common use-case for range partitions (like, 95% of all
partitioning cases I've seen), so a new partitioning scheme ought to
address it.

While there are certainly users who desire the ability to define
arbitrary ranges for each range partition, these are by far the minority
and could be accomodated by a different path with more complex syntax.
Further, I'd wager that most users who want to define arbitrary ranges
for range partitions aren't going to be satisfied with the other
restrictions on declarative partitioning (e.g. same constraints, columns
for all partitions) and are going to use inheritance partitioning anyway.

I like the idea of a regular partitioning step because it is how you design
such tables - "lets use monthly partitions".

This gives sanely terse syntax, rather than specifying pages and pages of
exact values in DDL....

PARTITION BY RANGE ON (columns) INCREMENT BY (INTERVAL '1 month' ) START
WITH value;

borrowing the same concepts from sequence syntax.

Creating index on parent is not allowed. They should be defined on (leaf)

partitions. Because of this limitation, primary keys are not allowed on a
partitioned table. Perhaps, we should be able to just create a dummy
entry somewhere to represent an index on parent (which every partition
then copies.)

This would be preferable, yes. Making users remember to manually create
indexes on each partition is undesirable.

I think it is useful to allow additional indexes on partitions, if desired,
but we should always automatically build the indexes that are defined on
the master when we create a new partition.

Presumably unique indexes will be allowed on partitions. So if the
partition key is unique, we can say the whole partitioned table is unique
and call that a Primary Key.

I would want individual partitions to be placed on separate tablespaces,
but not by default.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#11Simon Riggs
simon@2ndQuadrant.com
In reply to: Amit Langote (#1)
Re: Declarative partitioning

On 18 August 2015 at 11:30, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
wrote:

There is no need to define tuple routing triggers. CopyFrom() and
ExecInsert() determine target partition just before performing
heap_insert() and ExecInsertIndexTuples(). IOW, any BR triggers and
constraints (on parent) are executed for tuple before being routed to a
partition. If no partition can be found, it's an error.

Because row-level AFTER triggers need to save ItemPointers in trigger
event data and defining triggers on partitions (which is where tuples
really go) is not allowed, I could not find a straightforward way to
implement them. So, perhaps we should allow (only) row-level AFTER
triggers on partitions or think of modifying trigger.c to know about this
twist explicitly.

I think tables will eventually need FK support; its not sustainable as a
long term restriction, though perhaps its something we can do in a later
patch.

You haven't specified what would happen if an UPDATE would change a row's
partition. I'm happy to add this to the list of restrictions by saying that
the partition key cannot be updated.

We'll need regression tests that cover each restriction and docs that
match. This is not something we should leave until last. People read the
docs to understand the feature, helping them to reach consensus. So it is
for you to provide the docs before, not wait until later. I will begin a
code review once you tell me docs and tests are present. We all want the
feature, so its all about the details now.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#12David Fetter
david@fetter.org
In reply to: Amit Langote (#8)
Re: Declarative partitioning

On Wed, Aug 19, 2015 at 04:30:39PM +0900, Amit Langote wrote:

On 2015-08-18 PM 10:43, David Fetter wrote:

After the first command is done, the second command would take exclusive
lock on table_name, scan the table to check if it contains any values
outside the boundaries defined by FOR VALUES clause defined previously,
throw error if so, mark as valid partition of parent if not.

One small change to make this part more efficient:

1. Take the access exclusive lock on table_name.
2. Check for a matching constraint on it.
3. If it's there, mark it as a valid partition.
4. If not, check for values outside the boundaries as above.

That's an interesting idea. Thanks!

I hope I'm advancing this feature rather than bogging it down...

By a matching constraint, I guess you mean a 'valid' constraint from
which the declared partition constraint can be proven to follow. For
(a simple) example, from a CHECK (a >= 100 AND a < 150) on
table_name, the partition constraint implied by FOR VALUES START
(100) END (200) can be assumed to hold.

Well, I was assuming an exact match, but a stricter match seems like a
nice-to-have...possibly later.

Should the be a *valid* constraint? Perhaps that should be
parameterized, as I'm not yet seeing a compelling argument either
direction. I'm picturing something like:

ALTER TABLE table_name SET VALID PARTITION OF <parent> [TRUST]

where TRUST would mean that an existing constraint need not be VALID.

Hmm, I'd think this step must be able to assert the partition
constraint beyond any doubt. If the DBA added the constraint and
marked it invalid, she should first VALIDATE the constraint to make
it valid by performing whatever steps necessary before. IOW, a full
heap scan at least once is inevitable (the reason why we might want
to make this a two step process at all). Am I missing something?

There are use cases where we need to warn people that their assertions
need to be true, and if those assertions are not true, this will
explode, leaving them to pick the resulting shrapnel out of their
faces. There are other parts of the system where this is true, as
when people write UDFs in C.

As I understand it, NOT VALID means, "I assert that the tuples already
here fit the constraint. Any changes will be checked against the
constraint."

I've seen cases where a gigantic amount of data is coming out of some
distributed system which holds the constraint as an invariant. This
let a DBA decide to add a NOT VALID constraint in order not to take
the hit of a second full scan of the data, which might have made the
import, and possibly the entire project, untenable.

See above.

5. Detach partition

ALTER TABLE partitioned_table
DETACH PARTITION partition_name [USING table_name]

This removes partition_name as partition of partitioned_table.
The table continues to exist with the same name or 'table_name',
if specified. pg_class.relispartition is set to false for the
table, so it behaves like a normal table.

Could this take anything short of an access exclusive lock on the
parent?

Yes, both the step 1 of ATTACH command and DETACH command take
access exclusive lock on the parent. They are rather quick metadata
changes, so should not stall others significantly, I think.

So no. Weakening required locks has been something of an ongoing
project, project-wide, and need not be part of the first cut of this
long-needed feature.

Thanks so much for working on this!

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Marc Mamin
M.Mamin@intershop.de
In reply to: Amit Langote (#9)
Re: Declarative partitioning

On 2015-08-19 AM 02:57, Marc Mamin wrote:

2. Creating a partition of a partitioned table

CREATE TABLE table_name
PARTITION OF partitioned_table_name
FOR VALUES values_spec;

Where values_spec is:

listvalues: [IN] (val1, ...)

Would it make sense to allow one complementary partition to the listvalues?

listvalues: [[NOT] IN] (val1, ...)

I've thought a few times about moving data with some most common values to dedicated partitions
and keeping the rest in a separate one...

Thanks, that's definitely something to consider.

I have been thinking of a sort of default list partition for the "rest" of
values. Would you rather declare that with something like the below than
having to enumerate all the values in a NOT IN list? Or the NOT IN way is
more intuitive/friendly?

CREATE TABLE _rest PARTITION OF table_name FOR VALUES [ IN ] DEFAULT

Of course, at most one such partition would be allowed.

On the one hand I guess it will be easies to check for partition overlapping if their definitions all contain the exact allowed values.
But this could be generalized to range partitions too:

CREATE TABLE _rest FALLBACK PARTITION OF table_name

The need for it for range partitions seems very narrow at the first glimpse, but I remember bore administrative work in order to ensure that there always was a partition available for incoming data (from a very old time when I was still working with Oracle).

To have it comfortable and nevertheless allow to define new partitions, this would require to always check/move data from the default partition to new partitions at create time.

and 2 other thoughts:
- In your proposal, the parent table is not materialized at all. Could it be used for the fallback partition?
- what about always having a fallback partition? This would reduce the risk of unexpected failures and somewhat help Postgres stand out from the crowd :)

regards,
Marc Mamin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Josh Berkus
josh@agliodbs.com
In reply to: Amit Langote (#1)
Re: Declarative partitioning

On 08/19/2015 04:59 AM, Simon Riggs wrote:

I like the idea of a regular partitioning step because it is how you
design such tables - "lets use monthly partitions".

This gives sanely terse syntax, rather than specifying pages and pages
of exact values in DDL....

PARTITION BY RANGE ON (columns) INCREMENT BY (INTERVAL '1 month' )
START WITH value;

Oh, I like that syntax!

How would it work if there were multiple columns? Maybe we don't want
to allow that for this form?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Thom Brown
thom@linux.com
In reply to: Josh Berkus (#14)
Re: Declarative partitioning

On 19 August 2015 at 21:10, Josh Berkus <josh@agliodbs.com> wrote:

On 08/19/2015 04:59 AM, Simon Riggs wrote:

I like the idea of a regular partitioning step because it is how you
design such tables - "lets use monthly partitions".

This gives sanely terse syntax, rather than specifying pages and pages
of exact values in DDL....

PARTITION BY RANGE ON (columns) INCREMENT BY (INTERVAL '1 month' )
START WITH value;

Oh, I like that syntax!

How would it work if there were multiple columns? Maybe we don't want
to allow that for this form?

If we went with that, and had:

CREATE TABLE orders (order_id serial, order_date date, item text)
PARTITION BY RANGE ON (order_date) INCREMENT BY (INTERVAL '1 month')
START WITH '2015-01-01';

Where would the following go?

INSERT INTO orders (order_date, item) VALUES ('2014-11-12', 'Old item');

Would there automatically be an "others" partition? Or would it produce an
error and act like a constraint?

Thom

#16Josh Berkus
josh@agliodbs.com
In reply to: Amit Langote (#1)
Re: Declarative partitioning

On 08/19/2015 01:18 PM, Thom Brown wrote:

On 19 August 2015 at 21:10, Josh Berkus <josh@agliodbs.com
<mailto:josh@agliodbs.com>> wrote:

On 08/19/2015 04:59 AM, Simon Riggs wrote:

I like the idea of a regular partitioning step because it is how you
design such tables - "lets use monthly partitions".

This gives sanely terse syntax, rather than specifying pages and pages
of exact values in DDL....

PARTITION BY RANGE ON (columns) INCREMENT BY (INTERVAL '1 month' )
START WITH value;

Oh, I like that syntax!

How would it work if there were multiple columns? Maybe we don't want
to allow that for this form?

If we went with that, and had:

CREATE TABLE orders (order_id serial, order_date date, item text)
PARTITION BY RANGE ON (order_date) INCREMENT BY (INTERVAL '1 month')
START WITH '2015-01-01';

Where would the following go?

INSERT INTO orders (order_date, item) VALUES ('2014-11-12', 'Old item');

Would there automatically be an "others" partition? Or would it produce
an error and act like a constraint?

The "others" partition was brought up upthread, as an addition to the
original proposal. I really think that an "others" partition needs to
be up to the DBA; I've seen apps where they'd want to capture it, and
apps where they'd want such an insert to error.

I, for one, would be OK with a new partitioning which didn't address the
"others" partition issue until 9.7; I see it as a wholly separable
improvement.

Plus, you can always *manually* add high/low catchall partitions.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Simon Riggs (#11)
Re: Declarative partitioning

On 2015-08-19 PM 09:23, Simon Riggs wrote:

On 18 August 2015 at 11:30, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
wrote:

There is no need to define tuple routing triggers. CopyFrom() and
ExecInsert() determine target partition just before performing
heap_insert() and ExecInsertIndexTuples(). IOW, any BR triggers and
constraints (on parent) are executed for tuple before being routed to a
partition. If no partition can be found, it's an error.

Because row-level AFTER triggers need to save ItemPointers in trigger
event data and defining triggers on partitions (which is where tuples
really go) is not allowed, I could not find a straightforward way to
implement them. So, perhaps we should allow (only) row-level AFTER
triggers on partitions or think of modifying trigger.c to know about this
twist explicitly.

I think tables will eventually need FK support; its not sustainable as a
long term restriction, though perhaps its something we can do in a later
patch.

Sure. Solving the row-level AFTER trigger problem should hopefully open up
the possibility of partitioned-table-as-FK-rel implementation.

You haven't specified what would happen if an UPDATE would change a row's
partition. I'm happy to add this to the list of restrictions by saying that
the partition key cannot be updated.

UPDATEs that change a row's partition would cause error. I haven't
implemented that yet but will that way in the next patch.

By the last sentence, do you mean only UPDATEs to the partition key that
cause rows to jump partitions or simply any UPDATEs to the partition key?

We'll need regression tests that cover each restriction and docs that
match. This is not something we should leave until last. People read the
docs to understand the feature, helping them to reach consensus. So it is
for you to provide the docs before, not wait until later. I will begin a
code review once you tell me docs and tests are present. We all want the
feature, so its all about the details now.

Sorry, should have added tests and docs already. I will add them in the
next version of the patch. Thanks for willing to review.

Thanks,
Amit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Michael Paquier
michael@paquier.xyz
In reply to: Amit Langote (#17)
Re: Declarative partitioning

On Thu, Aug 20, 2015 at 11:16 AM, Amit Langote <
Langote_Amit_f8@lab.ntt.co.jp> wrote:

On 2015-08-19 PM 09:23, Simon Riggs wrote:

We'll need regression tests that cover each restriction and docs that
match. This is not something we should leave until last. People read the
docs to understand the feature, helping them to reach consensus. So it is
for you to provide the docs before, not wait until later. I will begin a
code review once you tell me docs and tests are present. We all want the
feature, so its all about the details now.

Sorry, should have added tests and docs already. I will add them in the
next version of the patch.

Yes, those would be really good to have before any review so as it is
possible to grasp an understanding of what this patch does. I would like to
look at it as well more in depths.

Thanks for willing to review.

Really thanks for working on that! I am sure you are going to get a lot of
feedback.
Regards,
--
Michael

#19Simon Riggs
simon@2ndQuadrant.com
In reply to: Amit Langote (#17)
Re: Declarative partitioning

On 20 August 2015 at 03:16, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
wrote:

Sorry, should have added tests and docs already. I will add them in the
next version of the patch. Thanks for willing to review.

Thanks for picking up this challenge. It's easier if you have someone
interested all the way.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#20Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Josh Berkus (#14)
Re: Declarative partitioning

On 2015-08-20 AM 05:10, Josh Berkus wrote:

On 08/19/2015 04:59 AM, Simon Riggs wrote:

I like the idea of a regular partitioning step because it is how you
design such tables - "lets use monthly partitions".

This gives sanely terse syntax, rather than specifying pages and pages
of exact values in DDL....

PARTITION BY RANGE ON (columns) INCREMENT BY (INTERVAL '1 month' )
START WITH value;

Oh, I like that syntax!

How would it work if there were multiple columns? Maybe we don't want
to allow that for this form?

Yea, we could simply restrict it to the single column case, which does not
sound like a major restriction.

Thanks,
Amit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Amit Langote (#1)
#22Simon Riggs
simon@2ndQuadrant.com
In reply to: Amit Langote (#20)
#23Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Pavan Deolasee (#21)
#24Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Simon Riggs (#22)
#25Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Corey Huinker (#7)
#26Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: David Fetter (#12)
#27David Steele
david@pgmasters.net
In reply to: Amit Langote (#23)
#28David Fetter
david@fetter.org
In reply to: Amit Langote (#26)
#29Corey Huinker
corey.huinker@gmail.com
In reply to: Amit Langote (#25)
#30Corey Huinker
corey.huinker@gmail.com
In reply to: David Steele (#27)
#31Josh Berkus
josh@agliodbs.com
In reply to: Amit Langote (#1)
#32David Fetter
david@fetter.org
In reply to: Amit Langote (#17)
#33Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: David Fetter (#28)
#34Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: David Fetter (#32)
#35Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Amit Langote (#34)
#36Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Josh Berkus (#5)
#37Josh Berkus
josh@agliodbs.com
In reply to: Amit Langote (#1)
#38Simon Riggs
simon@2ndQuadrant.com
In reply to: Josh Berkus (#37)
#39Andrew Dunstan
andrew@dunslane.net
In reply to: Simon Riggs (#38)
#40Simon Riggs
simon@2ndQuadrant.com
In reply to: Amit Langote (#1)
#41Simon Riggs
simon@2ndQuadrant.com
In reply to: Amit Langote (#1)
#42Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Simon Riggs (#40)
#43Simon Riggs
simon@2ndQuadrant.com
In reply to: Amit Langote (#42)
#44Thom Brown
thom@linux.com
In reply to: Amit Langote (#3)
#45Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Thom Brown (#44)
#46Thom Brown
thom@linux.com
In reply to: Amit Langote (#45)
#47Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Simon Riggs (#41)
#48Robert Haas
robertmhaas@gmail.com
In reply to: Amit Langote (#47)
#49Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Robert Haas (#48)
#50Simon Riggs
simon@2ndQuadrant.com
In reply to: Amit Langote (#49)
#51Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Simon Riggs (#50)
#52Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Amit Langote (#51)
#53Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#52)
#54Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#53)
#55Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#54)
#56Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#55)
#57Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Michael Paquier (#56)
#58Michael Paquier
michael@paquier.xyz
In reply to: Amit Langote (#57)
#59Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Robert Haas (#53)
#60Robert Haas
robertmhaas@gmail.com
In reply to: Amit Langote (#59)
#61Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Robert Haas (#60)
#62Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#61)
#63Corey Huinker
corey.huinker@gmail.com
In reply to: Amit Langote (#62)
#64Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Corey Huinker (#63)
#65Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Amit Langote (#1)
#66Corey Huinker
corey.huinker@gmail.com
In reply to: Amit Langote (#64)
#67Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Corey Huinker (#66)
#68Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Tomas Vondra (#65)
#69Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#68)
#70Corey Huinker
corey.huinker@gmail.com
In reply to: Amit Langote (#69)
#71Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Corey Huinker (#70)
#72Corey Huinker
corey.huinker@gmail.com
In reply to: Amit Langote (#71)
#73Josh Berkus
josh@agliodbs.com
In reply to: Amit Langote (#1)
#74Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Josh Berkus (#73)
#75Robert Haas
robertmhaas@gmail.com
In reply to: Amit Langote (#62)
#76Robert Haas
robertmhaas@gmail.com
In reply to: Tomas Vondra (#65)
#77Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#69)
#78Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Robert Haas (#75)
#79Corey Huinker
corey.huinker@gmail.com
In reply to: Amit Langote (#78)
#80Peter Eisentraut
peter_e@gmx.net
In reply to: Amit Langote (#77)
#81Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Peter Eisentraut (#80)
#82Robert Haas
robertmhaas@gmail.com
In reply to: Amit Langote (#78)
#83Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Corey Huinker (#79)
#84Corey Huinker
corey.huinker@gmail.com
In reply to: Amit Langote (#83)
#85Jean-Pierre Pelletier
jppelletier@e-djuster.com
In reply to: Corey Huinker (#84)
#86Jean-Pierre Pelletier
jppelletier@e-djuster.ca
In reply to: Jean-Pierre Pelletier (#85)
#87Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Robert Haas (#82)
#88Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Jean-Pierre Pelletier (#86)
#89Ildar Musin
i.musin@postgrespro.ru
In reply to: Amit Langote (#74)
#90Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ildar Musin (#89)
#91Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Corey Huinker (#84)
#92Corey Huinker
corey.huinker@gmail.com
In reply to: Amit Langote (#91)
#93Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Corey Huinker (#92)
#94Corey Huinker
corey.huinker@gmail.com
In reply to: Amit Langote (#93)
#95Robert Haas
robertmhaas@gmail.com
In reply to: Amit Langote (#91)
#96Robert Haas
robertmhaas@gmail.com
In reply to: Amit Langote (#87)
#97Corey Huinker
corey.huinker@gmail.com
In reply to: Robert Haas (#95)
#98Alexander Korotkov
aekorotkov@gmail.com
In reply to: Amit Langote (#93)
#99Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Alexander Korotkov (#98)
#100Alexander Korotkov
aekorotkov@gmail.com
In reply to: Amit Langote (#99)
#101Robert Haas
robertmhaas@gmail.com
In reply to: Alexander Korotkov (#100)
#102Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#101)
#103Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Robert Haas (#101)
#104Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Amit Langote (#103)
#105Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ashutosh Bapat (#104)
#106Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Amit Langote (#105)
#107Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Amit Langote (#105)
#108Robert Haas
robertmhaas@gmail.com
In reply to: Ashutosh Bapat (#106)
#109Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ashutosh Bapat (#107)
#110Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ashutosh Bapat (#106)
#111Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Robert Haas (#108)
#112Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Amit Langote (#109)
#113Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Amit Langote (#110)
#114Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ashutosh Bapat (#113)
#115Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Amit Langote (#114)
#116Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ashutosh Bapat (#112)
#117Ildar Musin
i.musin@postgrespro.ru
In reply to: Amit Langote (#105)
#118Alexander Korotkov
aekorotkov@gmail.com
In reply to: Ildar Musin (#117)
#119Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Alexander Korotkov (#118)
#120Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#119)
#121Ildar Musin
i.musin@postgrespro.ru
In reply to: Amit Langote (#120)
#122Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ildar Musin (#121)
#123Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#122)
#124Erik Rijkers
er@xs4all.nl
In reply to: Amit Langote (#105)
#125Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Amit Langote (#123)
#126Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ashutosh Bapat (#125)
#127Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Erik Rijkers (#124)
#128Thakur, Sameer
Sameer.Thakur@nttdata.com
In reply to: Amit Langote (#126)
#129Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Thakur, Sameer (#128)
#130Ildar Musin
i.musin@postgrespro.ru
In reply to: Amit Langote (#123)
#131Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ildar Musin (#130)
#132Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Langote (#131)
#133Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Tom Lane (#132)
#134Ildar Musin
i.musin@postgrespro.ru
In reply to: Amit Langote (#133)
#135Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Langote (#133)
#136Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Tom Lane (#135)
#137Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ildar Musin (#134)
#138Ildar Musin
i.musin@postgrespro.ru
In reply to: Amit Langote (#137)
#139Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ildar Musin (#138)
#140Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Amit Langote (#139)
#141Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ashutosh Bapat (#140)
#142Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Amit Langote (#141)
#143Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ashutosh Bapat (#142)
#144Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Amit Langote (#141)
#145Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ashutosh Bapat (#144)
#146Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Amit Langote (#145)
#147Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ashutosh Bapat (#146)
#148Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Amit Langote (#147)
#149Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ashutosh Bapat (#148)
#150Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Amit Langote (#149)
#151Robert Haas
robertmhaas@gmail.com
In reply to: Ashutosh Bapat (#150)
#152Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Robert Haas (#151)
#153Robert Haas
robertmhaas@gmail.com
In reply to: Ashutosh Bapat (#152)
#154Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Robert Haas (#153)
#155Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ashutosh Bapat (#154)
#156Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Amit Langote (#155)
#157Robert Haas
robertmhaas@gmail.com
In reply to: Amit Langote (#155)
#158Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Robert Haas (#157)
#159Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Amit Langote (#158)
#160Ashutosh Bapat
ashutosh.bapat@enterprisedb.com
In reply to: Ashutosh Bapat (#159)
#161Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Ashutosh Bapat (#160)