Documentation improvements for partitioning
Here are some patches to improve the documentation about partitioned tables:
0001: Adds some details about partition_bound_spec to the CREATE TABLE
page, especially:
- a note about inclusivity of range partition bounds,
- a note about the UNBOUNDED literal in case of range partitioning,
- a note about the NULL literal in case of list partitioning,
I wonder if the above "note" should be added under the Notes section or
are they fine to be added as part of the description of PARTITION OF
clause. Also:
- in syntax synopsis, it appears now that any expression is OK to be used
for individual bound datum, but it's not true. Only literals are
allowed. So fixed that.
- added an example showing how to create partitions of a range
partitioned table with multiple columns in the partition key
- added PARTITION BY and PARTITION OF (FOR VALUES) as PostgreSQL
extensions in the compatibility section
0002: Adds details about partitioned tables to the DDL chapter (ddl.sgml)
- a new section named "Partitioned Tables" right next to the
Inheritance and Partitioning sections is created.
- examples are added to the existing Partitioning section using the new
partitioned tables. Old text about implementing table partitioning
using inheritance is kept, sort of as a still supported older
alternative.
0003: Add partitioning keywords to keywords.sgml
This is all I have for now. Any feedback is greatly appreciated. Adding
this to the next CF.
Thanks,
Amit
Attachments:
0001-Improve-CREATE-TABLE-documentation-of-partitioning.patchtext/x-diff; name=0001-Improve-CREATE-TABLE-documentation-of-partitioning.patchDownload
From 47c9edf3e46080d4bfb7d2a2908016c9039ee20f Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 26 Jan 2017 18:57:55 +0900
Subject: [PATCH 1/3] Improve CREATE TABLE documentation of partitioning
---
doc/src/sgml/ref/create_table.sgml | 103 ++++++++++++++++++++++++++++++++++---
1 file changed, 96 insertions(+), 7 deletions(-)
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 58f8bf6d6a..5596250aef 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -85,8 +85,8 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
<phrase>and <replaceable class="PARAMETER">partition_bound_spec</replaceable> is:</phrase>
-{ IN ( <replaceable class="PARAMETER">expression</replaceable> [, ...] ) |
- FROM ( { <replaceable class="PARAMETER">expression</replaceable> | UNBOUNDED } [, ...] ) TO ( { <replaceable class="PARAMETER">expression</replaceable> | UNBOUNDED } [, ...] ) }
+{ IN ( { <replaceable class="PARAMETER">bound_literal</replaceable> | NULL } [, ...] ) |
+ FROM ( { <replaceable class="PARAMETER">bound_literal</replaceable> | UNBOUNDED } [, ...] ) TO ( { <replaceable class="PARAMETER">bound_literal</replaceable> | UNBOUNDED } [, ...] ) }
<phrase><replaceable class="PARAMETER">index_parameters</replaceable> in <literal>UNIQUE</literal>, <literal>PRIMARY KEY</literal>, and <literal>EXCLUDE</literal> constraints are:</phrase>
@@ -261,6 +261,44 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
any existing partition of that parent.
</para>
+ <note>
+ <para>
+ Each of the values specified in the partition bound specification is
+ a literal, <literal>NULL</literal>, or <literal>UNBOUNDED</literal>.
+ A literal is either a numeric constant or a string constant that can be
+ automatically coerced to the corresponding partition key column's type.
+ </para>
+
+ <para>
+ When creating a range partition, the lower bound specified with
+ <literal>FROM</literal> is an inclusive bound, whereas the upper bound
+ specified with <literal>TO</literal> is an exclusive bound. That is,
+ the values specified in the <literal>FROM</literal> list are accepted
+ values of the corresponding partition key columns in a given partition,
+ whereas those in the <literal>TO</literal> list are not. To be precise,
+ this applies only to the first of the partition key columns for which
+ the corresponding values in the <literal>FROM</literal> and
+ <literal>TO</literal> lists are not equal. All rows in a given
+ partition contain the same values for all preceding columns, equal to
+ those specified in <literal>FROM</literal> and <literal>TO</literal>
+ lists. On the other hand, any subsequent columns are insignificant
+ as far as implicit partition constraint is concerned.
+
+ Specifying <literal>UNBOUNDED</literal> in <literal>FROM</literal>
+ signifies <literal>-infinity</literal> as the lower bound of the
+ corresponding column, whereas it signifies <literal>+infinity</literal>
+ as the upper bound when specified in <literal>TO</literal>.
+ </para>
+
+ <para>
+ When creating a list partition, <literal>NULL</literal> can be specified
+ to signify that the partition allows the partition key column to be null.
+ However, there cannot be more than one such list partitions for a given
+ parent table. <literal>NULL</literal> cannot specified for range
+ partitions.
+ </para>
+ </note>
+
<para>
A partition cannot have columns other than those inherited from the
parent. That includes the <structfield>oid</> column, which can be
@@ -386,11 +424,12 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
<firstterm>partitioned</firstterm> table. The parenthesized list of
columns or expressions forms the <firstterm>partition key</firstterm>
for the table. When using range partitioning, the partition key can
- include multiple columns or expressions, but for list partitioning, the
- partition key must consist of a single column or expression. If no
- btree operator class is specified when creating a partitioned table,
- the default btree operator class for the datatype will be used. If
- there is none, an error will be reported.
+ include multiple columns or expressions (up to 32, but this limit can
+ altered when building <productname>PostgreSQL</productname>.), but for
+ list partitioning, the partition key must consist of a single column or
+ expression. If no btree operator class is specified when creating a
+ partitioned table, the default btree operator class for the datatype will
+ be used. If there is none, an error will be reported.
</para>
<para>
@@ -1485,6 +1524,16 @@ CREATE TABLE measurement (
</programlisting></para>
<para>
+ Create a range partitioned table with multiple columns in the partition key:
+<programlisting>
+CREATE TABLE measurement_year_month (
+ logdate date not null,
+ peaktemp int,
+ unitsales int
+) PARTITION BY RANGE (EXTRACT(YEAR FROM DATE logdate), EXTRACT(MONTH FROM DATE logdate));
+</programlisting></para>
+
+ <para>
Create a list partitioned table:
<programlisting>
CREATE TABLE cities (
@@ -1504,6 +1553,27 @@ CREATE TABLE measurement_y2016m07
</programlisting></para>
<para>
+ Create a few partitions of a range partitioned table with multiple
+ columns in the partition key:
+<programlisting>
+CREATE TABLE measurement_ym_older
+ PARTITION OF measurement_year_month
+ FOR VALUES FROM (unbounded, unbounded) TO (2016, 11);
+
+CREATE TABLE measurement_ym_y2016m11
+ PARTITION OF measurement_year_month
+ FOR VALUES FROM (2016, 11) TO (2016, 12);
+
+CREATE TABLE measurement_ym_y2016m12
+ PARTITION OF measurement_year_month
+ FOR VALUES FROM (2016, 12) TO (2017, 01);
+
+CREATE TABLE measurement_ym_y2017m01
+ PARTITION OF measurement_year_month
+ FOR VALUES FROM (2017, 01) TO (2017, 02);
+</programlisting></para>
+
+ <para>
Create partition of a list partitioned table:
<programlisting>
CREATE TABLE cities_ab
@@ -1705,6 +1775,25 @@ CREATE TABLE cities_ab_10000_to_100000
effect can be had using the OID feature.
</para>
</refsect2>
+
+ <refsect2>
+ <title><literal>PARTITION BY</> Clause</title>
+
+ <para>
+ The <literal>PARTITION BY</> clause is a
+ <productname>PostgreSQL</productname> extension.
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title><literal>PARTITION OF</> Clause</title>
+
+ <para>
+ The <literal>PARTITION OF</> clause is a
+ <productname>PostgreSQL</productname> extension.
+ </para>
+ </refsect2>
+
</refsect1>
--
2.11.0
0002-Update-ddl.sgml-for-declarative-partitioning.patchtext/x-diff; name=0002-Update-ddl.sgml-for-declarative-partitioning.patchDownload
From 06841c7c2bbce4e923d770ff8bb16a3bdc899a09 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Tue, 31 Jan 2017 10:57:41 +0900
Subject: [PATCH 2/3] Update ddl.sgml for declarative partitioning
Add a section titled "Partitioned Tables" to describe what are
partitioned tables, partition, their relation with inheritance,
differences, limitations, etc.
Then add examples to the partitioning chapter that show how to use
partitioned tables. In fact they implement the same partitioning
scheme that is currently shown using inheritance and highlights
some differences between the two methods.
---
doc/src/sgml/ddl.sgml | 431 +++++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 412 insertions(+), 19 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index aebe898466..491ce9b32c 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -12,7 +12,8 @@
Subsequently, we discuss how tables can be organized into
schemas, and how privileges can be assigned to tables. Finally,
we will briefly look at other features that affect the data storage,
- such as inheritance, views, functions, and triggers.
+ such as inheritance, table partitioning, views, functions, and
+ triggers.
</para>
<sect1 id="ddl-basics">
@@ -2771,6 +2772,132 @@ VALUES ('Albany', NULL, NULL, 'NY');
</sect2>
</sect1>
+ <sect1 id="ddl-partitioned-tables">
+ <title>Partitioned Tables</title>
+
+ <indexterm>
+ <primary>partitioned table</primary>
+ </indexterm>
+
+ <para>
+ PostgreSQL offers a way to specify how to divide a table into pieces
+ called partitions. The table that is divided is called
+ <firstterm>partitioned table</firstterm>. The specification consists
+ of the <firstterm>partitioning method</firstterm> and a list of columns
+ or expressions to be used as the <firstterm>partition key</firstterm>.
+ Any data inserted into a partitioned table must live in one of its
+ <firstterm>partitions</firstterm> based on the value of the partition
+ key. Each partition is assigned a subset of the data that is inserted
+ into the partitioned table, which is defined by its <firstterm>partition
+ bounds</firstterm>. Currently supported methods of partitioning include
+ range and list, wherein each partition is assigned a range of keys or
+ a list of keys, respectively. It is possible to implement
+ <firstterm>sub-partitioning</firstterm> by defining individual partitions
+ themselves to be partitioned table. See <xref linkend="sql-createtable">
+ for more details creating partitioned tables and partitions. It is not
+ currently possible to make a regular table into a partitioned table or
+ vice versa. However, it is possible to make a regular table containing
+ data a partition of a partitioned table and vice versa; see
+ <xref linkend="sql-altertable"> to learn more about the
+ <command>ATTACH PARTITION</> and <command>DETACH PARTITION</> commands.
+ </para>
+
+ <para>
+ Individual partitions are linked to the partitioned table with inheritance
+ behind-the-scenes, however it is not possible to use various inheritance
+ features discussed in the previous section with partitioned tables and
+ partitions. For example, partitions cannot have any other parents than
+ the partitioned table it is a partition of, nor can a regular table inherit
+ from a partitioned table making the latter its parent. That means
+ partitioned table and partitions do not participate in inheritance with
+ regular tables. Since a partition hierarchy consisting of the
+ partitioned table and its partitions is still an inheritance hierarchy,
+ all the normal rules of inheritance apply as described in the previous
+ section (<xref linkend="ddl-inherit">) with some exceptions, most notably:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ Both <literal>CHECK</literal> and <literal>NOT NULL</literal>
+ constraints of a partitioned table are always inherited by all its
+ partitions. There cannot be any <literal>CHECK</literal> constraints
+ that are marked <literal>NO INHERIT</literal>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ The <literal>ONLY</literal> notation used to exclude child tables
+ would either cause error or will be ignored in some cases for
+ partitioned tables. For example, specifying <literal>ONLY</literal>
+ when querying data from a partitioned table does not make much sense,
+ because all the data is contained in partitions, so it will be
+ ignored. Specifying <literal>ONLY</literal> when modifying schema is
+ not desirable in certain cases with partitioned tables where it may be
+ fine for regular inheritance parents (for example, dropping a column
+ from only the parent); an error will be thrown in that case.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Partitions cannot have columns that are not present in the parent.
+ It is neither possible to specify own columns when creating partitions
+ with <command>CREATE TABLE</> nor is it possible to add own columns
+ using <command>ALTER TABLE</>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ One cannot drop a <literal>NOT NULL</literal> constraint on a
+ partition's column, if the constraint is present in the parent table.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ Partitions can also be foreign tables (see <xref linkend="ddl-foreign-data">),
+ although certain limitations exist currently in their usage. For example,
+ data inserted into the partitioned table cannot be routed to foreign table
+ partitions.
+ </para>
+
+ <para>
+ There are currently following limitations of using partitioned tables:
+ <itemizedlist>
+ <listitem>
+ <para>
+ It is currently not possible to define indexes on partitioned tables.
+ Consequently, it is not possible to create constraints that are realized
+ using an index such as <literal>UNIQUE</>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Foreign keys referencing partitioned tables are not supported, nor
+ are foreign key references from a partitioned table to some other table.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Row triggers, if necessary, must be defined on individual partitions, not
+ the partitioned table as it is currently not supported.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ A detailed example that shows how to use partitioned tables is discussed in
+ the next chapter.
+ </para>
+
+ </sect1>
+
<sect1 id="ddl-partitioning">
<title>Partitioning</title>
@@ -2821,8 +2948,8 @@ VALUES ('Albany', NULL, NULL, 'NY');
<para>
Bulk loads and deletes can be accomplished by adding or removing
partitions, if that requirement is planned into the partitioning design.
- <command>ALTER TABLE NO INHERIT</> and <command>DROP TABLE</> are
- both far faster than a bulk operation.
+ <command>ALTER TABLE NO INHERIT</> or <command>ALTER TABLE DETACH PARTITION</>
+ and <command>DROP TABLE</> are both far faster than a bulk operation.
These commands also entirely avoid the <command>VACUUM</command>
overhead caused by a bulk <command>DELETE</>.
</para>
@@ -2844,16 +2971,41 @@ VALUES ('Albany', NULL, NULL, 'NY');
<para>
Currently, <productname>PostgreSQL</productname> supports partitioning
- via table inheritance. Each partition must be created as a child
- table of a single parent table. The parent table itself is normally
- empty; it exists just to represent the entire data set. You should be
- familiar with inheritance (see <xref linkend="ddl-inherit">) before
- attempting to set up partitioning.
+ using two methods:
+
+ <variablelist>
+ <varlistentry>
+ <term>Using Table Inheritance</term>
+
+ <listitem>
+ <para>
+ Each partition must be created as a child table of a single parent
+ table. The parent table itself is normally empty; it exists just to
+ represent the entire data set. You should be familiar with
+ inheritance (see <xref linkend="ddl-inherit">) before attempting to
+ set up partitioning with it. This was the only method to implement
+ partitioning in older versions.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Using Partitioned Tables</term>
+
+ <listitem>
+ <para>
+ See last section for some general information:
+ <xref linkend="ddl-partitioned-tables">
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</para>
<para>
The following forms of partitioning can be implemented in
- <productname>PostgreSQL</productname>:
+ <productname>PostgreSQL</productname> using either of the above mentioned
+ methods, although the latter provides dedicated syntax for each:
<variablelist>
<varlistentry>
@@ -2888,7 +3040,7 @@ VALUES ('Albany', NULL, NULL, 'NY');
<title>Implementing Partitioning</title>
<para>
- To set up a partitioned table, do the following:
+ To set up a partitioned table using inheritance, do the following:
<orderedlist spacing="compact">
<listitem>
<para>
@@ -2979,6 +3131,88 @@ CHECK ( outletID BETWEEN 200 AND 300 )
</para>
<para>
+ To use partitioned tables, do the following:
+ <orderedlist spacing="compact">
+ <listitem>
+ <para>
+ Create <quote>master</quote> table as a partitioned table by
+ specifying the <literal>PARTITION BY</literal> clause, which includes
+ the partitioning method (<literal>RANGE</literal> or
+ <literal>LIST</literal>) and the list of column(s) to use as the
+ partition key. To be able to insert data into the table, one must
+ create partitions, as described below.
+ </para>
+
+ <note>
+ <para>
+ To decide when to use multiple columns in the partition key for range
+ partitioning, consider whether queries accessing the partitioned
+ in question will include conditions that involve multiple columns,
+ especially the columns being considered to be the partition key.
+ If so, the optimizer can create a plan that will scan fewer partitions
+ if a query's conditions are such that there is equality constraint on
+ leading partition key columns, because they limit the number of
+ partitions of interest. The first partition key column with
+ inequality constraint also further eliminates some partitions of
+ those chosen by equality constraints on earlier columns.
+ </para>
+ </note>
+ </listitem>
+
+ <listitem>
+ <para>
+ Create partitions of the master partitioned table, with the partition
+ bounds specified for each partition matching the partitioning method
+ and partition key of the master table. Note that specifying partition
+ bounds such that the new partition's values will overlap with one or
+ more existing partitions will cause an error. It is only after
+ creating partitions that one is able to insert data into the master
+ partitioned table, provided it maps to one of the existing partitions.
+ If a data row does not map to any of the existing partitions, it will
+ cause an error.
+ </para>
+
+ <para>
+ Partitions thus created are also in every way normal
+ <productname>PostgreSQL</> tables (or, possibly, foreign tables),
+ whereas partitioned tables differ in a number of ways.
+ </para>
+
+ <para>
+ It is not necessary to create table constraints for partitions.
+ Instead, partition constraints are generated implicitly whenever
+ there is a need to refer to them. Also, since any data inserted into
+ the master partitioned table is automatically inserted into the
+ appropriate partition, it is not necessary to create triggers for the
+ same.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Just like with inheritance, create an index on the key column(s),
+ as well as any other indexes you might want for every partition.
+ Note that it is currently not supported to propagate index definition
+ from the master partitioned table to its partitions; in fact, it is
+ not possible to define indexes on partitioned tables in the first
+ place. This might change in future releases.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Currently, partitioned tables also depend on constraint exclusion
+ for query optimization, so ensure that the
+ <xref linkend="guc-constraint-exclusion"> configuration parameter is
+ not disabled in <filename>postgresql.conf</>. This might change in
+ future releases.
+ </para>
+ </listitem>
+
+ </orderedlist>
+ </para>
+
+ <para>
For example, suppose we are constructing a database for a large
ice cream company. The company measures peak temperatures every
day as well as ice cream sales in each region. Conceptually,
@@ -3004,7 +3238,8 @@ CREATE TABLE measurement (
<para>
In this situation we can use partitioning to help us meet all of our
different requirements for the measurements table. Following the
- steps outlined above, partitioning can be set up as follows:
+ steps outlined above for both methods, partitioning can be set up as
+ follows:
</para>
<para>
@@ -3172,10 +3407,85 @@ LANGUAGE plpgsql;
</para>
<para>
+ Steps when using a partitioned table are as follows:
+ </para>
+
+ <para>
+ <orderedlist spacing="compact">
+ <listitem>
+ <para>
+ Create the <structname>measurement</> table as a partitioned table:
+
+<programlisting>
+CREATE TABLE measurement (
+ city_id int not null,
+ logdate date not null,
+ peaktemp int,
+ unitsales int
+) PARTITION BY RANGE (logdate);
+</programlisting>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Then create partitions as follows:
+
+<programlisting>
+CREATE TABLE measurement_y2006m02 PARTITION OF measurement
+ FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
+CREATE TABLE measurement_y2006m03 PARTITION OF measurement
+ FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
+...
+CREATE TABLE measurement_y2007m11 PARTITION OF measurement
+ FOR VALUES FROM ('2007-11-01') TO ('2007-12-01');
+CREATE TABLE measurement_y2007m12 PARTITION OF measurement
+ FOR VALUES FROM ('2007-12-01') TO ('2008-01-01');
+CREATE TABLE measurement_y2008m01 PARTITION OF measurement
+ FOR VALUES FROM ('2008-01-01') TO ('2008-02-01');
+</programlisting>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Create indexes on the key columns just like in case of inheritance
+ partitions.
+ </para>
+ </listitem>
+ </orderedlist>
+
+ <note>
+ <para>
+ To implement sub-partitioning, specify the
+ <literal>PARTITION BY</literal> clause in the commands used to create
+ individual partitions, for example:
+
+<programlisting>
+CREATE TABLE measurement_y2006m02 PARTITION OF measurement
+ FOR VALUES FROM ('2006-02-01') TO ('2006-03-01')
+ PARTITION BY RANGE (peaktemp);
+</programlisting>
+
+ After creating partitions of <structname>measurement_y2006m02</>, any
+ data inserted into <structname>measurement</> that is mapped to
+ <structname>measurement_y2006m02</> will be further redirected to one
+ of its partitions based on the <structfield>peaktemp</> column.
+ Partition key specified may overlap with the parent's partition key,
+ although care must be taken when specifying the bounds of sub-partitions
+ such that the accepted set of data constitutes a subset of what a
+ partition's own bounds allows; the system does not try to check if
+ that's really the case.
+ </para>
+ </note>
+ </para>
+
+ <para>
As we can see, a complex partitioning scheme could require a
- substantial amount of DDL. In the above example we would be
- creating a new partition each month, so it might be wise to write a
- script that generates the required DDL automatically.
+ substantial amount of DDL, although significantly less when using
+ partitioned tables. In the above example we would be creating a new
+ partition each month, so it might be wise to write a script that
+ generates the required DDL automatically.
</para>
</sect2>
@@ -3195,8 +3505,15 @@ LANGUAGE plpgsql;
</para>
<para>
+ Both the inheritance-based and partitioned table methods allow this to
+ be done, although the latter requires taking an <literal>ACCESS EXCLUSIVE</literal>
+ lock on the master table for various commands mentioned below.
+ </para>
+
+ <para>
The simplest option for removing old data is simply to drop the partition
- that is no longer necessary:
+ that is no longer necessary, which works using both methods of
+ partitioning:
<programlisting>
DROP TABLE measurement_y2006m02;
</programlisting>
@@ -3211,6 +3528,13 @@ DROP TABLE measurement_y2006m02;
<programlisting>
ALTER TABLE measurement_y2006m02 NO INHERIT measurement;
</programlisting>
+
+ When using a partitioned table:
+
+<programlisting>
+ALTER TABLE measurement DETACH PARTITION measurement_y2006m02;
+</programlisting>
+
This allows further operations to be performed on the data before
it is dropped. For example, this is often a useful time to back up
the data using <command>COPY</>, <application>pg_dump</>, or
@@ -3230,6 +3554,13 @@ CREATE TABLE measurement_y2008m02 (
) INHERITS (measurement);
</programlisting>
+ When using a partitioned table:
+
+<programlisting>
+CREATE TABLE measurement_y2008m02 PARTITION OF measurement
+ FOR VALUES FROM ('2008-02-01') TO ('2008-03-01');
+</programlisting>
+
As an alternative, it is sometimes more convenient to create the
new table outside the partition structure, and make it a proper
partition later. This allows the data to be loaded, checked, and
@@ -3244,7 +3575,28 @@ ALTER TABLE measurement_y2008m02 ADD CONSTRAINT y2008m02
-- possibly some other data preparation work
ALTER TABLE measurement_y2008m02 INHERIT measurement;
</programlisting>
+
+ The last of the above commands when using a partitioned table would be:
+
+<programlisting>
+ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
+ FOR VALUES FROM ('2008-02-01') TO ('2008-03-01' );
+</programlisting>
</para>
+
+ <tip>
+ <para>
+ Before running the <command>ATTACH PARTITION</> command, it is
+ recommended to create a <literal>CHECK</> constraint on the table to
+ be attached describing the desired partition constraint. Using the
+ same, system is able to skip the scan to validate the implicit
+ partition constraint. Without such a constraint, the table will be
+ scanned to validate the partition constraint, while holding an
+ <literal>ACCESS EXCLUSIVE</literal> lock on the parent table.
+ One may want to drop the constraint after <command>ATTACH PARTITION</>
+ is finished, because it is no longer necessary.
+ </para>
+ </tip>
</sect2>
<sect2 id="ddl-partitioning-constraint-exclusion">
@@ -3340,6 +3692,15 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
are unlikely to benefit.
</para>
+ <note>
+ <para>
+ Currently, constraint exclusion is also used for partitioned tables.
+ However, we did not create any <literal>CHECK</literal> constraints
+ for individual partitions as seen above. In this case, the optimizer
+ uses internally generated constraint for every partition.
+ </para>
+ </note>
+
</sect2>
<sect2 id="ddl-partitioning-alternatives">
@@ -3348,7 +3709,7 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
<para>
A different approach to redirecting inserts into the appropriate
partition table is to set up rules, instead of a trigger, on the
- master table. For example:
+ master table (unless it is a partitioned table). For example:
<programlisting>
CREATE RULE measurement_insert_y2006m02 AS
@@ -3408,7 +3769,8 @@ UNION ALL SELECT * FROM measurement_y2008m01;
<title>Caveats</title>
<para>
- The following caveats apply to partitioned tables:
+ The following caveats apply to partitioned tables implemented using either
+ method (unless noted otherwise):
<itemizedlist>
<listitem>
<para>
@@ -3418,6 +3780,13 @@ UNION ALL SELECT * FROM measurement_y2008m01;
partitions and creates and/or modifies associated objects than
to write each by hand.
</para>
+
+ <para>
+ This is not a problem with partitioned tables though, as trying to
+ create a partition that overlaps with one of the existing partitions
+ results in an error, so it is impossible to end up with partitions
+ that overlap one another.
+ </para>
</listitem>
<listitem>
@@ -3430,6 +3799,14 @@ UNION ALL SELECT * FROM measurement_y2008m01;
on the partition tables, but it makes management of the structure
much more complicated.
</para>
+
+ <para>
+ This problem exists even for partitioned tables. An <command>UPDATE</>
+ that causes a row to move from one partition to another fails, because
+ the new value of the row fails to satisfy the implicit partition
+ constraint of the original partition. This might change in future
+ releases.
+ </para>
</listitem>
<listitem>
@@ -3440,7 +3817,8 @@ UNION ALL SELECT * FROM measurement_y2008m01;
<programlisting>
ANALYZE measurement;
</programlisting>
- will only process the master table.
+ will only process the master table. This is true even for partitioned
+ tables.
</para>
</listitem>
@@ -3451,6 +3829,12 @@ ANALYZE measurement;
action is only taken in case of unique violations on the specified
target relation, not its child relations.
</para>
+
+ <para>
+ <command>INSERT</command> statements with <literal>ON CONFLICT</>
+ clause are currently not allowed on partitioned tables, that is,
+ cause error when specified.
+ </para>
</listitem>
</itemizedlist>
@@ -3479,7 +3863,9 @@ ANALYZE measurement;
range tests for range partitioning, as illustrated in the preceding
examples. A good rule of thumb is that partitioning constraints should
contain only comparisons of the partitioning column(s) to constants
- using B-tree-indexable operators.
+ using B-tree-indexable operators, which applies even to partitioned
+ tables, because only B-tree-indexable column(s) are allowed in the
+ partition key.
</para>
</listitem>
@@ -3491,6 +3877,13 @@ ANALYZE measurement;
these techniques will work well with up to perhaps a hundred partitions;
don't try to use many thousands of partitions.
</para>
+
+ <para>
+ This restriction on the number of partitions currently applies even to
+ the partitioned tables, but it will be alleviated in future releases so
+ that the query planning time is not influenced much by the number of
+ partitions.
+ </para>
</listitem>
</itemizedlist>
--
2.11.0
0003-Add-partitioning-keywords-to-keywords.sgml.patchtext/x-diff; name=0003-Add-partitioning-keywords-to-keywords.sgml.patchDownload
From 684393c6c3e2b44f1e8fc4170f62508303a979c3 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 2 Feb 2017 14:01:02 +0900
Subject: [PATCH 3/3] Add partitioning keywords to keywords.sgml
---
doc/src/sgml/keywords.sgml | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/doc/src/sgml/keywords.sgml b/doc/src/sgml/keywords.sgml
index 0a8027e3a9..4407fb928b 100644
--- a/doc/src/sgml/keywords.sgml
+++ b/doc/src/sgml/keywords.sgml
@@ -336,6 +336,13 @@
<entry></entry>
</row>
<row>
+ <entry><token>ATTACH</token></entry>
+ <entry>non-reserved</entry>
+ <entry></entry>
+ <entry></entry>
+ <entry></entry>
+ </row>
+ <row>
<entry><token>ATTRIBUTE</token></entry>
<entry>non-reserved</entry>
<entry>non-reserved</entry>
@@ -1365,6 +1372,13 @@
<entry>reserved</entry>
</row>
<row>
+ <entry><token>DETACH</token></entry>
+ <entry>non-reserved</entry>
+ <entry></entry>
+ <entry></entry>
+ <entry></entry>
+ </row>
+ <row>
<entry><token>DETERMINISTIC</token></entry>
<entry></entry>
<entry>reserved</entry>
@@ -2555,6 +2569,13 @@
<entry></entry>
</row>
<row>
+ <entry><token>LIST</token></entry>
+ <entry>non-reserved</entry>
+ <entry></entry>
+ <entry></entry>
+ <entry></entry>
+ </row>
+ <row>
<entry><token>LISTEN</token></entry>
<entry>non-reserved</entry>
<entry></entry>
--
2.11.0
On Fri, Feb 3, 2017 at 4:15 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
wrote:
Here are some patches to improve the documentation about partitioned
tables:0001: Adds some details about partition_bound_spec to the CREATE TABLE
page, especially:- a note about inclusivity of range partition bounds,
- a note about the UNBOUNDED literal in case of range partitioning,
- a note about the NULL literal in case of list partitioning,I wonder if the above "note" should be added under the Notes section or
are they fine to be added as part of the description of PARTITION OF
clause. Also:- in syntax synopsis, it appears now that any expression is OK to be used
for individual bound datum, but it's not true. Only literals are
allowed. So fixed that.
- added an example showing how to create partitions of a range
partitioned table with multiple columns in the partition key
- added PARTITION BY and PARTITION OF (FOR VALUES) as PostgreSQL
extensions in the compatibility section0002: Adds details about partitioned tables to the DDL chapter (ddl.sgml)
- a new section named "Partitioned Tables" right next to the
Inheritance and Partitioning sections is created.
- examples are added to the existing Partitioning section using the new
partitioned tables. Old text about implementing table partitioning
using inheritance is kept, sort of as a still supported older
alternative.0003: Add partitioning keywords to keywords.sgml
This is all I have for now. Any feedback is greatly appreciated. Adding
this to the next CF.Thanks,
Amit
Patch applies.
Overall this looks really good. It goes a long way towards stating some of
the things I had to learn through experimentation.
I had to read a really long way into the patch before finding a blurb that
I felt wasn't completely clear:
+
+ <para>
+ <command>INSERT</command> statements with <literal>ON CONFLICT</>
+ clause are currently not allowed on partitioned tables, that is,
+ cause error when specified.
+ </para>
Here's some other tries at saying the same thing, none of which are
completely satisfying:
...ON CONFLICT clause are currently not allowed on partitioned tables and
will cause an error?
...ON CONFLICT clause are currently not allowed on partitioned tables and
will instead cause an error?
...ON CONFLICT clause will currently cause an error if used on a
partitioned table?
As far as additional issues to cover, this bit:
+ <listitem>
+ <para>
+ One cannot drop a <literal>NOT NULL</literal> constraint on a
+ partition's column, if the constraint is present in the parent
table.
+ </para>
+ </listitem>
Maybe we should add something about how one would go about dropping a NOT
NULL constraint (parent first then partitions?)
In reviewing this patch, do all our target formats make word spacing
irrelevant? i.e. is there any point in looking at the number of spaces
after a period, etc?
A final note, because I'm really familiar with partitioning on Postgres and
other databases, documentation which is clear to me might not be to someone
less familiar with partitioning. Maybe we want another reviewer for that?
Hi Corey,
On 2017/02/09 6:14, Corey Huinker wrote:
On Fri, Feb 3, 2017 at 4:15 AM, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>
wrote:Here are some patches to improve the documentation about partitioned
tables:Patch applies.
Overall this looks really good. It goes a long way towards stating some of
the things I had to learn through experimentation.
Thanks a lot for taking a look at it.
I had to read a really long way into the patch before finding a blurb that
I felt wasn't completely clear:+ + <para> + <command>INSERT</command> statements with <literal>ON CONFLICT</> + clause are currently not allowed on partitioned tables, that is, + cause error when specified. + </para>Here's some other tries at saying the same thing, none of which are
completely satisfying:...ON CONFLICT clause are currently not allowed on partitioned tables and
will cause an error?
...ON CONFLICT clause are currently not allowed on partitioned tables and
will instead cause an error?
...ON CONFLICT clause will currently cause an error if used on a
partitioned table?
The last one sounds better.
As far as additional issues to cover, this bit:
+ <listitem> + <para> + One cannot drop a <literal>NOT NULL</literal> constraint on a + partition's column, if the constraint is present in the parent table. + </para> + </listitem>Maybe we should add something about how one would go about dropping a NOT
NULL constraint (parent first then partitions?)
Dropping it on the parent will cause it to be dropped on the partitions as
well. About your point whether we should add a note about how to go about
dropping it in the partition, it seems to me it would be out of place
here; it's just saying that dropping NOT NULL constraint has a different
behavior with partitioned tables than regular inheritance. That note most
likely belongs in the ALTER TABLE reference page in the DROP NOT NULL
description, so created a patch for that (patch 0004 of the attached patches).
In reviewing this patch, do all our target formats make word spacing
irrelevant? i.e. is there any point in looking at the number of spaces
after a period, etc?
It seems to be a convention in the sources to include 2 spaces after a
period, which I just try to follow (both in the code comments and SGML).
I don't see that spaces are relevant as far as how the targets such as
HTML are rendered.
A final note, because I'm really familiar with partitioning on Postgres and
other databases, documentation which is clear to me might not be to someone
less familiar with partitioning. Maybe we want another reviewer for that?
More eyeballs will only help make this better.
Thanks,
Amit
Attachments:
0001-Improve-CREATE-TABLE-documentation-of-partitioning.patchtext/x-diff; name=0001-Improve-CREATE-TABLE-documentation-of-partitioning.patchDownload
From 3074d1986afe0f3ce4710f38517f2e1929ff4c48 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 26 Jan 2017 18:57:55 +0900
Subject: [PATCH 1/4] Improve CREATE TABLE documentation of partitioning
---
doc/src/sgml/ref/create_table.sgml | 103 ++++++++++++++++++++++++++++++++++---
1 file changed, 96 insertions(+), 7 deletions(-)
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 58f8bf6d6a..5596250aef 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -85,8 +85,8 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
<phrase>and <replaceable class="PARAMETER">partition_bound_spec</replaceable> is:</phrase>
-{ IN ( <replaceable class="PARAMETER">expression</replaceable> [, ...] ) |
- FROM ( { <replaceable class="PARAMETER">expression</replaceable> | UNBOUNDED } [, ...] ) TO ( { <replaceable class="PARAMETER">expression</replaceable> | UNBOUNDED } [, ...] ) }
+{ IN ( { <replaceable class="PARAMETER">bound_literal</replaceable> | NULL } [, ...] ) |
+ FROM ( { <replaceable class="PARAMETER">bound_literal</replaceable> | UNBOUNDED } [, ...] ) TO ( { <replaceable class="PARAMETER">bound_literal</replaceable> | UNBOUNDED } [, ...] ) }
<phrase><replaceable class="PARAMETER">index_parameters</replaceable> in <literal>UNIQUE</literal>, <literal>PRIMARY KEY</literal>, and <literal>EXCLUDE</literal> constraints are:</phrase>
@@ -261,6 +261,44 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
any existing partition of that parent.
</para>
+ <note>
+ <para>
+ Each of the values specified in the partition bound specification is
+ a literal, <literal>NULL</literal>, or <literal>UNBOUNDED</literal>.
+ A literal is either a numeric constant or a string constant that can be
+ automatically coerced to the corresponding partition key column's type.
+ </para>
+
+ <para>
+ When creating a range partition, the lower bound specified with
+ <literal>FROM</literal> is an inclusive bound, whereas the upper bound
+ specified with <literal>TO</literal> is an exclusive bound. That is,
+ the values specified in the <literal>FROM</literal> list are accepted
+ values of the corresponding partition key columns in a given partition,
+ whereas those in the <literal>TO</literal> list are not. To be precise,
+ this applies only to the first of the partition key columns for which
+ the corresponding values in the <literal>FROM</literal> and
+ <literal>TO</literal> lists are not equal. All rows in a given
+ partition contain the same values for all preceding columns, equal to
+ those specified in <literal>FROM</literal> and <literal>TO</literal>
+ lists. On the other hand, any subsequent columns are insignificant
+ as far as implicit partition constraint is concerned.
+
+ Specifying <literal>UNBOUNDED</literal> in <literal>FROM</literal>
+ signifies <literal>-infinity</literal> as the lower bound of the
+ corresponding column, whereas it signifies <literal>+infinity</literal>
+ as the upper bound when specified in <literal>TO</literal>.
+ </para>
+
+ <para>
+ When creating a list partition, <literal>NULL</literal> can be specified
+ to signify that the partition allows the partition key column to be null.
+ However, there cannot be more than one such list partitions for a given
+ parent table. <literal>NULL</literal> cannot specified for range
+ partitions.
+ </para>
+ </note>
+
<para>
A partition cannot have columns other than those inherited from the
parent. That includes the <structfield>oid</> column, which can be
@@ -386,11 +424,12 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
<firstterm>partitioned</firstterm> table. The parenthesized list of
columns or expressions forms the <firstterm>partition key</firstterm>
for the table. When using range partitioning, the partition key can
- include multiple columns or expressions, but for list partitioning, the
- partition key must consist of a single column or expression. If no
- btree operator class is specified when creating a partitioned table,
- the default btree operator class for the datatype will be used. If
- there is none, an error will be reported.
+ include multiple columns or expressions (up to 32, but this limit can
+ altered when building <productname>PostgreSQL</productname>.), but for
+ list partitioning, the partition key must consist of a single column or
+ expression. If no btree operator class is specified when creating a
+ partitioned table, the default btree operator class for the datatype will
+ be used. If there is none, an error will be reported.
</para>
<para>
@@ -1485,6 +1524,16 @@ CREATE TABLE measurement (
</programlisting></para>
<para>
+ Create a range partitioned table with multiple columns in the partition key:
+<programlisting>
+CREATE TABLE measurement_year_month (
+ logdate date not null,
+ peaktemp int,
+ unitsales int
+) PARTITION BY RANGE (EXTRACT(YEAR FROM DATE logdate), EXTRACT(MONTH FROM DATE logdate));
+</programlisting></para>
+
+ <para>
Create a list partitioned table:
<programlisting>
CREATE TABLE cities (
@@ -1504,6 +1553,27 @@ CREATE TABLE measurement_y2016m07
</programlisting></para>
<para>
+ Create a few partitions of a range partitioned table with multiple
+ columns in the partition key:
+<programlisting>
+CREATE TABLE measurement_ym_older
+ PARTITION OF measurement_year_month
+ FOR VALUES FROM (unbounded, unbounded) TO (2016, 11);
+
+CREATE TABLE measurement_ym_y2016m11
+ PARTITION OF measurement_year_month
+ FOR VALUES FROM (2016, 11) TO (2016, 12);
+
+CREATE TABLE measurement_ym_y2016m12
+ PARTITION OF measurement_year_month
+ FOR VALUES FROM (2016, 12) TO (2017, 01);
+
+CREATE TABLE measurement_ym_y2017m01
+ PARTITION OF measurement_year_month
+ FOR VALUES FROM (2017, 01) TO (2017, 02);
+</programlisting></para>
+
+ <para>
Create partition of a list partitioned table:
<programlisting>
CREATE TABLE cities_ab
@@ -1705,6 +1775,25 @@ CREATE TABLE cities_ab_10000_to_100000
effect can be had using the OID feature.
</para>
</refsect2>
+
+ <refsect2>
+ <title><literal>PARTITION BY</> Clause</title>
+
+ <para>
+ The <literal>PARTITION BY</> clause is a
+ <productname>PostgreSQL</productname> extension.
+ </para>
+ </refsect2>
+
+ <refsect2>
+ <title><literal>PARTITION OF</> Clause</title>
+
+ <para>
+ The <literal>PARTITION OF</> clause is a
+ <productname>PostgreSQL</productname> extension.
+ </para>
+ </refsect2>
+
</refsect1>
--
2.11.0
0002-Update-ddl.sgml-for-declarative-partitioning.patchtext/x-diff; name=0002-Update-ddl.sgml-for-declarative-partitioning.patchDownload
From 21370b0bb06b6ab865c6791363f4a769ab0378ac Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Tue, 31 Jan 2017 10:57:41 +0900
Subject: [PATCH 2/4] Update ddl.sgml for declarative partitioning
Add a section titled "Partitioned Tables" to describe what are
partitioned tables, partition, their relation with inheritance,
differences, limitations, etc.
Then add examples to the partitioning chapter that show how to use
partitioned tables. In fact they implement the same partitioning
scheme that is currently shown using inheritance and highlights
some differences between the two methods.
---
doc/src/sgml/ddl.sgml | 430 +++++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 411 insertions(+), 19 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index aebe898466..f5c01928a4 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -12,7 +12,8 @@
Subsequently, we discuss how tables can be organized into
schemas, and how privileges can be assigned to tables. Finally,
we will briefly look at other features that affect the data storage,
- such as inheritance, views, functions, and triggers.
+ such as inheritance, table partitioning, views, functions, and
+ triggers.
</para>
<sect1 id="ddl-basics">
@@ -2771,6 +2772,132 @@ VALUES ('Albany', NULL, NULL, 'NY');
</sect2>
</sect1>
+ <sect1 id="ddl-partitioned-tables">
+ <title>Partitioned Tables</title>
+
+ <indexterm>
+ <primary>partitioned table</primary>
+ </indexterm>
+
+ <para>
+ <productname>PostgreSQL</productname> offers a way to specify how to
+ divide a table into pieces called partitions. The table that is divided
+ is called <firstterm>partitioned table</firstterm>. The specification
+ consists of the <firstterm>partitioning method</firstterm> and a list of
+ columns or expressions to be used as the <firstterm>partition key</firstterm>.
+ Any data inserted into a partitioned table must live in one of its key.
+ <firstterm>partitions</firstterm> based on the value of the partition key.
+ Each partition is assigned a subset of the data that is inserted into the
+ into the partitioned table, which is defined by its <firstterm>partition
+ bounds</firstterm>. Currently supported methods of partitioning include
+ range and list, wherein each partition is assigned a range of keys or
+ a list of keys, respectively. It is possible to implement
+ <firstterm>sub-partitioning</firstterm> by defining individual partitions
+ themselves to be partitioned table. See <xref linkend="sql-createtable">
+ for more details creating partitioned tables and partitions. It is not
+ currently possible to make a regular table into a partitioned table or
+ vice versa. However, it is possible to make a regular table containing
+ data a partition of a partitioned table and vice versa; see
+ <xref linkend="sql-altertable"> to learn more about the
+ <command>ATTACH PARTITION</> and <command>DETACH PARTITION</> commands.
+ </para>
+
+ <para>
+ Individual partitions are linked to the partitioned table with inheritance
+ behind-the-scenes, however it is not possible to use various inheritance
+ features discussed in the previous section with partitioned tables and
+ partitions. For example, partitions cannot have any other parents than
+ the partitioned table it is a partition of, nor can a regular table inherit
+ from a partitioned table making the latter its parent. That means
+ partitioned table and partitions do not participate in inheritance with
+ regular tables. Since a partition hierarchy consisting of the
+ partitioned table and its partitions is still an inheritance hierarchy,
+ all the normal rules of inheritance apply as described in the previous
+ section (<xref linkend="ddl-inherit">) with some exceptions, most notably:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ Both <literal>CHECK</literal> and <literal>NOT NULL</literal>
+ constraints of a partitioned table are always inherited by all its
+ partitions. There cannot be any <literal>CHECK</literal> constraints
+ that are marked <literal>NO INHERIT</literal>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ The <literal>ONLY</literal> notation used to exclude child tables
+ would either cause error or will be ignored in some cases for
+ partitioned tables. For example, specifying <literal>ONLY</literal>
+ when querying data from a partitioned table does not make much sense,
+ because all the data is contained in partitions, so it will be
+ ignored. Specifying <literal>ONLY</literal> when modifying schema is
+ not desirable in certain cases with partitioned tables where it may be
+ fine for regular inheritance parents (for example, dropping a column
+ from only the parent); an error will be thrown in that case.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Partitions cannot have columns that are not present in the parent.
+ It is neither possible to specify own columns when creating partitions
+ with <command>CREATE TABLE</> nor is it possible to add own columns
+ using <command>ALTER TABLE</>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ One cannot drop a <literal>NOT NULL</literal> constraint on a
+ partition's column, if the constraint is present in the parent table.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ Partitions can also be foreign tables (see <xref linkend="ddl-foreign-data">),
+ although certain limitations exist currently in their usage. For example,
+ data inserted into the partitioned table cannot be routed to foreign table
+ partitions.
+ </para>
+
+ <para>
+ There are currently following limitations of using partitioned tables:
+ <itemizedlist>
+ <listitem>
+ <para>
+ It is currently not possible to define indexes on partitioned tables.
+ Consequently, it is not possible to create constraints that are realized
+ using an index such as <literal>UNIQUE</>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Foreign keys referencing partitioned tables are not supported, nor
+ are foreign key references from a partitioned table to some other table.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Row triggers, if necessary, must be defined on individual partitions, not
+ the partitioned table as it is currently not supported.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ A detailed example that shows how to use partitioned tables is discussed in
+ the next chapter.
+ </para>
+
+ </sect1>
+
<sect1 id="ddl-partitioning">
<title>Partitioning</title>
@@ -2821,8 +2948,8 @@ VALUES ('Albany', NULL, NULL, 'NY');
<para>
Bulk loads and deletes can be accomplished by adding or removing
partitions, if that requirement is planned into the partitioning design.
- <command>ALTER TABLE NO INHERIT</> and <command>DROP TABLE</> are
- both far faster than a bulk operation.
+ <command>ALTER TABLE NO INHERIT</> or <command>ALTER TABLE DETACH PARTITION</>
+ and <command>DROP TABLE</> are both far faster than a bulk operation.
These commands also entirely avoid the <command>VACUUM</command>
overhead caused by a bulk <command>DELETE</>.
</para>
@@ -2844,16 +2971,41 @@ VALUES ('Albany', NULL, NULL, 'NY');
<para>
Currently, <productname>PostgreSQL</productname> supports partitioning
- via table inheritance. Each partition must be created as a child
- table of a single parent table. The parent table itself is normally
- empty; it exists just to represent the entire data set. You should be
- familiar with inheritance (see <xref linkend="ddl-inherit">) before
- attempting to set up partitioning.
+ using two methods:
+
+ <variablelist>
+ <varlistentry>
+ <term>Using Table Inheritance</term>
+
+ <listitem>
+ <para>
+ Each partition must be created as a child table of a single parent
+ table. The parent table itself is normally empty; it exists just to
+ represent the entire data set. You should be familiar with
+ inheritance (see <xref linkend="ddl-inherit">) before attempting to
+ set up partitioning with it. This was the only method to implement
+ partitioning in older versions.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Using Partitioned Tables</term>
+
+ <listitem>
+ <para>
+ See last section for some general information:
+ <xref linkend="ddl-partitioned-tables">
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
</para>
<para>
The following forms of partitioning can be implemented in
- <productname>PostgreSQL</productname>:
+ <productname>PostgreSQL</productname> using either of the above mentioned
+ methods, although the latter provides dedicated syntax for each:
<variablelist>
<varlistentry>
@@ -2888,7 +3040,7 @@ VALUES ('Albany', NULL, NULL, 'NY');
<title>Implementing Partitioning</title>
<para>
- To set up a partitioned table, do the following:
+ To set up a partitioned table using inheritance, do the following:
<orderedlist spacing="compact">
<listitem>
<para>
@@ -2979,6 +3131,88 @@ CHECK ( outletID BETWEEN 200 AND 300 )
</para>
<para>
+ To use partitioned tables, do the following:
+ <orderedlist spacing="compact">
+ <listitem>
+ <para>
+ Create <quote>master</quote> table as a partitioned table by
+ specifying the <literal>PARTITION BY</literal> clause, which includes
+ the partitioning method (<literal>RANGE</literal> or
+ <literal>LIST</literal>) and the list of column(s) to use as the
+ partition key. To be able to insert data into the table, one must
+ create partitions, as described below.
+ </para>
+
+ <note>
+ <para>
+ To decide when to use multiple columns in the partition key for range
+ partitioning, consider whether queries accessing the partitioned
+ in question will include conditions that involve multiple columns,
+ especially the columns being considered to be the partition key.
+ If so, the optimizer can create a plan that will scan fewer partitions
+ if a query's conditions are such that there is equality constraint on
+ leading partition key columns, because they limit the number of
+ partitions of interest. The first partition key column with
+ inequality constraint also further eliminates some partitions of
+ those chosen by equality constraints on earlier columns.
+ </para>
+ </note>
+ </listitem>
+
+ <listitem>
+ <para>
+ Create partitions of the master partitioned table, with the partition
+ bounds specified for each partition matching the partitioning method
+ and partition key of the master table. Note that specifying partition
+ bounds such that the new partition's values will overlap with one or
+ more existing partitions will cause an error. It is only after
+ creating partitions that one is able to insert data into the master
+ partitioned table, provided it maps to one of the existing partitions.
+ If a data row does not map to any of the existing partitions, it will
+ cause an error.
+ </para>
+
+ <para>
+ Partitions thus created are also in every way normal
+ <productname>PostgreSQL</> tables (or, possibly, foreign tables),
+ whereas partitioned tables differ in a number of ways.
+ </para>
+
+ <para>
+ It is not necessary to create table constraints for partitions.
+ Instead, partition constraints are generated implicitly whenever
+ there is a need to refer to them. Also, since any data inserted into
+ the master partitioned table is automatically inserted into the
+ appropriate partition, it is not necessary to create triggers for the
+ same.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Just like with inheritance, create an index on the key column(s),
+ as well as any other indexes you might want for every partition.
+ Note that it is currently not supported to propagate index definition
+ from the master partitioned table to its partitions; in fact, it is
+ not possible to define indexes on partitioned tables in the first
+ place. This might change in future releases.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Currently, partitioned tables also depend on constraint exclusion
+ for query optimization, so ensure that the
+ <xref linkend="guc-constraint-exclusion"> configuration parameter is
+ not disabled in <filename>postgresql.conf</>. This might change in
+ future releases.
+ </para>
+ </listitem>
+
+ </orderedlist>
+ </para>
+
+ <para>
For example, suppose we are constructing a database for a large
ice cream company. The company measures peak temperatures every
day as well as ice cream sales in each region. Conceptually,
@@ -3004,7 +3238,8 @@ CREATE TABLE measurement (
<para>
In this situation we can use partitioning to help us meet all of our
different requirements for the measurements table. Following the
- steps outlined above, partitioning can be set up as follows:
+ steps outlined above for both methods, partitioning can be set up as
+ follows:
</para>
<para>
@@ -3172,10 +3407,85 @@ LANGUAGE plpgsql;
</para>
<para>
+ Steps when using a partitioned table are as follows:
+ </para>
+
+ <para>
+ <orderedlist spacing="compact">
+ <listitem>
+ <para>
+ Create the <structname>measurement</> table as a partitioned table:
+
+<programlisting>
+CREATE TABLE measurement (
+ city_id int not null,
+ logdate date not null,
+ peaktemp int,
+ unitsales int
+) PARTITION BY RANGE (logdate);
+</programlisting>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Then create partitions as follows:
+
+<programlisting>
+CREATE TABLE measurement_y2006m02 PARTITION OF measurement
+ FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
+CREATE TABLE measurement_y2006m03 PARTITION OF measurement
+ FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
+...
+CREATE TABLE measurement_y2007m11 PARTITION OF measurement
+ FOR VALUES FROM ('2007-11-01') TO ('2007-12-01');
+CREATE TABLE measurement_y2007m12 PARTITION OF measurement
+ FOR VALUES FROM ('2007-12-01') TO ('2008-01-01');
+CREATE TABLE measurement_y2008m01 PARTITION OF measurement
+ FOR VALUES FROM ('2008-01-01') TO ('2008-02-01');
+</programlisting>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Create indexes on the key columns just like in case of inheritance
+ partitions.
+ </para>
+ </listitem>
+ </orderedlist>
+
+ <note>
+ <para>
+ To implement sub-partitioning, specify the
+ <literal>PARTITION BY</literal> clause in the commands used to create
+ individual partitions, for example:
+
+<programlisting>
+CREATE TABLE measurement_y2006m02 PARTITION OF measurement
+ FOR VALUES FROM ('2006-02-01') TO ('2006-03-01')
+ PARTITION BY RANGE (peaktemp);
+</programlisting>
+
+ After creating partitions of <structname>measurement_y2006m02</>, any
+ data inserted into <structname>measurement</> that is mapped to
+ <structname>measurement_y2006m02</> will be further redirected to one
+ of its partitions based on the <structfield>peaktemp</> column.
+ Partition key specified may overlap with the parent's partition key,
+ although care must be taken when specifying the bounds of sub-partitions
+ such that the accepted set of data constitutes a subset of what a
+ partition's own bounds allows; the system does not try to check if
+ that's really the case.
+ </para>
+ </note>
+ </para>
+
+ <para>
As we can see, a complex partitioning scheme could require a
- substantial amount of DDL. In the above example we would be
- creating a new partition each month, so it might be wise to write a
- script that generates the required DDL automatically.
+ substantial amount of DDL, although significantly less when using
+ partitioned tables. In the above example we would be creating a new
+ partition each month, so it might be wise to write a script that
+ generates the required DDL automatically.
</para>
</sect2>
@@ -3195,8 +3505,15 @@ LANGUAGE plpgsql;
</para>
<para>
+ Both the inheritance-based and partitioned table methods allow this to
+ be done, although the latter requires taking an <literal>ACCESS EXCLUSIVE</literal>
+ lock on the master table for various commands mentioned below.
+ </para>
+
+ <para>
The simplest option for removing old data is simply to drop the partition
- that is no longer necessary:
+ that is no longer necessary, which works using both methods of
+ partitioning:
<programlisting>
DROP TABLE measurement_y2006m02;
</programlisting>
@@ -3211,6 +3528,13 @@ DROP TABLE measurement_y2006m02;
<programlisting>
ALTER TABLE measurement_y2006m02 NO INHERIT measurement;
</programlisting>
+
+ When using a partitioned table:
+
+<programlisting>
+ALTER TABLE measurement DETACH PARTITION measurement_y2006m02;
+</programlisting>
+
This allows further operations to be performed on the data before
it is dropped. For example, this is often a useful time to back up
the data using <command>COPY</>, <application>pg_dump</>, or
@@ -3230,6 +3554,13 @@ CREATE TABLE measurement_y2008m02 (
) INHERITS (measurement);
</programlisting>
+ When using a partitioned table:
+
+<programlisting>
+CREATE TABLE measurement_y2008m02 PARTITION OF measurement
+ FOR VALUES FROM ('2008-02-01') TO ('2008-03-01');
+</programlisting>
+
As an alternative, it is sometimes more convenient to create the
new table outside the partition structure, and make it a proper
partition later. This allows the data to be loaded, checked, and
@@ -3244,7 +3575,28 @@ ALTER TABLE measurement_y2008m02 ADD CONSTRAINT y2008m02
-- possibly some other data preparation work
ALTER TABLE measurement_y2008m02 INHERIT measurement;
</programlisting>
+
+ The last of the above commands when using a partitioned table would be:
+
+<programlisting>
+ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
+ FOR VALUES FROM ('2008-02-01') TO ('2008-03-01' );
+</programlisting>
</para>
+
+ <tip>
+ <para>
+ Before running the <command>ATTACH PARTITION</> command, it is
+ recommended to create a <literal>CHECK</> constraint on the table to
+ be attached describing the desired partition constraint. Using the
+ same, system is able to skip the scan to validate the implicit
+ partition constraint. Without such a constraint, the table will be
+ scanned to validate the partition constraint, while holding an
+ <literal>ACCESS EXCLUSIVE</literal> lock on the parent table.
+ One may want to drop the constraint after <command>ATTACH PARTITION</>
+ is finished, because it is no longer necessary.
+ </para>
+ </tip>
</sect2>
<sect2 id="ddl-partitioning-constraint-exclusion">
@@ -3340,6 +3692,15 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
are unlikely to benefit.
</para>
+ <note>
+ <para>
+ Currently, constraint exclusion is also used for partitioned tables.
+ However, we did not create any <literal>CHECK</literal> constraints
+ for individual partitions as seen above. In this case, the optimizer
+ uses internally generated constraint for every partition.
+ </para>
+ </note>
+
</sect2>
<sect2 id="ddl-partitioning-alternatives">
@@ -3348,7 +3709,7 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
<para>
A different approach to redirecting inserts into the appropriate
partition table is to set up rules, instead of a trigger, on the
- master table. For example:
+ master table (unless it is a partitioned table). For example:
<programlisting>
CREATE RULE measurement_insert_y2006m02 AS
@@ -3408,7 +3769,8 @@ UNION ALL SELECT * FROM measurement_y2008m01;
<title>Caveats</title>
<para>
- The following caveats apply to partitioned tables:
+ The following caveats apply to partitioned tables implemented using either
+ method (unless noted otherwise):
<itemizedlist>
<listitem>
<para>
@@ -3418,6 +3780,13 @@ UNION ALL SELECT * FROM measurement_y2008m01;
partitions and creates and/or modifies associated objects than
to write each by hand.
</para>
+
+ <para>
+ This is not a problem with partitioned tables though, as trying to
+ create a partition that overlaps with one of the existing partitions
+ results in an error, so it is impossible to end up with partitions
+ that overlap one another.
+ </para>
</listitem>
<listitem>
@@ -3430,6 +3799,14 @@ UNION ALL SELECT * FROM measurement_y2008m01;
on the partition tables, but it makes management of the structure
much more complicated.
</para>
+
+ <para>
+ This problem exists even for partitioned tables. An <command>UPDATE</>
+ that causes a row to move from one partition to another fails, because
+ the new value of the row fails to satisfy the implicit partition
+ constraint of the original partition. This might change in future
+ releases.
+ </para>
</listitem>
<listitem>
@@ -3440,7 +3817,8 @@ UNION ALL SELECT * FROM measurement_y2008m01;
<programlisting>
ANALYZE measurement;
</programlisting>
- will only process the master table.
+ will only process the master table. This is true even for partitioned
+ tables.
</para>
</listitem>
@@ -3451,6 +3829,11 @@ ANALYZE measurement;
action is only taken in case of unique violations on the specified
target relation, not its child relations.
</para>
+
+ <para>
+ <literal>ON CONFLICT</literal> clause will currently cause an error if
+ used on a partitioned table.
+ </para>
</listitem>
</itemizedlist>
@@ -3479,7 +3862,9 @@ ANALYZE measurement;
range tests for range partitioning, as illustrated in the preceding
examples. A good rule of thumb is that partitioning constraints should
contain only comparisons of the partitioning column(s) to constants
- using B-tree-indexable operators.
+ using B-tree-indexable operators, which applies even to partitioned
+ tables, because only B-tree-indexable column(s) are allowed in the
+ partition key.
</para>
</listitem>
@@ -3491,6 +3876,13 @@ ANALYZE measurement;
these techniques will work well with up to perhaps a hundred partitions;
don't try to use many thousands of partitions.
</para>
+
+ <para>
+ This restriction on the number of partitions currently applies even to
+ the partitioned tables, but it will be alleviated in future releases so
+ that the query planning time is not influenced much by the number of
+ partitions.
+ </para>
</listitem>
</itemizedlist>
--
2.11.0
0003-Add-partitioning-keywords-to-keywords.sgml.patchtext/x-diff; name=0003-Add-partitioning-keywords-to-keywords.sgml.patchDownload
From d031422edbd07c59ffd00aed30d933af0c5c3576 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 2 Feb 2017 14:01:02 +0900
Subject: [PATCH 3/4] Add partitioning keywords to keywords.sgml
---
doc/src/sgml/keywords.sgml | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/doc/src/sgml/keywords.sgml b/doc/src/sgml/keywords.sgml
index 0a8027e3a9..4407fb928b 100644
--- a/doc/src/sgml/keywords.sgml
+++ b/doc/src/sgml/keywords.sgml
@@ -336,6 +336,13 @@
<entry></entry>
</row>
<row>
+ <entry><token>ATTACH</token></entry>
+ <entry>non-reserved</entry>
+ <entry></entry>
+ <entry></entry>
+ <entry></entry>
+ </row>
+ <row>
<entry><token>ATTRIBUTE</token></entry>
<entry>non-reserved</entry>
<entry>non-reserved</entry>
@@ -1365,6 +1372,13 @@
<entry>reserved</entry>
</row>
<row>
+ <entry><token>DETACH</token></entry>
+ <entry>non-reserved</entry>
+ <entry></entry>
+ <entry></entry>
+ <entry></entry>
+ </row>
+ <row>
<entry><token>DETERMINISTIC</token></entry>
<entry></entry>
<entry>reserved</entry>
@@ -2555,6 +2569,13 @@
<entry></entry>
</row>
<row>
+ <entry><token>LIST</token></entry>
+ <entry>non-reserved</entry>
+ <entry></entry>
+ <entry></entry>
+ <entry></entry>
+ </row>
+ <row>
<entry><token>LISTEN</token></entry>
<entry>non-reserved</entry>
<entry></entry>
--
2.11.0
0004-Add-a-note-about-DROP-NOT-NULL-and-partitions.patchtext/x-diff; name=0004-Add-a-note-about-DROP-NOT-NULL-and-partitions.patchDownload
From 3cdc14f40120a45d1950329df4e8226fe8c26810 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 10 Feb 2017 15:03:45 +0900
Subject: [PATCH 4/4] Add a note about DROP NOT NULL and partitions
On the ALTER TABLE refpage, it seems better to mention how to drop
drop the not null constraint of a partition's column. Per suggestion
from Corey Huinker.
---
doc/src/sgml/ref/alter_table.sgml | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
index da431f8369..be857882bb 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -172,9 +172,13 @@ ALTER TABLE [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
</para>
<para>
- If this table is a partition, one cannot perform <literal>DROP NOT NULL</>
+ If this table is a partition, one cannot perform <literal>DROP NOT NULL</literal>
on a column if it is marked <literal>NOT NULL</literal> in the parent
- table.
+ table. To drop the <literal>NOT NULL</literal> constraint from all the
+ partitions, perform <literal>DROP NOT NULL</literal> on the parent
+ table. One might however want to set it for only some partitions,
+ which is possible by doing <literal>SET NOT NULL</literal> on individual
+ partitions.
</para>
</listitem>
</varlistentry>
--
2.11.0
On 10 February 2017 at 07:35, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
A final note, because I'm really familiar with partitioning on Postgres and
other databases, documentation which is clear to me might not be to someone
less familiar with partitioning. Maybe we want another reviewer for that?More eyeballs will only help make this better.
Given that we already have partitioning feature committed, we really
need to have the docs committed as well.
Without claiming I'm happy about this, I think the best way to improve
the number of eyeballs on this is to commit these docs as is.
For me, the most important thing is understanding the feature, not
(yet) discussing what the docs should look like. This is especially
true if other patches reference the way partitioning works and nobody
can comment on those patches because they don't understand
Any issues with that?
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/02/10 17:00, Simon Riggs wrote:
On 10 February 2017 at 07:35, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:A final note, because I'm really familiar with partitioning on Postgres and
other databases, documentation which is clear to me might not be to someone
less familiar with partitioning. Maybe we want another reviewer for that?More eyeballs will only help make this better.
Given that we already have partitioning feature committed, we really
need to have the docs committed as well.Without claiming I'm happy about this, I think the best way to improve
the number of eyeballs on this is to commit these docs as is.For me, the most important thing is understanding the feature, not
(yet) discussing what the docs should look like. This is especially
true if other patches reference the way partitioning works and nobody
can comment on those patches because they don't understandAny issues with that?
I agree that getting the proposed documentation changes committed would be
a step ahead. I saw in the logical replication thread that dealing with
partitioned tables without the docs explaining what they are has been
difficult. Hopefully the proposed documentation improvements help make
progress in that regard. Partitioned tables, at this point, have certain
limitations which affect its interaction with other features (old or new);
documenting those limitations will be helpful not only to the users
deciding whether to start using the new partitioned tables right away, but
also to the developers of other features who want to understand what
partitioned tables are.
Thanks,
Amit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10 February 2017 at 08:18, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
I agree that getting the proposed documentation changes committed would be
a step ahead.
Committed. I tested how it works and added documentation that matched
my experiences, correcting what you'd said and adding more information
for clarity as I went.
Few points from tests
* "ERROR: cannot create index on partitioned table "measurement_year_month""
is misleading because you can create indexes on partitions
* You can create additional CHECK (column is NOT NULL) as a check
constraint, even if you can't create a not null constraint
* The OID inheritance needs work - you shouldn't need to specify a
partition needs OIDS if the parent has it already.
* There's no tab completion, which prevents people from testing this
(maybe better now with some docs)
* ERROR: no partition of relation "measurement_year_month" found for row
DETAIL: Failing row contains (2016-12-02, 1, 1).
should not return the whole row, just the partition keys
* It's very weird you can't DROP a partitioned table. I think you need
to add dependencies.
* ERROR: TO must specify exactly one value per partitioning column
should say something more like "you specified one column value,
whereas the partitioning key requires two columns"
Good work so far, but there is still a very significant amount of work
to do. And as this feature evolves it must now contain full
documentation at every step, otherwise it won't be able to receive
full and fair review. So please make sure each new patch contains docs
changes for that patch.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/02/10 19:19, Simon Riggs wrote:
On 10 February 2017 at 08:18, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:I agree that getting the proposed documentation changes committed would be
a step ahead.Committed. I tested how it works and added documentation that matched
my experiences, correcting what you'd said and adding more information
for clarity as I went.
Thanks for making the improvements to and committing the patch!
Few points from tests
* "ERROR: cannot create index on partitioned table "measurement_year_month""
is misleading because you can create indexes on partitions
Do you mean that this should not cause an error at all, but create the
specified index on partitions as part of running the command? Should the
code to handle that be part of this release?
Or do you simply mean that we should rewrite the error message and/or add
a HINT asking to create the index on partitions instead?
* You can create additional CHECK (column is NOT NULL) as a check
constraint, even if you can't create a not null constraint
Sorry but I am not exactly sure which "cannot create a not null
constraint" you are referring to here.
There are no restrictions on *creating* constraints on partitions, but
there are on dropping. Regular inheritance rules prevent dropping
inherited constraints (just the same for partitioned tables), of which
there are only named CHECK constraints at the moment. A new rule
introduced for partitions prevents dropping a column's NOT NULL constraint
if it's been "inherited" (i.e., the same constraint is present in the
parent partitioned table), although it's not in the same sense as for
CHECK constraints, because NOT NULL constraints are not tracked with
pg_constraints.
* The OID inheritance needs work - you shouldn't need to specify a
partition needs OIDS if the parent has it already.
That sounds right. It's better to keep the behavior same as for regular
inheritance. Will post a patch to get rid of the unnecessary check.
FWIW, the check was added to prevent the command from succeeding in the
case where WITHOUT OIDS has been specified for a partition and the parent
partitioned table has the OID column. Regular inheritance simply
*overrides* the WITHOUT OIDS specification, which might be seen as surprising.
* There's no tab completion, which prevents people from testing this
(maybe better now with some docs)
Will post a patch as well.
* ERROR: no partition of relation "measurement_year_month" found for row
DETAIL: Failing row contains (2016-12-02, 1, 1).
should not return the whole row, just the partition keys
I think that makes sense. Something along the lines of
BuildIndexValueDescription() for partition keys will be necessary. Will
post a patch.
* It's very weird you can't DROP a partitioned table. I think you need
to add dependencies.
Do you mean it should be possible to DROP a partitioned table without
needing to specify CASCADE? Currently, same thing happens for a
partitioned table as will for a inheritance parent.
* ERROR: TO must specify exactly one value per partitioning column
should say something more like "you specified one column value,
whereas the partitioning key requires two columns"
Should that be a DETAIL or HINT message?
Good work so far, but there is still a very significant amount of work
to do. And as this feature evolves it must now contain full
documentation at every step, otherwise it won't be able to receive
full and fair review. So please make sure each new patch contains docs
changes for that patch.
Agreed that comprehensive documentation of any new feature is crucial both
during development and after the feature is released.
Thanks,
Amit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/02/13 14:21, Amit Langote wrote:
On 2017/02/10 19:19, Simon Riggs wrote:
* The OID inheritance needs work - you shouldn't need to specify a
partition needs OIDS if the parent has it already.That sounds right. It's better to keep the behavior same as for regular
inheritance. Will post a patch to get rid of the unnecessary check.FWIW, the check was added to prevent the command from succeeding in the
case where WITHOUT OIDS has been specified for a partition and the parent
partitioned table has the OID column. Regular inheritance simply
*overrides* the WITHOUT OIDS specification, which might be seen as surprising.
0001 of the attached patches takes care of this.
* There's no tab completion, which prevents people from testing this
(maybe better now with some docs)Will post a patch as well.
...and 0002 for this.
* ERROR: no partition of relation "measurement_year_month" found for row
DETAIL: Failing row contains (2016-12-02, 1, 1).
should not return the whole row, just the partition keysI think that makes sense. Something along the lines of
BuildIndexValueDescription() for partition keys will be necessary. Will
post a patch.
Let me spend a bit more time on this one.
Thanks,
Amit
Attachments:
0001-Inherit-OID-system-column-automatically-for-partitio.patchtext/x-diff; name=0001-Inherit-OID-system-column-automatically-for-partitio.patchDownload
From 42682394d3d40aeaa5b2565a4f0ca23828a0dda0 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Mon, 13 Feb 2017 13:32:18 +0900
Subject: [PATCH 1/3] Inherit OID system column automatically for partitions
Currently, WITH OIDS must be explicitly specified when creating a
partition if the parent table has the OID system column. Instead,
inherit it automatically, possibly overriding any explicit WITHOUT
OIDS specification. Per review comment from Simon Riggs
---
src/backend/commands/tablecmds.c | 21 ++++++++-------------
src/test/regress/expected/create_table.out | 18 +++++++++++++-----
src/test/regress/sql/create_table.sql | 10 ++++++----
3 files changed, 27 insertions(+), 22 deletions(-)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 37a4c4a3d6..96650e69df 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -634,19 +634,14 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
relkind == RELKIND_PARTITIONED_TABLE));
descriptor->tdhasoid = (localHasOids || parentOidCount > 0);
- if (stmt->partbound)
- {
- /* If the parent has OIDs, partitions must have them too. */
- if (parentOidCount > 0 && !localHasOids)
- ereport(ERROR,
- (errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg("cannot create table without OIDs as partition of table with OIDs")));
- /* If the parent doesn't, partitions must not have them. */
- if (parentOidCount == 0 && localHasOids)
- ereport(ERROR,
- (errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg("cannot create table with OIDs as partition of table without OIDs")));
- }
+ /*
+ * If a partitioned table doesn't have the system OID column, then none
+ * of its partitions should have it.
+ */
+ if (stmt->partbound && parentOidCount == 0 && localHasOids)
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("cannot create table with OIDs as partition of table without OIDs")));
/*
* Find columns with default values and prepare for insertion of the
diff --git a/src/test/regress/expected/create_table.out b/src/test/regress/expected/create_table.out
index fc92cd92dd..20eb3d35f9 100644
--- a/src/test/regress/expected/create_table.out
+++ b/src/test/regress/expected/create_table.out
@@ -524,16 +524,24 @@ DROP TABLE temp_parted;
CREATE TABLE no_oids_parted (
a int
) PARTITION BY RANGE (a) WITHOUT OIDS;
-CREATE TABLE fail_part PARTITION OF no_oids_parted FOR VALUES FROM (1) TO (10 )WITH OIDS;
+CREATE TABLE fail_part PARTITION OF no_oids_parted FOR VALUES FROM (1) TO (10) WITH OIDS;
ERROR: cannot create table with OIDs as partition of table without OIDs
DROP TABLE no_oids_parted;
--- likewise, the reverse if also true
+-- If the partitioned table has oids, then the partition must have them.
+-- If the WITHOUT OIDS option is specified for partition, it is overridden.
CREATE TABLE oids_parted (
a int
) PARTITION BY RANGE (a) WITH OIDS;
-CREATE TABLE fail_part PARTITION OF oids_parted FOR VALUES FROM (1) TO (10 ) WITHOUT OIDS;
-ERROR: cannot create table without OIDs as partition of table with OIDs
-DROP TABLE oids_parted;
+CREATE TABLE part_forced_oids PARTITION OF oids_parted FOR VALUES FROM (1) TO (10) WITHOUT OIDS;
+\d+ part_forced_oids
+ Table "public.part_forced_oids"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | not null | | plain | |
+Partition of: oids_parted FOR VALUES FROM (1) TO (10)
+Has OIDs: yes
+
+DROP TABLE oids_parted, part_forced_oids;
-- check for partition bound overlap and other invalid specifications
CREATE TABLE list_parted2 (
a varchar
diff --git a/src/test/regress/sql/create_table.sql b/src/test/regress/sql/create_table.sql
index 5f25c436ee..f41dd71475 100644
--- a/src/test/regress/sql/create_table.sql
+++ b/src/test/regress/sql/create_table.sql
@@ -493,15 +493,17 @@ DROP TABLE temp_parted;
CREATE TABLE no_oids_parted (
a int
) PARTITION BY RANGE (a) WITHOUT OIDS;
-CREATE TABLE fail_part PARTITION OF no_oids_parted FOR VALUES FROM (1) TO (10 )WITH OIDS;
+CREATE TABLE fail_part PARTITION OF no_oids_parted FOR VALUES FROM (1) TO (10) WITH OIDS;
DROP TABLE no_oids_parted;
--- likewise, the reverse if also true
+-- If the partitioned table has oids, then the partition must have them.
+-- If the WITHOUT OIDS option is specified for partition, it is overridden.
CREATE TABLE oids_parted (
a int
) PARTITION BY RANGE (a) WITH OIDS;
-CREATE TABLE fail_part PARTITION OF oids_parted FOR VALUES FROM (1) TO (10 ) WITHOUT OIDS;
-DROP TABLE oids_parted;
+CREATE TABLE part_forced_oids PARTITION OF oids_parted FOR VALUES FROM (1) TO (10) WITHOUT OIDS;
+\d+ part_forced_oids
+DROP TABLE oids_parted, part_forced_oids;
-- check for partition bound overlap and other invalid specifications
--
2.11.0
0002-Tab-completion-for-the-new-partitioning-syntax.patchtext/x-diff; name=0002-Tab-completion-for-the-new-partitioning-syntax.patchDownload
From c09837eff10725bc33265cf6d91e8e8e81dd2c55 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Mon, 13 Feb 2017 18:18:48 +0900
Subject: [PATCH 2/3] Tab completion for the new partitioning syntax
---
src/bin/psql/tab-complete.c | 59 ++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 58 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 6e759d0b76..73958cdebf 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -463,6 +463,21 @@ static const SchemaQuery Query_for_list_of_tables = {
NULL
};
+static const SchemaQuery Query_for_list_of_partitioned_tables = {
+ /* catname */
+ "pg_catalog.pg_class c",
+ /* selcondition */
+ "c.relkind IN ('P')",
+ /* viscondition */
+ "pg_catalog.pg_table_is_visible(c.oid)",
+ /* namespace */
+ "c.relnamespace",
+ /* result */
+ "pg_catalog.quote_ident(c.relname)",
+ /* qualresult */
+ NULL
+};
+
static const SchemaQuery Query_for_list_of_constraints_with_schema = {
/* catname */
"pg_catalog.pg_constraint c",
@@ -913,6 +928,16 @@ static const SchemaQuery Query_for_list_of_matviews = {
" SELECT 'DEFAULT' ) ss "\
" WHERE pg_catalog.substring(name,1,%%d)='%%s'"
+/* the silly-looking length condition is just to eat up the current word */
+#define Query_for_partition_of_table \
+"SELECT pg_catalog.quote_ident(c2.relname) "\
+" FROM pg_catalog.pg_class c1, pg_catalog.pg_class c2, pg_catalog.pg_inherits i"\
+" WHERE c1.oid=i.inhparent and i.inhrelid=c2.oid"\
+" and (%d = pg_catalog.length('%s'))"\
+" and pg_catalog.quote_ident(c1.relname)='%s'"\
+" and pg_catalog.pg_table_is_visible(c2.oid)"\
+" and c2.relispartition = 'true'"
+
/*
* This is a list of all "things" in Pgsql, which can show up after CREATE or
* DROP; and there is also a query to get a list of them.
@@ -1741,7 +1766,8 @@ psql_completion(const char *text, int start, int end)
static const char *const list_ALTER2[] =
{"ADD", "ALTER", "CLUSTER ON", "DISABLE", "DROP", "ENABLE", "INHERIT",
"NO INHERIT", "RENAME", "RESET", "OWNER TO", "SET",
- "VALIDATE CONSTRAINT", "REPLICA IDENTITY", NULL};
+ "VALIDATE CONSTRAINT", "REPLICA IDENTITY", "ATTACH PARTITION",
+ "DETACH PARTITION", NULL};
COMPLETE_WITH_LIST(list_ALTER2);
}
@@ -1922,6 +1948,26 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH_LIST4("FULL", "NOTHING", "DEFAULT", "USING");
else if (Matches4("ALTER", "TABLE", MatchAny, "REPLICA"))
COMPLETE_WITH_CONST("IDENTITY");
+ /*
+ * If we have ALTER TABLE <foo> ATTACH PARTITION, provide a list of
+ * tables.
+ */
+ else if (Matches5("ALTER", "TABLE", MatchAny, "ATTACH", "PARTITION"))
+ COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, "");
+ /* Limited completion support for partition bound specification */
+ else if (TailMatches3("ATTACH", "PARTITION", MatchAny))
+ COMPLETE_WITH_CONST("FOR VALUES");
+ else if (TailMatches5("ATTACH", "PARTITION", MatchAny, "FOR", "VALUES"))
+ COMPLETE_WITH_LIST2("FROM (", "IN (");
+ /*
+ * If we have ALTER TABLE <foo> DETACH PARTITION, provide a list of
+ * partitions of <foo>.
+ */
+ else if (Matches5("ALTER", "TABLE", MatchAny, "DETACH", "PARTITION"))
+ {
+ completion_info_charp = prev3_wd;
+ COMPLETE_WITH_QUERY(Query_for_partition_of_table);
+ }
/* ALTER TABLESPACE <foo> with RENAME TO, OWNER TO, SET, RESET */
else if (Matches3("ALTER", "TABLESPACE", MatchAny))
@@ -2299,6 +2345,17 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE UNLOGGED" with TABLE or MATVIEW */
else if (TailMatches2("CREATE", "UNLOGGED"))
COMPLETE_WITH_LIST2("TABLE", "MATERIALIZED VIEW");
+ /* Complete PARTITION BY with RANGE ( or LIST ( or ... */
+ else if (TailMatches2("PARTITION", "BY"))
+ COMPLETE_WITH_LIST2("RANGE (", "LIST (");
+ /* If we have xxx PARTITION OF, provide a list of partitioned tables */
+ else if (TailMatches2("PARTITION", "OF"))
+ COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_partitioned_tables, "");
+ /* Limited completion support for partition bound specification */
+ else if (TailMatches3("PARTITION", "OF", MatchAny))
+ COMPLETE_WITH_CONST("FOR VALUES");
+ else if (TailMatches5("PARTITION", "OF", MatchAny, "FOR", "VALUES"))
+ COMPLETE_WITH_LIST2("FROM (", "IN (");
/* CREATE TABLESPACE */
else if (Matches3("CREATE", "TABLESPACE", MatchAny))
--
2.11.0
On 2017/02/13 14:21, Amit Langote wrote:
On 2017/02/10 19:19, Simon Riggs wrote:
On 10 February 2017 at 08:18, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:I agree that getting the proposed documentation changes committed would be
a step ahead.Committed. I tested how it works and added documentation that matched
my experiences, correcting what you'd said and adding more information
for clarity as I went.Thanks for making the improvements to and committing the patch!
Since I had added this to CF 2017-03, I have marked it as committed.
https://commitfest.postgresql.org/13/983/
Thanks,
Amit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Feb 10, 2017 at 3:00 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
Given that we already have partitioning feature committed, we really
need to have the docs committed as well.
Just for the record, it's not like there were no documentation changes
in the originally committed patch. In fact there were over 400 lines
of documentation:
doc/src/sgml/catalogs.sgml | 129 +++++++++++++++++++++++-
doc/src/sgml/ref/alter_table.sgml | 117 +++++++++++++++++++++-
doc/src/sgml/ref/create_foreign_table.sgml | 26 +++++
doc/src/sgml/ref/create_table.sgml | 154 +++++++++++++++++++++++++++++
The patch you committed approximately doubles the amount of
documentation for this feature, which is fine as far as it goes, but
the key points were all explained in the original commit. I have been
known to leave out documentation from commits from time to time and
fill it in after-the-fact, but that's not really what happened here.
Without claiming I'm happy about this, I think the best way to improve
the number of eyeballs on this is to commit these docs as is.For me, the most important thing is understanding the feature, not
(yet) discussing what the docs should look like. This is especially
true if other patches reference the way partitioning works and nobody
can comment on those patches because they don't understandAny issues with that?
There are a number of things that I think are awkward about the patch
as committed:
+ <listitem>
+ <para>
+ See last section for some general information:
+ <xref linkend="ddl-partitioned-tables">
+ </para>
+ </listitem>
I think we generally try to write the documentation in such a way as
to minimize backward references, and a backward reference to the
previous section seems particularly odd. We've now got section "5.10
Partitioned Tables" followed immediately by section "5.11
Partitioning", where the latter seems to think that you haven't read
the former.
I think that section 5.11 needs a much heavier rewrite than what it
got as part of this patch. It's a hodgepodge of the old content
(which explained how to fake partitioning when we didn't have an
explicit concept of partitioning) and new content that talks about how
the new way is different from the old way. But now that we have the
new way, I'm guessing that most people are going to use that and not
care about the old way any more. I'm not that it's even appropriate
to keep the lengthy explanation of how to fake table partitioning
using table inheritance and non-overlapping CHECK constraints, but if
we still want that stuff it should be de-emphasized more than it is
here. Probably the section should be retitled: in previous releases
we called this "Partitioning" because we had no explicit notion of
partitioning, but now that we do, it's confusing to have a section
called "Partitioning" that explains how to avoid using the
partitioning feature, which is more or less what this does. Or maybe
the section title should stay the same (or this should be merged into
the previous section?) but rewritten to change the emphasis.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Feb 13, 2017 at 5:57 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2017/02/13 14:21, Amit Langote wrote:
On 2017/02/10 19:19, Simon Riggs wrote:
* The OID inheritance needs work - you shouldn't need to specify a
partition needs OIDS if the parent has it already.That sounds right. It's better to keep the behavior same as for regular
inheritance. Will post a patch to get rid of the unnecessary check.FWIW, the check was added to prevent the command from succeeding in the
case where WITHOUT OIDS has been specified for a partition and the parent
partitioned table has the OID column. Regular inheritance simply
*overrides* the WITHOUT OIDS specification, which might be seen as surprising.0001 of the attached patches takes care of this.
I think 0001 needs to remove this hunk of documentation:
<listitem>
<para>
If the partitioned table specified <literal>WITH OIDS</literal> then
each partition must also specify <literal>WITH OIDS</literal>. Oids
are not automatically inherited by partitions.
</para>
</listitem>
I think 0001 is better than the status quo, but I'm wondering whether
we should try to do something slightly different. Maybe it should
always work for the child table to specify neither WITH OIDS nor
WITHOUT OIDS, but if you do specify one of them then it has to be the
one that matches the parent partitioned table? With this patch, IIUC,
WITH OIDS is allowed only if the parent has the same, but WITHOUT OIDS
is allowed (but ignored) regardless of the parent setting.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Noticed some typos in the documentation. Here's patch to correct
those. Sorry, if it has been already taken care of.
On Wed, Feb 15, 2017 at 10:01 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Feb 13, 2017 at 5:57 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:On 2017/02/13 14:21, Amit Langote wrote:
On 2017/02/10 19:19, Simon Riggs wrote:
* The OID inheritance needs work - you shouldn't need to specify a
partition needs OIDS if the parent has it already.That sounds right. It's better to keep the behavior same as for regular
inheritance. Will post a patch to get rid of the unnecessary check.FWIW, the check was added to prevent the command from succeeding in the
case where WITHOUT OIDS has been specified for a partition and the parent
partitioned table has the OID column. Regular inheritance simply
*overrides* the WITHOUT OIDS specification, which might be seen as surprising.0001 of the attached patches takes care of this.
I think 0001 needs to remove this hunk of documentation:
<listitem>
<para>
If the partitioned table specified <literal>WITH OIDS</literal> then
each partition must also specify <literal>WITH OIDS</literal>. Oids
are not automatically inherited by partitions.
</para>
</listitem>I think 0001 is better than the status quo, but I'm wondering whether
we should try to do something slightly different. Maybe it should
always work for the child table to specify neither WITH OIDS nor
WITHOUT OIDS, but if you do specify one of them then it has to be the
one that matches the parent partitioned table? With this patch, IIUC,
WITH OIDS is allowed only if the parent has the same, but WITHOUT OIDS
is allowed (but ignored) regardless of the parent setting.--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Attachments:
pg_part_doc.patchapplication/octet-stream; name=pg_part_doc.patchDownload
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 39e4446..f909242 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2907,8 +2907,8 @@ VALUES ('Albany', NULL, NULL, 'NY');
<listitem>
<para>
- Since Primary Keys are not supprtable on partitioned tables
- Foreign keys referencing partitioned tables are not supported, nor
+ Since primary keys are not supported on partitioned tables
+ foreign keys referencing partitioned tables are not supported, nor
are foreign key references from a partitioned table to some other table.
</para>
</listitem>
On Wed, Feb 15, 2017 at 4:26 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
Noticed some typos in the documentation. Here's patch to correct
those. Sorry, if it has been already taken care of.
Thanks. That is indeed nonstandard capitalization. Committed.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 13 February 2017 at 05:21, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
Few points from tests
* "ERROR: cannot create index on partitioned table "measurement_year_month""
is misleading because you can create indexes on partitionsDo you mean that this should not cause an error at all, but create the
specified index on partitions as part of running the command? Should the
code to handle that be part of this release?
Sounds fairly basic to me. If you don't support this, then presumably
every ORM, pgAdmin etc will all be broken.
And 1000 people will need to write a script that does what we could do
easily in a loop internally.
At present you haven't even documented how you'd do this.
I see you might want to create an index on a subset of partitions, but
I expect that you've thought of that and have a proposal for syntax
that allows it. Though the default should be that CREATE INDEX just
works.
Or do you simply mean that we should rewrite the error message and/or add
a HINT asking to create the index on partitions instead?
You could
* It's very weird you can't DROP a partitioned table. I think you need
to add dependencies.Do you mean it should be possible to DROP a partitioned table without
needing to specify CASCADE? Currently, same thing happens for a
partitioned table as will for a inheritance parent.
If we wanted them to act identically we wouldn't have any need for a
new feature at all, so clearly that doesn't make sense as an argument.
If I issue DROP TABLE elsewhere, it doesn't refuse to drop because it
has indexes, sequences etc on it. So why should it just because it has
partitions? Most especially if they were created automatically for me
in the first place. I might just understand that running ATTACH TABLE
might change that viewpoint.
I'm pretty sure DROP TABLE and CREATE INDEX are fairly basic
expectations from users about how tables should work, partitioned or
otherwise.
It leaves me asking what else is missing.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 02/15/2017 06:10 AM, Simon Riggs wrote:
On 13 February 2017 at 05:21, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
If I issue DROP TABLE elsewhere, it doesn't refuse to drop because it
has indexes, sequences etc on it. So why should it just because it has
partitions?
Because partitions may have data.
JD
--
Command Prompt, Inc. http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Joshua D. Drake wrote:
On 02/15/2017 06:10 AM, Simon Riggs wrote:
On 13 February 2017 at 05:21, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:If I issue DROP TABLE elsewhere, it doesn't refuse to drop because it
has indexes, sequences etc on it. So why should it just because it has
partitions?Because partitions may have data.
So would the table, were it not partitioned.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Feb 15, 2017 at 9:37 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
Joshua D. Drake wrote:
On 02/15/2017 06:10 AM, Simon Riggs wrote:
On 13 February 2017 at 05:21, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:If I issue DROP TABLE elsewhere, it doesn't refuse to drop because it
has indexes, sequences etc on it. So why should it just because it has
partitions?Because partitions may have data.
So would the table, were it not partitioned.
True. I think the question here is: do we want to view the dependency
between a partitioned table and a partition of that table as
DEPENDENCY_NORMAL or as DEPENDENCY_AUTO? With table inheritance, it's
always been "normal" and I'm not sure there's any good reason for
partitioning to make the opposite decision. The new partitioning
implementation provides a user experience that is overall smoother
than doing the same thing with inheritance, but it's not as if you can
ignore the fact that your partitioned tables have sub-objects that are
also tables.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Feb 15, 2017 at 9:10 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
* "ERROR: cannot create index on partitioned table "measurement_year_month""
is misleading because you can create indexes on partitionsDo you mean that this should not cause an error at all, but create the
specified index on partitions as part of running the command? Should the
code to handle that be part of this release?Sounds fairly basic to me. If you don't support this, then presumably
every ORM, pgAdmin etc will all be broken.
I don't see why that should be the case.
And 1000 people will need to write a script that does what we could do
easily in a loop internally.
Now that is probably true.
At present you haven't even documented how you'd do this.
It's not that hard to figure it out, though. A HINT wouldn't be a bad
idea, certainly.
There are some really thorny problems with making index creation
cascade to all of the partitions. I think it's worth doing, but
there's a lot of stuff to think about before you go and start writing
code. Most obviously, if you can use a single CREATE INDEX statement
to create indexes on all of the partitions, then probably you ought to
also be able to use DROP INDEX to get rid of all of those indexes. In
other words, it should probably work a lot like what already happens
with constraints: constraints cascade from the parent down to the
children, but we still know which child object goes with which parent
object, so if the parent object is dropped we can get rid of all of
the children. I think we need something similar here, although if we
restrict it to the partitioning case and don't make it work with table
inheritance then it can be simpler since table partitioning doesn't
allow multiple inheritance. Presumably we'd want other index commands
like REINDEX to cascade similarly.
Also, it's not entirely clear what the semantics should be. If the
partitioning key is (a) and you ask for an index on (a, b), you could
conceivably omit a from the indexes created on partitions that only
cover a single value of a. (That case is easy to detect when list
partitioning is in use.) Should we try do that elimination, or just
do what the user asked for? Will users be unhappy if we try to do
this sort of column elimination but it only works in simple cases?
Think about the possibility that there are partitioning expressions
rather than partitioning columns before concluding we can make it work
in all cases. On the other hand, if you ask for a UNIQUE index on
(b), should we go ahead and create such an index on each partition,
ensuring uniqueness within each partition, or should we refuse to
proceed on the grounds that we can't be sure that such an index will
ensure global uniqueness? If you do the former, someone might find
the behavior surprising, but if you do the latter, you might annoy
people who know what they're asking for and want that thing but can't
get it. I suspect we want to eventually allow a user to ask for
either one, because eventually we'll probably have global indexes, and
then you really need a way to say whether you want a global index or a
partitioned non-global index. But that requires agreeing on syntax,
which is complicated and will probably involve a lot of bikeshedding
(as well it should - these are big decisions).
I think it would be a bad idea to try to fix this problem for v10.
One of the earlier versions of the patch allowed indexes on the parent
table as if it were just a regular empty table, which did not seem
useful. I asked him to disallow that so as to keep our options open
for the future. I see no reason why v11 or v12 can't fill in the
functionality in this area. Right now we're 2 weeks away from the
start of the last CommitFest, and that's not the time to go start
writing a complex patch for a feature that isn't even particularly
well-defined. If somebody really cared about this
make-an-index-for-everything-in-the-hierarchy problem, they could've
written a patch for that at any time in the last 5 years; it's not
strictly dependent on the new partitioning stuff. Nobody's done that,
and trying to throw together something now in the last couple of weeks
could easily end with us getting it wrong and then having to face the
unpleasant prospect of either leaving it broken or breaking backward
compatibility to fix it.
It leaves me asking what else is missing.
There is certainly a lot of room for improvement here but I don't
understand your persistent negativity about what's been done thus far.
I think it's pretty clearly a huge step forward, and I think Amit
deserves a ton of credit for making it happen. The improvements in
bulk loading performance alone are stupendous. You apparently have
the idea that somebody could have written an even larger patch that
solved even more problems at once, but this was already a really big
patch, and IMHO quite a good one.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas wrote:
On Wed, Feb 15, 2017 at 9:37 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:Joshua D. Drake wrote:
Because partitions may have data.
So would the table, were it not partitioned.
True. I think the question here is: do we want to view the dependency
between a partitioned table and a partition of that table as
DEPENDENCY_NORMAL or as DEPENDENCY_AUTO? With table inheritance, it's
always been "normal" and I'm not sure there's any good reason for
partitioning to make the opposite decision.
I think new-style partitioning is supposed to consider each partition as
an implementation detail of the table; the fact that you can manipulate
partitions separately does not really mean that they are their own
independent object. You don't stop to think "do I really want to drop
the TOAST table attached to this main table?" and attach a CASCADE
clause if so. You just drop the main table, and the toast one is
dropped automatically. I think new-style partitions should behave
equivalently.
You can make the partition an independent entity, but if you don't
explicitly take that step beforehand, I don't see why we should see it
that way implicitly.
The new partitioning
implementation provides a user experience that is overall smoother
than doing the same thing with inheritance, but it's not as if you can
ignore the fact that your partitioned tables have sub-objects that are
also tables.
Now that partitions are declarative, the underlying implementation could
change away from inheritance. It's now just an implementation artifact.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Feb 15, 2017 at 11:34 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
I think new-style partitioning is supposed to consider each partition as
an implementation detail of the table; the fact that you can manipulate
partitions separately does not really mean that they are their own
independent object. You don't stop to think "do I really want to drop
the TOAST table attached to this main table?" and attach a CASCADE
clause if so. You just drop the main table, and the toast one is
dropped automatically. I think new-style partitions should behave
equivalently.
That's a reasonable point of view. I'd like to get some more opinions
on this topic. I'm happy to have us do whatever most people want, but
I'm worried that having table inheritance and table partitioning work
differently will be create confusion. I'm also suspicious that there
may be some implementation difficulties. On the hand, it does seem a
little silly to say that DROP TABLE partitioned_table should always
fail except in the degenerate case where there are no partitions, so
maybe changing it is for the best.
Now that partitions are declarative, the underlying implementation could
change away from inheritance. It's now just an implementation artifact.
I don't really agree with that. It's true that, for example, we could
decide to store the inheritance information for partitions someplace
other than pg_inherits, but from a user perspective these are always
going to be closely-related features. Also, there are quite a number
of ways in which it's very noticeable that the objects are separate.
They are dumped separately. They have separate indexes, and even if
we provide some facility to create a common indexing scheme across all
partitions automatically, you'll still be able to REINDEX or CLUSTER
one of those tables individually. They can have different storage
properties, like one can be UNLOGGED while another is not. They show
up in EXPLAIN plans. The partitioning structure affects what you can
and can't do with foreign keys. All of those are user-visible things
that make this look and feel like a collection of tables, not just a
single table that happens to have several relfilenodes under the hood.
I think that's actually a really important feature, not a design
defect.
As you may or may not know, EDB has had a proprietary implementation
of table partitioning since Advanced Server 9.1, and one of the things
we've learned from that implementation is that users really like to be
able to fiddle with the individual partitions. They want to things
like make them individually unlogged, rename them, vacuum them, add
contraints, add triggers, put them in different tablespaces, vary
indexing strategies, all the stuff that you normally do with
standalone tables. Any time one of those things didn't work quite
like it would for a standalone table, we got complaints. One of the
things that has become really clear over the five years that feature
has been out of the field is that users value the ability to do
different things with different child tables. Now that of course does
not mean that they don't want to be able to operate on the hierarchy
as a whole; we have numerous feature requests in that direction, too.
But, at least among the base of customers that have talked to us about
the proprietary partitioning feature in Advanced Server, wanting to
treat a partition as a table and do some arbitrary thing to it that
can be done to a table is extremely common. Of course, I can't
promise (and am not trying to argue) that the reaction among
PostgreSQL users generally will necessarily be similar to the
experience we've had with our Advanced Server customers, but this
experience has definitely caused me to lean in the direction of
wanting partitions to be first-class objects.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
Robert Haas wrote:
True. I think the question here is: do we want to view the dependency
between a partitioned table and a partition of that table as
DEPENDENCY_NORMAL or as DEPENDENCY_AUTO? With table inheritance, it's
always been "normal" and I'm not sure there's any good reason for
partitioning to make the opposite decision.
I think new-style partitioning is supposed to consider each partition as
an implementation detail of the table; the fact that you can manipulate
partitions separately does not really mean that they are their own
independent object. You don't stop to think "do I really want to drop
the TOAST table attached to this main table?" and attach a CASCADE
clause if so. You just drop the main table, and the toast one is
dropped automatically. I think new-style partitions should behave
equivalently.
I agree with Alvaro's position. If you need CASCADE to get rid of the
individual partitions, that's going to be a serious usability fail.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/02/15 13:31, Robert Haas wrote:
On Mon, Feb 13, 2017 at 5:57 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:On 2017/02/13 14:21, Amit Langote wrote:
On 2017/02/10 19:19, Simon Riggs wrote:
* The OID inheritance needs work - you shouldn't need to specify a
partition needs OIDS if the parent has it already.That sounds right. It's better to keep the behavior same as for regular
inheritance. Will post a patch to get rid of the unnecessary check.FWIW, the check was added to prevent the command from succeeding in the
case where WITHOUT OIDS has been specified for a partition and the parent
partitioned table has the OID column. Regular inheritance simply
*overrides* the WITHOUT OIDS specification, which might be seen as surprising.0001 of the attached patches takes care of this.
I think 0001 needs to remove this hunk of documentation:
<listitem>
<para>
If the partitioned table specified <literal>WITH OIDS</literal> then
each partition must also specify <literal>WITH OIDS</literal>. Oids
are not automatically inherited by partitions.
</para>
</listitem>
Attached updated 0001 which does that.
I think 0001 is better than the status quo, but I'm wondering whether
we should try to do something slightly different. Maybe it should
always work for the child table to specify neither WITH OIDS nor
WITHOUT OIDS, but if you do specify one of them then it has to be the
one that matches the parent partitioned table? With this patch, IIUC,
WITH OIDS is allowed only if the parent has the same, but WITHOUT OIDS
is allowed (but ignored) regardless of the parent setting.
With the patch, one can always specify (or not) WITH/WITHOUT OIDS when
creating partitions. If WITH OIDS is specified and the parent doesn't
have OIDs, then an error is raised. Then just like with normal
inheritance, WITHOUT OIDS specification for a partition will be
*overridden* if the parent has OIDs. By the way, CREATE TABLE page says
this about inheritance and OIDS:
(If the new table inherits from any tables that have OIDs, then
<literal>OIDS=TRUE</> is forced even if the command says
<literal>OIDS=FALSE</>.
Hopefully it's clear to someone reading "If the table inherits from any
tables ..." that it also refers to creating partition of a partitioned table.
Also attaching 0002 (unchanged) for tab-completion support for the new
partitioning syntax.
0003 changes how ExecFindPartition() shows the row for which
get_partition_for_tuple() failed to find a partition. As Simon commented
upthread, we should show just the partition key, not the whole row in the
error DETAIL. So the DETAIL now looks like what's shown by
_bt_check_unique() upon uniqueness violation:
DETAIL: Partition key of the failing row contains (key1, key2, ...)=(val1,
val2, ...)
The rules about which columns to show or whether to show the DETAIL at all
are similar to those in BuildIndexValueDescription():
- if user has SELECT privilege on the whole table, simply go ahead
- if user doesn't have SELECT privilege on the table, check that they
can see all the columns in the key (no point in showing partial key);
however abort on finding an expression for which we don't try finding
out privilege situation of whatever columns may be in the expression
Thanks,
Amit
Attachments:
0001-Inherit-OID-system-column-automatically-for-partitio.patchtext/x-diff; name=0001-Inherit-OID-system-column-automatically-for-partitio.patchDownload
From 1f249c0a395d18532c470ebe4912e33aa61409fd Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Mon, 13 Feb 2017 13:32:18 +0900
Subject: [PATCH 1/3] Inherit OID system column automatically for partitions
Currently, WITH OIDS must be explicitly specified when creating a
partition if the parent table has the OID system column. Instead,
inherit it automatically, possibly overriding any explicit WITHOUT
OIDS specification. Per review comment from Simon Riggs
---
doc/src/sgml/ddl.sgml | 8 --------
doc/src/sgml/ref/create_table.sgml | 12 +++++-------
src/backend/commands/tablecmds.c | 21 ++++++++-------------
src/test/regress/expected/create_table.out | 18 +++++++++++++-----
src/test/regress/sql/create_table.sql | 10 ++++++----
5 files changed, 32 insertions(+), 37 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index f909242e4c..5779eac43d 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2863,14 +2863,6 @@ VALUES ('Albany', NULL, NULL, 'NY');
<listitem>
<para>
- If the partitioned table specified <literal>WITH OIDS</literal> then
- each partition must also specify <literal>WITH OIDS</literal>. Oids
- are not automatically inherited by partitions.
- </para>
- </listitem>
-
- <listitem>
- <para>
One cannot drop a <literal>NOT NULL</literal> constraint on a
partition's column, if the constraint is present in the parent table.
</para>
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index e0f7cd9b93..87a3443ee2 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -301,13 +301,11 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
<para>
A partition cannot have columns other than those inherited from the
- parent. If the parent is specified <literal>WITH OIDS</literal> then
- the partitions must also explicitly specify <literal>WITH OIDS</literal>.
- Defaults and constraints can optionally be specified for each of the
- inherited columns. One can also specify table constraints in addition
- to those inherited from the parent. If a check constraint with the name
- matching one of the parent's constraint is specified, it is merged with
- the latter, provided the specified condition is same.
+ parent. Defaults and constraints can optionally be specified for each
+ of the inherited columns. One can also specify table constraints in
+ addition to those inherited from the parent. If a check constraint with
+ the name matching one of the parent's constraint is specified, it is
+ merged with the latter, provided the specified condition is same.
</para>
<para>
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f33aa70da6..3cea220421 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -634,19 +634,14 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
relkind == RELKIND_PARTITIONED_TABLE));
descriptor->tdhasoid = (localHasOids || parentOidCount > 0);
- if (stmt->partbound)
- {
- /* If the parent has OIDs, partitions must have them too. */
- if (parentOidCount > 0 && !localHasOids)
- ereport(ERROR,
- (errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg("cannot create table without OIDs as partition of table with OIDs")));
- /* If the parent doesn't, partitions must not have them. */
- if (parentOidCount == 0 && localHasOids)
- ereport(ERROR,
- (errcode(ERRCODE_WRONG_OBJECT_TYPE),
- errmsg("cannot create table with OIDs as partition of table without OIDs")));
- }
+ /*
+ * If a partitioned table doesn't have the system OID column, then none
+ * of its partitions should have it.
+ */
+ if (stmt->partbound && parentOidCount == 0 && localHasOids)
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("cannot create table with OIDs as partition of table without OIDs")));
/*
* Find columns with default values and prepare for insertion of the
diff --git a/src/test/regress/expected/create_table.out b/src/test/regress/expected/create_table.out
index fc92cd92dd..20eb3d35f9 100644
--- a/src/test/regress/expected/create_table.out
+++ b/src/test/regress/expected/create_table.out
@@ -524,16 +524,24 @@ DROP TABLE temp_parted;
CREATE TABLE no_oids_parted (
a int
) PARTITION BY RANGE (a) WITHOUT OIDS;
-CREATE TABLE fail_part PARTITION OF no_oids_parted FOR VALUES FROM (1) TO (10 )WITH OIDS;
+CREATE TABLE fail_part PARTITION OF no_oids_parted FOR VALUES FROM (1) TO (10) WITH OIDS;
ERROR: cannot create table with OIDs as partition of table without OIDs
DROP TABLE no_oids_parted;
--- likewise, the reverse if also true
+-- If the partitioned table has oids, then the partition must have them.
+-- If the WITHOUT OIDS option is specified for partition, it is overridden.
CREATE TABLE oids_parted (
a int
) PARTITION BY RANGE (a) WITH OIDS;
-CREATE TABLE fail_part PARTITION OF oids_parted FOR VALUES FROM (1) TO (10 ) WITHOUT OIDS;
-ERROR: cannot create table without OIDs as partition of table with OIDs
-DROP TABLE oids_parted;
+CREATE TABLE part_forced_oids PARTITION OF oids_parted FOR VALUES FROM (1) TO (10) WITHOUT OIDS;
+\d+ part_forced_oids
+ Table "public.part_forced_oids"
+ Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
+--------+---------+-----------+----------+---------+---------+--------------+-------------
+ a | integer | | not null | | plain | |
+Partition of: oids_parted FOR VALUES FROM (1) TO (10)
+Has OIDs: yes
+
+DROP TABLE oids_parted, part_forced_oids;
-- check for partition bound overlap and other invalid specifications
CREATE TABLE list_parted2 (
a varchar
diff --git a/src/test/regress/sql/create_table.sql b/src/test/regress/sql/create_table.sql
index 5f25c436ee..f41dd71475 100644
--- a/src/test/regress/sql/create_table.sql
+++ b/src/test/regress/sql/create_table.sql
@@ -493,15 +493,17 @@ DROP TABLE temp_parted;
CREATE TABLE no_oids_parted (
a int
) PARTITION BY RANGE (a) WITHOUT OIDS;
-CREATE TABLE fail_part PARTITION OF no_oids_parted FOR VALUES FROM (1) TO (10 )WITH OIDS;
+CREATE TABLE fail_part PARTITION OF no_oids_parted FOR VALUES FROM (1) TO (10) WITH OIDS;
DROP TABLE no_oids_parted;
--- likewise, the reverse if also true
+-- If the partitioned table has oids, then the partition must have them.
+-- If the WITHOUT OIDS option is specified for partition, it is overridden.
CREATE TABLE oids_parted (
a int
) PARTITION BY RANGE (a) WITH OIDS;
-CREATE TABLE fail_part PARTITION OF oids_parted FOR VALUES FROM (1) TO (10 ) WITHOUT OIDS;
-DROP TABLE oids_parted;
+CREATE TABLE part_forced_oids PARTITION OF oids_parted FOR VALUES FROM (1) TO (10) WITHOUT OIDS;
+\d+ part_forced_oids
+DROP TABLE oids_parted, part_forced_oids;
-- check for partition bound overlap and other invalid specifications
--
2.11.0
0002-Tab-completion-for-the-new-partitioning-syntax.patchtext/x-diff; name=0002-Tab-completion-for-the-new-partitioning-syntax.patchDownload
From 09d5e3de9e0e45f593ac9801ec9fff580584969c Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Mon, 13 Feb 2017 18:18:48 +0900
Subject: [PATCH 2/3] Tab completion for the new partitioning syntax
---
src/bin/psql/tab-complete.c | 59 ++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 58 insertions(+), 1 deletion(-)
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 94814c20d0..715f7e46a4 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -463,6 +463,21 @@ static const SchemaQuery Query_for_list_of_tables = {
NULL
};
+static const SchemaQuery Query_for_list_of_partitioned_tables = {
+ /* catname */
+ "pg_catalog.pg_class c",
+ /* selcondition */
+ "c.relkind IN ('P')",
+ /* viscondition */
+ "pg_catalog.pg_table_is_visible(c.oid)",
+ /* namespace */
+ "c.relnamespace",
+ /* result */
+ "pg_catalog.quote_ident(c.relname)",
+ /* qualresult */
+ NULL
+};
+
static const SchemaQuery Query_for_list_of_constraints_with_schema = {
/* catname */
"pg_catalog.pg_constraint c",
@@ -913,6 +928,16 @@ static const SchemaQuery Query_for_list_of_matviews = {
" SELECT 'DEFAULT' ) ss "\
" WHERE pg_catalog.substring(name,1,%%d)='%%s'"
+/* the silly-looking length condition is just to eat up the current word */
+#define Query_for_partition_of_table \
+"SELECT pg_catalog.quote_ident(c2.relname) "\
+" FROM pg_catalog.pg_class c1, pg_catalog.pg_class c2, pg_catalog.pg_inherits i"\
+" WHERE c1.oid=i.inhparent and i.inhrelid=c2.oid"\
+" and (%d = pg_catalog.length('%s'))"\
+" and pg_catalog.quote_ident(c1.relname)='%s'"\
+" and pg_catalog.pg_table_is_visible(c2.oid)"\
+" and c2.relispartition = 'true'"
+
/*
* This is a list of all "things" in Pgsql, which can show up after CREATE or
* DROP; and there is also a query to get a list of them.
@@ -1742,7 +1767,8 @@ psql_completion(const char *text, int start, int end)
static const char *const list_ALTER2[] =
{"ADD", "ALTER", "CLUSTER ON", "DISABLE", "DROP", "ENABLE", "INHERIT",
"NO INHERIT", "RENAME", "RESET", "OWNER TO", "SET",
- "VALIDATE CONSTRAINT", "REPLICA IDENTITY", NULL};
+ "VALIDATE CONSTRAINT", "REPLICA IDENTITY", "ATTACH PARTITION",
+ "DETACH PARTITION", NULL};
COMPLETE_WITH_LIST(list_ALTER2);
}
@@ -1923,6 +1949,26 @@ psql_completion(const char *text, int start, int end)
COMPLETE_WITH_LIST4("FULL", "NOTHING", "DEFAULT", "USING");
else if (Matches4("ALTER", "TABLE", MatchAny, "REPLICA"))
COMPLETE_WITH_CONST("IDENTITY");
+ /*
+ * If we have ALTER TABLE <foo> ATTACH PARTITION, provide a list of
+ * tables.
+ */
+ else if (Matches5("ALTER", "TABLE", MatchAny, "ATTACH", "PARTITION"))
+ COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, "");
+ /* Limited completion support for partition bound specification */
+ else if (TailMatches3("ATTACH", "PARTITION", MatchAny))
+ COMPLETE_WITH_CONST("FOR VALUES");
+ else if (TailMatches5("ATTACH", "PARTITION", MatchAny, "FOR", "VALUES"))
+ COMPLETE_WITH_LIST2("FROM (", "IN (");
+ /*
+ * If we have ALTER TABLE <foo> DETACH PARTITION, provide a list of
+ * partitions of <foo>.
+ */
+ else if (Matches5("ALTER", "TABLE", MatchAny, "DETACH", "PARTITION"))
+ {
+ completion_info_charp = prev3_wd;
+ COMPLETE_WITH_QUERY(Query_for_partition_of_table);
+ }
/* ALTER TABLESPACE <foo> with RENAME TO, OWNER TO, SET, RESET */
else if (Matches3("ALTER", "TABLESPACE", MatchAny))
@@ -2300,6 +2346,17 @@ psql_completion(const char *text, int start, int end)
/* Complete "CREATE UNLOGGED" with TABLE or MATVIEW */
else if (TailMatches2("CREATE", "UNLOGGED"))
COMPLETE_WITH_LIST2("TABLE", "MATERIALIZED VIEW");
+ /* Complete PARTITION BY with RANGE ( or LIST ( or ... */
+ else if (TailMatches2("PARTITION", "BY"))
+ COMPLETE_WITH_LIST2("RANGE (", "LIST (");
+ /* If we have xxx PARTITION OF, provide a list of partitioned tables */
+ else if (TailMatches2("PARTITION", "OF"))
+ COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_partitioned_tables, "");
+ /* Limited completion support for partition bound specification */
+ else if (TailMatches3("PARTITION", "OF", MatchAny))
+ COMPLETE_WITH_CONST("FOR VALUES");
+ else if (TailMatches5("PARTITION", "OF", MatchAny, "FOR", "VALUES"))
+ COMPLETE_WITH_LIST2("FROM (", "IN (");
/* CREATE TABLESPACE */
else if (Matches3("CREATE", "TABLESPACE", MatchAny))
--
2.11.0
0003-Show-only-the-partition-key-upon-failing-to-find-a-p.patchtext/x-diff; name=0003-Show-only-the-partition-key-upon-failing-to-find-a-p.patchDownload
From 898c7191e50a6179716b1c7867f292c3722afdb8 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Mon, 13 Feb 2017 19:52:19 +0900
Subject: [PATCH 3/3] Show only the partition key upon failing to find a
partition
Currently, the whole row is shown without column names. Instead,
adopt a style similar to _bt_check_unique() in ExecFindPartition()
and show the failing key in format (key1, ...)=(val1, ...).
The key description shown in the error message is now built by the
new ExecBuildSlotPartitionKeyDescription(), which works along the
lines of BuildIndexValueDescription(), using similar rules about
what columns of the partition key to include depending on the user's
privileges to view the same.
A bunch of relevant tests are added.
---
src/backend/catalog/partition.c | 29 +++----
src/backend/executor/execMain.c | 147 ++++++++++++++++++++++++++++++-----
src/backend/utils/adt/ruleutils.c | 37 ++++++---
src/include/catalog/partition.h | 13 +++-
src/include/utils/ruleutils.h | 2 +
src/test/regress/expected/insert.out | 38 ++++++++-
src/test/regress/sql/insert.sql | 30 +++++++
7 files changed, 246 insertions(+), 50 deletions(-)
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 4bcef58763..710ce07a6f 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -140,13 +140,6 @@ static int partition_bound_bsearch(PartitionKey key,
PartitionBoundInfo boundinfo,
void *probe, bool probe_is_bound, bool *is_equal);
-/* Support get_partition_for_tuple() */
-static void FormPartitionKeyDatum(PartitionDispatch pd,
- TupleTableSlot *slot,
- EState *estate,
- Datum *values,
- bool *isnull);
-
/*
* RelationBuildPartitionDesc
* Form rel's partition descriptor
@@ -1608,7 +1601,7 @@ generate_partition_qual(Relation rel)
* the heap tuple passed in.
* ----------------
*/
-static void
+void
FormPartitionKeyDatum(PartitionDispatch pd,
TupleTableSlot *slot,
EState *estate,
@@ -1672,7 +1665,7 @@ int
get_partition_for_tuple(PartitionDispatch *pd,
TupleTableSlot *slot,
EState *estate,
- Oid *failed_at)
+ GetPartitionFailureData *gpfd)
{
PartitionDispatch parent;
Datum values[PARTITION_MAX_KEYS];
@@ -1693,13 +1686,6 @@ get_partition_for_tuple(PartitionDispatch *pd,
TupleTableSlot *myslot = parent->tupslot;
TupleConversionMap *map = parent->tupmap;
- /* Quick exit */
- if (partdesc->nparts == 0)
- {
- *failed_at = RelationGetRelid(parent->reldesc);
- return -1;
- }
-
if (myslot != NULL && map != NULL)
{
HeapTuple tuple = ExecFetchSlotTuple(slot);
@@ -1710,6 +1696,14 @@ get_partition_for_tuple(PartitionDispatch *pd,
slot = myslot;
}
+ /* Quick exit */
+ if (partdesc->nparts == 0)
+ {
+ gpfd->failed_at = parent;
+ gpfd->failed_slot = slot;
+ return -1;
+ }
+
/*
* Extract partition key from tuple. Expression evaluation machinery
* that FormPartitionKeyDatum() invokes expects ecxt_scantuple to
@@ -1774,7 +1768,8 @@ get_partition_for_tuple(PartitionDispatch *pd,
if (cur_index < 0)
{
result = -1;
- *failed_at = RelationGetRelid(parent->reldesc);
+ gpfd->failed_at = parent;
+ gpfd->failed_slot = slot;
break;
}
else if (parent->indexes[cur_index] >= 0)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index a66639178a..9f3b10e364 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -60,6 +60,7 @@
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/rls.h"
+#include "utils/ruleutils.h"
#include "utils/snapmgr.h"
#include "utils/tqual.h"
@@ -95,6 +96,10 @@ static char *ExecBuildSlotValueDescription(Oid reloid,
TupleDesc tupdesc,
Bitmapset *modifiedCols,
int maxfieldlen);
+static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
+ Datum *values,
+ bool *isnull,
+ int maxfieldlen);
static void EvalPlanQualStart(EPQState *epqstate, EState *parentestate,
Plan *planTree);
@@ -3187,33 +3192,137 @@ ExecFindPartition(ResultRelInfo *resultRelInfo, PartitionDispatch *pd,
TupleTableSlot *slot, EState *estate)
{
int result;
- Oid failed_at;
+ GetPartitionFailureData gpfd;
- result = get_partition_for_tuple(pd, slot, estate, &failed_at);
+ result = get_partition_for_tuple(pd, slot, estate, &gpfd);
if (result < 0)
{
- Relation rel = resultRelInfo->ri_RelationDesc;
+ Relation failed_rel;
+ Datum key_values[PARTITION_MAX_KEYS];
+ bool key_isnull[PARTITION_MAX_KEYS];
char *val_desc;
- Bitmapset *insertedCols,
- *updatedCols,
- *modifiedCols;
- TupleDesc tupDesc = RelationGetDescr(rel);
-
- insertedCols = GetInsertedColumns(resultRelInfo, estate);
- updatedCols = GetUpdatedColumns(resultRelInfo, estate);
- modifiedCols = bms_union(insertedCols, updatedCols);
- val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
- slot,
- tupDesc,
- modifiedCols,
- 64);
- Assert(OidIsValid(failed_at));
+ ExprContext *ecxt = GetPerTupleExprContext(estate);
+
+ failed_rel = gpfd.failed_at->reldesc;
+ ecxt->ecxt_scantuple = slot;
+ FormPartitionKeyDatum(gpfd.failed_at, gpfd.failed_slot, estate,
+ key_values, key_isnull);
+ val_desc = ExecBuildSlotPartitionKeyDescription(failed_rel,
+ key_values,
+ key_isnull,
+ 64);
+ Assert(OidIsValid(RelationGetRelid(failed_rel)));
ereport(ERROR,
(errcode(ERRCODE_CHECK_VIOLATION),
errmsg("no partition of relation \"%s\" found for row",
- get_rel_name(failed_at)),
- val_desc ? errdetail("Failing row contains %s.", val_desc) : 0));
+ RelationGetRelationName(failed_rel)),
+ val_desc ? errdetail("Partition key of the failing row contains %s.", val_desc) : 0));
}
return result;
}
+
+/*
+ * BuildSlotPartitinKeyDescription
+ *
+ * Construct a string describing the contents of the partition key in the
+ * form "(key_name, ...)=(key_value, ...)". This is currently used for
+ * building error messages when ExecFindPartition() fails to find partition
+ * for a row.
+ *
+ * Note that if the user does not have permissions to view all of the
+ * columns involved then a NULL is returned.
+ */
+static char *
+ExecBuildSlotPartitionKeyDescription(Relation rel,
+ Datum *values,
+ bool *isnull,
+ int maxfieldlen)
+{
+ StringInfoData buf;
+ PartitionKey key = RelationGetPartitionKey(rel);
+ int partnatts = get_partition_natts(key);
+ int i;
+ Oid relid = RelationGetRelid(rel);
+ AclResult aclresult;
+
+ /*
+ * Check permissions- if the user does not have access to view all of the
+ * key columns then return NULL to avoid leaking data.
+ *
+ * First check if RLS is enabled for the relation. If so, return NULL to
+ * avoid leaking data.
+ *
+ * Next we need to check table-level SELECT access and then, if there is
+ * no access there, check column-level permissions.
+ */
+
+ /* RLS check- if RLS is enabled then we don't return anything. */
+ if (check_enable_rls(relid, InvalidOid, true) == RLS_ENABLED)
+ return NULL;
+
+ /* Table-level SELECT is enough, if the user has it */
+ aclresult = pg_class_aclcheck(relid, GetUserId(), ACL_SELECT);
+ if (aclresult != ACLCHECK_OK)
+ {
+ /*
+ * No table-level access, so step through the columns in the partition
+ * key and make sure the user has SELECT rights on all of them.
+ */
+ for (i = 0; i < partnatts; i++)
+ {
+ AttrNumber attnum = get_partition_col_attnum(key, i);
+
+ /*
+ * Note that if attnum == InvalidAttrNumber, then this partition
+ * key column is an expression and we return no detail rather than
+ * try to figure out what column(s) the expression includes and
+ * if the user has SELECT rights on them.
+ */
+ if (attnum == InvalidAttrNumber ||
+ pg_attribute_aclcheck(relid, attnum, GetUserId(),
+ ACL_SELECT) != ACLCHECK_OK)
+ return NULL;
+ }
+ }
+
+ initStringInfo(&buf);
+ appendStringInfo(&buf, "(%s)=(",
+ pg_get_partkeydef_columns(relid, true));
+
+ for (i = 0; i < partnatts; i++)
+ {
+ char *val;
+ int vallen;
+
+ if (isnull[i])
+ val = "null";
+ else
+ {
+ Oid foutoid;
+ bool typisvarlena;
+
+ getTypeOutputInfo(get_partition_col_typid(key, i),
+ &foutoid, &typisvarlena);
+ val = OidOutputFunctionCall(foutoid, values[i]);
+ }
+
+ if (i > 0)
+ appendStringInfoString(&buf, ", ");
+
+ /* truncate if needed */
+ vallen = strlen(val);
+ if (vallen <= maxfieldlen)
+ appendStringInfoString(&buf, val);
+ else
+ {
+ vallen = pg_mbcliplen(val, vallen, maxfieldlen);
+ appendBinaryStringInfo(&buf, val, vallen);
+ appendStringInfoString(&buf, "...");
+ }
+ }
+
+ appendStringInfoChar(&buf, ')');
+
+ return buf.data;
+}
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index f355954b53..0c4e28fd9b 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -317,7 +317,8 @@ static char *pg_get_indexdef_worker(Oid indexrelid, int colno,
const Oid *excludeOps,
bool attrsOnly, bool showTblSpc,
int prettyFlags, bool missing_ok);
-static char *pg_get_partkeydef_worker(Oid relid, int prettyFlags);
+static char *pg_get_partkeydef_worker(Oid relid, int prettyFlags,
+ bool attrsOnly);
static char *pg_get_constraintdef_worker(Oid constraintId, bool fullCommand,
int prettyFlags, bool missing_ok);
static text *pg_get_expr_worker(text *expr, Oid relid, const char *relname,
@@ -1431,14 +1432,26 @@ pg_get_partkeydef(PG_FUNCTION_ARGS)
Oid relid = PG_GETARG_OID(0);
PG_RETURN_TEXT_P(string_to_text(pg_get_partkeydef_worker(relid,
- PRETTYFLAG_INDENT)));
+ PRETTYFLAG_INDENT,
+ false)));
+}
+
+/* Internal version that just reports the column definitions */
+char *
+pg_get_partkeydef_columns(Oid relid, bool pretty)
+{
+ int prettyFlags;
+
+ prettyFlags = pretty ? PRETTYFLAG_PAREN | PRETTYFLAG_INDENT : PRETTYFLAG_INDENT;
+ return pg_get_partkeydef_worker(relid, prettyFlags, true);
}
/*
* Internal workhorse to decompile a partition key definition.
*/
static char *
-pg_get_partkeydef_worker(Oid relid, int prettyFlags)
+pg_get_partkeydef_worker(Oid relid, int prettyFlags,
+ bool attrsOnly)
{
Form_pg_partitioned_table form;
HeapTuple tuple;
@@ -1508,17 +1521,20 @@ pg_get_partkeydef_worker(Oid relid, int prettyFlags)
switch (form->partstrat)
{
case PARTITION_STRATEGY_LIST:
- appendStringInfo(&buf, "LIST");
+ if (!attrsOnly)
+ appendStringInfo(&buf, "LIST");
break;
case PARTITION_STRATEGY_RANGE:
- appendStringInfo(&buf, "RANGE");
+ if (!attrsOnly)
+ appendStringInfo(&buf, "RANGE");
break;
default:
elog(ERROR, "unexpected partition strategy: %d",
(int) form->partstrat);
}
- appendStringInfo(&buf, " (");
+ if (!attrsOnly)
+ appendStringInfo(&buf, " (");
sep = "";
for (keyno = 0; keyno < form->partnatts; keyno++)
{
@@ -1561,14 +1577,17 @@ pg_get_partkeydef_worker(Oid relid, int prettyFlags)
/* Add collation, if not default for column */
partcoll = partcollation->values[keyno];
- if (OidIsValid(partcoll) && partcoll != keycolcollation)
+ if (!attrsOnly && OidIsValid(partcoll) && partcoll != keycolcollation)
appendStringInfo(&buf, " COLLATE %s",
generate_collation_name((partcoll)));
/* Add the operator class name, if not default */
- get_opclass_name(partclass->values[keyno], keycoltype, &buf);
+ if (!attrsOnly)
+ get_opclass_name(partclass->values[keyno], keycoltype, &buf);
}
- appendStringInfoChar(&buf, ')');
+
+ if (!attrsOnly)
+ appendStringInfoChar(&buf, ')');
/* Clean up */
ReleaseSysCache(tuple);
diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h
index b195d1a5ab..698b388c46 100644
--- a/src/include/catalog/partition.h
+++ b/src/include/catalog/partition.h
@@ -70,6 +70,12 @@ typedef struct PartitionDispatchData
typedef struct PartitionDispatchData *PartitionDispatch;
+typedef struct GetPartitionFailureData
+{
+ PartitionDispatch failed_at;
+ TupleTableSlot *failed_slot;
+} GetPartitionFailureData;
+
extern void RelationBuildPartitionDesc(Relation relation);
extern bool partition_bounds_equal(PartitionKey key,
PartitionBoundInfo p1, PartitionBoundInfo p2);
@@ -85,8 +91,13 @@ extern List *RelationGetPartitionQual(Relation rel);
extern PartitionDispatch *RelationGetPartitionDispatchInfo(Relation rel,
int lockmode, int *num_parted,
List **leaf_part_oids);
+extern void FormPartitionKeyDatum(PartitionDispatch pd,
+ TupleTableSlot *slot,
+ EState *estate,
+ Datum *values,
+ bool *isnull);
extern int get_partition_for_tuple(PartitionDispatch *pd,
TupleTableSlot *slot,
EState *estate,
- Oid *failed_at);
+ GetPartitionFailureData *gpfd);
#endif /* PARTITION_H */
diff --git a/src/include/utils/ruleutils.h b/src/include/utils/ruleutils.h
index 3e8aad97e2..42fc872c4a 100644
--- a/src/include/utils/ruleutils.h
+++ b/src/include/utils/ruleutils.h
@@ -21,6 +21,8 @@
extern char *pg_get_indexdef_string(Oid indexrelid);
extern char *pg_get_indexdef_columns(Oid indexrelid, bool pretty);
+extern char *pg_get_partkeydef_columns(Oid relid, bool pretty);
+
extern char *pg_get_constraintdef_command(Oid constraintId);
extern char *deparse_expression(Node *expr, List *dpcontext,
bool forceprefix, bool showimplicit);
diff --git a/src/test/regress/expected/insert.out b/src/test/regress/expected/insert.out
index 81af3ef497..083d977a95 100644
--- a/src/test/regress/expected/insert.out
+++ b/src/test/regress/expected/insert.out
@@ -234,14 +234,14 @@ insert into part_ee_ff2 values ('ff', 11);
-- fail
insert into range_parted values ('a', 0);
ERROR: no partition of relation "range_parted" found for row
-DETAIL: Failing row contains (a, 0).
+DETAIL: Partition key of the failing row contains (a, (b + 0))=(a, 0).
-- ok
insert into range_parted values ('a', 1);
insert into range_parted values ('a', 10);
-- fail
insert into range_parted values ('a', 20);
ERROR: no partition of relation "range_parted" found for row
-DETAIL: Failing row contains (a, 20).
+DETAIL: Partition key of the failing row contains (a, (b + 0))=(a, 20).
-- ok
insert into range_parted values ('b', 1);
insert into range_parted values ('b', 10);
@@ -265,10 +265,10 @@ insert into list_parted (a) values ('aA');
-- fail (partition of part_ee_ff not found in both cases)
insert into list_parted values ('EE', 0);
ERROR: no partition of relation "part_ee_ff" found for row
-DETAIL: Failing row contains (EE, 0).
+DETAIL: Partition key of the failing row contains (b)=(0).
insert into part_ee_ff values ('EE', 0);
ERROR: no partition of relation "part_ee_ff" found for row
-DETAIL: Failing row contains (EE, 0).
+DETAIL: Partition key of the failing row contains (b)=(0).
-- ok
insert into list_parted values ('EE', 1);
insert into part_ee_ff values ('EE', 10);
@@ -351,6 +351,10 @@ select tableoid::regclass, * from p;
p11 | 1 | 2
(1 row)
+-- check that proper message is shown after failure to route through p1
+insert into p values (1, 5);
+ERROR: no partition of relation "p1" found for row
+DETAIL: Partition key of the failing row contains ((b + 0))=(1).
truncate p;
alter table p add constraint check_b check (b = 3);
-- check that correct input row is shown when constraint check_b fails on p11
@@ -386,5 +390,31 @@ with ins (a, b, c) as
p4 | 1 | 30 | 39
(5 rows)
+-- check that message shown after failure to find a partition shows the
+-- appropriate key description (or none) in various situations
+create table key_desc (a int, b int) partition by list ((a+0));
+create table key_desc_1 partition of key_desc for values in (1) partition by range (b);
+create user someone_else;
+grant select (a) on key_desc_1 to someone_else;
+grant insert on key_desc to someone_else;
+set role someone_else;
+-- no key description is shown
+insert into key_desc values (1, 1);
+ERROR: no partition of relation "key_desc_1" found for row
+reset role;
+grant select (b) on key_desc_1 to someone_else;
+set role someone_else;
+-- key description (b)=(1) is now shown
+insert into key_desc values (1, 1);
+ERROR: no partition of relation "key_desc_1" found for row
+DETAIL: Partition key of the failing row contains (b)=(1).
+-- key description is not shown if key contains expression
+insert into key_desc values (2, 1);
+ERROR: no partition of relation "key_desc" found for row
+reset role;
+revoke all on key_desc from someone_else;
+revoke all on key_desc_1 from someone_else;
+drop role someone_else;
+drop table key_desc, key_desc_1;
-- cleanup
drop table p, p1, p11, p12, p2, p3, p4;
diff --git a/src/test/regress/sql/insert.sql b/src/test/regress/sql/insert.sql
index 454e1ce2e7..3f982b6449 100644
--- a/src/test/regress/sql/insert.sql
+++ b/src/test/regress/sql/insert.sql
@@ -215,6 +215,9 @@ alter table p attach partition p1 for values from (1, 2) to (1, 10);
insert into p values (1, 2);
select tableoid::regclass, * from p;
+-- check that proper message is shown after failure to route through p1
+insert into p values (1, 5);
+
truncate p;
alter table p add constraint check_b check (b = 3);
-- check that correct input row is shown when constraint check_b fails on p11
@@ -240,5 +243,32 @@ with ins (a, b, c) as
(insert into p (b, a) select s.a, 1 from generate_series(2, 39) s(a) returning tableoid::regclass, *)
select a, b, min(c), max(c) from ins group by a, b order by 1;
+-- check that message shown after failure to find a partition shows the
+-- appropriate key description (or none) in various situations
+create table key_desc (a int, b int) partition by list ((a+0));
+create table key_desc_1 partition of key_desc for values in (1) partition by range (b);
+
+create user someone_else;
+grant select (a) on key_desc_1 to someone_else;
+grant insert on key_desc to someone_else;
+
+set role someone_else;
+-- no key description is shown
+insert into key_desc values (1, 1);
+
+reset role;
+grant select (b) on key_desc_1 to someone_else;
+set role someone_else;
+-- key description (b)=(1) is now shown
+insert into key_desc values (1, 1);
+
+-- key description is not shown if key contains expression
+insert into key_desc values (2, 1);
+reset role;
+revoke all on key_desc from someone_else;
+revoke all on key_desc_1 from someone_else;
+drop role someone_else;
+drop table key_desc, key_desc_1;
+
-- cleanup
drop table p, p1, p11, p12, p2, p3, p4;
--
2.11.0
On 2017/02/16 2:08, Robert Haas wrote:
On Wed, Feb 15, 2017 at 11:34 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:I think new-style partitioning is supposed to consider each partition as
an implementation detail of the table; the fact that you can manipulate
partitions separately does not really mean that they are their own
independent object. You don't stop to think "do I really want to drop
the TOAST table attached to this main table?" and attach a CASCADE
clause if so. You just drop the main table, and the toast one is
dropped automatically. I think new-style partitions should behave
equivalently.That's a reasonable point of view. I'd like to get some more opinions
on this topic. I'm happy to have us do whatever most people want, but
I'm worried that having table inheritance and table partitioning work
differently will be create confusion. I'm also suspicious that there
may be some implementation difficulties. On the hand, it does seem a
little silly to say that DROP TABLE partitioned_table should always
fail except in the degenerate case where there are no partitions, so
maybe changing it is for the best.
So I count more than a few votes saying that we should be able to DROP
partitioned tables without specifying CASCADE.
I tried to implement that using the attached patch by having
StoreCatalogInheritance1() create DEPENDENCY_AUTO dependency between
parent and child if the child is a partition, instead of DEPENDENCY_NORMAL
that would otherwise be created. Now it seems that that is one way of
making sure that partitions are dropped when the root partitioned table is
dropped, not sure if the best; why create the pg_depend entries at all one
might ask. I chose it for now because that's the one with fewest lines of
change. Adjusted regression tests as well, since we recently tweaked
tests [1]https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c397814 to work around the irregularities of test output when using CASCADE.
Thanks,
Amit
[1]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c397814
Attachments:
0001-Allow-dropping-partitioned-table-without-CASCADE.patchtext/x-diff; name=0001-Allow-dropping-partitioned-table-without-CASCADE.patchDownload
From 3b75f82b4f47e840074d0209323d8ab1f39ed676 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Thu, 16 Feb 2017 15:56:44 +0900
Subject: [PATCH] Allow dropping partitioned table without CASCADE
Currently, a normal dependency is created between a inheritance
parent and child when creating the child. That means one must
specify CASCADE to drop the parent table if a child table exists.
When creating partitions as inheritance children, create auto
dependency instead, so that partitions are dropped automatically
when the parent is dropped i.e., without specifying CASCADE.
---
src/backend/commands/tablecmds.c | 26 ++++++++++++++++++--------
src/test/regress/expected/alter_table.out | 10 ++++------
src/test/regress/expected/create_table.out | 9 ++-------
src/test/regress/expected/inherit.out | 18 ------------------
src/test/regress/expected/insert.out | 7 ++-----
src/test/regress/expected/update.out | 5 -----
src/test/regress/sql/alter_table.sql | 10 ++++------
src/test/regress/sql/create_table.sql | 9 ++-------
src/test/regress/sql/insert.sql | 7 ++-----
9 files changed, 34 insertions(+), 67 deletions(-)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index f33aa70da6..27b6556a71 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -289,9 +289,11 @@ static List *MergeAttributes(List *schema, List *supers, char relpersistence,
static bool MergeCheckConstraint(List *constraints, char *name, Node *expr);
static void MergeAttributesIntoExisting(Relation child_rel, Relation parent_rel);
static void MergeConstraintsIntoExisting(Relation child_rel, Relation parent_rel);
-static void StoreCatalogInheritance(Oid relationId, List *supers);
+static void StoreCatalogInheritance(Oid relationId, List *supers,
+ bool child_is_partition);
static void StoreCatalogInheritance1(Oid relationId, Oid parentOid,
- int16 seqNumber, Relation inhRelation);
+ int16 seqNumber, Relation inhRelation,
+ bool child_is_partition);
static int findAttrByName(const char *attributeName, List *schema);
static void AlterIndexNamespaces(Relation classRel, Relation rel,
Oid oldNspOid, Oid newNspOid, ObjectAddresses *objsMoved);
@@ -730,7 +732,7 @@ DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId,
typaddress);
/* Store inheritance information for new rel. */
- StoreCatalogInheritance(relationId, inheritOids);
+ StoreCatalogInheritance(relationId, inheritOids, stmt->partbound != NULL);
/*
* We must bump the command counter to make the newly-created relation
@@ -2245,7 +2247,8 @@ MergeCheckConstraint(List *constraints, char *name, Node *expr)
* supers is a list of the OIDs of the new relation's direct ancestors.
*/
static void
-StoreCatalogInheritance(Oid relationId, List *supers)
+StoreCatalogInheritance(Oid relationId, List *supers,
+ bool child_is_partition)
{
Relation relation;
int16 seqNumber;
@@ -2275,7 +2278,8 @@ StoreCatalogInheritance(Oid relationId, List *supers)
{
Oid parentOid = lfirst_oid(entry);
- StoreCatalogInheritance1(relationId, parentOid, seqNumber, relation);
+ StoreCatalogInheritance1(relationId, parentOid, seqNumber, relation,
+ child_is_partition);
seqNumber++;
}
@@ -2288,7 +2292,8 @@ StoreCatalogInheritance(Oid relationId, List *supers)
*/
static void
StoreCatalogInheritance1(Oid relationId, Oid parentOid,
- int16 seqNumber, Relation inhRelation)
+ int16 seqNumber, Relation inhRelation,
+ bool child_is_partition)
{
TupleDesc desc = RelationGetDescr(inhRelation);
Datum values[Natts_pg_inherits];
@@ -2322,7 +2327,10 @@ StoreCatalogInheritance1(Oid relationId, Oid parentOid,
childobject.objectId = relationId;
childobject.objectSubId = 0;
- recordDependencyOn(&childobject, &parentobject, DEPENDENCY_NORMAL);
+ if (!child_is_partition)
+ recordDependencyOn(&childobject, &parentobject, DEPENDENCY_NORMAL);
+ else
+ recordDependencyOn(&childobject, &parentobject, DEPENDENCY_AUTO);
/*
* Post creation hook of this inheritance. Since object_access_hook
@@ -10749,7 +10757,9 @@ CreateInheritance(Relation child_rel, Relation parent_rel)
StoreCatalogInheritance1(RelationGetRelid(child_rel),
RelationGetRelid(parent_rel),
inhseqno + 1,
- catalogRelation);
+ catalogRelation,
+ parent_rel->rd_rel->relkind ==
+ RELKIND_PARTITIONED_TABLE);
/* Now we're done with pg_inherits */
heap_close(catalogRelation, RowExclusiveLock);
diff --git a/src/test/regress/expected/alter_table.out b/src/test/regress/expected/alter_table.out
index b0e80a7788..0d1554a73e 100644
--- a/src/test/regress/expected/alter_table.out
+++ b/src/test/regress/expected/alter_table.out
@@ -3334,10 +3334,8 @@ ALTER TABLE list_parted2 DROP COLUMN b;
ERROR: cannot drop column named in partition key
ALTER TABLE list_parted2 ALTER COLUMN b TYPE text;
ERROR: cannot alter type of column named in partition key
--- cleanup: avoid using CASCADE
-DROP TABLE list_parted, part_1;
-DROP TABLE list_parted2, part_2, part_5, part_5_a;
-DROP TABLE range_parted, part1, part2;
+-- cleanup
+DROP TABLE list_parted, list_parted2, range_parted;
-- more tests for certain multi-level partitioning scenarios
create table p (a int, b int) partition by range (a, b);
create table p1 (b int, a int not null) partition by range (b);
@@ -3366,5 +3364,5 @@ insert into p1 (a, b) values (2, 3);
-- check that partition validation scan correctly detects violating rows
alter table p attach partition p1 for values from (1, 2) to (1, 10);
ERROR: partition constraint is violated by some row
--- cleanup: avoid using CASCADE
-drop table p, p1, p11;
+-- cleanup
+drop table p;
diff --git a/src/test/regress/expected/create_table.out b/src/test/regress/expected/create_table.out
index fc92cd92dd..bfad755e32 100644
--- a/src/test/regress/expected/create_table.out
+++ b/src/test/regress/expected/create_table.out
@@ -659,10 +659,5 @@ Check constraints:
"check_a" CHECK (length(a) > 0)
Number of partitions: 3 (Use \d+ to list them.)
--- cleanup: avoid using CASCADE
-DROP TABLE parted, part_a, part_b, part_c, part_c_1_10;
-DROP TABLE list_parted, part_1, part_2, part_null;
-DROP TABLE range_parted;
-DROP TABLE list_parted2, part_ab, part_null_z;
-DROP TABLE range_parted2, part0, part1, part2, part3;
-DROP TABLE range_parted3, part00, part10, part11, part12;
+-- cleanup
+DROP TABLE parted, list_parted, range_parted, list_parted2, range_parted2, range_parted3;
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
index a8c8b28a75..623aa1db93 100644
--- a/src/test/regress/expected/inherit.out
+++ b/src/test/regress/expected/inherit.out
@@ -1844,22 +1844,4 @@ explain (costs off) select * from range_list_parted where a >= 30;
(11 rows)
drop table list_parted cascade;
-NOTICE: drop cascades to 3 other objects
-DETAIL: drop cascades to table part_ab_cd
-drop cascades to table part_ef_gh
-drop cascades to table part_null_xy
drop table range_list_parted cascade;
-NOTICE: drop cascades to 13 other objects
-DETAIL: drop cascades to table part_1_10
-drop cascades to table part_1_10_ab
-drop cascades to table part_1_10_cd
-drop cascades to table part_10_20
-drop cascades to table part_10_20_ab
-drop cascades to table part_10_20_cd
-drop cascades to table part_21_30
-drop cascades to table part_21_30_ab
-drop cascades to table part_21_30_cd
-drop cascades to table part_40_inf
-drop cascades to table part_40_inf_ab
-drop cascades to table part_40_inf_cd
-drop cascades to table part_40_inf_null
diff --git a/src/test/regress/expected/insert.out b/src/test/regress/expected/insert.out
index 81af3ef497..31cfa4e76e 100644
--- a/src/test/regress/expected/insert.out
+++ b/src/test/regress/expected/insert.out
@@ -314,10 +314,7 @@ select tableoid::regclass::text, a, min(b) as min_b, max(b) as max_b from list_p
(9 rows)
-- cleanup
-drop table part1, part2, part3, part4, range_parted;
-drop table part_ee_ff3_1, part_ee_ff3_2, part_ee_ff1, part_ee_ff2, part_ee_ff3;
-drop table part_ee_ff, part_gg2_2, part_gg2_1, part_gg2, part_gg1, part_gg;
-drop table part_aa_bb, part_cc_dd, part_null, list_parted;
+drop table range_parted, list_parted;
-- more tests for certain multi-level partitioning scenarios
create table p (a int, b int) partition by range (a, b);
create table p1 (b int not null, a int not null) partition by range ((b+0));
@@ -387,4 +384,4 @@ with ins (a, b, c) as
(5 rows)
-- cleanup
-drop table p, p1, p11, p12, p2, p3, p4;
+drop table p;
diff --git a/src/test/regress/expected/update.out b/src/test/regress/expected/update.out
index a1e9255450..af0d5bfffe 100644
--- a/src/test/regress/expected/update.out
+++ b/src/test/regress/expected/update.out
@@ -220,8 +220,3 @@ DETAIL: Failing row contains (b, 9).
update range_parted set b = b + 1 where b = 10;
-- cleanup
drop table range_parted cascade;
-NOTICE: drop cascades to 4 other objects
-DETAIL: drop cascades to table part_a_1_a_10
-drop cascades to table part_a_10_a_20
-drop cascades to table part_b_1_b_10
-drop cascades to table part_b_10_b_20
diff --git a/src/test/regress/sql/alter_table.sql b/src/test/regress/sql/alter_table.sql
index 7513769359..0787e785ce 100644
--- a/src/test/regress/sql/alter_table.sql
+++ b/src/test/regress/sql/alter_table.sql
@@ -2194,10 +2194,8 @@ ALTER TABLE part_2 INHERIT inh_test;
ALTER TABLE list_parted2 DROP COLUMN b;
ALTER TABLE list_parted2 ALTER COLUMN b TYPE text;
--- cleanup: avoid using CASCADE
-DROP TABLE list_parted, part_1;
-DROP TABLE list_parted2, part_2, part_5, part_5_a;
-DROP TABLE range_parted, part1, part2;
+-- cleanup
+DROP TABLE list_parted, list_parted2, range_parted;
-- more tests for certain multi-level partitioning scenarios
create table p (a int, b int) partition by range (a, b);
@@ -2222,5 +2220,5 @@ insert into p1 (a, b) values (2, 3);
-- check that partition validation scan correctly detects violating rows
alter table p attach partition p1 for values from (1, 2) to (1, 10);
--- cleanup: avoid using CASCADE
-drop table p, p1, p11;
+-- cleanup
+drop table p;
diff --git a/src/test/regress/sql/create_table.sql b/src/test/regress/sql/create_table.sql
index 5f25c436ee..553c084a10 100644
--- a/src/test/regress/sql/create_table.sql
+++ b/src/test/regress/sql/create_table.sql
@@ -593,10 +593,5 @@ CREATE TABLE part_c_1_10 PARTITION OF part_c FOR VALUES FROM (1) TO (10);
-- returned.
\d parted
--- cleanup: avoid using CASCADE
-DROP TABLE parted, part_a, part_b, part_c, part_c_1_10;
-DROP TABLE list_parted, part_1, part_2, part_null;
-DROP TABLE range_parted;
-DROP TABLE list_parted2, part_ab, part_null_z;
-DROP TABLE range_parted2, part0, part1, part2, part3;
-DROP TABLE range_parted3, part00, part10, part11, part12;
+-- cleanup
+DROP TABLE parted, list_parted, range_parted, list_parted2, range_parted2, range_parted3;
diff --git a/src/test/regress/sql/insert.sql b/src/test/regress/sql/insert.sql
index 454e1ce2e7..dfdc24eba8 100644
--- a/src/test/regress/sql/insert.sql
+++ b/src/test/regress/sql/insert.sql
@@ -186,10 +186,7 @@ insert into list_parted (b) values (1);
select tableoid::regclass::text, a, min(b) as min_b, max(b) as max_b from list_parted group by 1, 2 order by 1;
-- cleanup
-drop table part1, part2, part3, part4, range_parted;
-drop table part_ee_ff3_1, part_ee_ff3_2, part_ee_ff1, part_ee_ff2, part_ee_ff3;
-drop table part_ee_ff, part_gg2_2, part_gg2_1, part_gg2, part_gg1, part_gg;
-drop table part_aa_bb, part_cc_dd, part_null, list_parted;
+drop table range_parted, list_parted;
-- more tests for certain multi-level partitioning scenarios
create table p (a int, b int) partition by range (a, b);
@@ -241,4 +238,4 @@ with ins (a, b, c) as
select a, b, min(c), max(c) from ins group by a, b order by 1;
-- cleanup
-drop table p, p1, p11, p12, p2, p3, p4;
+drop table p;
--
2.11.0
In the documentation of ALTER TABLE ... ATTACH PARTITION its implicit
that partition name specified should be the name of the existing table
being attached. Same is the case with DETACH PARTITION where its
implicit that the detached partition becomes a stand-alone table with
same name as the partition being detached. I think this needs to be
more explicit. PFA patch on those lines.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Forgot to attach the patch. Thanks Rajkumar for notifying me.
On Fri, Feb 17, 2017 at 11:18 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
In the documentation of ALTER TABLE ... ATTACH PARTITION its implicit
that partition name specified should be the name of the existing table
being attached. Same is the case with DETACH PARTITION where its
implicit that the detached partition becomes a stand-alone table with
same name as the partition being detached. I think this needs to be
more explicit. PFA patch on those lines.--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Attachments:
alter_table_part_doc.patchapplication/octet-stream; name=alter_table_part_doc.patchDownload
diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
index da431f8..6d10f54 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -718,8 +718,10 @@ ALTER TABLE [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
<term><literal>ATTACH PARTITION</literal> <replaceable class="PARAMETER">partition_name</replaceable> FOR VALUES <replaceable class="PARAMETER">partition_bound_spec</replaceable></term>
<listitem>
<para>
- This form attaches an existing table (which might itself be partitioned)
- as a partition of the target table using the same syntax for
+ This form attaches an existing table with name
+ <replaceable class="PARAMETER">partition_name</replaceable>
+ (which might itself be partitioned) as a partition of the target table
+ using the same syntax for
<replaceable class="PARAMETER">partition_bound_spec</replaceable> as
<xref linkend="sql-createtable">. The partition bound specification
must correspond to the partitioning strategy and partition key of the
@@ -757,8 +759,9 @@ ALTER TABLE [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
<listitem>
<para>
This form detaches specified partition of the target table. The detached
- partition continues to exist as a standalone table, but no longer has any
- ties to the table from which it was detached.
+ partition continues to exist as a standalone table with name
+ <replaceable class="PARAMETER">partition_name</replaceable>, but no
+ longer has any ties to the partitioned table from which it was detached.
</para>
</listitem>
</varlistentry>
On Fri, Feb 17, 2017 at 11:25 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
Forgot to attach the patch. Thanks Rajkumar for notifying me.
I think this is overexplaining what is anyway obvious.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Feb 16, 2017 at 7:15 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
I think 0001 is better than the status quo, but I'm wondering whether
we should try to do something slightly different. Maybe it should
always work for the child table to specify neither WITH OIDS nor
WITHOUT OIDS, but if you do specify one of them then it has to be the
one that matches the parent partitioned table? With this patch, IIUC,
WITH OIDS is allowed only if the parent has the same, but WITHOUT OIDS
is allowed (but ignored) regardless of the parent setting.With the patch, one can always specify (or not) WITH/WITHOUT OIDS when
creating partitions. If WITH OIDS is specified and the parent doesn't
have OIDs, then an error is raised. Then just like with normal
inheritance, WITHOUT OIDS specification for a partition will be
*overridden* if the parent has OIDs. By the way, CREATE TABLE page says
this about inheritance and OIDS:(If the new table inherits from any tables that have OIDs, then
<literal>OIDS=TRUE</> is forced even if the command says
<literal>OIDS=FALSE</>.Hopefully it's clear to someone reading "If the table inherits from any
tables ..." that it also refers to creating partition of a partitioned table.
I rewrote this to be a bit more explicit and committed it. I noticed
that there was some duplication here: the old text said both this:
A partition cannot have columns other than those inherited from the parent.
And also this just a little later in the same page:
A partition must have the same column names and types as the table of
which it is a partition.
The second wording seemed better, so I moved that statement a little
higher up and rejiggered the wording to be super-clear about OIDs.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Feb 16, 2017 at 12:43 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2017/02/16 2:08, Robert Haas wrote:
On Wed, Feb 15, 2017 at 11:34 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:I think new-style partitioning is supposed to consider each partition as
an implementation detail of the table; the fact that you can manipulate
partitions separately does not really mean that they are their own
independent object. You don't stop to think "do I really want to drop
the TOAST table attached to this main table?" and attach a CASCADE
clause if so. You just drop the main table, and the toast one is
dropped automatically. I think new-style partitions should behave
equivalently.That's a reasonable point of view. I'd like to get some more opinions
on this topic. I'm happy to have us do whatever most people want, but
I'm worried that having table inheritance and table partitioning work
differently will be create confusion. I'm also suspicious that there
may be some implementation difficulties. On the hand, it does seem a
little silly to say that DROP TABLE partitioned_table should always
fail except in the degenerate case where there are no partitions, so
maybe changing it is for the best.So I count more than a few votes saying that we should be able to DROP
partitioned tables without specifying CASCADE.I tried to implement that using the attached patch by having
StoreCatalogInheritance1() create DEPENDENCY_AUTO dependency between
parent and child if the child is a partition, instead of DEPENDENCY_NORMAL
that would otherwise be created. Now it seems that that is one way of
making sure that partitions are dropped when the root partitioned table is
dropped, not sure if the best; why create the pg_depend entries at all one
might ask. I chose it for now because that's the one with fewest lines of
change. Adjusted regression tests as well, since we recently tweaked
tests [1] to work around the irregularities of test output when using CASCADE.
Could you possibly post this on a new thread with a reference back to
this one? The number of patches on this one is getting a bit hard to
track, and some people may be under the misimpression that this one is
just about documentation.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 15 February 2017 at 15:46, Robert Haas <robertmhaas@gmail.com> wrote:
It leaves me asking what else is missing.
There is certainly a lot of room for improvement here but I don't
understand your persistent negativity about what's been done thus far.
I think it's pretty clearly a huge step forward, and I think Amit
deserves a ton of credit for making it happen. The improvements in
bulk loading performance alone are stupendous. You apparently have
the idea that somebody could have written an even larger patch that
solved even more problems at once, but this was already a really big
patch, and IMHO quite a good one.
Please explain these personal comments against me.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/02/20 1:04, Robert Haas wrote:
On Thu, Feb 16, 2017 at 12:43 PM, Amit Langote wrote:
So I count more than a few votes saying that we should be able to DROP
partitioned tables without specifying CASCADE.I tried to implement that using the attached patch by having
StoreCatalogInheritance1() create DEPENDENCY_AUTO dependency between
parent and child if the child is a partition, instead of DEPENDENCY_NORMAL
that would otherwise be created. Now it seems that that is one way of
making sure that partitions are dropped when the root partitioned table is
dropped, not sure if the best; why create the pg_depend entries at all one
might ask. I chose it for now because that's the one with fewest lines of
change. Adjusted regression tests as well, since we recently tweaked
tests [1] to work around the irregularities of test output when using CASCADE.Could you possibly post this on a new thread with a reference back to
this one? The number of patches on this one is getting a bit hard to
track, and some people may be under the misimpression that this one is
just about documentation.
Sorry about that. Sent the above message and the patch in a new thread
titled "dropping partitioned tables without CASCADE" [1]/messages/by-id/6c420206-45d7-3f56-8325-4bd7b76483ba@lab.ntt.co.jp.
Thanks,
Amit
[1]: /messages/by-id/6c420206-45d7-3f56-8325-4bd7b76483ba@lab.ntt.co.jp
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Feb 20, 2017 at 2:09 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 15 February 2017 at 15:46, Robert Haas <robertmhaas@gmail.com> wrote:
It leaves me asking what else is missing.
There is certainly a lot of room for improvement here but I don't
understand your persistent negativity about what's been done thus far.
I think it's pretty clearly a huge step forward, and I think Amit
deserves a ton of credit for making it happen. The improvements in
bulk loading performance alone are stupendous. You apparently have
the idea that somebody could have written an even larger patch that
solved even more problems at once, but this was already a really big
patch, and IMHO quite a good one.Please explain these personal comments against me.
Several of your emails, including your first post to this thread,
seemed to me to be quite negative about the state of this feature. I
don't think that's warranted, though perhaps I am misreading your
tone, as I have been known to do. I also don't think that expressing
the opinion that the feature is better than you're giving it credit
for is a personal comment against you. Where exactly do you see a
personal comment against you in what I wrote?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Feb 15, 2017 at 12:08:19PM -0500, Robert Haas wrote:
On Wed, Feb 15, 2017 at 11:34 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:I think new-style partitioning is supposed to consider each partition as
an implementation detail of the table; the fact that you can manipulate
partitions separately does not really mean that they are their own
independent object. You don't stop to think "do I really want to drop
the TOAST table attached to this main table?" and attach a CASCADE
clause if so. You just drop the main table, and the toast one is
dropped automatically. I think new-style partitions should behave
equivalently.That's a reasonable point of view. I'd like to get some more opinions
on this topic. I'm happy to have us do whatever most people want, but
I'm worried that having table inheritance and table partitioning work
differently will be create confusion. I'm also suspicious that there
may be some implementation difficulties. On the hand, it does seem a
little silly to say that DROP TABLE partitioned_table should always
fail except in the degenerate case where there are no partitions, so
maybe changing it is for the best.
I think we have a precedent for this, and that is the SERIAL data type,
which is really just a macro on top of CREATE SEQUENCE and DEFAULT
nextval() using the sequence. You can manipulate the sequence and the
DEFAULT separately, but if you drop the table the sequence is dropped to
automatically.
This seems like an instructive example of how we have bundled behavior
together in the past in a logical and easy-to-understand way. Of
course, their might be some technical limitations that prevent us from
using this approach, and I would be interested in hearing those details.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Feb 20, 2017 at 02:37:44PM +0530, Robert Haas wrote:
On Mon, Feb 20, 2017 at 2:09 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 15 February 2017 at 15:46, Robert Haas <robertmhaas@gmail.com> wrote:
It leaves me asking what else is missing.
There is certainly a lot of room for improvement here but I don't
understand your persistent negativity about what's been done thus far.
I think it's pretty clearly a huge step forward, and I think Amit
deserves a ton of credit for making it happen. The improvements in
bulk loading performance alone are stupendous. You apparently have
the idea that somebody could have written an even larger patch that
solved even more problems at once, but this was already a really big
patch, and IMHO quite a good one.Please explain these personal comments against me.
Several of your emails, including your first post to this thread,
seemed to me to be quite negative about the state of this feature. I
don't think that's warranted, though perhaps I am misreading your
tone, as I have been known to do. I also don't think that expressing
the opinion that the feature is better than you're giving it credit
for is a personal comment against you. Where exactly do you see a
personal comment against you in what I wrote?
I have to admit my reaction was similar to Simon's, meaning that the
lack of docs is a problem, and that the limitations are kind of a
surprise, and I wonder what other surprises there are.
I am thinking this is a result of small teams, often from the same
company, working on a features in isolation and then making them public.
It is often not clear what decisions were made and why. The idea that
unique indexes on a parent table can't guarantee uniqueness across child
tables is both a surprise, and obvious once stated.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/02/16 10:45, Amit Langote wrote:
Also attaching 0002 (unchanged) for tab-completion support for the new
partitioning syntax.
Robert already spawned a new thread titled "tab completion for
partitioning" for this [0]/messages/by-id/CA+TgmobYOj=A8GesiEs_V2Wq46-_w0+7MOwPiNWC+iuzJ-uWjA@mail.gmail.com.
0003 changes how ExecFindPartition() shows the row for which
get_partition_for_tuple() failed to find a partition. As Simon commented
upthread, we should show just the partition key, not the whole row in the
error DETAIL. So the DETAIL now looks like what's shown by
_bt_check_unique() upon uniqueness violation:DETAIL: Partition key of the failing row contains (key1, key2, ...)=(val1,
val2, ...)The rules about which columns to show or whether to show the DETAIL at all
are similar to those in BuildIndexValueDescription():- if user has SELECT privilege on the whole table, simply go ahead
- if user doesn't have SELECT privilege on the table, check that they
can see all the columns in the key (no point in showing partial key);
however abort on finding an expression for which we don't try finding
out privilege situation of whatever columns may be in the expression
I posted this patch in a new thread titled "error detail when partition
not found" [1]/messages/by-id/9f9dc7ae-14f0-4a25-5485-964d9bfc19bd@lab.ntt.co.jp.
Thanks,
Amit
[0]: /messages/by-id/CA+TgmobYOj=A8GesiEs_V2Wq46-_w0+7MOwPiNWC+iuzJ-uWjA@mail.gmail.com
/messages/by-id/CA+TgmobYOj=A8GesiEs_V2Wq46-_w0+7MOwPiNWC+iuzJ-uWjA@mail.gmail.com
[1]: /messages/by-id/9f9dc7ae-14f0-4a25-5485-964d9bfc19bd@lab.ntt.co.jp
/messages/by-id/9f9dc7ae-14f0-4a25-5485-964d9bfc19bd@lab.ntt.co.jp
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Feb 21, 2017 at 2:03 AM, Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Feb 20, 2017 at 02:37:44PM +0530, Robert Haas wrote:
On Mon, Feb 20, 2017 at 2:09 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On 15 February 2017 at 15:46, Robert Haas <robertmhaas@gmail.com> wrote:
It leaves me asking what else is missing.
There is certainly a lot of room for improvement here but I don't
understand your persistent negativity about what's been done thus far.
I think it's pretty clearly a huge step forward, and I think Amit
deserves a ton of credit for making it happen. The improvements in
bulk loading performance alone are stupendous. You apparently have
the idea that somebody could have written an even larger patch that
solved even more problems at once, but this was already a really big
patch, and IMHO quite a good one.Please explain these personal comments against me.
Several of your emails, including your first post to this thread,
seemed to me to be quite negative about the state of this feature. I
don't think that's warranted, though perhaps I am misreading your
tone, as I have been known to do. I also don't think that expressing
the opinion that the feature is better than you're giving it credit
for is a personal comment against you. Where exactly do you see a
personal comment against you in what I wrote?I have to admit my reaction was similar to Simon's, meaning that the
lack of docs is a problem, and that the limitations are kind of a
surprise, and I wonder what other surprises there are.I am thinking this is a result of small teams, often from the same
company, working on a features in isolation and then making them public.
I agree that this is result of small teams. The partitioning feature
encompasses features like global indexes, which is large in
themselves. Usually, in a company many teams would be working on
different sub-features in the same release schedule. But that's not
the case here. We have to add sub-features in multiple releases. That
might be the reason behind some of the current limitations.
It is often not clear what decisions were made and why.
Amit Langote submitted the patch sometime in August 2015, which
certainly didn't look like a well-thought design, certainly not a
product of 'long cooking' within his company. It was more
experimental. (Obviously he had background of many earlier discussions
on partitioning, that all happened on hackers.) Since then all the
discussion is on hackers; all decisions were made during the
discussion. While what you are saying may be true with other patches,
I am not sure whether it's true with this work.
The idea that
unique indexes on a parent table can't guarantee uniqueness across child
tables is both a surprise, and obvious once stated.
I think, that's a limitation till we implement global indexes. But
nothing in the current implementation stops us from implementing it.
In fact, I remember, a reply from Robert to another thread on
partitioning started in parallel to Amit's thread had explained some
of these design decisions. I am unable to find link to that exact
reply though.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/02/21 12:10, Ashutosh Bapat wrote:
I think, that's a limitation till we implement global indexes. But
nothing in the current implementation stops us from implementing it.
In fact, I remember, a reply from Robert to another thread on
partitioning started in parallel to Amit's thread had explained some
of these design decisions. I am unable to find link to that exact
reply though.
Are you perhaps thinking of the message titled "design for a partitioning
feature (was: inheritance)" a few months back:
/messages/by-id/CA+TgmoZeV0nryvw9cNB81xdOJg4XtpJMKDif0zTo-GdLcOCgcQ@mail.gmail.com
Thanks,
Amit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Feb 21, 2017 at 9:57 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2017/02/21 12:10, Ashutosh Bapat wrote:
I think, that's a limitation till we implement global indexes. But
nothing in the current implementation stops us from implementing it.
In fact, I remember, a reply from Robert to another thread on
partitioning started in parallel to Amit's thread had explained some
of these design decisions. I am unable to find link to that exact
reply though.Are you perhaps thinking of the message titled "design for a partitioning
feature (was: inheritance)" a few months back:/messages/by-id/CA+TgmoZeV0nryvw9cNB81xdOJg4XtpJMKDif0zTo-GdLcOCgcQ@mail.gmail.com
Thanks a lot, that's the one I was searching for. And now that I
confirm it, there's NO reply to Robert's mail.
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
There is a very small typo that a comma is missing.
Attached is the patch to fix it.
On Wed, 15 Feb 2017 07:57:53 -0500
Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Feb 15, 2017 at 4:26 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:Noticed some typos in the documentation. Here's patch to correct
those. Sorry, if it has been already taken care of.Thanks. That is indeed nonstandard capitalization. Committed.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Yugo Nagata <nagata@sraoss.co.jp>
Attachments:
ddl_typo.patchtext/x-diff; name=ddl_typo.patchDownload
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 5779eac..ef0f7cf 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2899,7 +2899,7 @@ VALUES ('Albany', NULL, NULL, 'NY');
<listitem>
<para>
- Since primary keys are not supported on partitioned tables
+ Since primary keys are not supported on partitioned tables,
foreign keys referencing partitioned tables are not supported, nor
are foreign key references from a partitioned table to some other table.
</para>
On Tue, Feb 21, 2017 at 2:03 AM, Bruce Momjian <bruce@momjian.us> wrote:
I have to admit my reaction was similar to Simon's, meaning that the
lack of docs is a problem, and that the limitations are kind of a
surprise, and I wonder what other surprises there are.
Did you read my message upthread pointing out that the initial commit
contained hundreds of lines of documentation? I agree that it would
be bad if table partitioning got committed with no documentation, but
that did not happen.
I am thinking this is a result of small teams, often from the same
company, working on a features in isolation and then making them public.
It is often not clear what decisions were made and why.
That also did not happen, or at least certainly not with this patch.
All of the discussion was public and on the mailing list. I never
communicated with Amit Langote off-list about this work, except
shortly before I committed his patches I added him on Skype and gave
him a heads up that I was planning to do so real soon. At no time
have the two of us worked for the same company. Also, the patch had 7
other reviewers credited in the commit messages spread across, I
think, 4 different companies.
I think the issue here might be that with this feature, as with some
other features I've committed over the last few years, the email
discussion got very long. On the one hand, that does make it hard for
others to keep up, but not because anything is secret, only because
reading hundreds of email messages takes a lot of time. However, the
large number of emails on a public mailing list makes it absolutely
clear that this wasn't developed in isolation and presented as a done
deal. It was written and rewritten multiple times in response to
feedback, not only from me but from other people who did take the time
to keep up with the discussion. As Ashutosh Bapat and Amit Langote
already pointed out, I even posted (on a separate thread with a clear
subject line) some thoughts about the overall design of this feature,
in response to concerns articulated on an unrelated thread by Tom and
Alvaro. I did that in an attempt to give people a separate thread on
which to discuss those issues - without having to dive into the main
thread where things were being hashed out in detail - but it got no
responses, either because the logic was unarguable or because nobody
took the time to write back.
I think there's certainly room to criticize cases where a feature is
designed, developed, and committed almost entirely by people who work
at a single company, or where a significant amount of discussion
happens off-list, but it would be difficult to find a more clear-cut
case of where that DIDN'T happen than this patch.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Feb 21, 2017 at 10:27 PM, Yugo Nagata <nagata@sraoss.co.jp> wrote:
There is a very small typo that a comma is missing.
Attached is the patch to fix it.
Thank you. I have committed your patch.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Feb 21, 2017 at 6:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Feb 21, 2017 at 2:03 AM, Bruce Momjian <bruce@momjian.us> wrote:
I have to admit my reaction was similar to Simon's, meaning that the
lack of docs is a problem, and that the limitations are kind of a
surprise, and I wonder what other surprises there are.Did you read my message upthread pointing out that the initial commit
contained hundreds of lines of documentation? I agree that it would
be bad if table partitioning got committed with no documentation, but
that did not happen.I am thinking this is a result of small teams, often from the same
company, working on a features in isolation and then making them public.
It is often not clear what decisions were made and why.That also did not happen, or at least certainly not with this patch.
All of the discussion was public and on the mailing list.
FWIW, I agree that some of what has been claimed about what
contributors failed to do with this patch is exaggerated, and not in a
way that I'd understand as hyperbole that drives home a deeper point.
I'm not the slightest bit surprised at the limitations that this
feature has, even if Bruce and Simon are. The documentation needs
work, and perhaps the feature itself needs a small tweak here or
there. Just not to a particularly notable degree, given the point we
are in in the release cycle.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 22 February 2017 at 19:57, Peter Geoghegan <pg@bowt.ie> wrote:
FWIW, I agree that some of what has been claimed about what
contributors failed to do with this patch is exaggerated, and not in a
way that I'd understand as hyperbole that drives home a deeper point.
What claims are you talking about? Which things have been exaggerated,
and by whom?
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 22 February 2017 at 02:14, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Feb 21, 2017 at 2:03 AM, Bruce Momjian <bruce@momjian.us> wrote:
I have to admit my reaction was similar to Simon's, meaning that the
lack of docs is a problem, and that the limitations are kind of a
surprise, and I wonder what other surprises there are.Did you read my message upthread pointing out that the initial commit
contained hundreds of lines of documentation? I agree that it would
be bad if table partitioning got committed with no documentation, but
that did not happen.
You seem a little defensive about some reasonable review comments.
While its true that the patch had syntax documentation, there was no
user design documentation which explained how it worked to allow
objective review. Had I been able to provide input without reading
every email message, I would have done so earlier.
Amit,
The features I consider very important in the first release are
1. Declarative syntax (we have this!)
2. Tuple routing on INSERT/COPY (we have this!)
3. O(1) partition elimination for simple = queries
4. PRIMARY KEY defined using a) distinct set of UNIQUE constraints on
each partition, b) partition key
2 and 3 are intimately connected because they would both use the same
in-memory data for bsearch, so the code should be almost identical.
4 is important for Foreign Keys and Logical Replication
As missing items, features 3 and 4 seem achievable in this release,
potentially in restricted form. I think we should probably avoid
trying to UPDATE rows from one partition to another in this release,
since that seems likely to be buggy and seems like would only be
needed in relatively few cases.
Let me know if I can help with these.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Feb 23, 2017 at 8:08 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
What claims are you talking about? Which things have been exaggerated,
and by whom?
* The specious argument that we should "just" have CREATE INDEX create
equivalent indexes across partitions, to save all those people from
writing all those scripts.
The reality is that there are several ways that that would not comport
with the existing design of things. Most obviously: where does that
leave the design of global indexes that we eventually come up with?
Now, maybe there is a good way to make this easier for users, worth
doing sooner rather than later, but that will be subtle, and you
should at least acknowledge that.
* "It leaves me asking what else is missing"... "If we wanted them to
act identically we wouldn't have any need for a new feature at all, so
clearly that doesn't make sense as an argument."
These remarks sound imperious to me. I think that this could be quite
demoralizing to someone in Amit's position, and you ought to give some
consideration to that. I think that several of your remarks on the
patch are facile and/or needlessly ambiguous, which is what makes this
lack of tact seem unfair to me.
* "Good work so far, but there is still a very significant amount of
work to do."
There is always more work to do, so why say so? I think that the
implication is that this isn't complete as a feature that goes into
the next release, which I disagree with. There is nothing
disappointing to me about this feature, and, as I said, I am
unsurprised that it doesn't support certain things.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 02/23/2017 09:27 AM, Peter Geoghegan wrote:
On Thu, Feb 23, 2017 at 8:08 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
* "Good work so far, but there is still a very significant amount of
work to do."There is always more work to do, so why say so? I think that the
implication is that this isn't complete as a feature that goes into
the next release, which I disagree with. There is nothing
disappointing to me about this feature, and, as I said, I am
unsurprised that it doesn't support certain things.
I don't think we need to start going down the avenue of "you could be
nicer". We can all be nicer and we all have our good and bad days.
If we start worrying about egos to this degree, we will never get
anything done.
JD
--
Command Prompt, Inc. http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 23 February 2017 at 17:27, Peter Geoghegan <pg@bowt.ie> wrote:
On Thu, Feb 23, 2017 at 8:08 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
What claims are you talking about? Which things have been exaggerated,
and by whom?* The specious argument that we should "just" have CREATE INDEX create
equivalent indexes across partitions, to save all those people from
writing all those scripts.The reality is that there are several ways that that would not comport
with the existing design of things. Most obviously: where does that
leave the design of global indexes that we eventually come up with?
Now, maybe there is a good way to make this easier for users, worth
doing sooner rather than later, but that will be subtle, and you
should at least acknowledge that.
My argument was that CREATE INDEX is expected to just work on tables
at present, so should also just work on partitioned tables. Without
that, the reality is people will need to write scripts.
I don't see how that relates to the desire for multiple index options,
since one of them would need to be the default and we could provide
one in this release, one in the next etc..
* "It leaves me asking what else is missing"... "If we wanted them to
act identically we wouldn't have any need for a new feature at all, so
clearly that doesn't make sense as an argument."These remarks sound imperious to me. I think that this could be quite
demoralizing to someone in Amit's position, and you ought to give some
consideration to that. I think that several of your remarks on the
patch are facile and/or needlessly ambiguous, which is what makes this
lack of tact seem unfair to me.
The current design has assumed many things, leading me to question
what else has been assumed.
Challenging those assumptions is important and has been upheld.
I agree my review comments could well be demoralizing, which is why I
said "Good work so far". It takes a while to realise that review
comments are given to patch authors with the intent to help improve
the product, not as personal attacks. I thought you would know that by
now.
Imperious? No, definitely a Jedi.
* "Good work so far, but there is still a very significant amount of
work to do."There is always more work to do, so why say so? I think that the
implication is that this isn't complete as a feature that goes into
the next release, which I disagree with.
I've seen many patches rejected because they do not contain the
desired feature set, yet.
ISTM my opinion on when that is reached is as valid as yours or anyone
else's, so I'm unclear as to your issue.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Feb 23, 2017 at 11:13 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
My argument was that CREATE INDEX is expected to just work on tables
at present, so should also just work on partitioned tables. Without
that, the reality is people will need to write scripts.
Really? What about postgres_fdw?
Even if that counter-example didn't exist, I'd still disagree. People
may expect that CREATE INDEX should just work, but that expectation is
not as general as you suggest. Otherwise, I doubt we'd be talking
about it at all.
I don't see how that relates to the desire for multiple index options,
since one of them would need to be the default and we could provide
one in this release, one in the next etc..
You didn't say any of that until now. And besides, I think that global
indexes make a lot more sense as a default.
You seem to be saying that a simple CREATE INDEX could be interpreted
as meaning one or the other of those two behaviors just as easily
(global index vs. create an index on each partition). I don't think
it's a good idea to try to meet any general "just works" expectation
if what you actually get does not fit the intuition of almost all
users. "Just don't show me an error" seems like a bad design goal,
especially for a utility statement.
The current design has assumed many things, leading me to question
what else has been assumed.Challenging those assumptions is important and has been upheld.
I agree. The irony is that in so doing, you yourself make your own
assumptions, confusing everyone, and making it harder to act on your
feedback. You did make some reasonable points, IMV.
I've seen many patches rejected because they do not contain the
desired feature set, yet.
Obviously that general principle is not under discussion. My point, of
course, was that it seems pretty clear to me that this is on the right
side of that fence.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Feb 23, 2017 at 10:00 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
You seem a little defensive about some reasonable review comments.
I am prone to that from time to time, and this may be an instance of it.
While its true that the patch had syntax documentation, there was no
user design documentation which explained how it worked to allow
objective review. Had I been able to provide input without reading
every email message, I would have done so earlier.
But I don't agree that it was impossible for you to provide input
earlier without reading every email message, nor do I agree that it is
unreasonable to expect to people who want to provide input to read the
relevant threads.
The features I consider very important in the first release are
1. Declarative syntax (we have this!)
2. Tuple routing on INSERT/COPY (we have this!)
3. O(1) partition elimination for simple = queries
4. PRIMARY KEY defined using a) distinct set of UNIQUE constraints on
each partition, b) partition key2 and 3 are intimately connected because they would both use the same
in-memory data for bsearch, so the code should be almost identical.4 is important for Foreign Keys and Logical Replication
As missing items, features 3 and 4 seem achievable in this release,
potentially in restricted form.
Simon, this is ridiculous. We're 4 or 5 days away from the start of
the last CommitFest. We have time to fix bugs and improve
documentation and maybe tweak things here and there, but 3 and 4 are
significant development projects. There isn't time to do that stuff
right now and get it right. You don't get to show up more than two
months after the feature is committed and start complaining that it
doesn't include all the things you want. Which things ought to be
included in the initial patch was under active discussion about a year
ago this time.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2/23/17 8:36 PM, Robert Haas wrote:
We're 4 or 5 days away from the start of
the last CommitFest. We have time to fix bugs and improve
documentation and maybe tweak things here and there, but 3 and 4 are
significant development projects. There isn't time to do that stuff
right now and get it right.
It might be possible to provide some temporary work-arounds for some of
this, which would be nice. But I agree that there's definitely not
enough time to implement *good* solutions to even just automatic index
creation, let alone somehow handling uniqueness.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Feb 24, 2017 at 8:06 AM, Robert Haas <robertmhaas@gmail.com> wrote:
Simon, this is ridiculous. We're 4 or 5 days away from the start of
the last CommitFest. We have time to fix bugs and improve
documentation and maybe tweak things here and there, but 3 and 4 are
significant development projects. There isn't time to do that stuff
right now and get it right. You don't get to show up more than two
months after the feature is committed and start complaining that it
doesn't include all the things you want. Which things ought to be
included in the initial patch was under active discussion about a year
ago this time.
I take that back. You can complain as much as you like; everybody has
a right to complain. But it's not reasonable to expect Amit (or
anyone else) to go fix the things you're complaining about in time for
v10, or really ever. He doesn't have to write any more partitioning
patches ever, and if he does decide to do so, he doesn't have to write
the ones you or I or anyone other than his employer wants him to write
(and he only has to listen to his employer if he doesn't want to get
fired).
Also, I'm very much against against any major tinkering with this
feature in this release. We've got enough work to do stabilizing
what's already been committed in this area, and the last things we
need is a bunch of patches that significant change it showing up at
the eleventh hour without time for adequate reflection and discussion.
Most if not all significant patches for this release should already
have been submitted; again, the last CommitFest will be starting
shortly, and we should have seen those patches in the previous
CommitFest. We should be focusing on getting all the good patches
that have already been written committed, not creating new ones at the
last minute.
Contrary to what you may think, neither changing the way partition
pruning works nor inventing a system for indexes to roll down to
partition children is a minor fix. Even if you restrict the scope to
simple cases, there's still got to be a level of design agreement so
that we know we're not boxing ourselves into a corner for the future,
and the patch quality still has to be good. That's not going to
happen in the next couple of days, barring a dramatic reversal of how
the development process in this community has always worked before.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 24 February 2017 at 02:36, Robert Haas <robertmhaas@gmail.com> wrote:
While its true that the patch had syntax documentation, there was no
user design documentation which explained how it worked to allow
objective review. Had I been able to provide input without reading
every email message, I would have done so earlier.But I don't agree that it was impossible for you to provide input
earlier without reading every email message, nor do I agree that it is
unreasonable to expect to people who want to provide input to read the
relevant threads.
The features I consider very important in the first release are
1. Declarative syntax (we have this!)
2. Tuple routing on INSERT/COPY (we have this!)
3. O(1) partition elimination for simple = queries
4. PRIMARY KEY defined using a) distinct set of UNIQUE constraints on
each partition, b) partition key2 and 3 are intimately connected because they would both use the same
in-memory data for bsearch, so the code should be almost identical.4 is important for Foreign Keys and Logical Replication
As missing items, features 3 and 4 seem achievable in this release,
potentially in restricted form.Simon, this is ridiculous. We're 4 or 5 days away from the start of
the last CommitFest. We have time to fix bugs and improve
documentation and maybe tweak things here and there, but 3 and 4 are
significant development projects. There isn't time to do that stuff
right now and get it right.
Agreed, you beat me to it. I've spent the last few hours trying to see
what I could personally pull out of the fire to assist. Those things
aren't that big, but they are bigger than we have time for now. Had I
known about that a few months back... well, we are where we are.
The good news is that logical replication DOES work with partitioning,
but only for a Publication with PUBLISH INSERT, pushing from a normal
table to a partitioned one. Which is useful enough for first release.
The work on having UPDATE work between partitions can be used to make
updates and deletes work in logical replication. That might yet be
made to work in this release, and I think we should look into it, but
I think it can be left til next release if we try.
You don't get to show up more than two
months after the feature is committed and start complaining that it
doesn't include all the things you want. Which things ought to be
included in the initial patch was under active discussion about a year
ago this time.
I've provided lots of input across many years, for example my
explanation of how to do tuple routing is very clearly the basis of
the design of the current feature. You *knew* I would have feedback
and if it it hadn't appeared yet you knew it would come.
The "two months after the feature is committed" is a direct result of
the lack of docs. Note that within hours of reading the docs I'd given
feedback. How could I give feedback earlier? Well, I tried. I've flown
to Japan to ensure I could talk to Amit in person about this feature.
Your absence at two consecutive developer meetings hasn't helped
there. It is you that needs to show up more. I'm sure we both have
family and work reasons not to attend them.
The need for design feedback is exactly why the docs for logical
replication were published in June, six months before logical
replication was committed.
With repect, you are making a few mistakes. The first is to imagine
that review comments are negative or complaints; with the right
viewpoint they could easily be seen as helping people to see what has
been missed, yet you repeatedly see them as personal attacks and throw
words back. Oh sure, I've done that myself in earlier days. The second
is to imagine that discussing things on Hackers in multiple threads,
spanning many months with long, detailed emails and rapid replies is
something that anybody could have followed if they wished. And the
third is to imagine I have no right to review; I will watch and see if
you deploy this same "You don't get to show up.." argument against Tom
or Noah when they point out holes in late reviews, though we already
both know you won't. I see you using that yourself, objecting
frequently against patches large and small if they do not meet your
exacting standards, yet you have spoken multiple times against my
right to do that.
Perhaps we'll have time to scowl at each other in India. I'll look
forward to that. ;-)
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Feb 24, 2017 at 9:53 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
The good news is that logical replication DOES work with partitioning,
but only for a Publication with PUBLISH INSERT, pushing from a normal
table to a partitioned one. Which is useful enough for first release.The work on having UPDATE work between partitions can be used to make
updates and deletes work in logical replication. That might yet be
made to work in this release, and I think we should look into it, but
I think it can be left til next release if we try.
Are you speaking of the case where you want to replicate from an
unpartitioned table to a partitioned table? I would imagine that if
you are replicating between two identical partitioning hierarchies, it
would just work. (If not, that's probably a bug, though I don't know
whose bug.)
You don't get to show up more than two
months after the feature is committed and start complaining that it
doesn't include all the things you want. Which things ought to be
included in the initial patch was under active discussion about a year
ago this time.I've provided lots of input across many years, for example my
explanation of how to do tuple routing is very clearly the basis of
the design of the current feature. You *knew* I would have feedback
and if it it hadn't appeared yet you knew it would come.
Not really. I can't always predict who will take an interest in a
patch. I'm not surprised that you have feedback on it, and if that
feedback were phrased as "let's try to do X, Y, and Z in future
releases" I'd have no problem with it. What I'm reacting against is
really the idea that any of this should be done in v10 (and also the
idea, which may be an overly defensive reading of your emails, that my
original commit was somehow not up to the mark).
The "two months after the feature is committed" is a direct result of
the lack of docs. Note that within hours of reading the docs I'd given
feedback. How could I give feedback earlier? Well, I tried.
Quite a few other people managed to give feedback earlier, so it
evidently wasn't totally impossible.
I've flown
to Japan to ensure I could talk to Amit in person about this feature.
Your absence at two consecutive developer meetings hasn't helped
there. It is you that needs to show up more. I'm sure we both have
family and work reasons not to attend them.
We used to have one developer meeting a year and I have attended it
every year I was invited. Then it went up to two per year and I kept
attending one of them per year. Now it seems to be three per year and
I plan to keep attending one of them per year, unless one of the
others happens to be scheduled together with an event I'm planning to
attend anyway. As you say, for both family and work reasons, it's
hard to commit to doing multiple such events per year.
The need for design feedback is exactly why the docs for logical
replication were published in June, six months before logical
replication was committed.
Sure, I think that's great, but there's never been an obligation to
write the documentation before the code in the PostgreSQL community.
It's not a bad idea and I'm happy if people do it, but I'm not going
to start refusing to commit the patches of people who don't do it.
Probably 25% of patches have no documentation at all in the first
version and that gets added later once some consensus is reached and
the feature gets closer to commit; I think that's just fine. For
larger features, the documentation often gets expanded after the
initial commit; I think that's also fine. Unlike you, I actually
don't think it's very hard to follow the discussion about the design
of these features in most cases, and it speeds up development if every
change in the proposed design doesn't require an adjustment to both
the documentation and the code. I think the way logical replication
was done was reasonable, and I think the way that partitioning was
done was also reasonable.
With repect, you are making a few mistakes. The first is to imagine
that review comments are negative or complaints; with the right
viewpoint they could easily be seen as helping people to see what has
been missed, yet you repeatedly see them as personal attacks and throw
words back. Oh sure, I've done that myself in earlier days.
Sure.
The second
is to imagine that discussing things on Hackers in multiple threads,
spanning many months with long, detailed emails and rapid replies is
something that anybody could have followed if they wished.
I don't really see how that's a mistake. It might take more time than
someone's willing to put in -- I have that problem too, on some
threads -- but if someone has the time and is willing to spend it,
then they can follow that discussion.
And the
third is to imagine I have no right to review; I will watch and see if
you deploy this same "You don't get to show up.." argument against Tom
or Noah when they point out holes in late reviews, though we already
both know you won't. I see you using that yourself, objecting
frequently against patches large and small if they do not meet your
exacting standards, yet you have spoken multiple times against my
right to do that.
I don't think this primarily is about how I react to you vs. how I
react to other people, although anybody will have noticed by now that
you and I don't agree as often as I agree with some other people, and
perhaps I'm being more negative than is deserved. That having been
said, Tom and Noah typically complain about bugs in what got committed
rather than complaining that the scope of the project was all wrong.
I have heard them complain that the scope of the project is all wrong,
but before things get committed, not after. What I hear you doing is
saying that there are features missing that ought to have been there
in the original commit or at least in the same release, and I wouldn't
agree with that argument no matter who made it. Nice to have? Yes.
Required before 10 goes out the door? Nope. It's not like the scope
of this project wasn't extensively discussed long before anything to
committed, and the only things we can reasonably do at this release
cycle are ship it more or less as-is, with some small fixes here and
there, or rip the whole thing out and try again later. I'm pretty
convicted that the first option is better, but obviously people can
disagree about that. What we can't do is insist on a whole bunch of
additional development in the time remaining before feature freeze;
the only way that can happen is if we move the whole release timetable
out.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Feb 22, 2017 at 07:44:41AM +0530, Robert Haas wrote:
On Tue, Feb 21, 2017 at 2:03 AM, Bruce Momjian <bruce@momjian.us> wrote:
I have to admit my reaction was similar to Simon's, meaning that the
lack of docs is a problem, and that the limitations are kind of a
surprise, and I wonder what other surprises there are.Did you read my message upthread pointing out that the initial commit
contained hundreds of lines of documentation? I agree that it would
be bad if table partitioning got committed with no documentation, but
that did not happen.I am thinking this is a result of small teams, often from the same
company, working on a features in isolation and then making them public.
It is often not clear what decisions were made and why.That also did not happen, or at least certainly not with this patch.
All of the discussion was public and on the mailing list. I never
communicated with Amit Langote off-list about this work, except
shortly before I committed his patches I added him on Skype and gave
him a heads up that I was planning to do so real soon. At no time
have the two of us worked for the same company. Also, the patch had 7
other reviewers credited in the commit messages spread across, I
think, 4 different companies.
I think you are right. I was only guessing on a possible cause of
Simon's reaction since I had the same reaction. When traveling, it is
hard to get excited about reading a 100+ post thread that has reached a
conclusion. I found Simon's summary of the 4 sub-features to be
helpful.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Feb 26, 2017 at 4:31 AM, Bruce Momjian <bruce@momjian.us> wrote:
I think you are right. I was only guessing on a possible cause of
Simon's reaction since I had the same reaction. When traveling, it is
hard to get excited about reading a 100+ post thread that has reached a
conclusion. I found Simon's summary of the 4 sub-features to be
helpful.
OK, no problem. Basically, I think it's a bad plan to redesign this -
or add any large amount of incremental change to what's already been
done - at this point in the release cycle. Unless we're prepared to
rip it all back out, we've got to ship more or less what we have and
improve it later. I always viewed the mission of this patch as to set
the stage for future improvements in this area, not to solve all of
the problems by itself. I'm sorry if anyone was under a contrary
impression, and I'm also sorry that the discussion seems to have left
some people behind, but I did try my best.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 24/02/17 07:15, Robert Haas wrote:
On Fri, Feb 24, 2017 at 9:53 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
The good news is that logical replication DOES work with partitioning,
but only for a Publication with PUBLISH INSERT, pushing from a normal
table to a partitioned one. Which is useful enough for first release.The work on having UPDATE work between partitions can be used to make
updates and deletes work in logical replication. That might yet be
made to work in this release, and I think we should look into it, but
I think it can be left til next release if we try.Are you speaking of the case where you want to replicate from an
unpartitioned table to a partitioned table? I would imagine that if
you are replicating between two identical partitioning hierarchies, it
would just work. (If not, that's probably a bug, though I don't know
whose bug.)
Yes same hierarchies will work but mainly because one has to add
partitions themselves to publication currently. I guess that's the
limitation we might have to live with in 10 as adding the whole
partitioned table should probably work for different hierarchies when we
enable it and I am not quite sure that's doable before start of the CF
at this point.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/02/27 3:18, Petr Jelinek wrote:
On 24/02/17 07:15, Robert Haas wrote:
On Fri, Feb 24, 2017 at 9:53 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
The good news is that logical replication DOES work with partitioning,
but only for a Publication with PUBLISH INSERT, pushing from a normal
table to a partitioned one. Which is useful enough for first release.The work on having UPDATE work between partitions can be used to make
updates and deletes work in logical replication. That might yet be
made to work in this release, and I think we should look into it, but
I think it can be left til next release if we try.Are you speaking of the case where you want to replicate from an
unpartitioned table to a partitioned table? I would imagine that if
you are replicating between two identical partitioning hierarchies, it
would just work. (If not, that's probably a bug, though I don't know
whose bug.)Yes same hierarchies will work but mainly because one has to add
partitions themselves to publication currently. I guess that's the
limitation we might have to live with in 10 as adding the whole
partitioned table should probably work for different hierarchies when we
enable it and I am not quite sure that's doable before start of the CF
at this point.
If and when we add support to add partitioned tables to publications, I
think it will work by recursing to include the partitions to the same
publication (I see that OpenTableList() in publicationcmds.c calls
find_all_inheritors if recursion is requested by not specifying ONLY).
When a subscription replicates from this publication, it's going to match
changes for individual leaf partitions, not the root parent table. IOW,
none of the changes applied to a partitioned table are replicated as
changes to the table itself. So, it seems that adding a partitioned table
to a publication or subscription would simply be a *shorthand* for adding
all the (leaf) partitions that will actually emit and receive changes.
I'm not sure but should adding/removing partitions after the fact cause
their addition/removal from the publication (& subscription)? Maybe we'll
discuss these issues later.
By the way, we currently get the following error due to the relkind check
in check_publication_add_relation():
create publication p_pub for table p;
ERROR: "p" is not a table
DETAIL: Only tables can be added to publications.
Would it be better to produce an error message that explicitly states that
adding partitioned tables to publications is not supported. Something like:
create publication p_pub for table p;
ERROR: table "p" cannot be replicated
DETAIL: Adding partitioned tables to publications is not supported.
Thanks,
Amit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/02/15 12:00, Robert Haas wrote:
On Fri, Feb 10, 2017 at 3:00 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
Without claiming I'm happy about this, I think the best way to improve
the number of eyeballs on this is to commit these docs as is.For me, the most important thing is understanding the feature, not
(yet) discussing what the docs should look like. This is especially
true if other patches reference the way partitioning works and nobody
can comment on those patches because they don't understandAny issues with that?
There are a number of things that I think are awkward about the patch
as committed:+ <listitem> + <para> + See last section for some general information: + <xref linkend="ddl-partitioned-tables"> + </para> + </listitem>I think we generally try to write the documentation in such a way as
to minimize backward references, and a backward reference to the
previous section seems particularly odd. We've now got section "5.10
Partitioned Tables" followed immediately by section "5.11
Partitioning", where the latter seems to think that you haven't read
the former.I think that section 5.11 needs a much heavier rewrite than what it
got as part of this patch. It's a hodgepodge of the old content
(which explained how to fake partitioning when we didn't have an
explicit concept of partitioning) and new content that talks about how
the new way is different from the old way. But now that we have the
new way, I'm guessing that most people are going to use that and not
care about the old way any more. I'm not that it's even appropriate
to keep the lengthy explanation of how to fake table partitioning
using table inheritance and non-overlapping CHECK constraints, but if
we still want that stuff it should be de-emphasized more than it is
here. Probably the section should be retitled: in previous releases
we called this "Partitioning" because we had no explicit notion of
partitioning, but now that we do, it's confusing to have a section
called "Partitioning" that explains how to avoid using the
partitioning feature, which is more or less what this does. Or maybe
the section title should stay the same (or this should be merged into
the previous section?) but rewritten to change the emphasis.
I agree that my patch failed to de-emphasize the old partitioning method
enough. The examples in 5.11 Partitioning chapter also did not highlight
the new partitioning feature as much as it should have been, so it indeed
reads like a description of how to avoid using the new partitioning
feature. Should we completely remove details about the older partitioning
methods?
I like the idea of merging what are now two chapters into one and call it
Partitioned Tables, retaining the text that describes concepts while
getting rid of the texts detailing inheritance implementation. Perhaps
with the following sub-sections:
5.10 Partitioned Tables
5.10.1 When To Use Partitioning? (what is now 5.11.1 Overview)
5.10.2 Example (what is now 5.11.2. Implementing Partitioning)
5.10.3 Managing Partitions (same title as 5.11.3)
5.10.4. Partitioning and Constraint Exclusion (same title as 5.11.4)
About this, do we want to emphasize the fact that the new partitioned
tables *currently* depend on constraint exclusion? I guess not. So the
sub-section is retained most as is, with some tweaks.
5.10.5 Caveats (that still are)
As mentioned above, the above sub-sections will retain the old text that
talks about concepts and not the particular implementation details. For
example, in 5.10.3 Managing Partitions, the following text will be retained:
<sect2 id="ddl-partitioning-managing-partitions">
<title>Managing Partitions</title>
<para>
Normally the set of partitions established when initially
defining the table are not intended to remain static. It is
common to want to remove old partitions of data and periodically
add new partitions for new data. One of the most important
advantages of partitioning is precisely that it allows this
otherwise painful task to be executed nearly instantaneously by
manipulating the partition structure, rather than physically moving large
amounts of data around.
</para>
I will go make a patch for this if there are no objections. Suggestions
are welcome.
Thanks,
Amit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 27 February 2017 at 10:12, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
I agree that my patch failed to de-emphasize the old partitioning method
enough. The examples in 5.11 Partitioning chapter also did not highlight
the new partitioning feature as much as it should have been, so it indeed
reads like a description of how to avoid using the new partitioning
feature.
Your patch was incredibly useful; I just wanted it earlier.
I think we're probably all agreed that we should highlight benefits of
the new approach more, though the list of caveats should stay
somewhere, just as we did for the original inheritance feature, and
other things such as replication.
Should we completely remove details about the older partitioning
methods?
No, because there is much code out there using it for last 12 years
that needs to be explained and there are still some use cases where it
is useful that aren't on the roadmap for partitioning.
I like the idea of merging what are now two chapters into one and call it
Partitioned Tables, retaining the text that describes concepts
+1
...but how?
5.10 Partitioned Tables and Related Solutions
5.10.1 Declarative Partitioning (this new feature)
5.10.2 Managing Partitions using Inheritance
5.10.3 Managing Partitions using Union All Views
5.10.4 Accessing tables using BRIN indexes
So first and foremost we highlight the new feature and explain all its
strengths with examples.
We then explain the other possible ways of implementing something
similar. This allows us to explain how to handle cases such as when
partitions have different set of columns etc..
I'm happy to put my name down to write the sections on Union All
Views, which is useful but only mentioned in passing, and the section
on BRIN indexes, all of which would have their own independent sets of
caveats.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 27/02/17 07:59, Amit Langote wrote:
On 2017/02/27 3:18, Petr Jelinek wrote:
On 24/02/17 07:15, Robert Haas wrote:
On Fri, Feb 24, 2017 at 9:53 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
The good news is that logical replication DOES work with partitioning,
but only for a Publication with PUBLISH INSERT, pushing from a normal
table to a partitioned one. Which is useful enough for first release.The work on having UPDATE work between partitions can be used to make
updates and deletes work in logical replication. That might yet be
made to work in this release, and I think we should look into it, but
I think it can be left til next release if we try.Are you speaking of the case where you want to replicate from an
unpartitioned table to a partitioned table? I would imagine that if
you are replicating between two identical partitioning hierarchies, it
would just work. (If not, that's probably a bug, though I don't know
whose bug.)Yes same hierarchies will work but mainly because one has to add
partitions themselves to publication currently. I guess that's the
limitation we might have to live with in 10 as adding the whole
partitioned table should probably work for different hierarchies when we
enable it and I am not quite sure that's doable before start of the CF
at this point.If and when we add support to add partitioned tables to publications, I
think it will work by recursing to include the partitions to the same
publication (I see that OpenTableList() in publicationcmds.c calls
find_all_inheritors if recursion is requested by not specifying ONLY).
When a subscription replicates from this publication, it's going to match
changes for individual leaf partitions, not the root parent table. IOW,
none of the changes applied to a partitioned table are replicated as
changes to the table itself. So, it seems that adding a partitioned table
to a publication or subscription would simply be a *shorthand* for adding
all the (leaf) partitions that will actually emit and receive changes.
This was my first thought as well. However we need to also take into
account the use-case with different partitioning topology on publisher
and subscriber (at least in a sense that the initial design will not
paint us in the corner).
Now I see two ways of achieving this.
a) Either going with more or less what you wrote above and in the future
have some smarts where we can specify that it should replicate with
different name. We'll eventually want to be able to replicate table to
differently named table anyway so it's not stretch by any means to do that.
b) Or just replicate changes to leaf partitions as changes to
partitioned table. That's also not that hard to do, we just need fast
lookup of partitioned table from leaf table.
I guess a) looks simpler given that we eventually need the rename
anyway, but I'd like opinions of other people as well.
I'm not sure but should adding/removing partitions after the fact cause
their addition/removal from the publication (& subscription)? Maybe we'll
discuss these issues later.
That's something I've been also wondering as there is many corner cases
here. For example if table is in some publication and then is added as
partition to partitioned table what should happen? What should happen
when the partitioned table is then removed from same publication to
which partition was added explicitly? Should we allow different
publication configuration for different partitions within same
partitioned table, etc?
This somewhat brings bigger question about where we want to go in
general with partitioning. And that is, should partition be a separate
entity that is on the same level of table, or should it be part of the
partitioned table without it's own "identity".
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/02/27 20:44, Simon Riggs wrote:
On 27 February 2017 at 10:12, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:I agree that my patch failed to de-emphasize the old partitioning method
enough. The examples in 5.11 Partitioning chapter also did not highlight
the new partitioning feature as much as it should have been, so it indeed
reads like a description of how to avoid using the new partitioning
feature.Your patch was incredibly useful; I just wanted it earlier.
Thanks. Anyway, we agree that there is room for improvement.
I think we're probably all agreed that we should highlight benefits of
the new approach more, though the list of caveats should stay
somewhere, just as we did for the original inheritance feature, and
other things such as replication.
Yes.
Should we completely remove details about the older partitioning
methods?No, because there is much code out there using it for last 12 years
that needs to be explained and there are still some use cases where it
is useful that aren't on the roadmap for partitioning.
Sure, it's a matter of where we place it in the rewritten section about
partitioning.
I like the idea of merging what are now two chapters into one and call it
Partitioned Tables, retaining the text that describes concepts+1
...but how?
5.10 Partitioned Tables and Related Solutions
Presumably, this is where we put the "why" of using partitioning as part
of database design, that is, some of the text at the beginning of the 5.11
Partitioning section.
5.10.1 Declarative Partitioning (this new feature)
5.10.2 Managing Partitions using Inheritance
5.10.3 Managing Partitions using Union All Views
5.10.4 Accessing tables using BRIN indexes
Each of these sub-sections explain the "how", using that method.
We point out here when it's better to use one method over another,
limitations when using a particular method, etc.
So first and foremost we highlight the new feature and explain all its
strengths with examples.We then explain the other possible ways of implementing something
similar. This allows us to explain how to handle cases such as when
partitions have different set of columns etc..
Yeah, we don't allow that with declarative partitioned tables. A user
reading this section should be able to choose the best solution for their
needs.
I'm happy to put my name down to write the sections on Union All
Views, which is useful but only mentioned in passing, and the section
on BRIN indexes, all of which would have their own independent sets of
caveats.
OK. So, I will start writing the patch with above general skeleton and
hopefully post it within this week and you can improve it as fit.
Thanks,
Amit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 28 February 2017 at 08:14, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
OK. So, I will start writing the patch with above general skeleton and
hopefully post it within this week and you can improve it as fit.
Will do, thanks.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Feb 27, 2017 at 5:14 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
I like the idea of merging what are now two chapters into one and call it
Partitioned Tables, retaining the text that describes concepts+1
...but how?
5.10 Partitioned Tables and Related Solutions
5.10.1 Declarative Partitioning (this new feature)
5.10.2 Managing Partitions using Inheritance
5.10.3 Managing Partitions using Union All Views
5.10.4 Accessing tables using BRIN indexesSo first and foremost we highlight the new feature and explain all its
strengths with examples.We then explain the other possible ways of implementing something
similar. This allows us to explain how to handle cases such as when
partitions have different set of columns etc..I'm happy to put my name down to write the sections on Union All
Views, which is useful but only mentioned in passing, and the section
on BRIN indexes, all of which would have their own independent sets of
caveats.
I like the proposed 5.10.1 and 5.10.2 organization. I am not sure
whether 5.10.3 and 5.10.4 make sense because I can't quite imagine
what content would go there. We don't document UNION ALL views as a
method today, and I'm not sure we really need to start. Also I don't
see what BRIN indexes have to do with partitioning. But I may be
missing something.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/02/28 17:25, Simon Riggs wrote:
On 28 February 2017 at 08:14, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:OK. So, I will start writing the patch with above general skeleton and
hopefully post it within this week and you can improve it as fit.Will do, thanks.
Attached patch 0001 is what I managed so far. Please take a look and let
me know if there is more I can do. I guess you might want to expand the
parts related to UNION ALL views and BRIN indexes.
Also for consideration,
0002: some cosmetic fixes to create_table.sgml
0003: add clarification about NOT NULL constraint on partition columns in
alter_table.sgml
Thoughts?
Thanks,
Amit
Attachments:
0003-Listify-a-note-on-the-CREATE-TABLE-page.patchtext/x-diff; name=0003-Listify-a-note-on-the-CREATE-TABLE-page.patchDownload
From 02c8753b3a81589c5f6679bcc4d50f452d0c658d Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Mon, 27 Feb 2017 19:00:08 +0900
Subject: [PATCH 3/3] Listify a note on the CREATE TABLE page
---
doc/src/sgml/ref/create_table.sgml | 82 ++++++++++++++++++++++----------------
1 file changed, 47 insertions(+), 35 deletions(-)
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index bb081ff86f..161f7d1280 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -262,41 +262,53 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
</para>
<note>
- <para>
- Each of the values specified in the partition bound specification is
- a literal, <literal>NULL</literal>, or <literal>UNBOUNDED</literal>.
- A literal is either a numeric constant or a string constant that is
- coercable to the corresponding partition key column's type.
- </para>
-
- <para>
- When creating a range partition, the lower bound specified with
- <literal>FROM</literal> is an inclusive bound, whereas the upper bound
- specified with <literal>TO</literal> is an exclusive bound. That is,
- the values specified in the <literal>FROM</literal> list are accepted
- values of the corresponding partition key columns in a given partition,
- whereas those in the <literal>TO</literal> list are not. To be precise,
- this applies only to the first of the partition key columns for which
- the corresponding values in the <literal>FROM</literal> and
- <literal>TO</literal> lists are not equal. All rows in a given
- partition contain the same values for all preceding columns, equal to
- those specified in <literal>FROM</literal> and <literal>TO</literal>
- lists. On the other hand, any subsequent columns are insignificant
- as far as implicit partition constraint is concerned.
-
- Specifying <literal>UNBOUNDED</literal> in <literal>FROM</literal>
- signifies <literal>-infinity</literal> as the lower bound of the
- corresponding column, whereas it signifies <literal>+infinity</literal>
- as the upper bound when specified in <literal>TO</literal>.
- </para>
-
- <para>
- When creating a list partition, <literal>NULL</literal> can be specified
- to signify that the partition allows the partition key column to be null.
- However, there cannot be more than one such list partitions for a given
- parent table. <literal>NULL</literal> cannot specified for range
- partitions.
- </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ Each of the values specified in the partition bound specification is
+ a literal, <literal>NULL</literal>, or <literal>UNBOUNDED</literal>.
+ A literal is either a numeric constant or a string constant that is
+ coercable to the corresponding partition key column's type.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ When creating a range partition, the lower bound specified with
+ <literal>FROM</literal> is an inclusive bound, whereas the upper
+ bound specified with <literal>TO</literal> is an exclusive bound.
+ That is, the values specified in the <literal>FROM</literal> list
+ are accepted values of the corresponding partition key columns in a
+ given partition, whereas those in the <literal>TO</literal> list are
+ not. To be precise, this applies only to the first of the partition
+ key columns for which the corresponding values in the <literal>FROM</literal>
+ and <literal>TO</literal> lists are not equal. All rows in a given
+ partition contain the same values for all preceding columns, equal to
+ those specified in <literal>FROM</literal> and <literal>TO</literal>
+ lists. On the other hand, any subsequent columns are insignificant
+ as far as implicit partition constraint is concerned.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Specifying <literal>UNBOUNDED</literal> in <literal>FROM</literal>
+ signifies <literal>-infinity</literal> as the lower bound of the
+ corresponding column, whereas it signifies <literal>+infinity</literal>
+ as the upper bound when specified in <literal>TO</literal>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ When creating a list partition, <literal>NULL</literal> can be
+ specified to signify that the partition allows the partition key
+ column to be null. However, there cannot be more than one such
+ list partitions for a given parent table. <literal>NULL</literal>
+ cannot specified for range partitions.
+ </para>
+ </listitem>
+ </itemizedlist>
</note>
<para>
--
2.11.0
0001-Rewrite-sections-in-ddl.sgml-related-to-partitioning.patchtext/x-diff; name=0001-Rewrite-sections-in-ddl.sgml-related-to-partitioning.patchDownload
From 97c104054310c9361623c182497d708ed65bf65f Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 3 Mar 2017 16:39:24 +0900
Subject: [PATCH 1/3] Rewrite sections in ddl.sgml related to partitioning
Merge sections Partitioned Tables and Partitioning into one section
called Table Partitioning and Related Solutions.
---
doc/src/sgml/ddl.sgml | 1359 +++++++++++++++++++++++++------------------------
1 file changed, 707 insertions(+), 652 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 09b5b3ff70..a2dd39df54 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2772,14 +2772,181 @@ VALUES ('Albany', NULL, NULL, 'NY');
</sect2>
</sect1>
- <sect1 id="ddl-partitioned-tables">
- <title>Partitioned Tables</title>
+ <sect1 id="ddl-partitioning">
+ <title>Table Partitioning and Related Solutions</title>
+
+ <indexterm>
+ <primary>partitioning</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>table</primary>
+ <secondary>partitioning</secondary>
+ </indexterm>
<indexterm>
<primary>partitioned table</primary>
</indexterm>
<para>
+ <productname>PostgreSQL</productname> supports basic table
+ partitioning. This section describes why and how to implement
+ partitioning as part of your database design.
+ </para>
+
+ <sect2 id="ddl-partitioning-overview">
+ <title>Overview</title>
+
+ <para>
+ Partitioning refers to splitting what is logically one large table into
+ smaller physical pieces. Partitioning can provide several benefits:
+ <itemizedlist>
+ <listitem>
+ <para>
+ Query performance can be improved dramatically in certain situations,
+ particularly when most of the heavily accessed rows of the table are in a
+ single partition or a small number of partitions. The partitioning
+ substitutes for leading columns of indexes, reducing index size and
+ making it more likely that the heavily-used parts of the indexes
+ fit in memory.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ When queries or updates access a large percentage of a single
+ partition, performance can be improved by taking advantage
+ of sequential scan of that partition instead of using an
+ index and random access reads scattered across the whole table.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Bulk loads and deletes can be accomplished by adding or removing
+ partitions, if that requirement is planned into the partitioning design.
+ Doing <command>ALTER TABLE DETACH PARTITION</> followed by
+ <command>DROP TABLE</> is far faster than a bulk operation. These
+ commands also entirely avoid the <command>VACUUM</command> overhead
+ caused by a bulk <command>DELETE</>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Seldom-used data can be migrated to cheaper and slower storage media.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ The benefits will normally be worthwhile only when a table would
+ otherwise be very large. The exact point at which a table will
+ benefit from partitioning depends on the application, although a
+ rule of thumb is that the size of the table should exceed the physical
+ memory of the database server.
+ </para>
+
+ <para>
+ The following forms of partitioning can be implemented in
+ <productname>PostgreSQL</productname>:
+
+ <variablelist>
+ <varlistentry>
+ <term>Range Partitioning</term>
+
+ <listitem>
+ <para>
+ The table is partitioned into <quote>ranges</quote> defined
+ by a key column or set of columns, with no overlap between
+ the ranges of values assigned to different partitions. For
+ example one might partition by date ranges, or by ranges of
+ identifiers for particular business objects.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>List Partitioning</term>
+
+ <listitem>
+ <para>
+ The table is partitioned by explicitly listing which key values
+ appear in each partition.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+
+ <para>
+ The following partitioning methods are currently supported:
+
+ <variablelist>
+ <varlistentry>
+ <term>Declarative Partitioning</term>
+
+ <listitem>
+ <para>
+ One creates a <firstterm>partitioned table</firstterm> by specifying
+ the partitioning method and a set of columns as the partition key.
+ <firstterm>Partitions</firstterm>, which contain actual data inserted
+ into the table, are created by specifying what subset of the data it
+ accepts.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Using inheritance</term>
+
+ <listitem>
+ <para>
+ Each partition must be created as a child table of a single parent
+ table. The parent table itself is normally empty; it exists just to
+ represent the entire data set.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Using UNION ALL views</term>
+
+ <listitem>
+ <para>
+ One can define a <literal>UNION ALL</literal> view over
+ <literal>SELECT</literal> on individual tables, each of which
+ contains a partition of data. Partitions are added or removed
+ by updating the view definition.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Accessing Tables using BRIN Indexes</term>
+
+ <listitem>
+ <para>
+ <acronym>BRIN</acronym>, which stands for Block Range Index is,
+ designed for handling very large tables in which certain columns
+ have some natural physical location within the table. Scanning
+ a large table using a <acronym>BRIN</acronym> index results in
+ reading only a portion of the table, which is often why partitioning
+ is implemented.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+
+ <para>
+ Each of the above mentioned methods is described below.
+ </para>
+ </sect2>
+
+ <sect2 id="ddl-partitioning-declarative">
+ <title>Declarative Partitioning</title>
+
+ <para>
PostgreSQL offers a way to specify how to divide a table into pieces
called partitions. The table that is divided is referred to as a
<firstterm>partitioned table</firstterm>. The specification consists
@@ -2790,25 +2957,29 @@ VALUES ('Albany', NULL, NULL, 'NY');
<para>
All rows inserted into a partitioned table will be routed to one of the
<firstterm>partitions</firstterm> based on the value of the partition
- key. Each partition has a subset defined by its <firstterm>partition
- bounds</firstterm>. Currently supported partitioning methods include
- range and list, wherein each partition is assigned a range of keys or
- a list of keys, respectively.
+ key. Each partition has a subset of the data defined by its
+ <firstterm>partition bounds</firstterm>. Currently supported
+ partitioning methods include range and list, where each partition is
+ assigned a range of keys and a list of keys, respectively.
</para>
<para>
Partitions may have their own indexes, constraints and default values,
- distinct from other partitions. Partitions do not inherit indexes from
- the partitioned table.
+ distinct from other partitions. Partitions do not currently inherit
+ indexes from the partitioned table.
+ </para>
+
+ <para>
+ See <xref linkend="sql-createtable"> for more details creating partitioned
+ tables and partitions.
</para>
<para>
Partitions may themselves be defined as partitioned tables, referred to as
- <firstterm>sub-partitioning</firstterm>. See <xref linkend="sql-createtable">
- for more details creating partitioned tables and partitions. It is not
- currently possible to alter a regular table into a partitioned table or
- vice versa. However, it is possible to add a regular table containing
- data into a partition of a partitioned table, or remove a partition; see
+ <firstterm>sub-partitioning</firstterm>. It is not currently possible to
+ alter a regular table into a partitioned table or vice versa. However,
+ it is possible to add a regular or partitioned table containing data into
+ a partition of a partitioned table, or remove a partition; see
<xref linkend="sql-altertable"> to learn more about the
<command>ATTACH PARTITION</> and <command>DETACH PARTITION</> sub-commands.
</para>
@@ -2823,8 +2994,8 @@ VALUES ('Albany', NULL, NULL, 'NY');
partitioned table and partitions do not participate in inheritance with
regular tables. Since a partition hierarchy consisting of the
partitioned table and its partitions is still an inheritance hierarchy,
- all the normal rules of inheritance apply as described in the previous
- section (<xref linkend="ddl-inherit">) with some exceptions, most notably:
+ all the normal rules of inheritance apply as described in
+ <xref linkend="ddl-inherit"> with some exceptions, most notably:
<itemizedlist>
<listitem>
@@ -2840,13 +3011,11 @@ VALUES ('Albany', NULL, NULL, 'NY');
<para>
The <literal>ONLY</literal> notation used to exclude child tables
would either cause error or will be ignored in some cases for
- partitioned tables. For example, specifying <literal>ONLY</literal>
- when querying data from a partitioned table would not make much sense,
- because all the data is contained in partitions, so this raises an
- error. Specifying <literal>ONLY</literal> when modifying schema is
- not desirable in certain cases with partitioned tables where it may be
- fine for regular inheritance parents (for example, dropping a column
- from only the parent); an error will be thrown in that case.
+ partitioned tables. Specifying <literal>ONLY</literal> when modifying
+ schema is not desirable in certain cases with partitioned tables
+ whereas it may be fine for regular inheritance parents (for example,
+ dropping a column from only the parent); an error will be thrown in
+ that case.
</para>
</listitem>
@@ -2855,9 +3024,9 @@ VALUES ('Albany', NULL, NULL, 'NY');
Partitions cannot have columns that are not present in the parent.
It is neither possible to specify columns when creating partitions
with <command>CREATE TABLE</> nor is it possible to add columns to
- partitions using <command>ALTER TABLE</>. Tables may be added with
- <command>ALTER TABLE ... ATTACH PARTITION</> if their columns exactly
- match the parent, including oids.
+ partitions using <command>ALTER TABLE</>. Tables may be added as a
+ partition with <command>ALTER TABLE ... ATTACH PARTITION</> only if
+ their columns exactly match the parent, including oids.
</para>
</listitem>
@@ -2871,199 +3040,353 @@ VALUES ('Albany', NULL, NULL, 'NY');
</para>
<para>
- Partitions can also be foreign tables (see <xref linkend="ddl-foreign-data">),
+ Partitions can also be foreign tables (see <xref linkend="sql-createforeigntable">),
although certain limitations exist currently in their usage. For example,
- data inserted into the partitioned table cannot be routed to foreign table
- partitions.
+ data inserted into the partitioned table is currently not routed to foreign
+ table partitions.
</para>
+ <sect3 id="ddl-partitioning-declarative-example">
+ <title>Example</title>
+
<para>
- There are currently the following limitations of using partitioned tables:
- <itemizedlist>
+ Suppose we are constructing a database for a large ice cream company.
+ The company measures peak temperatures every day as well as ice cream
+ sales in each region. Conceptually, we want a table like:
+
+<programlisting>
+CREATE TABLE measurement (
+ city_id int not null,
+ logdate date not null,
+ peaktemp int,
+ unitsales int
+);
+</programlisting>
+
+ We know that most queries will access just the last week's, month's or
+ quarter's data, since the main use of this table will be to prepare
+ online reports for management. To reduce the amount of old data that
+ needs to be stored, we decide to only keep the most recent 3 years
+ worth of data. At the beginning of each month we will remove the oldest
+ month's data. In this situation we can use partitioning to help us meet
+ all of our different requirements for the measurements table.
+ </para>
+
+ <para>
+ To use declarative partitioning in this case, use the following steps:
+
+ <orderedlist spacing="compact">
<listitem>
<para>
- It is currently not possible to add same set of indexes on all partitions
- automatically. Indexes must be added to each partition with separate
- commands.
+ Create <structname>measurement</structname> table as a partitioned
+ table by specifying the <literal>PARTITION BY</literal> clause, which
+ includes the partitioning method (<literal>RANGE</literal> or
+ <literal>LIST</literal>) and the list of column(s) to use as the
+ partition key.
+
+<programlisting>
+CREATE TABLE measurement (
+ city_id int not null,
+ logdate date not null,
+ peaktemp int,
+ unitsales int
+) PARTITION BY RANGE (logdate);
+</programlisting>
</para>
- </listitem>
- <listitem>
+ <note>
+ <para>
+ To decide when to use multiple columns in the partition key for range
+ partitioning, consider whether queries accessing the partitioned
+ in question will include conditions that involve multiple columns,
+ especially the columns being considered to be the partition key.
+ If so, the optimizer can create a plan that will scan fewer partitions
+ if a query's conditions are such that there is equality constraint on
+ leading partition key columns, because they limit the number of
+ partitions of interest. The first partition key column with
+ inequality constraint also further eliminates some partitions of
+ those chosen by equality constraints on earlier columns.
+ </para>
+ </note>
+
<para>
- It is currently not possible to define indexes on partitioned tables
- that include all rows from all partitions in one global index.
- Consequently, it is not possible to create constraints that are realized
- using an index such as <literal>UNIQUE</>.
+ To be able to insert data into this table, one must create partitions,
+ as described below.
</para>
</listitem>
<listitem>
<para>
- Since primary keys are not supported on partitioned tables,
- foreign keys referencing partitioned tables are not supported, nor
- are foreign key references from a partitioned table to some other table.
+ Create partitions. Each partition's definition must specify the bounds
+ that correspond to the partitioning method and partition key of the
+ parent. Note that specifying bounds such that the new partition's
+ values will overlap with those in one or more existing partitions will
+ cause an error. Inserting data into into the parent table that does
+ not map to one of the existing partitions will cause an error;
+ appropriate partition must be added manually.
+ </para>
+
+ <para>
+ Partitions thus created are in every way normal <productname>PostgreSQL</>
+ tables (or, possibly, foreign tables). It is possible, for example, to
+ specify tablespace, storage parameters for each partition separately.
+ </para>
+
+ <para>
+ It is not necessary to create table constraints describing partition
+ boundary condition for partitions. Instead, partition constraints are
+ generated implicitly from the partition bound specification whenever
+ there is need to refer to them. Also, since any data inserted into the
+ parent table is automatically inserted into the appropriate partition,
+ it is not necessary to create triggers for the same.
+
+<programlisting>
+CREATE TABLE measurement_y2006m02 PARTITION OF measurement
+ FOR VALUES FROM ('2006-02-01') TO ('2006-03-01')
+
+CREATE TABLE measurement_y2006m03 PARTITION OF measurement
+ FOR VALUES FROM ('2006-03-01') TO ('2006-04-01')
+
+...
+CREATE TABLE measurement_y2007m11 PARTITION OF measurement
+ FOR VALUES FROM ('2007-11-01') TO ('2007-12-01')
+
+CREATE TABLE measurement_y2007m12 PARTITION OF measurement
+ FOR VALUES FROM ('2007-12-01') TO ('2008-01-01')
+ TABLESPACE fasttablespace;
+
+CREATE TABLE measurement_y2008m01 PARTITION OF measurement
+ FOR VALUES FROM ('2008-01-01') TO ('2008-02-01')
+ TABLESPACE fasttablespace
+ WITH (parallel_workers = 4);
+</programlisting>
</para>
+
+ <note>
+ <para>
+ To implement sub-partitioning, specify the
+ <literal>PARTITION BY</literal> clause in the commands used to create
+ individual partitions, for example:
+
+<programlisting>
+CREATE TABLE measurement_y2006m02 PARTITION OF measurement
+ FOR VALUES FROM ('2006-02-01') TO ('2006-03-01')
+ PARTITION BY RANGE (peaktemp);
+</programlisting>
+
+ After creating partitions of <structname>measurement_y2006m02</>,
+ any data inserted into <structname>measurement</> that is mapped to
+ <structname>measurement_y2006m02</> will be further redirected to one
+ of its partitions based on the <structfield>peaktemp</> column.
+ Partition key specified may overlap with the parent's partition key,
+ although care must be taken when specifying the bounds of
+ sub-partitions such that the accepted set of data constitutes a
+ subset of what a partition's own bounds allows; the system does not
+ try to check if that's really the case.
+ </para>
+ </note>
</listitem>
<listitem>
<para>
- Row triggers, if necessary, must be defined on individual partitions, not
- the partitioned table as it is currently not supported.
+ Create an index on the key column(s),
+ as well as any other indexes you might want for every partition.
+ Note that it is currently not supported to propagate index definition
+ from the master partitioned table to its partitions; in fact, it is
+ not possible to define indexes on partitioned tables in the first
+ place. This might change in future releases.
+
+<programlisting>
+CREATE INDEX ON measurement_y2006m02 (logdate);
+CREATE INDEX ON measurement_y2006m03 (logdate);
+...
+CREATE INDEX ON measurement_y2007m11 (logdate);
+CREATE INDEX ON measurement_y2007m12 (logdate);
+CREATE INDEX ON measurement_y2008m01 (logdate);
+</programlisting>
</para>
</listitem>
- </itemizedlist>
+
+ <listitem>
+ <para>
+ Ensure that the <xref linkend="guc-constraint-exclusion">
+ configuration parameter is not disabled in <filename>postgresql.conf</>.
+ If it is, queries will not be optimized as desired.
+ </para>
+ </listitem>
+ </orderedlist>
</para>
<para>
- A detailed example that shows how to use partitioned tables is discussed in
- the next chapter.
+ In the above example we would be creating a new partition each month, so
+ it might be wise to write a script that generates the required DDL
+ automatically.
</para>
-
- </sect1>
+ </sect3>
- <sect1 id="ddl-partitioning">
- <title>Partitioning</title>
+ <sect3 id="ddl-partitioning-declarative-maintenance">
+ <title>Partition Maintenance</title>
- <indexterm>
- <primary>partitioning</primary>
- </indexterm>
+ <para>
+ Normally the set of partitions established when initially defining the
+ the table are not intended to remain static. It is common to want to
+ remove old partitions of data and periodically add new partitions for
+ new data. One of the most important advantages of partitioning is
+ precisely that it allows this otherwise painful task to be executed
+ nearly instantaneously by manipulating the partition structure, rather
+ than physically moving large amounts of data around.
+ </para>
- <indexterm>
- <primary>table</primary>
- <secondary>partitioning</secondary>
- </indexterm>
+ <para>
+ The simplest option for removing old data is simply to drop the partition
+ that is no longer necessary:
+<programlisting>
+DROP TABLE measurement_y2006m02;
+</programlisting>
+ This can very quickly delete millions of records because it doesn't have
+ to individually delete every record. Note however that the above command
+ requires taking an <literal>ACCESS EXCLUSIVE</literal> lock on the parent
+ table.
+ </para>
<para>
- <productname>PostgreSQL</productname> supports basic table
- partitioning. This section describes why and how to implement
- partitioning as part of your database design.
- </para>
+ Another option that is often preferable is to remove the partition from
+ the partitioned table but retain access to it as a table in its own
+ right:
- <sect2 id="ddl-partitioning-overview">
- <title>Overview</title>
+<programlisting>
+ALTER TABLE measurement DETACH PARTITION measurement_y2006m02;
+</programlisting>
+
+ This allows further operations to be performed on the data before
+ it is dropped. For example, this is often a useful time to back up
+ the data using <command>COPY</>, <application>pg_dump</>, or
+ similar tools. It might also be a useful time to aggregate data
+ into smaller formats, perform other data manipulations, or run
+ reports.
+ </para>
<para>
- Partitioning refers to splitting what is logically one large table
- into smaller physical pieces.
- Partitioning can provide several benefits:
- <itemizedlist>
- <listitem>
- <para>
- Query performance can be improved dramatically in certain situations,
- particularly when most of the heavily accessed rows of the table are in a
- single partition or a small number of partitions. The partitioning
- substitutes for leading columns of indexes, reducing index size and
- making it more likely that the heavily-used parts of the indexes
- fit in memory.
- </para>
- </listitem>
+ Similarly we can add a new partition to handle new data. We can create an
+ empty partition in the partitioned table just as the original partitions
+ were created above:
- <listitem>
- <para>
- When queries or updates access a large percentage of a single
- partition, performance can be improved by taking advantage
- of sequential scan of that partition instead of using an
- index and random access reads scattered across the whole table.
- </para>
- </listitem>
+<programlisting>
+CREATE TABLE measurement_y2008m02 PARTITION OF measurement
+ FOR VALUES FROM ('2008-02-01') TO ('2008-03-01')
+ TABLESPACE fasttablespace;
+</programlisting>
- <listitem>
- <para>
- Bulk loads and deletes can be accomplished by adding or removing
- partitions, if that requirement is planned into the partitioning design.
- <command>ALTER TABLE NO INHERIT</> or <command>ALTER TABLE DETACH PARTITION</>
- and <command>DROP TABLE</> are both far faster than a bulk operation.
- These commands also entirely avoid the <command>VACUUM</command>
- overhead caused by a bulk <command>DELETE</>.
- </para>
- </listitem>
+ As an alternative, it is sometimes more convenient to create the
+ new table outside the partition structure, and make it a proper
+ partition later. This allows the data to be loaded, checked, and
+ transformed prior to it appearing in the partitioned table:
- <listitem>
- <para>
- Seldom-used data can be migrated to cheaper and slower storage media.
- </para>
- </listitem>
- </itemizedlist>
+<programlisting>
+CREATE TABLE measurement_y2008m02
+ (LIKE measurement INCLUDING DEFAULTS INCLUDING CONSTRAINTS)
+ TABLESPACE fasttablespace;
- The benefits will normally be worthwhile only when a table would
- otherwise be very large. The exact point at which a table will
- benefit from partitioning depends on the application, although a
- rule of thumb is that the size of the table should exceed the physical
- memory of the database server.
- </para>
+ALTER TABLE measurement_y2008m02 ADD CONSTRAINT y2008m02
+ CHECK ( logdate >= DATE '2008-02-01' AND logdate < DATE '2008-03-01' );
- <para>
- Currently, <productname>PostgreSQL</productname> supports partitioning
- using two methods:
+\copy measurement_y2008m02 from 'measurement_y2008m02'
+-- possibly some other data preparation work
- <variablelist>
- <varlistentry>
- <term>Using Table Inheritance</term>
+ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
+ FOR VALUES FROM ('2008-02-01') TO ('2008-03-01' );
+</programlisting>
+ </para>
- <listitem>
- <para>
- Each partition must be created as a child table of a single parent
- table. The parent table itself is normally empty; it exists just to
- represent the entire data set. You should be familiar with
- inheritance (see <xref linkend="ddl-inherit">) before attempting to
- set up partitioning with it. This was the only method to implement
- partitioning in older versions.
- </para>
- </listitem>
- </varlistentry>
+ <tip>
+ <para>
+ Before running the <command>ATTACH PARTITION</> command, it is
+ recommended to create a <literal>CHECK</> constraint on the table to
+ be attached describing the desired partition constraint. Using the
+ same, system is able to skip the scan to validate the implicit
+ partition constraint. Without such a constraint, the table will be
+ scanned to validate the partition constraint, while holding an
+ <literal>ACCESS EXCLUSIVE</literal> lock on the parent table.
+ One may want to drop the constraint after <command>ATTACH PARTITION</>
+ is finished, because it is no longer necessary.
+ </para>
+ </tip>
+ </sect3>
- <varlistentry>
- <term>Using Partitioned Tables</term>
+ <sect3 id="ddl-partitioning-declarative-limitations">
+ <title>Limitations</title>
- <listitem>
- <para>
- See last section for some general information:
- <xref linkend="ddl-partitioned-tables">
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </para>
+ <para>
+ There are currently the following limitations of using partitioned tables:
+ <itemizedlist>
+ <listitem>
+ <para>
+ It is currently not possible to add same set of indexes on all partitions
+ automatically. Indexes must be added to each partition with separate
+ commands.
+ </para>
+ </listitem>
- <para>
- The following forms of partitioning can be implemented in
- <productname>PostgreSQL</productname> using either of the above mentioned
- methods, although the latter provides dedicated syntax for each:
+ <listitem>
+ <para>
+ It is currently not possible to define indexes on partitioned tables
+ that include all rows from all partitions in one global index.
+ Consequently, it is not possible to create constraints that are realized
+ using an index such as <literal>UNIQUE</>.
+ </para>
+ </listitem>
- <variablelist>
- <varlistentry>
- <term>Range Partitioning</term>
+ <listitem>
+ <para>
+ Since primary keys are not supported on partitioned tables,
+ foreign keys referencing partitioned tables are not supported, nor
+ are foreign key references from a partitioned table to some other table.
+ </para>
+ </listitem>
- <listitem>
- <para>
- The table is partitioned into <quote>ranges</quote> defined
- by a key column or set of columns, with no overlap between
- the ranges of values assigned to different partitions. For
- example one might partition by date ranges, or by ranges of
- identifiers for particular business objects.
- </para>
- </listitem>
- </varlistentry>
+ <listitem>
+ <para>
+ <command>INSERT</command> statements with <literal>ON CONFLICT</>
+ clause are currently not allowed on partitioned tables.
+ </para>
+ </listitem>
- <varlistentry>
- <term>List Partitioning</term>
+ <listitem>
+ <para>
+ An <command>UPDATE</> that causes a row to move from one partition to
+ another fails, because the new value of the row fails to satisfy the
+ implicit partition constraint of the original partition. This might
+ change in future releases.
+ </para>
+ </listitem>
- <listitem>
- <para>
- The table is partitioned by explicitly listing which key values
- appear in each partition.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </para>
+ <listitem>
+ <para>
+ Row triggers, if necessary, must be defined on individual partitions, not
+ the partitioned table as it is currently not supported.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </sect3>
</sect2>
- <sect2 id="ddl-partitioning-implementation">
- <title>Implementing Partitioning</title>
+ <sect2 id="ddl-partitioning-implementation-inheritance">
+ <title>Implementation Using Inheritance</title>
+ <para>
+ In some cases, one may want to add columns to partitions that are not
+ present in the parent table which is not possible to do with the above
+ method. For such cases, partitioning can be implemented using
+ inheritance (see <xref linkend="ddl-inherit">).
+ </para>
+
+ <sect3 id="ddl-partitioning-inheritance-example">
+ <title>Example</title>
<para>
- To set up a partitioned table using inheritance, do the following:
+ We use the same <structname>measurement</structname> table we used
+ above. To implement it as a partitioned table using inheritance, do the
+ following:
<orderedlist spacing="compact">
<listitem>
<para>
@@ -3076,6 +3399,11 @@ VALUES ('Albany', NULL, NULL, 'NY');
be applied equally to all partitions. There is no point
in defining any indexes or unique constraints on it, either.
</para>
+
+ <para>
+ In case of our example, master table is the original
+ <structname>measurement</structname> as originally defined.
+ </para>
</listitem>
<listitem>
@@ -3090,12 +3418,27 @@ VALUES ('Albany', NULL, NULL, 'NY');
are in every way normal <productname>PostgreSQL</> tables
(or, possibly, foreign tables).
</para>
+
+ <para>
+ This solves one of our problems: deleting old data. Each
+ month, all we will need to do is perform a <command>DROP
+ TABLE</command> on the oldest child table and create a new
+ child table for the new month's data.
+<programlisting>
+CREATE TABLE measurement_y2006m02 () INHERITS (measurement);
+CREATE TABLE measurement_y2006m03 () INHERITS (measurement);
+...
+CREATE TABLE measurement_y2007m11 () INHERITS (measurement);
+CREATE TABLE measurement_y2007m12 () INHERITS (measurement);
+CREATE TABLE measurement_y2008m01 () INHERITS (measurement);
+</programlisting>
+ </para>
</listitem>
<listitem>
<para>
- Add table constraints to the partition tables to define the
- allowed key values in each partition.
+ Add non-overlapping table constraints to the partition tables to
+ define the allowed key values in each partition.
</para>
<para>
@@ -3117,230 +3460,53 @@ CHECK ( outletID BETWEEN 200 AND 300 )
</para>
<para>
- Note that there is no difference in
- syntax between range and list partitioning; those terms are
- descriptive only.
- </para>
- </listitem>
-
- <listitem>
- <para>
- For each partition, create an index on the key column(s),
- as well as any other indexes you might want. (The key index is
- not strictly necessary, but in most scenarios it is helpful.
- If you intend the key values to be unique then you should
- always create a unique or primary-key constraint for each
- partition.)
- </para>
- </listitem>
-
- <listitem>
- <para>
- Optionally, define a trigger or rule to redirect data inserted into
- the master table to the appropriate partition.
- </para>
- </listitem>
-
- <listitem>
- <para>
- Ensure that the <xref linkend="guc-constraint-exclusion">
- configuration parameter is not disabled in
- <filename>postgresql.conf</>.
- If it is, queries will not be optimized as desired.
- </para>
- </listitem>
-
- </orderedlist>
- </para>
-
- <para>
- To use partitioned tables, do the following:
- <orderedlist spacing="compact">
- <listitem>
- <para>
- Create <quote>master</quote> table as a partitioned table by
- specifying the <literal>PARTITION BY</literal> clause, which includes
- the partitioning method (<literal>RANGE</literal> or
- <literal>LIST</literal>) and the list of column(s) to use as the
- partition key. To be able to insert data into the table, one must
- create partitions, as described below.
- </para>
-
- <note>
- <para>
- To decide when to use multiple columns in the partition key for range
- partitioning, consider whether queries accessing the partitioned
- in question will include conditions that involve multiple columns,
- especially the columns being considered to be the partition key.
- If so, the optimizer can create a plan that will scan fewer partitions
- if a query's conditions are such that there is equality constraint on
- leading partition key columns, because they limit the number of
- partitions of interest. The first partition key column with
- inequality constraint also further eliminates some partitions of
- those chosen by equality constraints on earlier columns.
- </para>
- </note>
- </listitem>
-
- <listitem>
- <para>
- Create partitions of the master partitioned table, with the partition
- bounds specified for each partition matching the partitioning method
- and partition key of the master table. Note that specifying partition
- bounds such that the new partition's values will overlap with one or
- more existing partitions will cause an error. It is only after
- creating partitions that one is able to insert data into the master
- partitioned table, provided it maps to one of the existing partitions.
- If a data row does not map to any of the existing partitions, it will
- cause an error.
- </para>
-
- <para>
- Partitions thus created are also in every way normal
- <productname>PostgreSQL</> tables (or, possibly, foreign tables),
- whereas partitioned tables differ in a number of ways.
- </para>
-
- <para>
- It is not necessary to create table constraints for partitions.
- Instead, partition constraints are generated implicitly whenever
- there is a need to refer to them. Also, since any data inserted into
- the master partitioned table is automatically inserted into the
- appropriate partition, it is not necessary to create triggers for the
- same.
- </para>
- </listitem>
-
- <listitem>
- <para>
- Just like with inheritance, create an index on the key column(s),
- as well as any other indexes you might want for every partition.
- Note that it is currently not supported to propagate index definition
- from the master partitioned table to its partitions; in fact, it is
- not possible to define indexes on partitioned tables in the first
- place. This might change in future releases.
- </para>
- </listitem>
-
- <listitem>
- <para>
- Currently, partitioned tables also depend on constraint exclusion
- for query optimization, so ensure that the
- <xref linkend="guc-constraint-exclusion"> configuration parameter is
- not disabled in <filename>postgresql.conf</>. This might change in
- future releases.
- </para>
- </listitem>
-
- </orderedlist>
- </para>
-
- <para>
- For example, suppose we are constructing a database for a large
- ice cream company. The company measures peak temperatures every
- day as well as ice cream sales in each region. Conceptually,
- we want a table like:
-
-<programlisting>
-CREATE TABLE measurement (
- city_id int not null,
- logdate date not null,
- peaktemp int,
- unitsales int
-);
-</programlisting>
-
- We know that most queries will access just the last week's, month's or
- quarter's data, since the main use of this table will be to prepare
- online reports for management.
- To reduce the amount of old data that needs to be stored, we
- decide to only keep the most recent 3 years worth of data. At the
- beginning of each month we will remove the oldest month's data.
- </para>
-
- <para>
- In this situation we can use partitioning to help us meet all of our
- different requirements for the measurements table. Following the
- steps outlined above for both methods, partitioning can be set up as
- follows:
- </para>
-
- <para>
- <orderedlist spacing="compact">
- <listitem>
- <para>
- The master table is the <structname>measurement</> table, declared
- exactly as above.
- </para>
- </listitem>
-
- <listitem>
- <para>
- Next we create one partition for each active month:
-
-<programlisting>
-CREATE TABLE measurement_y2006m02 ( ) INHERITS (measurement);
-CREATE TABLE measurement_y2006m03 ( ) INHERITS (measurement);
-...
-CREATE TABLE measurement_y2007m11 ( ) INHERITS (measurement);
-CREATE TABLE measurement_y2007m12 ( ) INHERITS (measurement);
-CREATE TABLE measurement_y2008m01 ( ) INHERITS (measurement);
-</programlisting>
-
- Each of the partitions are complete tables in their own right,
- but they inherit their definitions from the
- <structname>measurement</> table.
- </para>
-
- <para>
- This solves one of our problems: deleting old data. Each
- month, all we will need to do is perform a <command>DROP
- TABLE</command> on the oldest child table and create a new
- child table for the new month's data.
- </para>
- </listitem>
-
- <listitem>
- <para>
- We must provide non-overlapping table constraints. Rather than
- just creating the partition tables as above, the table creation
- script should really be:
+ It would be better to instead create partitions as follows:
<programlisting>
CREATE TABLE measurement_y2006m02 (
CHECK ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' )
) INHERITS (measurement);
+
CREATE TABLE measurement_y2006m03 (
CHECK ( logdate >= DATE '2006-03-01' AND logdate < DATE '2006-04-01' )
) INHERITS (measurement);
+
...
CREATE TABLE measurement_y2007m11 (
CHECK ( logdate >= DATE '2007-11-01' AND logdate < DATE '2007-12-01' )
) INHERITS (measurement);
+
CREATE TABLE measurement_y2007m12 (
CHECK ( logdate >= DATE '2007-12-01' AND logdate < DATE '2008-01-01' )
) INHERITS (measurement);
+
CREATE TABLE measurement_y2008m01 (
CHECK ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' )
) INHERITS (measurement);
</programlisting>
</para>
+
+ <para>
+ Note that there is no difference in syntax between range and list
+ partitioning; those terms are descriptive only.
+ </para>
</listitem>
<listitem>
<para>
- We probably need indexes on the key columns too:
-
+ For each partition, create an index on the key column(s),
+ as well as any other indexes you might want. (The key index is
+ not strictly necessary, but in most scenarios it is helpful.
+ If you intend the key values to be unique then you should
+ always create a unique or primary-key constraint for each
+ partition.)
<programlisting>
CREATE INDEX measurement_y2006m02_logdate ON measurement_y2006m02 (logdate);
CREATE INDEX measurement_y2006m03_logdate ON measurement_y2006m03 (logdate);
-...
CREATE INDEX measurement_y2007m11_logdate ON measurement_y2007m11 (logdate);
CREATE INDEX measurement_y2007m12_logdate ON measurement_y2007m12 (logdate);
CREATE INDEX measurement_y2008m01_logdate ON measurement_y2008m01 (logdate);
</programlisting>
-
- We choose not to add further indexes at this time.
</para>
</listitem>
@@ -3363,7 +3529,9 @@ END;
$$
LANGUAGE plpgsql;
</programlisting>
+ </para>
+ <para>
After creating the function, we create a trigger which
calls the trigger function:
@@ -3425,151 +3593,88 @@ LANGUAGE plpgsql;
of this example.
</para>
</note>
- </listitem>
- </orderedlist>
- </para>
-
- <para>
- Steps when using a partitioned table are as follows:
- </para>
- <para>
- <orderedlist spacing="compact">
- <listitem>
<para>
- Create the <structname>measurement</> table as a partitioned table:
+ A different approach to redirecting inserts into the appropriate
+ partition table is to set up rules, instead of a trigger, on the
+ master table. For example:
<programlisting>
-CREATE TABLE measurement (
- city_id int not null,
- logdate date not null,
- peaktemp int,
- unitsales int
-) PARTITION BY RANGE (logdate);
+CREATE RULE measurement_insert_y2006m02 AS
+ON INSERT TO measurement WHERE
+ ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' )
+DO INSTEAD
+ INSERT INTO measurement_y2006m02 VALUES (NEW.*);
+...
+CREATE RULE measurement_insert_y2008m01 AS
+ON INSERT TO measurement WHERE
+ ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' )
+DO INSTEAD
+ INSERT INTO measurement_y2008m01 VALUES (NEW.*);
</programlisting>
+
+ A rule has significantly more overhead than a trigger, but the overhead
+ is paid once per query rather than once per row, so this method might
+ be advantageous for bulk-insert situations. In most cases, however,
+ the trigger method will offer better performance.
</para>
- </listitem>
- <listitem>
<para>
- Then create partitions as follows:
+ Be aware that <command>COPY</> ignores rules. If you want to
+ use <command>COPY</> to insert data, you'll need to copy into the
+ correct partition table rather than into the master. <command>COPY</>
+ does fire triggers, so you can use it normally if you use the trigger
+ approach.
+ </para>
-<programlisting>
-CREATE TABLE measurement_y2006m02 PARTITION OF measurement
- FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
-CREATE TABLE measurement_y2006m03 PARTITION OF measurement
- FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
-...
-CREATE TABLE measurement_y2007m11 PARTITION OF measurement
- FOR VALUES FROM ('2007-11-01') TO ('2007-12-01');
-CREATE TABLE measurement_y2007m12 PARTITION OF measurement
- FOR VALUES FROM ('2007-12-01') TO ('2008-01-01');
-CREATE TABLE measurement_y2008m01 PARTITION OF measurement
- FOR VALUES FROM ('2008-01-01') TO ('2008-02-01');
-</programlisting>
+ <para>
+ Another disadvantage of the rule approach is that there is no simple
+ way to force an error if the set of rules doesn't cover the insertion
+ date; the data will silently go into the master table instead.
</para>
</listitem>
<listitem>
<para>
- Create indexes on the key columns just like in case of inheritance
- partitions.
+ Ensure that the <xref linkend="guc-constraint-exclusion">
+ configuration parameter is not disabled in
+ <filename>postgresql.conf</>.
+ If it is, queries will not be optimized as desired.
</para>
</listitem>
</orderedlist>
-
- <note>
- <para>
- To implement sub-partitioning, specify the
- <literal>PARTITION BY</literal> clause in the commands used to create
- individual partitions, for example:
-
-<programlisting>
-CREATE TABLE measurement_y2006m02 PARTITION OF measurement
- FOR VALUES FROM ('2006-02-01') TO ('2006-03-01')
- PARTITION BY RANGE (peaktemp);
-</programlisting>
-
- After creating partitions of <structname>measurement_y2006m02</>, any
- data inserted into <structname>measurement</> that is mapped to
- <structname>measurement_y2006m02</> will be further redirected to one
- of its partitions based on the <structfield>peaktemp</> column.
- Partition key specified may overlap with the parent's partition key,
- although care must be taken when specifying the bounds of sub-partitions
- such that the accepted set of data constitutes a subset of what a
- partition's own bounds allows; the system does not try to check if
- that's really the case.
- </para>
- </note>
</para>
<para>
- As we can see, a complex partitioning scheme could require a
- substantial amount of DDL, although significantly less when using
- partitioned tables. In the above example we would be creating a new
- partition each month, so it might be wise to write a script that
- generates the required DDL automatically.
- </para>
-
- </sect2>
-
- <sect2 id="ddl-partitioning-managing-partitions">
- <title>Managing Partitions</title>
-
- <para>
- Normally the set of partitions established when initially
- defining the table are not intended to remain static. It is
- common to want to remove old partitions of data and periodically
- add new partitions for new data. One of the most important
- advantages of partitioning is precisely that it allows this
- otherwise painful task to be executed nearly instantaneously by
- manipulating the partition structure, rather than physically moving large
- amounts of data around.
- </para>
-
- <para>
- Both the inheritance-based and partitioned table methods allow this to
- be done, although the latter requires taking an <literal>ACCESS EXCLUSIVE</literal>
- lock on the master table for various commands mentioned below.
- </para>
+ As we can see, a complex partitioning scheme could require a
+ substantial amount of DDL. In the above example we would be creating
+ a new partition each month, so it might be wise to write a script that
+ generates the required DDL automatically.
+ </para>
+ </sect3>
- <para>
- The simplest option for removing old data is simply to drop the partition
- that is no longer necessary, which works using both methods of
- partitioning:
+ <sect3 id="ddl-partitioning-inheritance-maintenance">
+ <title>Partition Maintenance</title>
+ <para>
+ To remove old data quickly, simply to drop the partition that is no
+ longer necessary:
<programlisting>
DROP TABLE measurement_y2006m02;
</programlisting>
- This can very quickly delete millions of records because it doesn't have
- to individually delete every record.
- </para>
+ </para>
<para>
- Another option that is often preferable is to remove the partition from
- the partitioned table but retain access to it as a table in its own
- right:
-<programlisting>
-ALTER TABLE measurement_y2006m02 NO INHERIT measurement;
-</programlisting>
-
- When using a partitioned table:
+ To remove the partition from the partitioned table but retain access to
+ it as a table in its own right:
<programlisting>
-ALTER TABLE measurement DETACH PARTITION measurement_y2006m02;
+ALTER TABLE measurement_y2006m02 NO INHERIT measurement;
</programlisting>
-
- This allows further operations to be performed on the data before
- it is dropped. For example, this is often a useful time to back up
- the data using <command>COPY</>, <application>pg_dump</>, or
- similar tools. It might also be a useful time to aggregate data
- into smaller formats, perform other data manipulations, or run
- reports.
</para>
<para>
- Similarly we can add a new partition to handle new data. We can create an
- empty partition in the partitioned table just as the original partitions
- were created above:
+ To add a new partition to handle new data, create an empty partition just
+ as the original partitions were created above:
<programlisting>
CREATE TABLE measurement_y2008m02 (
@@ -3577,52 +3682,80 @@ CREATE TABLE measurement_y2008m02 (
) INHERITS (measurement);
</programlisting>
- When using a partitioned table:
-
-<programlisting>
-CREATE TABLE measurement_y2008m02 PARTITION OF measurement
- FOR VALUES FROM ('2008-02-01') TO ('2008-03-01');
-</programlisting>
-
- As an alternative, it is sometimes more convenient to create the
- new table outside the partition structure, and make it a proper
- partition later. This allows the data to be loaded, checked, and
- transformed prior to it appearing in the partitioned table:
+ Alternatively, one may created the new table outside the partition
+ structure, and make it a partition after data is loaded, checked,
+ and transformed.
<programlisting>
CREATE TABLE measurement_y2008m02
(LIKE measurement INCLUDING DEFAULTS INCLUDING CONSTRAINTS);
+
ALTER TABLE measurement_y2008m02 ADD CONSTRAINT y2008m02
CHECK ( logdate >= DATE '2008-02-01' AND logdate < DATE '2008-03-01' );
+
\copy measurement_y2008m02 from 'measurement_y2008m02'
-- possibly some other data preparation work
+
ALTER TABLE measurement_y2008m02 INHERIT measurement;
</programlisting>
+ </para>
+ </sect3>
+
+ <sect3 id="ddl-partitioning-inheritance-caveats">
+ <title>Caveats</title>
+
+ <para>
+ The following caveats apply to partitioned tables implemented using
+ inheritance:
+ <itemizedlist>
+ <listitem>
+ <para>
+ There is no automatic way to verify that all of the
+ <literal>CHECK</literal> constraints are mutually
+ exclusive. It is safer to create code that generates
+ partitions and creates and/or modifies associated objects than
+ to write each by hand.
+ </para>
+ </listitem>
- The last of the above commands when using a partitioned table would be:
+ <listitem>
+ <para>
+ The schemes shown here assume that the partition key column(s)
+ of a row never change, or at least do not change enough to require
+ it to move to another partition. An <command>UPDATE</> that attempts
+ to do that will fail because of the <literal>CHECK</> constraints.
+ If you need to handle such cases, you can put suitable update triggers
+ on the partition tables, but it makes management of the structure
+ much more complicated.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ If you are using manual <command>VACUUM</command> or
+ <command>ANALYZE</command> commands, don't forget that
+ you need to run them on each partition individually. A command like:
<programlisting>
-ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
- FOR VALUES FROM ('2008-02-01') TO ('2008-03-01' );
+ANALYZE measurement;
</programlisting>
- </para>
+ will only process the master table.
+ </para>
+ </listitem>
- <tip>
+ <listitem>
<para>
- Before running the <command>ATTACH PARTITION</> command, it is
- recommended to create a <literal>CHECK</> constraint on the table to
- be attached describing the desired partition constraint. Using the
- same, system is able to skip the scan to validate the implicit
- partition constraint. Without such a constraint, the table will be
- scanned to validate the partition constraint, while holding an
- <literal>ACCESS EXCLUSIVE</literal> lock on the parent table.
- One may want to drop the constraint after <command>ATTACH PARTITION</>
- is finished, because it is no longer necessary.
+ <command>INSERT</command> statements with <literal>ON CONFLICT</>
+ clauses are unlikely to work as expected, as the <literal>ON CONFLICT</>
+ action is only taken in case of unique violations on the specified
+ target relation, not its child relations.
</para>
- </tip>
- </sect2>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </sect3>
+ </sect2>
- <sect2 id="ddl-partitioning-constraint-exclusion">
+ <sect2 id="ddl-partitioning-constraint-exclusion">
<title>Partitioning and Constraint Exclusion</title>
<indexterm>
@@ -3632,7 +3765,8 @@ ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
<para>
<firstterm>Constraint exclusion</> is a query optimization technique
that improves performance for partitioned tables defined in the
- fashion described above. As an example:
+ fashion described above (both declarative partitioned tables and those
+ implemented using inheritance). As an example:
<programlisting>
SET constraint_exclusion = on;
@@ -3715,153 +3849,6 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
are unlikely to benefit.
</para>
- <note>
- <para>
- Currently, constraint exclusion is also used for partitioned tables.
- However, we did not create any <literal>CHECK</literal> constraints
- for individual partitions as seen above. In this case, the optimizer
- uses internally generated constraint for every partition.
- </para>
- </note>
-
- </sect2>
-
- <sect2 id="ddl-partitioning-alternatives">
- <title>Alternative Partitioning Methods</title>
-
- <para>
- A different approach to redirecting inserts into the appropriate
- partition table is to set up rules, instead of a trigger, on the
- master table (unless it is a partitioned table). For example:
-
-<programlisting>
-CREATE RULE measurement_insert_y2006m02 AS
-ON INSERT TO measurement WHERE
- ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' )
-DO INSTEAD
- INSERT INTO measurement_y2006m02 VALUES (NEW.*);
-...
-CREATE RULE measurement_insert_y2008m01 AS
-ON INSERT TO measurement WHERE
- ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' )
-DO INSTEAD
- INSERT INTO measurement_y2008m01 VALUES (NEW.*);
-</programlisting>
-
- A rule has significantly more overhead than a trigger, but the overhead
- is paid once per query rather than once per row, so this method might be
- advantageous for bulk-insert situations. In most cases, however, the
- trigger method will offer better performance.
- </para>
-
- <para>
- Be aware that <command>COPY</> ignores rules. If you want to
- use <command>COPY</> to insert data, you'll need to copy into the correct
- partition table rather than into the master. <command>COPY</> does fire
- triggers, so you can use it normally if you use the trigger approach.
- </para>
-
- <para>
- Another disadvantage of the rule approach is that there is no simple
- way to force an error if the set of rules doesn't cover the insertion
- date; the data will silently go into the master table instead.
- </para>
-
- <para>
- Partitioning can also be arranged using a <literal>UNION ALL</literal>
- view, instead of table inheritance. For example,
-
-<programlisting>
-CREATE VIEW measurement AS
- SELECT * FROM measurement_y2006m02
-UNION ALL SELECT * FROM measurement_y2006m03
-...
-UNION ALL SELECT * FROM measurement_y2007m11
-UNION ALL SELECT * FROM measurement_y2007m12
-UNION ALL SELECT * FROM measurement_y2008m01;
-</programlisting>
-
- However, the need to recreate the view adds an extra step to adding and
- dropping individual partitions of the data set. In practice this
- method has little to recommend it compared to using inheritance.
- </para>
-
- </sect2>
-
- <sect2 id="ddl-partitioning-caveats">
- <title>Caveats</title>
-
- <para>
- The following caveats apply to using inheritance to implement partitioning:
- <itemizedlist>
- <listitem>
- <para>
- There is no automatic way to verify that all of the
- <literal>CHECK</literal> constraints are mutually
- exclusive. It is safer to create code that generates
- partitions and creates and/or modifies associated objects than
- to write each by hand.
- </para>
- </listitem>
-
- <listitem>
- <para>
- The schemes shown here assume that the partition key column(s)
- of a row never change, or at least do not change enough to require
- it to move to another partition. An <command>UPDATE</> that attempts
- to do that will fail because of the <literal>CHECK</> constraints.
- If you need to handle such cases, you can put suitable update triggers
- on the partition tables, but it makes management of the structure
- much more complicated.
- </para>
- </listitem>
-
- <listitem>
- <para>
- If you are using manual <command>VACUUM</command> or
- <command>ANALYZE</command> commands, don't forget that
- you need to run them on each partition individually. A command like:
-<programlisting>
-ANALYZE measurement;
-</programlisting>
- will only process the master table.
- </para>
- </listitem>
-
- <listitem>
- <para>
- <command>INSERT</command> statements with <literal>ON CONFLICT</>
- clauses are unlikely to work as expected, as the <literal>ON CONFLICT</>
- action is only taken in case of unique violations on the specified
- target relation, not its child relations.
- </para>
- </listitem>
- </itemizedlist>
- </para>
-
- <para>
- The following caveats apply to partitioned tables created with the
- explicit syntax:
- <itemizedlist>
- <listitem>
- <para>
- An <command>UPDATE</> that causes a row to move from one partition to
- another fails, because the new value of the row fails to satisfy the
- implicit partition constraint of the original partition. This might
- change in future releases.
- </para>
- </listitem>
-
- <listitem>
- <para>
- <command>INSERT</command> statements with <literal>ON CONFLICT</>
- clause are currently not allowed on partitioned tables.
- </para>
- </listitem>
-
- </itemizedlist>
- </para>
-
<para>
The following caveats apply to constraint exclusion, which is currently
used by both inheritance and partitioned tables:
@@ -3901,10 +3888,78 @@ ANALYZE measurement;
don't try to use many thousands of partitions.
</para>
</listitem>
-
</itemizedlist>
</para>
</sect2>
+
+ <sect2 id="ddl-partitioning-alternatives">
+ <title>Alternative Partitioning Methods</title>
+
+ <sect3 id="ddl-partitioning-alternatives-union-all">
+ <title>Using UNION ALL view</title>
+ <para>
+ Partitioning can also be arranged using a <literal>UNION ALL</literal>
+ view, instead of table inheritance. For example,
+
+<programlisting>
+CREATE VIEW measurement AS
+ SELECT * FROM measurement_y2006m02
+UNION ALL SELECT * FROM measurement_y2006m03
+...
+UNION ALL SELECT * FROM measurement_y2007m11
+UNION ALL SELECT * FROM measurement_y2007m12
+UNION ALL SELECT * FROM measurement_y2008m01;
+</programlisting>
+
+ However, the need to recreate the view adds an extra step to adding and
+ dropping individual partitions of the data set. In practice this
+ method has little to recommend it compared to using inheritance.
+ </para>
+ </sect3>
+
+ <sect3 id="ddl-partitioning-alternatives-brin-index">
+ <title>Accessing Tables Using BRIN Index</title>
+ <para>
+ <acronym>BRIN</acronym>, which stands for Block Range Index, is
+ designed for handling very large tables in which certain columns
+ have some natural physical location within the table. For example,
+ in the <structname>measurement</structname> table, the entries for
+ earlier times (<structfield>logdate</structfield> column) will appear
+ earlier in the table most of the time. A table storing a ZIP code
+ column might have all codes for a city grouped together naturally.
+ </para>
+
+ <para>
+ In case of <structname>measurement</structname> table, one may consider
+ adding a minmax <acronym>BRIN</acronym> index on the
+ <structfield>logdate</structfield> column.
+
+<programlisting>
+CREATE INDEX ON measurement USING brin (logdate date_minmax_ops);
+</programlisting>
+
+ In this case, specifying <literal>date_minmax_ops</literal> is not
+ necessary; it is shown for clarity.
+ </para>
+
+ <para>
+ <acronym>BRIN</acronym> indexes leverage this locality of data and
+ store summary information for a range of consecutive pages and keep
+ it updated as the data is added or removed. Because a
+ <acronym>BRIN</acronym> index is very small, scanning the index adds
+ adds little overhead compared to a sequential scan, but may avoid
+ scanning large parts of the table that are known not to contain
+ matching tuples. That is often why table partitioning is used. Thus,
+ <acronym>BRIN</acronym> indexes provide a subset of benefits that
+ parttioning provides with much less upfront setup.
+ </para>
+
+ <para>
+ See <xref linkend="brin"> for more details.
+ </para>
+ </sect3>
+
+ </sect2>
</sect1>
<sect1 id="ddl-foreign-data">
--
2.11.0
0002-Add-a-note-about-DROP-NOT-NULL-and-partitions.patchtext/x-diff; name=0002-Add-a-note-about-DROP-NOT-NULL-and-partitions.patchDownload
From 591fd2eb7a9089d68926bff1532879240be0aec6 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 10 Feb 2017 15:03:45 +0900
Subject: [PATCH 2/3] Add a note about DROP NOT NULL and partitions
On the ALTER TABLE refpage, it seems better to mention how to drop
drop the not null constraint of a partition's column. Per suggestion
from Corey Huinker.
---
doc/src/sgml/ref/alter_table.sgml | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
index da431f8369..be857882bb 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -172,9 +172,13 @@ ALTER TABLE [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
</para>
<para>
- If this table is a partition, one cannot perform <literal>DROP NOT NULL</>
+ If this table is a partition, one cannot perform <literal>DROP NOT NULL</literal>
on a column if it is marked <literal>NOT NULL</literal> in the parent
- table.
+ table. To drop the <literal>NOT NULL</literal> constraint from all the
+ partitions, perform <literal>DROP NOT NULL</literal> on the parent
+ table. One might however want to set it for only some partitions,
+ which is possible by doing <literal>SET NOT NULL</literal> on individual
+ partitions.
</para>
</listitem>
</varlistentry>
--
2.11.0
I think you might have the titles for 0002 and 0003 backwards.
On Fri, Mar 3, 2017 at 2:51 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
0002: some cosmetic fixes to create_table.sgml
I think this sentence may be unclear to some readers:
+ One might however want to set it for only some partitions,
+ which is possible by doing <literal>SET NOT NULL</literal> on individual
+ partitions.
I think you could replace this with something like: Even if there is
no <literal>NOT NULL</> constraint on the parent, such a constraint
can still be added to individual partitions, if desired; that is, the
children can disallow nulls even if the parent allows them, but not
the other way around.
0003: add clarification about NOT NULL constraint on partition columns in
alter_table.sgml
This is about list-ifying a note, but I think we should try to
de-note-ify it. It's a giant block of text that is not obviously more
noteworthy than the surrounding text; I think <note> should be saved
for things that particularly need to be called out.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
This is about list-ifying a note, but I think we should try to
de-note-ify it. It's a giant block of text that is not obviously more
noteworthy than the surrounding text; I think <note> should be saved
for things that particularly need to be called out.
Yeah. A big problem with the markup we use, imo, is that <note> segments
are displayed in a way that makes them more prominent than the surrounding
text, not less so. That doesn't really square with my intuitive view of
what a <note> ought to be used for; it forces it to be considered as
something only slightly less dangerous than a <caution> or <warning>, not
as a parenthetical remark. But that's what we have to deal with so
we should mark up our text accordingly.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/03/10 3:26, Robert Haas wrote:
I think you might have the titles for 0002 and 0003 backwards.
Oops, you're right.
On Fri, Mar 3, 2017 at 2:51 AM, Amit Langote wrote:
0002: some cosmetic fixes to create_table.sgml
I think this sentence may be unclear to some readers:
+ One might however want to set it for only some partitions, + which is possible by doing <literal>SET NOT NULL</literal> on individual + partitions.I think you could replace this with something like: Even if there is
no <literal>NOT NULL</> constraint on the parent, such a constraint
can still be added to individual partitions, if desired; that is, the
children can disallow nulls even if the parent allows them, but not
the other way around.
Reads much better, done that way. Thanks.
0003: add clarification about NOT NULL constraint on partition columns in
alter_table.sgmlThis is about list-ifying a note, but I think we should try to
de-note-ify it. It's a giant block of text that is not obviously more
noteworthy than the surrounding text; I think <note> should be saved
for things that particularly need to be called out.
OK. The patch is now just about de-note-ifying the block of text. Since
I don't see any other lists in the Parameters portion of the page, I also
take back my list-ifying proposal.
Attached updated patches.
Thanks,
Amit
Attachments:
0001-Rewrite-sections-in-ddl.sgml-related-to-partitioning.patchtext/x-diff; name=0001-Rewrite-sections-in-ddl.sgml-related-to-partitioning.patchDownload
From a159c9aa3ee7f2c51084f94243be16a30242d7a6 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 3 Mar 2017 16:39:24 +0900
Subject: [PATCH 1/3] Rewrite sections in ddl.sgml related to partitioning
Merge sections Partitioned Tables and Partitioning into one section
called Table Partitioning and Related Solutions.
---
doc/src/sgml/ddl.sgml | 1359 +++++++++++++++++++++++++------------------------
1 file changed, 707 insertions(+), 652 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 09b5b3ff70..a2dd39df54 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2772,14 +2772,181 @@ VALUES ('Albany', NULL, NULL, 'NY');
</sect2>
</sect1>
- <sect1 id="ddl-partitioned-tables">
- <title>Partitioned Tables</title>
+ <sect1 id="ddl-partitioning">
+ <title>Table Partitioning and Related Solutions</title>
+
+ <indexterm>
+ <primary>partitioning</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>table</primary>
+ <secondary>partitioning</secondary>
+ </indexterm>
<indexterm>
<primary>partitioned table</primary>
</indexterm>
<para>
+ <productname>PostgreSQL</productname> supports basic table
+ partitioning. This section describes why and how to implement
+ partitioning as part of your database design.
+ </para>
+
+ <sect2 id="ddl-partitioning-overview">
+ <title>Overview</title>
+
+ <para>
+ Partitioning refers to splitting what is logically one large table into
+ smaller physical pieces. Partitioning can provide several benefits:
+ <itemizedlist>
+ <listitem>
+ <para>
+ Query performance can be improved dramatically in certain situations,
+ particularly when most of the heavily accessed rows of the table are in a
+ single partition or a small number of partitions. The partitioning
+ substitutes for leading columns of indexes, reducing index size and
+ making it more likely that the heavily-used parts of the indexes
+ fit in memory.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ When queries or updates access a large percentage of a single
+ partition, performance can be improved by taking advantage
+ of sequential scan of that partition instead of using an
+ index and random access reads scattered across the whole table.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Bulk loads and deletes can be accomplished by adding or removing
+ partitions, if that requirement is planned into the partitioning design.
+ Doing <command>ALTER TABLE DETACH PARTITION</> followed by
+ <command>DROP TABLE</> is far faster than a bulk operation. These
+ commands also entirely avoid the <command>VACUUM</command> overhead
+ caused by a bulk <command>DELETE</>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Seldom-used data can be migrated to cheaper and slower storage media.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ The benefits will normally be worthwhile only when a table would
+ otherwise be very large. The exact point at which a table will
+ benefit from partitioning depends on the application, although a
+ rule of thumb is that the size of the table should exceed the physical
+ memory of the database server.
+ </para>
+
+ <para>
+ The following forms of partitioning can be implemented in
+ <productname>PostgreSQL</productname>:
+
+ <variablelist>
+ <varlistentry>
+ <term>Range Partitioning</term>
+
+ <listitem>
+ <para>
+ The table is partitioned into <quote>ranges</quote> defined
+ by a key column or set of columns, with no overlap between
+ the ranges of values assigned to different partitions. For
+ example one might partition by date ranges, or by ranges of
+ identifiers for particular business objects.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>List Partitioning</term>
+
+ <listitem>
+ <para>
+ The table is partitioned by explicitly listing which key values
+ appear in each partition.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+
+ <para>
+ The following partitioning methods are currently supported:
+
+ <variablelist>
+ <varlistentry>
+ <term>Declarative Partitioning</term>
+
+ <listitem>
+ <para>
+ One creates a <firstterm>partitioned table</firstterm> by specifying
+ the partitioning method and a set of columns as the partition key.
+ <firstterm>Partitions</firstterm>, which contain actual data inserted
+ into the table, are created by specifying what subset of the data it
+ accepts.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Using inheritance</term>
+
+ <listitem>
+ <para>
+ Each partition must be created as a child table of a single parent
+ table. The parent table itself is normally empty; it exists just to
+ represent the entire data set.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Using UNION ALL views</term>
+
+ <listitem>
+ <para>
+ One can define a <literal>UNION ALL</literal> view over
+ <literal>SELECT</literal> on individual tables, each of which
+ contains a partition of data. Partitions are added or removed
+ by updating the view definition.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Accessing Tables using BRIN Indexes</term>
+
+ <listitem>
+ <para>
+ <acronym>BRIN</acronym>, which stands for Block Range Index is,
+ designed for handling very large tables in which certain columns
+ have some natural physical location within the table. Scanning
+ a large table using a <acronym>BRIN</acronym> index results in
+ reading only a portion of the table, which is often why partitioning
+ is implemented.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </para>
+
+ <para>
+ Each of the above mentioned methods is described below.
+ </para>
+ </sect2>
+
+ <sect2 id="ddl-partitioning-declarative">
+ <title>Declarative Partitioning</title>
+
+ <para>
PostgreSQL offers a way to specify how to divide a table into pieces
called partitions. The table that is divided is referred to as a
<firstterm>partitioned table</firstterm>. The specification consists
@@ -2790,25 +2957,29 @@ VALUES ('Albany', NULL, NULL, 'NY');
<para>
All rows inserted into a partitioned table will be routed to one of the
<firstterm>partitions</firstterm> based on the value of the partition
- key. Each partition has a subset defined by its <firstterm>partition
- bounds</firstterm>. Currently supported partitioning methods include
- range and list, wherein each partition is assigned a range of keys or
- a list of keys, respectively.
+ key. Each partition has a subset of the data defined by its
+ <firstterm>partition bounds</firstterm>. Currently supported
+ partitioning methods include range and list, where each partition is
+ assigned a range of keys and a list of keys, respectively.
</para>
<para>
Partitions may have their own indexes, constraints and default values,
- distinct from other partitions. Partitions do not inherit indexes from
- the partitioned table.
+ distinct from other partitions. Partitions do not currently inherit
+ indexes from the partitioned table.
+ </para>
+
+ <para>
+ See <xref linkend="sql-createtable"> for more details creating partitioned
+ tables and partitions.
</para>
<para>
Partitions may themselves be defined as partitioned tables, referred to as
- <firstterm>sub-partitioning</firstterm>. See <xref linkend="sql-createtable">
- for more details creating partitioned tables and partitions. It is not
- currently possible to alter a regular table into a partitioned table or
- vice versa. However, it is possible to add a regular table containing
- data into a partition of a partitioned table, or remove a partition; see
+ <firstterm>sub-partitioning</firstterm>. It is not currently possible to
+ alter a regular table into a partitioned table or vice versa. However,
+ it is possible to add a regular or partitioned table containing data into
+ a partition of a partitioned table, or remove a partition; see
<xref linkend="sql-altertable"> to learn more about the
<command>ATTACH PARTITION</> and <command>DETACH PARTITION</> sub-commands.
</para>
@@ -2823,8 +2994,8 @@ VALUES ('Albany', NULL, NULL, 'NY');
partitioned table and partitions do not participate in inheritance with
regular tables. Since a partition hierarchy consisting of the
partitioned table and its partitions is still an inheritance hierarchy,
- all the normal rules of inheritance apply as described in the previous
- section (<xref linkend="ddl-inherit">) with some exceptions, most notably:
+ all the normal rules of inheritance apply as described in
+ <xref linkend="ddl-inherit"> with some exceptions, most notably:
<itemizedlist>
<listitem>
@@ -2840,13 +3011,11 @@ VALUES ('Albany', NULL, NULL, 'NY');
<para>
The <literal>ONLY</literal> notation used to exclude child tables
would either cause error or will be ignored in some cases for
- partitioned tables. For example, specifying <literal>ONLY</literal>
- when querying data from a partitioned table would not make much sense,
- because all the data is contained in partitions, so this raises an
- error. Specifying <literal>ONLY</literal> when modifying schema is
- not desirable in certain cases with partitioned tables where it may be
- fine for regular inheritance parents (for example, dropping a column
- from only the parent); an error will be thrown in that case.
+ partitioned tables. Specifying <literal>ONLY</literal> when modifying
+ schema is not desirable in certain cases with partitioned tables
+ whereas it may be fine for regular inheritance parents (for example,
+ dropping a column from only the parent); an error will be thrown in
+ that case.
</para>
</listitem>
@@ -2855,9 +3024,9 @@ VALUES ('Albany', NULL, NULL, 'NY');
Partitions cannot have columns that are not present in the parent.
It is neither possible to specify columns when creating partitions
with <command>CREATE TABLE</> nor is it possible to add columns to
- partitions using <command>ALTER TABLE</>. Tables may be added with
- <command>ALTER TABLE ... ATTACH PARTITION</> if their columns exactly
- match the parent, including oids.
+ partitions using <command>ALTER TABLE</>. Tables may be added as a
+ partition with <command>ALTER TABLE ... ATTACH PARTITION</> only if
+ their columns exactly match the parent, including oids.
</para>
</listitem>
@@ -2871,199 +3040,353 @@ VALUES ('Albany', NULL, NULL, 'NY');
</para>
<para>
- Partitions can also be foreign tables (see <xref linkend="ddl-foreign-data">),
+ Partitions can also be foreign tables (see <xref linkend="sql-createforeigntable">),
although certain limitations exist currently in their usage. For example,
- data inserted into the partitioned table cannot be routed to foreign table
- partitions.
+ data inserted into the partitioned table is currently not routed to foreign
+ table partitions.
</para>
+ <sect3 id="ddl-partitioning-declarative-example">
+ <title>Example</title>
+
<para>
- There are currently the following limitations of using partitioned tables:
- <itemizedlist>
+ Suppose we are constructing a database for a large ice cream company.
+ The company measures peak temperatures every day as well as ice cream
+ sales in each region. Conceptually, we want a table like:
+
+<programlisting>
+CREATE TABLE measurement (
+ city_id int not null,
+ logdate date not null,
+ peaktemp int,
+ unitsales int
+);
+</programlisting>
+
+ We know that most queries will access just the last week's, month's or
+ quarter's data, since the main use of this table will be to prepare
+ online reports for management. To reduce the amount of old data that
+ needs to be stored, we decide to only keep the most recent 3 years
+ worth of data. At the beginning of each month we will remove the oldest
+ month's data. In this situation we can use partitioning to help us meet
+ all of our different requirements for the measurements table.
+ </para>
+
+ <para>
+ To use declarative partitioning in this case, use the following steps:
+
+ <orderedlist spacing="compact">
<listitem>
<para>
- It is currently not possible to add same set of indexes on all partitions
- automatically. Indexes must be added to each partition with separate
- commands.
+ Create <structname>measurement</structname> table as a partitioned
+ table by specifying the <literal>PARTITION BY</literal> clause, which
+ includes the partitioning method (<literal>RANGE</literal> or
+ <literal>LIST</literal>) and the list of column(s) to use as the
+ partition key.
+
+<programlisting>
+CREATE TABLE measurement (
+ city_id int not null,
+ logdate date not null,
+ peaktemp int,
+ unitsales int
+) PARTITION BY RANGE (logdate);
+</programlisting>
</para>
- </listitem>
- <listitem>
+ <note>
+ <para>
+ To decide when to use multiple columns in the partition key for range
+ partitioning, consider whether queries accessing the partitioned
+ in question will include conditions that involve multiple columns,
+ especially the columns being considered to be the partition key.
+ If so, the optimizer can create a plan that will scan fewer partitions
+ if a query's conditions are such that there is equality constraint on
+ leading partition key columns, because they limit the number of
+ partitions of interest. The first partition key column with
+ inequality constraint also further eliminates some partitions of
+ those chosen by equality constraints on earlier columns.
+ </para>
+ </note>
+
<para>
- It is currently not possible to define indexes on partitioned tables
- that include all rows from all partitions in one global index.
- Consequently, it is not possible to create constraints that are realized
- using an index such as <literal>UNIQUE</>.
+ To be able to insert data into this table, one must create partitions,
+ as described below.
</para>
</listitem>
<listitem>
<para>
- Since primary keys are not supported on partitioned tables,
- foreign keys referencing partitioned tables are not supported, nor
- are foreign key references from a partitioned table to some other table.
+ Create partitions. Each partition's definition must specify the bounds
+ that correspond to the partitioning method and partition key of the
+ parent. Note that specifying bounds such that the new partition's
+ values will overlap with those in one or more existing partitions will
+ cause an error. Inserting data into into the parent table that does
+ not map to one of the existing partitions will cause an error;
+ appropriate partition must be added manually.
+ </para>
+
+ <para>
+ Partitions thus created are in every way normal <productname>PostgreSQL</>
+ tables (or, possibly, foreign tables). It is possible, for example, to
+ specify tablespace, storage parameters for each partition separately.
+ </para>
+
+ <para>
+ It is not necessary to create table constraints describing partition
+ boundary condition for partitions. Instead, partition constraints are
+ generated implicitly from the partition bound specification whenever
+ there is need to refer to them. Also, since any data inserted into the
+ parent table is automatically inserted into the appropriate partition,
+ it is not necessary to create triggers for the same.
+
+<programlisting>
+CREATE TABLE measurement_y2006m02 PARTITION OF measurement
+ FOR VALUES FROM ('2006-02-01') TO ('2006-03-01')
+
+CREATE TABLE measurement_y2006m03 PARTITION OF measurement
+ FOR VALUES FROM ('2006-03-01') TO ('2006-04-01')
+
+...
+CREATE TABLE measurement_y2007m11 PARTITION OF measurement
+ FOR VALUES FROM ('2007-11-01') TO ('2007-12-01')
+
+CREATE TABLE measurement_y2007m12 PARTITION OF measurement
+ FOR VALUES FROM ('2007-12-01') TO ('2008-01-01')
+ TABLESPACE fasttablespace;
+
+CREATE TABLE measurement_y2008m01 PARTITION OF measurement
+ FOR VALUES FROM ('2008-01-01') TO ('2008-02-01')
+ TABLESPACE fasttablespace
+ WITH (parallel_workers = 4);
+</programlisting>
</para>
+
+ <note>
+ <para>
+ To implement sub-partitioning, specify the
+ <literal>PARTITION BY</literal> clause in the commands used to create
+ individual partitions, for example:
+
+<programlisting>
+CREATE TABLE measurement_y2006m02 PARTITION OF measurement
+ FOR VALUES FROM ('2006-02-01') TO ('2006-03-01')
+ PARTITION BY RANGE (peaktemp);
+</programlisting>
+
+ After creating partitions of <structname>measurement_y2006m02</>,
+ any data inserted into <structname>measurement</> that is mapped to
+ <structname>measurement_y2006m02</> will be further redirected to one
+ of its partitions based on the <structfield>peaktemp</> column.
+ Partition key specified may overlap with the parent's partition key,
+ although care must be taken when specifying the bounds of
+ sub-partitions such that the accepted set of data constitutes a
+ subset of what a partition's own bounds allows; the system does not
+ try to check if that's really the case.
+ </para>
+ </note>
</listitem>
<listitem>
<para>
- Row triggers, if necessary, must be defined on individual partitions, not
- the partitioned table as it is currently not supported.
+ Create an index on the key column(s),
+ as well as any other indexes you might want for every partition.
+ Note that it is currently not supported to propagate index definition
+ from the master partitioned table to its partitions; in fact, it is
+ not possible to define indexes on partitioned tables in the first
+ place. This might change in future releases.
+
+<programlisting>
+CREATE INDEX ON measurement_y2006m02 (logdate);
+CREATE INDEX ON measurement_y2006m03 (logdate);
+...
+CREATE INDEX ON measurement_y2007m11 (logdate);
+CREATE INDEX ON measurement_y2007m12 (logdate);
+CREATE INDEX ON measurement_y2008m01 (logdate);
+</programlisting>
</para>
</listitem>
- </itemizedlist>
+
+ <listitem>
+ <para>
+ Ensure that the <xref linkend="guc-constraint-exclusion">
+ configuration parameter is not disabled in <filename>postgresql.conf</>.
+ If it is, queries will not be optimized as desired.
+ </para>
+ </listitem>
+ </orderedlist>
</para>
<para>
- A detailed example that shows how to use partitioned tables is discussed in
- the next chapter.
+ In the above example we would be creating a new partition each month, so
+ it might be wise to write a script that generates the required DDL
+ automatically.
</para>
-
- </sect1>
+ </sect3>
- <sect1 id="ddl-partitioning">
- <title>Partitioning</title>
+ <sect3 id="ddl-partitioning-declarative-maintenance">
+ <title>Partition Maintenance</title>
- <indexterm>
- <primary>partitioning</primary>
- </indexterm>
+ <para>
+ Normally the set of partitions established when initially defining the
+ the table are not intended to remain static. It is common to want to
+ remove old partitions of data and periodically add new partitions for
+ new data. One of the most important advantages of partitioning is
+ precisely that it allows this otherwise painful task to be executed
+ nearly instantaneously by manipulating the partition structure, rather
+ than physically moving large amounts of data around.
+ </para>
- <indexterm>
- <primary>table</primary>
- <secondary>partitioning</secondary>
- </indexterm>
+ <para>
+ The simplest option for removing old data is simply to drop the partition
+ that is no longer necessary:
+<programlisting>
+DROP TABLE measurement_y2006m02;
+</programlisting>
+ This can very quickly delete millions of records because it doesn't have
+ to individually delete every record. Note however that the above command
+ requires taking an <literal>ACCESS EXCLUSIVE</literal> lock on the parent
+ table.
+ </para>
<para>
- <productname>PostgreSQL</productname> supports basic table
- partitioning. This section describes why and how to implement
- partitioning as part of your database design.
- </para>
+ Another option that is often preferable is to remove the partition from
+ the partitioned table but retain access to it as a table in its own
+ right:
- <sect2 id="ddl-partitioning-overview">
- <title>Overview</title>
+<programlisting>
+ALTER TABLE measurement DETACH PARTITION measurement_y2006m02;
+</programlisting>
+
+ This allows further operations to be performed on the data before
+ it is dropped. For example, this is often a useful time to back up
+ the data using <command>COPY</>, <application>pg_dump</>, or
+ similar tools. It might also be a useful time to aggregate data
+ into smaller formats, perform other data manipulations, or run
+ reports.
+ </para>
<para>
- Partitioning refers to splitting what is logically one large table
- into smaller physical pieces.
- Partitioning can provide several benefits:
- <itemizedlist>
- <listitem>
- <para>
- Query performance can be improved dramatically in certain situations,
- particularly when most of the heavily accessed rows of the table are in a
- single partition or a small number of partitions. The partitioning
- substitutes for leading columns of indexes, reducing index size and
- making it more likely that the heavily-used parts of the indexes
- fit in memory.
- </para>
- </listitem>
+ Similarly we can add a new partition to handle new data. We can create an
+ empty partition in the partitioned table just as the original partitions
+ were created above:
- <listitem>
- <para>
- When queries or updates access a large percentage of a single
- partition, performance can be improved by taking advantage
- of sequential scan of that partition instead of using an
- index and random access reads scattered across the whole table.
- </para>
- </listitem>
+<programlisting>
+CREATE TABLE measurement_y2008m02 PARTITION OF measurement
+ FOR VALUES FROM ('2008-02-01') TO ('2008-03-01')
+ TABLESPACE fasttablespace;
+</programlisting>
- <listitem>
- <para>
- Bulk loads and deletes can be accomplished by adding or removing
- partitions, if that requirement is planned into the partitioning design.
- <command>ALTER TABLE NO INHERIT</> or <command>ALTER TABLE DETACH PARTITION</>
- and <command>DROP TABLE</> are both far faster than a bulk operation.
- These commands also entirely avoid the <command>VACUUM</command>
- overhead caused by a bulk <command>DELETE</>.
- </para>
- </listitem>
+ As an alternative, it is sometimes more convenient to create the
+ new table outside the partition structure, and make it a proper
+ partition later. This allows the data to be loaded, checked, and
+ transformed prior to it appearing in the partitioned table:
- <listitem>
- <para>
- Seldom-used data can be migrated to cheaper and slower storage media.
- </para>
- </listitem>
- </itemizedlist>
+<programlisting>
+CREATE TABLE measurement_y2008m02
+ (LIKE measurement INCLUDING DEFAULTS INCLUDING CONSTRAINTS)
+ TABLESPACE fasttablespace;
- The benefits will normally be worthwhile only when a table would
- otherwise be very large. The exact point at which a table will
- benefit from partitioning depends on the application, although a
- rule of thumb is that the size of the table should exceed the physical
- memory of the database server.
- </para>
+ALTER TABLE measurement_y2008m02 ADD CONSTRAINT y2008m02
+ CHECK ( logdate >= DATE '2008-02-01' AND logdate < DATE '2008-03-01' );
- <para>
- Currently, <productname>PostgreSQL</productname> supports partitioning
- using two methods:
+\copy measurement_y2008m02 from 'measurement_y2008m02'
+-- possibly some other data preparation work
- <variablelist>
- <varlistentry>
- <term>Using Table Inheritance</term>
+ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
+ FOR VALUES FROM ('2008-02-01') TO ('2008-03-01' );
+</programlisting>
+ </para>
- <listitem>
- <para>
- Each partition must be created as a child table of a single parent
- table. The parent table itself is normally empty; it exists just to
- represent the entire data set. You should be familiar with
- inheritance (see <xref linkend="ddl-inherit">) before attempting to
- set up partitioning with it. This was the only method to implement
- partitioning in older versions.
- </para>
- </listitem>
- </varlistentry>
+ <tip>
+ <para>
+ Before running the <command>ATTACH PARTITION</> command, it is
+ recommended to create a <literal>CHECK</> constraint on the table to
+ be attached describing the desired partition constraint. Using the
+ same, system is able to skip the scan to validate the implicit
+ partition constraint. Without such a constraint, the table will be
+ scanned to validate the partition constraint, while holding an
+ <literal>ACCESS EXCLUSIVE</literal> lock on the parent table.
+ One may want to drop the constraint after <command>ATTACH PARTITION</>
+ is finished, because it is no longer necessary.
+ </para>
+ </tip>
+ </sect3>
- <varlistentry>
- <term>Using Partitioned Tables</term>
+ <sect3 id="ddl-partitioning-declarative-limitations">
+ <title>Limitations</title>
- <listitem>
- <para>
- See last section for some general information:
- <xref linkend="ddl-partitioned-tables">
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </para>
+ <para>
+ There are currently the following limitations of using partitioned tables:
+ <itemizedlist>
+ <listitem>
+ <para>
+ It is currently not possible to add same set of indexes on all partitions
+ automatically. Indexes must be added to each partition with separate
+ commands.
+ </para>
+ </listitem>
- <para>
- The following forms of partitioning can be implemented in
- <productname>PostgreSQL</productname> using either of the above mentioned
- methods, although the latter provides dedicated syntax for each:
+ <listitem>
+ <para>
+ It is currently not possible to define indexes on partitioned tables
+ that include all rows from all partitions in one global index.
+ Consequently, it is not possible to create constraints that are realized
+ using an index such as <literal>UNIQUE</>.
+ </para>
+ </listitem>
- <variablelist>
- <varlistentry>
- <term>Range Partitioning</term>
+ <listitem>
+ <para>
+ Since primary keys are not supported on partitioned tables,
+ foreign keys referencing partitioned tables are not supported, nor
+ are foreign key references from a partitioned table to some other table.
+ </para>
+ </listitem>
- <listitem>
- <para>
- The table is partitioned into <quote>ranges</quote> defined
- by a key column or set of columns, with no overlap between
- the ranges of values assigned to different partitions. For
- example one might partition by date ranges, or by ranges of
- identifiers for particular business objects.
- </para>
- </listitem>
- </varlistentry>
+ <listitem>
+ <para>
+ <command>INSERT</command> statements with <literal>ON CONFLICT</>
+ clause are currently not allowed on partitioned tables.
+ </para>
+ </listitem>
- <varlistentry>
- <term>List Partitioning</term>
+ <listitem>
+ <para>
+ An <command>UPDATE</> that causes a row to move from one partition to
+ another fails, because the new value of the row fails to satisfy the
+ implicit partition constraint of the original partition. This might
+ change in future releases.
+ </para>
+ </listitem>
- <listitem>
- <para>
- The table is partitioned by explicitly listing which key values
- appear in each partition.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
- </para>
+ <listitem>
+ <para>
+ Row triggers, if necessary, must be defined on individual partitions, not
+ the partitioned table as it is currently not supported.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </sect3>
</sect2>
- <sect2 id="ddl-partitioning-implementation">
- <title>Implementing Partitioning</title>
+ <sect2 id="ddl-partitioning-implementation-inheritance">
+ <title>Implementation Using Inheritance</title>
+ <para>
+ In some cases, one may want to add columns to partitions that are not
+ present in the parent table which is not possible to do with the above
+ method. For such cases, partitioning can be implemented using
+ inheritance (see <xref linkend="ddl-inherit">).
+ </para>
+
+ <sect3 id="ddl-partitioning-inheritance-example">
+ <title>Example</title>
<para>
- To set up a partitioned table using inheritance, do the following:
+ We use the same <structname>measurement</structname> table we used
+ above. To implement it as a partitioned table using inheritance, do the
+ following:
<orderedlist spacing="compact">
<listitem>
<para>
@@ -3076,6 +3399,11 @@ VALUES ('Albany', NULL, NULL, 'NY');
be applied equally to all partitions. There is no point
in defining any indexes or unique constraints on it, either.
</para>
+
+ <para>
+ In case of our example, master table is the original
+ <structname>measurement</structname> as originally defined.
+ </para>
</listitem>
<listitem>
@@ -3090,12 +3418,27 @@ VALUES ('Albany', NULL, NULL, 'NY');
are in every way normal <productname>PostgreSQL</> tables
(or, possibly, foreign tables).
</para>
+
+ <para>
+ This solves one of our problems: deleting old data. Each
+ month, all we will need to do is perform a <command>DROP
+ TABLE</command> on the oldest child table and create a new
+ child table for the new month's data.
+<programlisting>
+CREATE TABLE measurement_y2006m02 () INHERITS (measurement);
+CREATE TABLE measurement_y2006m03 () INHERITS (measurement);
+...
+CREATE TABLE measurement_y2007m11 () INHERITS (measurement);
+CREATE TABLE measurement_y2007m12 () INHERITS (measurement);
+CREATE TABLE measurement_y2008m01 () INHERITS (measurement);
+</programlisting>
+ </para>
</listitem>
<listitem>
<para>
- Add table constraints to the partition tables to define the
- allowed key values in each partition.
+ Add non-overlapping table constraints to the partition tables to
+ define the allowed key values in each partition.
</para>
<para>
@@ -3117,230 +3460,53 @@ CHECK ( outletID BETWEEN 200 AND 300 )
</para>
<para>
- Note that there is no difference in
- syntax between range and list partitioning; those terms are
- descriptive only.
- </para>
- </listitem>
-
- <listitem>
- <para>
- For each partition, create an index on the key column(s),
- as well as any other indexes you might want. (The key index is
- not strictly necessary, but in most scenarios it is helpful.
- If you intend the key values to be unique then you should
- always create a unique or primary-key constraint for each
- partition.)
- </para>
- </listitem>
-
- <listitem>
- <para>
- Optionally, define a trigger or rule to redirect data inserted into
- the master table to the appropriate partition.
- </para>
- </listitem>
-
- <listitem>
- <para>
- Ensure that the <xref linkend="guc-constraint-exclusion">
- configuration parameter is not disabled in
- <filename>postgresql.conf</>.
- If it is, queries will not be optimized as desired.
- </para>
- </listitem>
-
- </orderedlist>
- </para>
-
- <para>
- To use partitioned tables, do the following:
- <orderedlist spacing="compact">
- <listitem>
- <para>
- Create <quote>master</quote> table as a partitioned table by
- specifying the <literal>PARTITION BY</literal> clause, which includes
- the partitioning method (<literal>RANGE</literal> or
- <literal>LIST</literal>) and the list of column(s) to use as the
- partition key. To be able to insert data into the table, one must
- create partitions, as described below.
- </para>
-
- <note>
- <para>
- To decide when to use multiple columns in the partition key for range
- partitioning, consider whether queries accessing the partitioned
- in question will include conditions that involve multiple columns,
- especially the columns being considered to be the partition key.
- If so, the optimizer can create a plan that will scan fewer partitions
- if a query's conditions are such that there is equality constraint on
- leading partition key columns, because they limit the number of
- partitions of interest. The first partition key column with
- inequality constraint also further eliminates some partitions of
- those chosen by equality constraints on earlier columns.
- </para>
- </note>
- </listitem>
-
- <listitem>
- <para>
- Create partitions of the master partitioned table, with the partition
- bounds specified for each partition matching the partitioning method
- and partition key of the master table. Note that specifying partition
- bounds such that the new partition's values will overlap with one or
- more existing partitions will cause an error. It is only after
- creating partitions that one is able to insert data into the master
- partitioned table, provided it maps to one of the existing partitions.
- If a data row does not map to any of the existing partitions, it will
- cause an error.
- </para>
-
- <para>
- Partitions thus created are also in every way normal
- <productname>PostgreSQL</> tables (or, possibly, foreign tables),
- whereas partitioned tables differ in a number of ways.
- </para>
-
- <para>
- It is not necessary to create table constraints for partitions.
- Instead, partition constraints are generated implicitly whenever
- there is a need to refer to them. Also, since any data inserted into
- the master partitioned table is automatically inserted into the
- appropriate partition, it is not necessary to create triggers for the
- same.
- </para>
- </listitem>
-
- <listitem>
- <para>
- Just like with inheritance, create an index on the key column(s),
- as well as any other indexes you might want for every partition.
- Note that it is currently not supported to propagate index definition
- from the master partitioned table to its partitions; in fact, it is
- not possible to define indexes on partitioned tables in the first
- place. This might change in future releases.
- </para>
- </listitem>
-
- <listitem>
- <para>
- Currently, partitioned tables also depend on constraint exclusion
- for query optimization, so ensure that the
- <xref linkend="guc-constraint-exclusion"> configuration parameter is
- not disabled in <filename>postgresql.conf</>. This might change in
- future releases.
- </para>
- </listitem>
-
- </orderedlist>
- </para>
-
- <para>
- For example, suppose we are constructing a database for a large
- ice cream company. The company measures peak temperatures every
- day as well as ice cream sales in each region. Conceptually,
- we want a table like:
-
-<programlisting>
-CREATE TABLE measurement (
- city_id int not null,
- logdate date not null,
- peaktemp int,
- unitsales int
-);
-</programlisting>
-
- We know that most queries will access just the last week's, month's or
- quarter's data, since the main use of this table will be to prepare
- online reports for management.
- To reduce the amount of old data that needs to be stored, we
- decide to only keep the most recent 3 years worth of data. At the
- beginning of each month we will remove the oldest month's data.
- </para>
-
- <para>
- In this situation we can use partitioning to help us meet all of our
- different requirements for the measurements table. Following the
- steps outlined above for both methods, partitioning can be set up as
- follows:
- </para>
-
- <para>
- <orderedlist spacing="compact">
- <listitem>
- <para>
- The master table is the <structname>measurement</> table, declared
- exactly as above.
- </para>
- </listitem>
-
- <listitem>
- <para>
- Next we create one partition for each active month:
-
-<programlisting>
-CREATE TABLE measurement_y2006m02 ( ) INHERITS (measurement);
-CREATE TABLE measurement_y2006m03 ( ) INHERITS (measurement);
-...
-CREATE TABLE measurement_y2007m11 ( ) INHERITS (measurement);
-CREATE TABLE measurement_y2007m12 ( ) INHERITS (measurement);
-CREATE TABLE measurement_y2008m01 ( ) INHERITS (measurement);
-</programlisting>
-
- Each of the partitions are complete tables in their own right,
- but they inherit their definitions from the
- <structname>measurement</> table.
- </para>
-
- <para>
- This solves one of our problems: deleting old data. Each
- month, all we will need to do is perform a <command>DROP
- TABLE</command> on the oldest child table and create a new
- child table for the new month's data.
- </para>
- </listitem>
-
- <listitem>
- <para>
- We must provide non-overlapping table constraints. Rather than
- just creating the partition tables as above, the table creation
- script should really be:
+ It would be better to instead create partitions as follows:
<programlisting>
CREATE TABLE measurement_y2006m02 (
CHECK ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' )
) INHERITS (measurement);
+
CREATE TABLE measurement_y2006m03 (
CHECK ( logdate >= DATE '2006-03-01' AND logdate < DATE '2006-04-01' )
) INHERITS (measurement);
+
...
CREATE TABLE measurement_y2007m11 (
CHECK ( logdate >= DATE '2007-11-01' AND logdate < DATE '2007-12-01' )
) INHERITS (measurement);
+
CREATE TABLE measurement_y2007m12 (
CHECK ( logdate >= DATE '2007-12-01' AND logdate < DATE '2008-01-01' )
) INHERITS (measurement);
+
CREATE TABLE measurement_y2008m01 (
CHECK ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' )
) INHERITS (measurement);
</programlisting>
</para>
+
+ <para>
+ Note that there is no difference in syntax between range and list
+ partitioning; those terms are descriptive only.
+ </para>
</listitem>
<listitem>
<para>
- We probably need indexes on the key columns too:
-
+ For each partition, create an index on the key column(s),
+ as well as any other indexes you might want. (The key index is
+ not strictly necessary, but in most scenarios it is helpful.
+ If you intend the key values to be unique then you should
+ always create a unique or primary-key constraint for each
+ partition.)
<programlisting>
CREATE INDEX measurement_y2006m02_logdate ON measurement_y2006m02 (logdate);
CREATE INDEX measurement_y2006m03_logdate ON measurement_y2006m03 (logdate);
-...
CREATE INDEX measurement_y2007m11_logdate ON measurement_y2007m11 (logdate);
CREATE INDEX measurement_y2007m12_logdate ON measurement_y2007m12 (logdate);
CREATE INDEX measurement_y2008m01_logdate ON measurement_y2008m01 (logdate);
</programlisting>
-
- We choose not to add further indexes at this time.
</para>
</listitem>
@@ -3363,7 +3529,9 @@ END;
$$
LANGUAGE plpgsql;
</programlisting>
+ </para>
+ <para>
After creating the function, we create a trigger which
calls the trigger function:
@@ -3425,151 +3593,88 @@ LANGUAGE plpgsql;
of this example.
</para>
</note>
- </listitem>
- </orderedlist>
- </para>
-
- <para>
- Steps when using a partitioned table are as follows:
- </para>
- <para>
- <orderedlist spacing="compact">
- <listitem>
<para>
- Create the <structname>measurement</> table as a partitioned table:
+ A different approach to redirecting inserts into the appropriate
+ partition table is to set up rules, instead of a trigger, on the
+ master table. For example:
<programlisting>
-CREATE TABLE measurement (
- city_id int not null,
- logdate date not null,
- peaktemp int,
- unitsales int
-) PARTITION BY RANGE (logdate);
+CREATE RULE measurement_insert_y2006m02 AS
+ON INSERT TO measurement WHERE
+ ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' )
+DO INSTEAD
+ INSERT INTO measurement_y2006m02 VALUES (NEW.*);
+...
+CREATE RULE measurement_insert_y2008m01 AS
+ON INSERT TO measurement WHERE
+ ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' )
+DO INSTEAD
+ INSERT INTO measurement_y2008m01 VALUES (NEW.*);
</programlisting>
+
+ A rule has significantly more overhead than a trigger, but the overhead
+ is paid once per query rather than once per row, so this method might
+ be advantageous for bulk-insert situations. In most cases, however,
+ the trigger method will offer better performance.
</para>
- </listitem>
- <listitem>
<para>
- Then create partitions as follows:
+ Be aware that <command>COPY</> ignores rules. If you want to
+ use <command>COPY</> to insert data, you'll need to copy into the
+ correct partition table rather than into the master. <command>COPY</>
+ does fire triggers, so you can use it normally if you use the trigger
+ approach.
+ </para>
-<programlisting>
-CREATE TABLE measurement_y2006m02 PARTITION OF measurement
- FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
-CREATE TABLE measurement_y2006m03 PARTITION OF measurement
- FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
-...
-CREATE TABLE measurement_y2007m11 PARTITION OF measurement
- FOR VALUES FROM ('2007-11-01') TO ('2007-12-01');
-CREATE TABLE measurement_y2007m12 PARTITION OF measurement
- FOR VALUES FROM ('2007-12-01') TO ('2008-01-01');
-CREATE TABLE measurement_y2008m01 PARTITION OF measurement
- FOR VALUES FROM ('2008-01-01') TO ('2008-02-01');
-</programlisting>
+ <para>
+ Another disadvantage of the rule approach is that there is no simple
+ way to force an error if the set of rules doesn't cover the insertion
+ date; the data will silently go into the master table instead.
</para>
</listitem>
<listitem>
<para>
- Create indexes on the key columns just like in case of inheritance
- partitions.
+ Ensure that the <xref linkend="guc-constraint-exclusion">
+ configuration parameter is not disabled in
+ <filename>postgresql.conf</>.
+ If it is, queries will not be optimized as desired.
</para>
</listitem>
</orderedlist>
-
- <note>
- <para>
- To implement sub-partitioning, specify the
- <literal>PARTITION BY</literal> clause in the commands used to create
- individual partitions, for example:
-
-<programlisting>
-CREATE TABLE measurement_y2006m02 PARTITION OF measurement
- FOR VALUES FROM ('2006-02-01') TO ('2006-03-01')
- PARTITION BY RANGE (peaktemp);
-</programlisting>
-
- After creating partitions of <structname>measurement_y2006m02</>, any
- data inserted into <structname>measurement</> that is mapped to
- <structname>measurement_y2006m02</> will be further redirected to one
- of its partitions based on the <structfield>peaktemp</> column.
- Partition key specified may overlap with the parent's partition key,
- although care must be taken when specifying the bounds of sub-partitions
- such that the accepted set of data constitutes a subset of what a
- partition's own bounds allows; the system does not try to check if
- that's really the case.
- </para>
- </note>
</para>
<para>
- As we can see, a complex partitioning scheme could require a
- substantial amount of DDL, although significantly less when using
- partitioned tables. In the above example we would be creating a new
- partition each month, so it might be wise to write a script that
- generates the required DDL automatically.
- </para>
-
- </sect2>
-
- <sect2 id="ddl-partitioning-managing-partitions">
- <title>Managing Partitions</title>
-
- <para>
- Normally the set of partitions established when initially
- defining the table are not intended to remain static. It is
- common to want to remove old partitions of data and periodically
- add new partitions for new data. One of the most important
- advantages of partitioning is precisely that it allows this
- otherwise painful task to be executed nearly instantaneously by
- manipulating the partition structure, rather than physically moving large
- amounts of data around.
- </para>
-
- <para>
- Both the inheritance-based and partitioned table methods allow this to
- be done, although the latter requires taking an <literal>ACCESS EXCLUSIVE</literal>
- lock on the master table for various commands mentioned below.
- </para>
+ As we can see, a complex partitioning scheme could require a
+ substantial amount of DDL. In the above example we would be creating
+ a new partition each month, so it might be wise to write a script that
+ generates the required DDL automatically.
+ </para>
+ </sect3>
- <para>
- The simplest option for removing old data is simply to drop the partition
- that is no longer necessary, which works using both methods of
- partitioning:
+ <sect3 id="ddl-partitioning-inheritance-maintenance">
+ <title>Partition Maintenance</title>
+ <para>
+ To remove old data quickly, simply to drop the partition that is no
+ longer necessary:
<programlisting>
DROP TABLE measurement_y2006m02;
</programlisting>
- This can very quickly delete millions of records because it doesn't have
- to individually delete every record.
- </para>
+ </para>
<para>
- Another option that is often preferable is to remove the partition from
- the partitioned table but retain access to it as a table in its own
- right:
-<programlisting>
-ALTER TABLE measurement_y2006m02 NO INHERIT measurement;
-</programlisting>
-
- When using a partitioned table:
+ To remove the partition from the partitioned table but retain access to
+ it as a table in its own right:
<programlisting>
-ALTER TABLE measurement DETACH PARTITION measurement_y2006m02;
+ALTER TABLE measurement_y2006m02 NO INHERIT measurement;
</programlisting>
-
- This allows further operations to be performed on the data before
- it is dropped. For example, this is often a useful time to back up
- the data using <command>COPY</>, <application>pg_dump</>, or
- similar tools. It might also be a useful time to aggregate data
- into smaller formats, perform other data manipulations, or run
- reports.
</para>
<para>
- Similarly we can add a new partition to handle new data. We can create an
- empty partition in the partitioned table just as the original partitions
- were created above:
+ To add a new partition to handle new data, create an empty partition just
+ as the original partitions were created above:
<programlisting>
CREATE TABLE measurement_y2008m02 (
@@ -3577,52 +3682,80 @@ CREATE TABLE measurement_y2008m02 (
) INHERITS (measurement);
</programlisting>
- When using a partitioned table:
-
-<programlisting>
-CREATE TABLE measurement_y2008m02 PARTITION OF measurement
- FOR VALUES FROM ('2008-02-01') TO ('2008-03-01');
-</programlisting>
-
- As an alternative, it is sometimes more convenient to create the
- new table outside the partition structure, and make it a proper
- partition later. This allows the data to be loaded, checked, and
- transformed prior to it appearing in the partitioned table:
+ Alternatively, one may created the new table outside the partition
+ structure, and make it a partition after data is loaded, checked,
+ and transformed.
<programlisting>
CREATE TABLE measurement_y2008m02
(LIKE measurement INCLUDING DEFAULTS INCLUDING CONSTRAINTS);
+
ALTER TABLE measurement_y2008m02 ADD CONSTRAINT y2008m02
CHECK ( logdate >= DATE '2008-02-01' AND logdate < DATE '2008-03-01' );
+
\copy measurement_y2008m02 from 'measurement_y2008m02'
-- possibly some other data preparation work
+
ALTER TABLE measurement_y2008m02 INHERIT measurement;
</programlisting>
+ </para>
+ </sect3>
+
+ <sect3 id="ddl-partitioning-inheritance-caveats">
+ <title>Caveats</title>
+
+ <para>
+ The following caveats apply to partitioned tables implemented using
+ inheritance:
+ <itemizedlist>
+ <listitem>
+ <para>
+ There is no automatic way to verify that all of the
+ <literal>CHECK</literal> constraints are mutually
+ exclusive. It is safer to create code that generates
+ partitions and creates and/or modifies associated objects than
+ to write each by hand.
+ </para>
+ </listitem>
- The last of the above commands when using a partitioned table would be:
+ <listitem>
+ <para>
+ The schemes shown here assume that the partition key column(s)
+ of a row never change, or at least do not change enough to require
+ it to move to another partition. An <command>UPDATE</> that attempts
+ to do that will fail because of the <literal>CHECK</> constraints.
+ If you need to handle such cases, you can put suitable update triggers
+ on the partition tables, but it makes management of the structure
+ much more complicated.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ If you are using manual <command>VACUUM</command> or
+ <command>ANALYZE</command> commands, don't forget that
+ you need to run them on each partition individually. A command like:
<programlisting>
-ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
- FOR VALUES FROM ('2008-02-01') TO ('2008-03-01' );
+ANALYZE measurement;
</programlisting>
- </para>
+ will only process the master table.
+ </para>
+ </listitem>
- <tip>
+ <listitem>
<para>
- Before running the <command>ATTACH PARTITION</> command, it is
- recommended to create a <literal>CHECK</> constraint on the table to
- be attached describing the desired partition constraint. Using the
- same, system is able to skip the scan to validate the implicit
- partition constraint. Without such a constraint, the table will be
- scanned to validate the partition constraint, while holding an
- <literal>ACCESS EXCLUSIVE</literal> lock on the parent table.
- One may want to drop the constraint after <command>ATTACH PARTITION</>
- is finished, because it is no longer necessary.
+ <command>INSERT</command> statements with <literal>ON CONFLICT</>
+ clauses are unlikely to work as expected, as the <literal>ON CONFLICT</>
+ action is only taken in case of unique violations on the specified
+ target relation, not its child relations.
</para>
- </tip>
- </sect2>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </sect3>
+ </sect2>
- <sect2 id="ddl-partitioning-constraint-exclusion">
+ <sect2 id="ddl-partitioning-constraint-exclusion">
<title>Partitioning and Constraint Exclusion</title>
<indexterm>
@@ -3632,7 +3765,8 @@ ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
<para>
<firstterm>Constraint exclusion</> is a query optimization technique
that improves performance for partitioned tables defined in the
- fashion described above. As an example:
+ fashion described above (both declarative partitioned tables and those
+ implemented using inheritance). As an example:
<programlisting>
SET constraint_exclusion = on;
@@ -3715,153 +3849,6 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
are unlikely to benefit.
</para>
- <note>
- <para>
- Currently, constraint exclusion is also used for partitioned tables.
- However, we did not create any <literal>CHECK</literal> constraints
- for individual partitions as seen above. In this case, the optimizer
- uses internally generated constraint for every partition.
- </para>
- </note>
-
- </sect2>
-
- <sect2 id="ddl-partitioning-alternatives">
- <title>Alternative Partitioning Methods</title>
-
- <para>
- A different approach to redirecting inserts into the appropriate
- partition table is to set up rules, instead of a trigger, on the
- master table (unless it is a partitioned table). For example:
-
-<programlisting>
-CREATE RULE measurement_insert_y2006m02 AS
-ON INSERT TO measurement WHERE
- ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' )
-DO INSTEAD
- INSERT INTO measurement_y2006m02 VALUES (NEW.*);
-...
-CREATE RULE measurement_insert_y2008m01 AS
-ON INSERT TO measurement WHERE
- ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' )
-DO INSTEAD
- INSERT INTO measurement_y2008m01 VALUES (NEW.*);
-</programlisting>
-
- A rule has significantly more overhead than a trigger, but the overhead
- is paid once per query rather than once per row, so this method might be
- advantageous for bulk-insert situations. In most cases, however, the
- trigger method will offer better performance.
- </para>
-
- <para>
- Be aware that <command>COPY</> ignores rules. If you want to
- use <command>COPY</> to insert data, you'll need to copy into the correct
- partition table rather than into the master. <command>COPY</> does fire
- triggers, so you can use it normally if you use the trigger approach.
- </para>
-
- <para>
- Another disadvantage of the rule approach is that there is no simple
- way to force an error if the set of rules doesn't cover the insertion
- date; the data will silently go into the master table instead.
- </para>
-
- <para>
- Partitioning can also be arranged using a <literal>UNION ALL</literal>
- view, instead of table inheritance. For example,
-
-<programlisting>
-CREATE VIEW measurement AS
- SELECT * FROM measurement_y2006m02
-UNION ALL SELECT * FROM measurement_y2006m03
-...
-UNION ALL SELECT * FROM measurement_y2007m11
-UNION ALL SELECT * FROM measurement_y2007m12
-UNION ALL SELECT * FROM measurement_y2008m01;
-</programlisting>
-
- However, the need to recreate the view adds an extra step to adding and
- dropping individual partitions of the data set. In practice this
- method has little to recommend it compared to using inheritance.
- </para>
-
- </sect2>
-
- <sect2 id="ddl-partitioning-caveats">
- <title>Caveats</title>
-
- <para>
- The following caveats apply to using inheritance to implement partitioning:
- <itemizedlist>
- <listitem>
- <para>
- There is no automatic way to verify that all of the
- <literal>CHECK</literal> constraints are mutually
- exclusive. It is safer to create code that generates
- partitions and creates and/or modifies associated objects than
- to write each by hand.
- </para>
- </listitem>
-
- <listitem>
- <para>
- The schemes shown here assume that the partition key column(s)
- of a row never change, or at least do not change enough to require
- it to move to another partition. An <command>UPDATE</> that attempts
- to do that will fail because of the <literal>CHECK</> constraints.
- If you need to handle such cases, you can put suitable update triggers
- on the partition tables, but it makes management of the structure
- much more complicated.
- </para>
- </listitem>
-
- <listitem>
- <para>
- If you are using manual <command>VACUUM</command> or
- <command>ANALYZE</command> commands, don't forget that
- you need to run them on each partition individually. A command like:
-<programlisting>
-ANALYZE measurement;
-</programlisting>
- will only process the master table.
- </para>
- </listitem>
-
- <listitem>
- <para>
- <command>INSERT</command> statements with <literal>ON CONFLICT</>
- clauses are unlikely to work as expected, as the <literal>ON CONFLICT</>
- action is only taken in case of unique violations on the specified
- target relation, not its child relations.
- </para>
- </listitem>
- </itemizedlist>
- </para>
-
- <para>
- The following caveats apply to partitioned tables created with the
- explicit syntax:
- <itemizedlist>
- <listitem>
- <para>
- An <command>UPDATE</> that causes a row to move from one partition to
- another fails, because the new value of the row fails to satisfy the
- implicit partition constraint of the original partition. This might
- change in future releases.
- </para>
- </listitem>
-
- <listitem>
- <para>
- <command>INSERT</command> statements with <literal>ON CONFLICT</>
- clause are currently not allowed on partitioned tables.
- </para>
- </listitem>
-
- </itemizedlist>
- </para>
-
<para>
The following caveats apply to constraint exclusion, which is currently
used by both inheritance and partitioned tables:
@@ -3901,10 +3888,78 @@ ANALYZE measurement;
don't try to use many thousands of partitions.
</para>
</listitem>
-
</itemizedlist>
</para>
</sect2>
+
+ <sect2 id="ddl-partitioning-alternatives">
+ <title>Alternative Partitioning Methods</title>
+
+ <sect3 id="ddl-partitioning-alternatives-union-all">
+ <title>Using UNION ALL view</title>
+ <para>
+ Partitioning can also be arranged using a <literal>UNION ALL</literal>
+ view, instead of table inheritance. For example,
+
+<programlisting>
+CREATE VIEW measurement AS
+ SELECT * FROM measurement_y2006m02
+UNION ALL SELECT * FROM measurement_y2006m03
+...
+UNION ALL SELECT * FROM measurement_y2007m11
+UNION ALL SELECT * FROM measurement_y2007m12
+UNION ALL SELECT * FROM measurement_y2008m01;
+</programlisting>
+
+ However, the need to recreate the view adds an extra step to adding and
+ dropping individual partitions of the data set. In practice this
+ method has little to recommend it compared to using inheritance.
+ </para>
+ </sect3>
+
+ <sect3 id="ddl-partitioning-alternatives-brin-index">
+ <title>Accessing Tables Using BRIN Index</title>
+ <para>
+ <acronym>BRIN</acronym>, which stands for Block Range Index, is
+ designed for handling very large tables in which certain columns
+ have some natural physical location within the table. For example,
+ in the <structname>measurement</structname> table, the entries for
+ earlier times (<structfield>logdate</structfield> column) will appear
+ earlier in the table most of the time. A table storing a ZIP code
+ column might have all codes for a city grouped together naturally.
+ </para>
+
+ <para>
+ In case of <structname>measurement</structname> table, one may consider
+ adding a minmax <acronym>BRIN</acronym> index on the
+ <structfield>logdate</structfield> column.
+
+<programlisting>
+CREATE INDEX ON measurement USING brin (logdate date_minmax_ops);
+</programlisting>
+
+ In this case, specifying <literal>date_minmax_ops</literal> is not
+ necessary; it is shown for clarity.
+ </para>
+
+ <para>
+ <acronym>BRIN</acronym> indexes leverage this locality of data and
+ store summary information for a range of consecutive pages and keep
+ it updated as the data is added or removed. Because a
+ <acronym>BRIN</acronym> index is very small, scanning the index adds
+ adds little overhead compared to a sequential scan, but may avoid
+ scanning large parts of the table that are known not to contain
+ matching tuples. That is often why table partitioning is used. Thus,
+ <acronym>BRIN</acronym> indexes provide a subset of benefits that
+ parttioning provides with much less upfront setup.
+ </para>
+
+ <para>
+ See <xref linkend="brin"> for more details.
+ </para>
+ </sect3>
+
+ </sect2>
</sect1>
<sect1 id="ddl-foreign-data">
--
2.11.0
0002-Add-a-note-about-DROP-NOT-NULL-and-partitions.patchtext/x-diff; name=0002-Add-a-note-about-DROP-NOT-NULL-and-partitions.patchDownload
From a80af3ad7e637684faa39d2b412fa8fe8b884a6c Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 10 Feb 2017 15:03:45 +0900
Subject: [PATCH 2/3] Add a note about DROP NOT NULL and partitions
On the ALTER TABLE refpage, it seems better to mention how to drop
drop the not null constraint of a partition's column. Per suggestion
from Corey Huinker.
---
doc/src/sgml/ref/alter_table.sgml | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
index 077c00373d..b6be3a0dcb 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -172,9 +172,14 @@ ALTER TABLE [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
</para>
<para>
- If this table is a partition, one cannot perform <literal>DROP NOT NULL</>
+ If this table is a partition, one cannot perform <literal>DROP NOT NULL</literal>
on a column if it is marked <literal>NOT NULL</literal> in the parent
- table.
+ table. To drop the <literal>NOT NULL</literal> constraint from all the
+ partitions, perform <literal>DROP NOT NULL</literal> on the parent
+ table. Even if there is no <literal>NOT NULL</> constraint on the
+ parent, such a constraint can still be added to individual partitions,
+ if desired; that is, the children can disallow nulls even if the parent
+ allows them, but not the other way around.
</para>
</listitem>
</varlistentry>
--
2.11.0
0003-Listify-a-note-on-the-CREATE-TABLE-page.patchtext/x-diff; name=0003-Listify-a-note-on-the-CREATE-TABLE-page.patchDownload
From dbae289161e5230227a33f90b922b77bc2605257 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Mon, 27 Feb 2017 19:00:08 +0900
Subject: [PATCH 3/3] Listify a note on the CREATE TABLE page
---
doc/src/sgml/ref/create_table.sgml | 74 +++++++++++++++++++-------------------
1 file changed, 37 insertions(+), 37 deletions(-)
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index bb081ff86f..f08433c667 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -261,43 +261,43 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
any existing partition of that parent.
</para>
- <note>
- <para>
- Each of the values specified in the partition bound specification is
- a literal, <literal>NULL</literal>, or <literal>UNBOUNDED</literal>.
- A literal is either a numeric constant or a string constant that is
- coercable to the corresponding partition key column's type.
- </para>
-
- <para>
- When creating a range partition, the lower bound specified with
- <literal>FROM</literal> is an inclusive bound, whereas the upper bound
- specified with <literal>TO</literal> is an exclusive bound. That is,
- the values specified in the <literal>FROM</literal> list are accepted
- values of the corresponding partition key columns in a given partition,
- whereas those in the <literal>TO</literal> list are not. To be precise,
- this applies only to the first of the partition key columns for which
- the corresponding values in the <literal>FROM</literal> and
- <literal>TO</literal> lists are not equal. All rows in a given
- partition contain the same values for all preceding columns, equal to
- those specified in <literal>FROM</literal> and <literal>TO</literal>
- lists. On the other hand, any subsequent columns are insignificant
- as far as implicit partition constraint is concerned.
-
- Specifying <literal>UNBOUNDED</literal> in <literal>FROM</literal>
- signifies <literal>-infinity</literal> as the lower bound of the
- corresponding column, whereas it signifies <literal>+infinity</literal>
- as the upper bound when specified in <literal>TO</literal>.
- </para>
-
- <para>
- When creating a list partition, <literal>NULL</literal> can be specified
- to signify that the partition allows the partition key column to be null.
- However, there cannot be more than one such list partitions for a given
- parent table. <literal>NULL</literal> cannot specified for range
- partitions.
- </para>
- </note>
+ <para>
+ Each of the values specified in the partition bound specification is
+ a literal, <literal>NULL</literal>, or <literal>UNBOUNDED</literal>.
+ A literal is either a numeric constant or a string constant that is
+ coercible to the corresponding partition key column's type.
+ </para>
+
+ <para>
+ When creating a range partition, the lower bound specified with
+ <literal>FROM</literal> is an inclusive bound, whereas the upper
+ bound specified with <literal>TO</literal> is an exclusive bound.
+ That is, the values specified in the <literal>FROM</literal> list
+ are accepted values of the corresponding partition key columns in a
+ given partition, whereas those in the <literal>TO</literal> list are
+ not. To be precise, this applies only to the first of the partition
+ key columns for which the corresponding values in the <literal>FROM</literal>
+ and <literal>TO</literal> lists are not equal. All rows in a given
+ partition contain the same values for all preceding columns, equal to
+ those specified in <literal>FROM</literal> and <literal>TO</literal>
+ lists. On the other hand, any subsequent columns are insignificant
+ as far as implicit partition constraint is concerned.
+ </para>
+
+ <para>
+ Specifying <literal>UNBOUNDED</literal> in <literal>FROM</literal>
+ signifies <literal>-infinity</literal> as the lower bound of the
+ corresponding column, whereas it signifies <literal>+infinity</literal>
+ as the upper bound when specified in <literal>TO</literal>.
+ </para>
+
+ <para>
+ When creating a list partition, <literal>NULL</literal> can be
+ specified to signify that the partition allows the partition key
+ column to be null. However, there cannot be more than one such
+ list partitions for a given parent table. <literal>NULL</literal>
+ cannot specified for range partitions.
+ </para>
<para>
A partition must have the same column names and types as the partitioned
--
2.11.0
On Thu, Mar 9, 2017 at 8:23 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
On 2017/03/10 3:26, Robert Haas wrote:
I think you might have the titles for 0002 and 0003 backwards.
Oops, you're right.
On Fri, Mar 3, 2017 at 2:51 AM, Amit Langote wrote:
0002: some cosmetic fixes to create_table.sgml
I think this sentence may be unclear to some readers:
+ One might however want to set it for only some partitions, + which is possible by doing <literal>SET NOT NULL</literal> on individual + partitions.I think you could replace this with something like: Even if there is
no <literal>NOT NULL</> constraint on the parent, such a constraint
can still be added to individual partitions, if desired; that is, the
children can disallow nulls even if the parent allows them, but not
the other way around.Reads much better, done that way. Thanks.
0003: add clarification about NOT NULL constraint on partition columns in
alter_table.sgmlThis is about list-ifying a note, but I think we should try to
de-note-ify it. It's a giant block of text that is not obviously more
noteworthy than the surrounding text; I think <note> should be saved
for things that particularly need to be called out.OK. The patch is now just about de-note-ifying the block of text. Since
I don't see any other lists in the Parameters portion of the page, I also
take back my list-ifying proposal.Attached updated patches.
Committed 0002, 0003.
I think the section on BRIN in 0001 is just silly. BRIN is a very
useful index type, possibly more useful than anything except btree,
but I think documenting it as an alternative method of partitioning is
over the top.
+ The following forms of partitioning can be implemented in
+ <productname>PostgreSQL</productname>:
Any form of partitioning can be implemented, at least to some degree,
using inheritance or UNION ALL views. I think what this should say is
that PostgreSQL has native support for list and range partitioning,
and then it can go on top say that if this built-in support is not
suitable for a particular use case (either because you need some other
partitioning scheme or due to some other limitation), inheritance or
UNION ALL views can be used instead, adding flexibility but losing
some of the performance benefits of built-in declarative partitioning.
<para>
Partitions may have their own indexes, constraints and default values,
- distinct from other partitions. Partitions do not inherit indexes from
- the partitioned table.
+ distinct from other partitions. Partitions do not currently inherit
+ indexes from the partitioned table.
+ </para>
+
+ <para>
+ See <xref linkend="sql-createtable"> for more details creating partitioned
+ tables and partitions.
</para>
I don't think we should add "currently"; that amounts to speculation
about what will happen in future versions. Also, I favor collapsing
these into one paragraph. A single-sentence paragraph tends to look
OK when you're reading the SGML directly, but it looks funny in the
rendered version.
+ <firstterm>sub-partitioning</firstterm>. It is not currently possible to
+ alter a regular table into a partitioned table or vice versa. However,
+ it is possible to add a regular or partitioned table containing data into
+ a partition of a partitioned table, or remove a partition; see
I think we should say "as a partition" rather than "into a partition",
assuming you're talking about ATTACH PARTITION here.
- partitioned tables. For example, specifying <literal>ONLY</literal>
- when querying data from a partitioned table would not make much sense,
- because all the data is contained in partitions, so this raises an
- error. Specifying <literal>ONLY</literal> when modifying schema is
- not desirable in certain cases with partitioned tables where it may be
- fine for regular inheritance parents (for example, dropping a column
- from only the parent); an error will be thrown in that case.
+ partitioned tables. Specifying <literal>ONLY</literal> when modifying
+ schema is not desirable in certain cases with partitioned tables
+ whereas it may be fine for regular inheritance parents (for example,
+ dropping a column from only the parent); an error will be thrown in
+ that case.
I don't see why this is an improvement.
- data inserted into the partitioned table cannot be routed to foreign table
- partitions.
+ data inserted into the partitioned table is currently not routed to foreign
+ table partitions.
Again, let's not speculate about the future.
+ Note that it is currently not supported to propagate index definition
+ from the master partitioned table to its partitions; in fact, it is
+ not possible to define indexes on partitioned tables in the first
+ place. This might change in future releases.
Same here.
+ There are currently the following limitations of using partitioned tables:
And here. Better to write "The following limitations apply to
partitioned tables:"
+ It is currently not possible to define indexes on partitioned tables
+ that include all rows from all partitions in one global index.
+ Consequently, it is not possible to create constraints that are realized
+ using an index such as <literal>UNIQUE</>.
This doesn't seem very grammatical, and it kind of overlaps with the
previous point, and the following point. How about just adding a
sentence to the previous paragraph: This also means that there is no
way to create a primary key, unique constraint, or exclusion
constraint spanning all partitions; it is only possible to constrain
each leaf partition individually.
+ <command>INSERT</command> statements with <literal>ON CONFLICT</>
+ clause are currently not allowed on partitioned tables.
Obsolete.
+ implicit partition constraint of the original partition. This might
+ change in future releases.
Remove speculation.
+ In some cases, one may want to add columns to partitions that are not
+ present in the parent table which is not possible to do with the above
+ method. For such cases, partitioning can be implemented using
+ inheritance (see <xref linkend="ddl-inherit">).
Hmm, I bet that's not the only advantage. And it doesn't seem like
the way to lead.
e.g.
While the built-in declarative partitioning is suitable for most
common use cases, there are some circumstances where a more flexible
approach may be useful. Partitioning can be implemented using table
inheritance, which allows for several features which are not supported
by declarative partitioning, such as:
- Partitioning enforces a rule that all partitions must have exactly
the same set of columns as the parent, but table inheritance allows
children to have extra columns not present in the parent.
- Table inheritance allows for multiple inheritance.
- Declarative partitioning only supports list and range partitioning,
whereas table inheritance allows data to be divided in a manner of the
user's choosing. (Note, however, that if constraint exclusion is
unable to prune partitions effectively, query performance will be very
poor.)
- Some operations require a stronger lock when using declarative
partitioning than when using table inheritance. (list these)
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/03/28 0:23, Robert Haas wrote:
On Thu, Mar 9, 2017 at 8:23 PM, Amit Langote wrote:
Attached updated patches.
Committed 0002, 0003.
Thanks a lot for committing these and reviewing 0001.
I think the section on BRIN in 0001 is just silly. BRIN is a very
useful index type, possibly more useful than anything except btree,
but I think documenting it as an alternative method of partitioning is
over the top.
Okay, removing the BRIN part from this patch for now.
+ The following forms of partitioning can be implemented in + <productname>PostgreSQL</productname>:Any form of partitioning can be implemented, at least to some degree,
using inheritance or UNION ALL views. I think what this should say is
that PostgreSQL has native support for list and range partitioning,
and then it can go on top say that if this built-in support is not
suitable for a particular use case (either because you need some other
partitioning scheme or due to some other limitation), inheritance or
UNION ALL views can be used instead, adding flexibility but losing
some of the performance benefits of built-in declarative partitioning.
You're right. I've updated the text to sound like what you said here.
<para> Partitions may have their own indexes, constraints and default values, - distinct from other partitions. Partitions do not inherit indexes from - the partitioned table. + distinct from other partitions. Partitions do not currently inherit + indexes from the partitioned table. + </para> + + <para> + See <xref linkend="sql-createtable"> for more details creating partitioned + tables and partitions. </para>I don't think we should add "currently"; that amounts to speculation
about what will happen in future versions. Also, I favor collapsing
these into one paragraph. A single-sentence paragraph tends to look
OK when you're reading the SGML directly, but it looks funny in the
rendered version.
Done.
+ <firstterm>sub-partitioning</firstterm>. It is not currently possible to + alter a regular table into a partitioned table or vice versa. However, + it is possible to add a regular or partitioned table containing data into + a partition of a partitioned table, or remove a partition; seeI think we should say "as a partition" rather than "into a partition",
assuming you're talking about ATTACH PARTITION here.
Right, fixed.
- partitioned tables. For example, specifying <literal>ONLY</literal> - when querying data from a partitioned table would not make much sense, - because all the data is contained in partitions, so this raises an - error. Specifying <literal>ONLY</literal> when modifying schema is - not desirable in certain cases with partitioned tables where it may be - fine for regular inheritance parents (for example, dropping a column - from only the parent); an error will be thrown in that case. + partitioned tables. Specifying <literal>ONLY</literal> when modifying + schema is not desirable in certain cases with partitioned tables + whereas it may be fine for regular inheritance parents (for example, + dropping a column from only the parent); an error will be thrown in + that case.I don't see why this is an improvement.
Because we neither raise an error nor ignore it if ONLY is specified when
querying data from a partitioned table.
create table p (a int, b char) partition by list (a);
create table p1 partition of p for values in (1);
insert into p values (1);
select * from only p;
a | b
---+---
(0 rows)
explain select * from only p;
QUERY PLAN
-------------------------------------------
Result (cost=0.00..0.00 rows=0 width=12)
One-Time Filter: false
(2 rows)
IOW, querying behavior is same as regular inheritance. I rewrote the
paragraph as follows:
<para>
The <literal>ONLY</literal> notation used to exclude child tables
will cause an error for partitioned tables in the case of
schema-modifying commands such as most <literal>ALTER TABLE</literal>
commands. For example, dropping a column from only the parent does
not make sense for partitioned tables.
</para>
- data inserted into the partitioned table cannot be routed to foreign table - partitions. + data inserted into the partitioned table is currently not routed to foreign + table partitions.Again, let's not speculate about the future.
+ Note that it is currently not supported to propagate index definition + from the master partitioned table to its partitions; in fact, it is + not possible to define indexes on partitioned tables in the first + place. This might change in future releases.Same here.
+ There are currently the following limitations of using partitioned tables:
And here. Better to write "The following limitations apply to
partitioned tables:"
Fixed all of these.
+ It is currently not possible to define indexes on partitioned tables + that include all rows from all partitions in one global index. + Consequently, it is not possible to create constraints that are realized + using an index such as <literal>UNIQUE</>.This doesn't seem very grammatical, and it kind of overlaps with the
previous point, and the following point. How about just adding a
sentence to the previous paragraph: This also means that there is no
way to create a primary key, unique constraint, or exclusion
constraint spanning all partitions; it is only possible to constrain
each leaf partition individually.
OK, done.
+ <command>INSERT</command> statements with <literal>ON CONFLICT</> + clause are currently not allowed on partitioned tables.Obsolete.
Text from the patch you just committed now replaces this item.
+ implicit partition constraint of the original partition. This might + change in future releases.Remove speculation.
Done. Also, a few other "currently"s I had added.
+ In some cases, one may want to add columns to partitions that are not + present in the parent table which is not possible to do with the above + method. For such cases, partitioning can be implemented using + inheritance (see <xref linkend="ddl-inherit">).Hmm, I bet that's not the only advantage. And it doesn't seem like
the way to lead.e.g.
While the built-in declarative partitioning is suitable for most
common use cases, there are some circumstances where a more flexible
approach may be useful. Partitioning can be implemented using table
inheritance, which allows for several features which are not supported
by declarative partitioning, such as:- Partitioning enforces a rule that all partitions must have exactly
the same set of columns as the parent, but table inheritance allows
children to have extra columns not present in the parent.- Table inheritance allows for multiple inheritance.
- Declarative partitioning only supports list and range partitioning,
whereas table inheritance allows data to be divided in a manner of the
user's choosing. (Note, however, that if constraint exclusion is
unable to prune partitions effectively, query performance will be very
poor.)- Some operations require a stronger lock when using declarative
partitioning than when using table inheritance. (list these)
Thanks, that's a lot better.
Attached updated patch.
Regards,
Amit
Attachments:
0001-Rewrite-sections-in-ddl.sgml-related-to-partitioning.patchtext/x-diff; name=0001-Rewrite-sections-in-ddl.sgml-related-to-partitioning.patchDownload
From 442d749315d545abac9e864fa42a1402a63d9f44 Mon Sep 17 00:00:00 2001
From: amit <amitlangote09@gmail.com>
Date: Fri, 3 Mar 2017 16:39:24 +0900
Subject: [PATCH] Rewrite sections in ddl.sgml related to partitioning
Merge sections Partitioned Tables and Partitioning into one section
called Table Partitioning and Related Solutions.
---
doc/src/sgml/ddl.sgml | 1473 ++++++++++++++++++++++++-------------------------
1 file changed, 722 insertions(+), 751 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index d1e915c11a..2cd75a9673 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2772,14 +2772,123 @@ VALUES ('Albany', NULL, NULL, 'NY');
</sect2>
</sect1>
- <sect1 id="ddl-partitioned-tables">
- <title>Partitioned Tables</title>
+ <sect1 id="ddl-partitioning">
+ <title>Table Partitioning and Related Solutions</title>
+
+ <indexterm>
+ <primary>partitioning</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>table</primary>
+ <secondary>partitioning</secondary>
+ </indexterm>
<indexterm>
<primary>partitioned table</primary>
</indexterm>
<para>
+ <productname>PostgreSQL</productname> supports basic table
+ partitioning. This section describes why and how to implement
+ partitioning as part of your database design.
+ </para>
+
+ <sect2 id="ddl-partitioning-overview">
+ <title>Overview</title>
+
+ <para>
+ Partitioning refers to splitting what is logically one large table into
+ smaller physical pieces. Partitioning can provide several benefits:
+ <itemizedlist>
+ <listitem>
+ <para>
+ Query performance can be improved dramatically in certain situations,
+ particularly when most of the heavily accessed rows of the table are in a
+ single partition or a small number of partitions. The partitioning
+ substitutes for leading columns of indexes, reducing index size and
+ making it more likely that the heavily-used parts of the indexes
+ fit in memory.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ When queries or updates access a large percentage of a single
+ partition, performance can be improved by taking advantage
+ of sequential scan of that partition instead of using an
+ index and random access reads scattered across the whole table.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Bulk loads and deletes can be accomplished by adding or removing
+ partitions, if that requirement is planned into the partitioning design.
+ Doing <command>ALTER TABLE DETACH PARTITION</> followed by
+ <command>DROP TABLE</> is far faster than a bulk operation. These
+ commands also entirely avoid the <command>VACUUM</command> overhead
+ caused by a bulk <command>DELETE</>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Seldom-used data can be migrated to cheaper and slower storage media.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ The benefits will normally be worthwhile only when a table would
+ otherwise be very large. The exact point at which a table will
+ benefit from partitioning depends on the application, although a
+ rule of thumb is that the size of the table should exceed the physical
+ memory of the database server.
+ </para>
+
+ <para>
+ <productname>PostgreSQL</productname> offers built-in support for the
+ following forms of partitioning:
+
+ <variablelist>
+ <varlistentry>
+ <term>Range Partitioning</term>
+
+ <listitem>
+ <para>
+ The table is partitioned into <quote>ranges</quote> defined
+ by a key column or set of columns, with no overlap between
+ the ranges of values assigned to different partitions. For
+ example one might partition by date ranges, or by ranges of
+ identifiers for particular business objects.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>List Partitioning</term>
+
+ <listitem>
+ <para>
+ The table is partitioned by explicitly listing which key values
+ appear in each partition.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ If your application needs to use other forms of partitioning not listed
+ above, alternative methods such as inheritance and
+ <literal>UNION ALL</literal> views can be used instead. Such methods
+ offer flexibility but do not have some of the performance benefits
+ of built-in declarative partitioning.
+ </para>
+ </sect2>
+
+ <sect2 id="ddl-partitioning-declarative">
+ <title>Declarative Partitioning</title>
+
+ <para>
PostgreSQL offers a way to specify how to divide a table into pieces
called partitions. The table that is divided is referred to as a
<firstterm>partitioned table</firstterm>. The specification consists
@@ -2790,74 +2899,72 @@ VALUES ('Albany', NULL, NULL, 'NY');
<para>
All rows inserted into a partitioned table will be routed to one of the
<firstterm>partitions</firstterm> based on the value of the partition
- key. Each partition has a subset defined by its <firstterm>partition
- bounds</firstterm>. Currently supported partitioning methods include
- range and list, wherein each partition is assigned a range of keys or
- a list of keys, respectively.
+ key. Each partition has a subset of the data defined by its
+ <firstterm>partition bounds</firstterm>. Currently supported
+ partitioning methods include range and list, where each partition is
+ assigned a range of keys and a list of keys, respectively.
</para>
<para>
- Partitions may have their own indexes, constraints and default values,
- distinct from other partitions. Partitions do not inherit indexes from
- the partitioned table.
+ Partitions may themselves be defined as partitioned tables, referred to as
+ <firstterm>sub-partitioning</firstterm>. Partitions may have their own
+ indexes, constraints and default values, distinct from other partitions.
+ They do not inherit indexes from the partitioned table. See
+ <xref linkend="sql-createtable"> for more details on creating partitioned
+ tables and partitions.
</para>
<para>
- Partitions may themselves be defined as partitioned tables, referred to as
- <firstterm>sub-partitioning</firstterm>. See <xref linkend="sql-createtable">
- for more details creating partitioned tables and partitions. It is not
- currently possible to alter a regular table into a partitioned table or
- vice versa. However, it is possible to add a regular table containing
- data into a partition of a partitioned table, or remove a partition; see
- <xref linkend="sql-altertable"> to learn more about the
- <command>ATTACH PARTITION</> and <command>DETACH PARTITION</> sub-commands.
+ It is not possible to turn a regular table into a partitioned table or
+ vice versa. However, it is possible to add a regular or partitioned table
+ containing data as a partition of a partitioned table, or remove a
+ partition from a partitioned table turning it into a standalone table;
+ see <xref linkend="sql-altertable"> to learn more about the
+ <command>ATTACH PARTITION</> and <command>DETACH PARTITION</>
+ sub-commands.
</para>
<para>
Individual partitions are linked to the partitioned table with inheritance
- behind-the-scenes, however it is not possible to use some of the inheritance
- features discussed in the previous section with partitioned tables and
- partitions. For example, partitions cannot have any other parents than
- the partitioned table it is a partition of, nor can a regular table inherit
- from a partitioned table making the latter its parent. That means
- partitioned table and partitions do not participate in inheritance with
- regular tables. Since a partition hierarchy consisting of the
- partitioned table and its partitions is still an inheritance hierarchy,
- all the normal rules of inheritance apply as described in the previous
- section (<xref linkend="ddl-inherit">) with some exceptions, most notably:
+ behind-the-scenes, however it is not possible to use some of the
+ inheritance features discussed in the previous section with partitioned
+ tables and partitions. For example, a partition cannot have any parents
+ other than the partitioned table it is a partition of, nor can a regular
+ table inherit from a partitioned table making the latter its parent.
+ That means partitioned table and partitions do not participate in
+ inheritance with regular tables. Since a partition hierarchy consisting
+ of the partitioned table and its partitions is still an inheritance
+ hierarchy, all the normal rules of inheritance apply as described in
+ <xref linkend="ddl-inherit"> with some exceptions, most notably:
<itemizedlist>
<listitem>
<para>
Both <literal>CHECK</literal> and <literal>NOT NULL</literal>
constraints of a partitioned table are always inherited by all its
- partitions. There cannot be any <literal>CHECK</literal> constraints
- that are marked <literal>NO INHERIT</literal>.
+ partitions. <literal>CHECK</literal> constraints that are marked
+ <literal>NO INHERIT</literal> are not allowed.
</para>
</listitem>
<listitem>
<para>
The <literal>ONLY</literal> notation used to exclude child tables
- would either cause error or will be ignored in some cases for
- partitioned tables. For example, specifying <literal>ONLY</literal>
- when querying data from a partitioned table would not make much sense,
- because all the data is contained in partitions, so this raises an
- error. Specifying <literal>ONLY</literal> when modifying schema is
- not desirable in certain cases with partitioned tables where it may be
- fine for regular inheritance parents (for example, dropping a column
- from only the parent); an error will be thrown in that case.
+ will cause an error for partitioned tables in the case of
+ schema-modifying commands such as most <literal>ALTER TABLE</literal>
+ commands. For example, dropping a column from only the parent does
+ not make sense for partitioned tables.
</para>
</listitem>
<listitem>
<para>
- Partitions cannot have columns that are not present in the parent.
- It is neither possible to specify columns when creating partitions
- with <command>CREATE TABLE</> nor is it possible to add columns to
- partitions using <command>ALTER TABLE</>. Tables may be added with
- <command>ALTER TABLE ... ATTACH PARTITION</> if their columns exactly
- match the parent, including oids.
+ Partitions cannot have columns that are not present in the parent. It
+ is neither possible to specify columns when creating partitions with
+ <command>CREATE TABLE</> nor is it possible to add columns to
+ partitions after-the-fact using <command>ALTER TABLE</>. Tables may be
+ added as a partition with <command>ALTER TABLE ... ATTACH PARTITION</>
+ only if their columns exactly match the parent, including oids.
</para>
</listitem>
@@ -2871,487 +2978,505 @@ VALUES ('Albany', NULL, NULL, 'NY');
</para>
<para>
- Partitions can also be foreign tables (see <xref linkend="ddl-foreign-data">),
- although certain limitations exist currently in their usage. For example,
- data inserted into the partitioned table cannot be routed to foreign table
+ Partitions can also be foreign tables (see <xref linkend="sql-createforeigntable">),
+ although certain limitations exist in their usage. For example, data
+ inserted into the partitioned table is not routed to foreign table
partitions.
</para>
+ <sect3 id="ddl-partitioning-declarative-example">
+ <title>Example</title>
+
<para>
- There are currently the following limitations of using partitioned tables:
- <itemizedlist>
+ Suppose we are constructing a database for a large ice cream company.
+ The company measures peak temperatures every day as well as ice cream
+ sales in each region. Conceptually, we want a table like:
+
+<programlisting>
+CREATE TABLE measurement (
+ city_id int not null,
+ logdate date not null,
+ peaktemp int,
+ unitsales int
+);
+</programlisting>
+
+ We know that most queries will access just the last week's, month's or
+ quarter's data, since the main use of this table will be to prepare
+ online reports for management. To reduce the amount of old data that
+ needs to be stored, we decide to only keep the most recent 3 years
+ worth of data. At the beginning of each month we will remove the oldest
+ month's data. In this situation we can use partitioning to help us meet
+ all of our different requirements for the measurements table.
+ </para>
+
+ <para>
+ To use declarative partitioning in this case, use the following steps:
+
+ <orderedlist spacing="compact">
<listitem>
<para>
- It is currently not possible to add same set of indexes on all partitions
- automatically. Indexes must be added to each partition with separate
- commands.
+ Create <structname>measurement</structname> table as a partitioned
+ table by specifying the <literal>PARTITION BY</literal> clause, which
+ includes the partitioning method (<literal>RANGE</literal> in this
+ case) and the list of column(s) to use as the partition key.
+
+<programlisting>
+CREATE TABLE measurement (
+ city_id int not null,
+ logdate date not null,
+ peaktemp int,
+ unitsales int
+) PARTITION BY RANGE (logdate);
+</programlisting>
</para>
- </listitem>
- <listitem>
<para>
- It is currently not possible to define indexes on partitioned tables
- that include all rows from all partitions in one global index.
- Consequently, it is not possible to create constraints that are realized
- using an index such as <literal>UNIQUE</>.
+ You may decide to use multiple columns in the partition key for range
+ partitioning if it's known that each of the selected columns will
+ divide the incoming data using successively more granular partition
+ criteria. Whereas using fewer columns may lead to coarser-grained
+ partitioning causing each partition to accept bigger set of data than
+ might be desirable. A query accessing the partitioned table will have
+ to scan fewer partitions if the conditions involve some or all of these
+ columns. For example, consider a table range partitioned using columns
+ <structfield>lastname</> and <structfield>firstname</> (in that order)
+ as the partition key.
</para>
- </listitem>
- <listitem>
<para>
- Since primary keys are not supported on partitioned tables,
- foreign keys referencing partitioned tables are not supported, nor
- are foreign key references from a partitioned table to some other table.
+ To be able to insert data into this table, one must create partitions,
+ as described below.
</para>
</listitem>
<listitem>
<para>
- Row triggers, if necessary, must be defined on individual partitions, not
- the partitioned table as it is currently not supported.
+ Create partitions. Each partition's definition must specify the bounds
+ that correspond to the partitioning method and partition key of the
+ parent. Note that specifying bounds such that the new partition's
+ values will overlap with those in one or more existing partitions will
+ cause an error. Inserting data into the parent table that does not map
+ to one of the existing partitions will cause an error; appropriate
+ partition must be added manually.
</para>
- </listitem>
- </itemizedlist>
- </para>
-
- <para>
- A detailed example that shows how to use partitioned tables is discussed in
- the next chapter.
- </para>
-
- </sect1>
- <sect1 id="ddl-partitioning">
- <title>Partitioning</title>
-
- <indexterm>
- <primary>partitioning</primary>
- </indexterm>
-
- <indexterm>
- <primary>table</primary>
- <secondary>partitioning</secondary>
- </indexterm>
+ <para>
+ Partitions thus created are in every way normal <productname>PostgreSQL</>
+ tables (or, possibly, foreign tables). It is possible to specify
+ tablespace, storage parameters for each partition separately.
+ </para>
- <para>
- <productname>PostgreSQL</productname> supports basic table
- partitioning. This section describes why and how to implement
- partitioning as part of your database design.
- </para>
+ <para>
+ It is not necessary to create table constraints describing partition
+ boundary condition for partitions. Instead, partition constraints are
+ generated implicitly from the partition bound specification whenever
+ there is need to refer to them. Also, since any data inserted into the
+ parent table is automatically inserted into the appropriate partition,
+ it is not necessary to create triggers for the same.
- <sect2 id="ddl-partitioning-overview">
- <title>Overview</title>
+<programlisting>
+CREATE TABLE measurement_y2006m02 PARTITION OF measurement
+ FOR VALUES FROM ('2006-02-01') TO ('2006-03-01')
- <para>
- Partitioning refers to splitting what is logically one large table
- into smaller physical pieces.
- Partitioning can provide several benefits:
- <itemizedlist>
- <listitem>
- <para>
- Query performance can be improved dramatically in certain situations,
- particularly when most of the heavily accessed rows of the table are in a
- single partition or a small number of partitions. The partitioning
- substitutes for leading columns of indexes, reducing index size and
- making it more likely that the heavily-used parts of the indexes
- fit in memory.
- </para>
- </listitem>
+CREATE TABLE measurement_y2006m03 PARTITION OF measurement
+ FOR VALUES FROM ('2006-03-01') TO ('2006-04-01')
- <listitem>
- <para>
- When queries or updates access a large percentage of a single
- partition, performance can be improved by taking advantage
- of sequential scan of that partition instead of using an
- index and random access reads scattered across the whole table.
- </para>
- </listitem>
+...
+CREATE TABLE measurement_y2007m11 PARTITION OF measurement
+ FOR VALUES FROM ('2007-11-01') TO ('2007-12-01')
- <listitem>
- <para>
- Bulk loads and deletes can be accomplished by adding or removing
- partitions, if that requirement is planned into the partitioning design.
- <command>ALTER TABLE NO INHERIT</> or <command>ALTER TABLE DETACH PARTITION</>
- and <command>DROP TABLE</> are both far faster than a bulk operation.
- These commands also entirely avoid the <command>VACUUM</command>
- overhead caused by a bulk <command>DELETE</>.
- </para>
- </listitem>
+CREATE TABLE measurement_y2007m12 PARTITION OF measurement
+ FOR VALUES FROM ('2007-12-01') TO ('2008-01-01')
+ TABLESPACE fasttablespace;
- <listitem>
- <para>
- Seldom-used data can be migrated to cheaper and slower storage media.
- </para>
- </listitem>
- </itemizedlist>
+CREATE TABLE measurement_y2008m01 PARTITION OF measurement
+ FOR VALUES FROM ('2008-01-01') TO ('2008-02-01')
+ TABLESPACE fasttablespace
+ WITH (parallel_workers = 4);
+</programlisting>
+ </para>
- The benefits will normally be worthwhile only when a table would
- otherwise be very large. The exact point at which a table will
- benefit from partitioning depends on the application, although a
- rule of thumb is that the size of the table should exceed the physical
- memory of the database server.
- </para>
+ <para>
+ To implement sub-partitioning, specify the
+ <literal>PARTITION BY</literal> clause in the commands used to create
+ individual partitions, for example:
- <para>
- Currently, <productname>PostgreSQL</productname> supports partitioning
- using two methods:
+<programlisting>
+CREATE TABLE measurement_y2006m02 PARTITION OF measurement
+ FOR VALUES FROM ('2006-02-01') TO ('2006-03-01')
+ PARTITION BY RANGE (peaktemp);
+</programlisting>
- <variablelist>
- <varlistentry>
- <term>Using Table Inheritance</term>
+ After creating partitions of <structname>measurement_y2006m02</>,
+ any data inserted into <structname>measurement</> that is mapped to
+ <structname>measurement_y2006m02</> (or data that is directly inserted
+ into <structname>measurement_y2006m02</>, provided it satisfies its
+ partition constraint) will be further redirected to one of its
+ partitions based on the <structfield>peaktemp</> column. Partition
+ key specified may overlap with the parent's partition key, although
+ care must be taken when specifying the bounds of a sub-partition
+ such that the set of data it accepts constitutes a subset of what
+ the partition's own bounds allows; the system does not try to check
+ if that's really the case.
+ </para>
+ </listitem>
- <listitem>
- <para>
- Each partition must be created as a child table of a single parent
- table. The parent table itself is normally empty; it exists just to
- represent the entire data set. You should be familiar with
- inheritance (see <xref linkend="ddl-inherit">) before attempting to
- set up partitioning with it. This was the only method to implement
- partitioning in older versions.
- </para>
- </listitem>
- </varlistentry>
+ <listitem>
+ <para>
+ Create an index on the key column(s), as well as any other indexes you
+ might want for every partition.
- <varlistentry>
- <term>Using Partitioned Tables</term>
+<programlisting>
+CREATE INDEX ON measurement_y2006m02 (logdate);
+CREATE INDEX ON measurement_y2006m03 (logdate);
+...
+CREATE INDEX ON measurement_y2007m11 (logdate);
+CREATE INDEX ON measurement_y2007m12 (logdate);
+CREATE INDEX ON measurement_y2008m01 (logdate);
+</programlisting>
+ </para>
+ </listitem>
<listitem>
<para>
- See last section for some general information:
- <xref linkend="ddl-partitioned-tables">
+ Ensure that the <xref linkend="guc-constraint-exclusion">
+ configuration parameter is not disabled in <filename>postgresql.conf</>.
+ If it is, queries will not be optimized as desired.
</para>
</listitem>
- </varlistentry>
- </variablelist>
+ </orderedlist>
</para>
<para>
- The following forms of partitioning can be implemented in
- <productname>PostgreSQL</productname> using either of the above mentioned
- methods, although the latter provides dedicated syntax for each:
-
- <variablelist>
- <varlistentry>
- <term>Range Partitioning</term>
-
- <listitem>
- <para>
- The table is partitioned into <quote>ranges</quote> defined
- by a key column or set of columns, with no overlap between
- the ranges of values assigned to different partitions. For
- example one might partition by date ranges, or by ranges of
- identifiers for particular business objects.
- </para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
- <term>List Partitioning</term>
-
- <listitem>
- <para>
- The table is partitioned by explicitly listing which key values
- appear in each partition.
- </para>
- </listitem>
- </varlistentry>
- </variablelist>
+ In the above example we would be creating a new partition each month, so
+ it might be wise to write a script that generates the required DDL
+ automatically.
</para>
- </sect2>
+ </sect3>
- <sect2 id="ddl-partitioning-implementation">
- <title>Implementing Partitioning</title>
+ <sect3 id="ddl-partitioning-declarative-maintenance">
+ <title>Partition Maintenance</title>
<para>
- To set up a partitioned table using inheritance, do the following:
- <orderedlist spacing="compact">
- <listitem>
- <para>
- Create the <quote>master</quote> table, from which all of the
- partitions will inherit.
- </para>
- <para>
- This table will contain no data. Do not define any check
- constraints on this table, unless you intend them to
- be applied equally to all partitions. There is no point
- in defining any indexes or unique constraints on it, either.
- </para>
- </listitem>
+ Normally the set of partitions established when initially defining the
+ the table are not intended to remain static. It is common to want to
+ remove old partitions of data and periodically add new partitions for
+ new data. One of the most important advantages of partitioning is
+ precisely that it allows this otherwise painful task to be executed
+ nearly instantaneously by manipulating the partition structure, rather
+ than physically moving large amounts of data around.
+ </para>
- <listitem>
- <para>
- Create several <quote>child</quote> tables that each inherit from
- the master table. Normally, these tables will not add any columns
- to the set inherited from the master.
- </para>
+ <para>
+ The simplest option for removing old data is simply to drop the partition
+ that is no longer necessary:
+<programlisting>
+DROP TABLE measurement_y2006m02;
+</programlisting>
+ This can very quickly delete millions of records because it doesn't have
+ to individually delete every record. Note however that the above command
+ requires taking an <literal>ACCESS EXCLUSIVE</literal> lock on the parent
+ table.
+ </para>
- <para>
- We will refer to the child tables as partitions, though they
- are in every way normal <productname>PostgreSQL</> tables
- (or, possibly, foreign tables).
- </para>
- </listitem>
+ <para>
+ Another option that is often preferable is to remove the partition from
+ the partitioned table but retain access to it as a table in its own
+ right:
- <listitem>
- <para>
- Add table constraints to the partition tables to define the
- allowed key values in each partition.
- </para>
+<programlisting>
+ALTER TABLE measurement DETACH PARTITION measurement_y2006m02;
+</programlisting>
+
+ This allows further operations to be performed on the data before
+ it is dropped. For example, this is often a useful time to back up
+ the data using <command>COPY</>, <application>pg_dump</>, or
+ similar tools. It might also be a useful time to aggregate data
+ into smaller formats, perform other data manipulations, or run
+ reports.
+ </para>
+
+ <para>
+ Similarly we can add a new partition to handle new data. We can create an
+ empty partition in the partitioned table just as the original partitions
+ were created above:
- <para>
- Typical examples would be:
<programlisting>
-CHECK ( x = 1 )
-CHECK ( county IN ( 'Oxfordshire', 'Buckinghamshire', 'Warwickshire' ))
-CHECK ( outletID >= 100 AND outletID < 200 )
+CREATE TABLE measurement_y2008m02 PARTITION OF measurement
+ FOR VALUES FROM ('2008-02-01') TO ('2008-03-01')
+ TABLESPACE fasttablespace;
</programlisting>
- Ensure that the constraints guarantee that there is no overlap
- between the key values permitted in different partitions. A common
- mistake is to set up range constraints like:
+
+ As an alternative, it is sometimes more convenient to create the
+ new table outside the partition structure, and make it a proper
+ partition later. This allows the data to be loaded, checked, and
+ transformed prior to it appearing in the partitioned table:
+
<programlisting>
-CHECK ( outletID BETWEEN 100 AND 200 )
-CHECK ( outletID BETWEEN 200 AND 300 )
+CREATE TABLE measurement_y2008m02
+ (LIKE measurement INCLUDING DEFAULTS INCLUDING CONSTRAINTS)
+ TABLESPACE fasttablespace;
+
+ALTER TABLE measurement_y2008m02 ADD CONSTRAINT y2008m02
+ CHECK ( logdate >= DATE '2008-02-01' AND logdate < DATE '2008-03-01' );
+
+\copy measurement_y2008m02 from 'measurement_y2008m02'
+-- possibly some other data preparation work
+
+ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
+ FOR VALUES FROM ('2008-02-01') TO ('2008-03-01' );
</programlisting>
- This is wrong since it is not clear which partition the key value
- 200 belongs in.
- </para>
+ </para>
- <para>
- Note that there is no difference in
- syntax between range and list partitioning; those terms are
- descriptive only.
- </para>
- </listitem>
+ <para>
+ Before running the <command>ATTACH PARTITION</> command, it is
+ recommended to create a <literal>CHECK</> constraint on the table to
+ be attached describing the desired partition constraint. Using the
+ same, system is able to skip the scan to validate the implicit
+ partition constraint. Without such a constraint, the table will be
+ scanned to validate the partition constraint while holding an
+ <literal>ACCESS EXCLUSIVE</literal> lock on the parent table.
+ One may then drop the constraint after <command>ATTACH PARTITION</>
+ is finished, because it is no longer necessary.
+ </para>
+ </sect3>
- <listitem>
- <para>
- For each partition, create an index on the key column(s),
- as well as any other indexes you might want. (The key index is
- not strictly necessary, but in most scenarios it is helpful.
- If you intend the key values to be unique then you should
- always create a unique or primary-key constraint for each
- partition.)
- </para>
- </listitem>
+ <sect3 id="ddl-partitioning-declarative-limitations">
+ <title>Limitations</title>
- <listitem>
- <para>
- Optionally, define a trigger or rule to redirect data inserted into
- the master table to the appropriate partition.
- </para>
- </listitem>
+ <para>
+ The following limitations apply to partitioned tables:
+ <itemizedlist>
+ <listitem>
+ <para>
+ It is not possible to add same set of indexes on all partitions
+ automatically. Indexes must be added to each partition with separate
+ commands. This also means that there is no way to create a primary
+ key, unique constraint, or exclusion constraint spanning all
+ partitions; it is only possible to constrain each leaf partition
+ individually.
+ </para>
+ </listitem>
- <listitem>
- <para>
- Ensure that the <xref linkend="guc-constraint-exclusion">
- configuration parameter is not disabled in
- <filename>postgresql.conf</>.
- If it is, queries will not be optimized as desired.
- </para>
- </listitem>
+ <listitem>
+ <para>
+ Since primary keys are not supported on partitioned tables, foreign
+ keys referencing partitioned tables are not supported, nor are foreign
+ key references from a partitioned table to some other table.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Using the <literal>ON CONFLICT</literal> clause with partitioned tables
+ will cause an error if <literal>DO UPDATE</literal> is specified as the
+ alternative action, because unique or exclusion constraints can only be
+ created on individual partitions. There is no support for enforcing
+ uniqueness (or an exclusion constraint) across an entire partitioning
+ hierarchy.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ An <command>UPDATE</> that causes a row to move from one partition to
+ another fails, because the new value of the row fails to satisfy the
+ implicit partition constraint of the original partition.
+ </para>
+ </listitem>
- </orderedlist>
+ <listitem>
+ <para>
+ Row triggers, if necessary, must be defined on individual partitions,
+ not the partitioned table as it is not supported.
+ </para>
+ </listitem>
+ </itemizedlist>
</para>
+ </sect3>
+ </sect2>
+ <sect2 id="ddl-partitioning-implementation-inheritance">
+ <title>Implementation Using Inheritance</title>
<para>
- To use partitioned tables, do the following:
- <orderedlist spacing="compact">
+ While the built-in declarative partitioning is suitable for most
+ common use cases, there are some circumstances where a more flexible
+ approach may be useful. Partitioning can be implemented using table
+ inheritance, which allows for several features which are not supported
+ by declarative partitioning, such as:
+
+ <itemizedlist>
<listitem>
<para>
- Create <quote>master</quote> table as a partitioned table by
- specifying the <literal>PARTITION BY</literal> clause, which includes
- the partitioning method (<literal>RANGE</literal> or
- <literal>LIST</literal>) and the list of column(s) to use as the
- partition key. To be able to insert data into the table, one must
- create partitions, as described below.
+ Partitioning enforces a rule that all partitions must have exactly
+ the same set of columns as the parent, but table inheritance allows
+ children to have extra columns not present in the parent.
</para>
-
- <note>
- <para>
- To decide when to use multiple columns in the partition key for range
- partitioning, consider whether queries accessing the partitioned
- in question will include conditions that involve multiple columns,
- especially the columns being considered to be the partition key.
- If so, the optimizer can create a plan that will scan fewer partitions
- if a query's conditions are such that there is equality constraint on
- leading partition key columns, because they limit the number of
- partitions of interest. The first partition key column with
- inequality constraint also further eliminates some partitions of
- those chosen by equality constraints on earlier columns.
- </para>
- </note>
</listitem>
<listitem>
<para>
- Create partitions of the master partitioned table, with the partition
- bounds specified for each partition matching the partitioning method
- and partition key of the master table. Note that specifying partition
- bounds such that the new partition's values will overlap with one or
- more existing partitions will cause an error. It is only after
- creating partitions that one is able to insert data into the master
- partitioned table, provided it maps to one of the existing partitions.
- If a data row does not map to any of the existing partitions, it will
- cause an error.
- </para>
-
- <para>
- Partitions thus created are also in every way normal
- <productname>PostgreSQL</> tables (or, possibly, foreign tables),
- whereas partitioned tables differ in a number of ways.
- </para>
-
- <para>
- It is not necessary to create table constraints for partitions.
- Instead, partition constraints are generated implicitly whenever
- there is a need to refer to them. Also, since any data inserted into
- the master partitioned table is automatically inserted into the
- appropriate partition, it is not necessary to create triggers for the
- same.
+ Table inheritance allows for multiple inheritance.
</para>
</listitem>
<listitem>
<para>
- Just like with inheritance, create an index on the key column(s),
- as well as any other indexes you might want for every partition.
- Note that it is currently not supported to propagate index definition
- from the master partitioned table to its partitions; in fact, it is
- not possible to define indexes on partitioned tables in the first
- place. This might change in future releases.
+ Declarative partitioning only supports list and range partitioning,
+ whereas table inheritance allows data to be divided in a manner of
+ the user's choosing. (Note, however, that if constraint exclusion is
+ unable to prune partitions effectively, query performance will be very
+ poor.)
</para>
</listitem>
<listitem>
<para>
- Currently, partitioned tables also depend on constraint exclusion
- for query optimization, so ensure that the
- <xref linkend="guc-constraint-exclusion"> configuration parameter is
- not disabled in <filename>postgresql.conf</>. This might change in
- future releases.
+ Some operations require a stronger lock when using declarative
+ partitioning than when using table inheritance. For example, adding
+ or removing a partition to or from a partitioned table requires taking
+ an <literal>ACCESS EXCLUSIVE</literal> lock on the parent table,
+ whereas a <literal>SHARE UPDATE EXCLUSIVE</literal> lock is enough
+ in the case of regular inheritance.
</para>
</listitem>
-
- </orderedlist>
+ </itemizedlist>
</para>
- <para>
- For example, suppose we are constructing a database for a large
- ice cream company. The company measures peak temperatures every
- day as well as ice cream sales in each region. Conceptually,
- we want a table like:
-
-<programlisting>
-CREATE TABLE measurement (
- city_id int not null,
- logdate date not null,
- peaktemp int,
- unitsales int
-);
-</programlisting>
+ <sect3 id="ddl-partitioning-inheritance-example">
+ <title>Example</title>
- We know that most queries will access just the last week's, month's or
- quarter's data, since the main use of this table will be to prepare
- online reports for management.
- To reduce the amount of old data that needs to be stored, we
- decide to only keep the most recent 3 years worth of data. At the
- beginning of each month we will remove the oldest month's data.
- </para>
+ <para>
+ We use the same <structname>measurement</structname> table we used
+ above. To implement it as a partitioned table using inheritance, use
+ the following steps:
- <para>
- In this situation we can use partitioning to help us meet all of our
- different requirements for the measurements table. Following the
- steps outlined above for both methods, partitioning can be set up as
- follows:
- </para>
+ <orderedlist spacing="compact">
+ <listitem>
+ <para>
+ Create the <quote>master</quote> table, from which all of the
+ partitions will inherit. This table will contain no data. Do not
+ define any check constraints on this table, unless you intend them
+ to be applied equally to all partitions. There is no point in
+ defining any indexes or unique constraints on it, either. For our
+ example, master table is the <structname>measurement</structname>
+ table as originally defined.
+ </para>
+ </listitem>
- <para>
- <orderedlist spacing="compact">
- <listitem>
- <para>
- The master table is the <structname>measurement</> table, declared
- exactly as above.
- </para>
- </listitem>
+ <listitem>
+ <para>
+ Create several <quote>child</quote> tables that each inherit from
+ the master table. Normally, these tables will not add any columns
+ to the set inherited from the master.
+ </para>
- <listitem>
- <para>
- Next we create one partition for each active month:
+ <para>
+ We will refer to the child tables as partitions, though they are
+ in every way normal <productname>PostgreSQL</> tables (or, possibly,
+ foreign tables).
+ </para>
+ <para>
+ This solves one of our problems: deleting old data. Each
+ month, all we will need to do is perform a <command>DROP
+ TABLE</command> on the oldest child table and create a new
+ child table for the new month's data.
<programlisting>
-CREATE TABLE measurement_y2006m02 ( ) INHERITS (measurement);
-CREATE TABLE measurement_y2006m03 ( ) INHERITS (measurement);
+CREATE TABLE measurement_y2006m02 () INHERITS (measurement);
+CREATE TABLE measurement_y2006m03 () INHERITS (measurement);
...
-CREATE TABLE measurement_y2007m11 ( ) INHERITS (measurement);
-CREATE TABLE measurement_y2007m12 ( ) INHERITS (measurement);
-CREATE TABLE measurement_y2008m01 ( ) INHERITS (measurement);
+CREATE TABLE measurement_y2007m11 () INHERITS (measurement);
+CREATE TABLE measurement_y2007m12 () INHERITS (measurement);
+CREATE TABLE measurement_y2008m01 () INHERITS (measurement);
</programlisting>
+ </para>
+ </listitem>
- Each of the partitions are complete tables in their own right,
- but they inherit their definitions from the
- <structname>measurement</> table.
- </para>
+ <listitem>
+ <para>
+ Add non-overlapping table constraints to the partition tables to
+ define the allowed key values in each partition.
+ </para>
- <para>
- This solves one of our problems: deleting old data. Each
- month, all we will need to do is perform a <command>DROP
- TABLE</command> on the oldest child table and create a new
- child table for the new month's data.
- </para>
- </listitem>
+ <para>
+ Typical examples would be:
+<programlisting>
+CHECK ( x = 1 )
+CHECK ( county IN ( 'Oxfordshire', 'Buckinghamshire', 'Warwickshire' ))
+CHECK ( outletID >= 100 AND outletID < 200 )
+</programlisting>
+ Ensure that the constraints guarantee that there is no overlap
+ between the key values permitted in different partitions. A common
+ mistake is to set up range constraints like:
+<programlisting>
+CHECK ( outletID BETWEEN 100 AND 200 )
+CHECK ( outletID BETWEEN 200 AND 300 )
+</programlisting>
+ This is wrong since it is not clear which partition the key value
+ 200 belongs in.
+ </para>
- <listitem>
- <para>
- We must provide non-overlapping table constraints. Rather than
- just creating the partition tables as above, the table creation
- script should really be:
+ <para>
+ It would be better to instead create partitions as follows:
<programlisting>
CREATE TABLE measurement_y2006m02 (
CHECK ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' )
) INHERITS (measurement);
+
CREATE TABLE measurement_y2006m03 (
CHECK ( logdate >= DATE '2006-03-01' AND logdate < DATE '2006-04-01' )
) INHERITS (measurement);
+
...
CREATE TABLE measurement_y2007m11 (
CHECK ( logdate >= DATE '2007-11-01' AND logdate < DATE '2007-12-01' )
) INHERITS (measurement);
+
CREATE TABLE measurement_y2007m12 (
CHECK ( logdate >= DATE '2007-12-01' AND logdate < DATE '2008-01-01' )
) INHERITS (measurement);
+
CREATE TABLE measurement_y2008m01 (
CHECK ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' )
) INHERITS (measurement);
</programlisting>
- </para>
- </listitem>
+ </para>
- <listitem>
- <para>
- We probably need indexes on the key columns too:
+ <para>
+ Note that there is no difference in syntax between range and list
+ partitioning; those terms are descriptive only.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ For each partition, create an index on the key column(s),
+ as well as any other indexes you might want. (The key index is
+ not strictly necessary, but in most scenarios it is helpful.
+ If you intend the key values to be unique then you should
+ always create a unique or primary-key constraint for each
+ partition.)
<programlisting>
CREATE INDEX measurement_y2006m02_logdate ON measurement_y2006m02 (logdate);
CREATE INDEX measurement_y2006m03_logdate ON measurement_y2006m03 (logdate);
-...
CREATE INDEX measurement_y2007m11_logdate ON measurement_y2007m11 (logdate);
CREATE INDEX measurement_y2007m12_logdate ON measurement_y2007m12 (logdate);
CREATE INDEX measurement_y2008m01_logdate ON measurement_y2008m01 (logdate);
</programlisting>
+ </para>
+ </listitem>
- We choose not to add further indexes at this time.
- </para>
- </listitem>
-
- <listitem>
- <para>
- We want our application to be able to say <literal>INSERT INTO
- measurement ...</> and have the data be redirected into the
- appropriate partition table. We can arrange that by attaching
- a suitable trigger function to the master table.
- If data will be added only to the latest partition, we can
- use a very simple trigger function:
+ <listitem>
+ <para>
+ We want our application to be able to say <literal>INSERT INTO
+ measurement ...</> and have the data be redirected into the
+ appropriate partition table. We can arrange that by attaching
+ a suitable trigger function to the master table.
+ If data will be added only to the latest partition, we can
+ use a very simple trigger function:
<programlisting>
CREATE OR REPLACE FUNCTION measurement_insert_trigger()
@@ -3363,9 +3488,11 @@ END;
$$
LANGUAGE plpgsql;
</programlisting>
+ </para>
- After creating the function, we create a trigger which
- calls the trigger function:
+ <para>
+ After creating the function, we create a trigger which
+ calls the trigger function:
<programlisting>
CREATE TRIGGER insert_measurement_trigger
@@ -3373,15 +3500,15 @@ CREATE TRIGGER insert_measurement_trigger
FOR EACH ROW EXECUTE PROCEDURE measurement_insert_trigger();
</programlisting>
- We must redefine the trigger function each month so that it always
- points to the current partition. The trigger definition does
- not need to be updated, however.
- </para>
+ We must redefine the trigger function each month so that it always
+ points to the current partition. The trigger definition does
+ not need to be updated, however.
+ </para>
- <para>
- We might want to insert data and have the server automatically
- locate the partition into which the row should be added. We
- could do this with a more complex trigger function, for example:
+ <para>
+ We might want to insert data and have the server automatically
+ locate the partition into which the row should be added. We
+ could do this with a more complex trigger function, for example:
<programlisting>
CREATE OR REPLACE FUNCTION measurement_insert_trigger()
@@ -3393,183 +3520,120 @@ BEGIN
ELSIF ( NEW.logdate >= DATE '2006-03-01' AND
NEW.logdate < DATE '2006-04-01' ) THEN
INSERT INTO measurement_y2006m03 VALUES (NEW.*);
- ...
- ELSIF ( NEW.logdate >= DATE '2008-01-01' AND
- NEW.logdate < DATE '2008-02-01' ) THEN
- INSERT INTO measurement_y2008m01 VALUES (NEW.*);
- ELSE
- RAISE EXCEPTION 'Date out of range. Fix the measurement_insert_trigger() function!';
- END IF;
- RETURN NULL;
-END;
-$$
-LANGUAGE plpgsql;
-</programlisting>
-
- The trigger definition is the same as before.
- Note that each <literal>IF</literal> test must exactly match the
- <literal>CHECK</literal> constraint for its partition.
- </para>
-
- <para>
- While this function is more complex than the single-month case,
- it doesn't need to be updated as often, since branches can be
- added in advance of being needed.
- </para>
-
- <note>
- <para>
- In practice it might be best to check the newest partition first,
- if most inserts go into that partition. For simplicity we have
- shown the trigger's tests in the same order as in other parts
- of this example.
- </para>
- </note>
- </listitem>
- </orderedlist>
- </para>
-
- <para>
- Steps when using a partitioned table are as follows:
- </para>
-
- <para>
- <orderedlist spacing="compact">
- <listitem>
- <para>
- Create the <structname>measurement</> table as a partitioned table:
-
-<programlisting>
-CREATE TABLE measurement (
- city_id int not null,
- logdate date not null,
- peaktemp int,
- unitsales int
-) PARTITION BY RANGE (logdate);
+ ...
+ ELSIF ( NEW.logdate >= DATE '2008-01-01' AND
+ NEW.logdate < DATE '2008-02-01' ) THEN
+ INSERT INTO measurement_y2008m01 VALUES (NEW.*);
+ ELSE
+ RAISE EXCEPTION 'Date out of range. Fix the measurement_insert_trigger() function!';
+ END IF;
+ RETURN NULL;
+END;
+$$
+LANGUAGE plpgsql;
</programlisting>
- </para>
- </listitem>
- <listitem>
- <para>
- Then create partitions as follows:
+ The trigger definition is the same as before.
+ Note that each <literal>IF</literal> test must exactly match the
+ <literal>CHECK</literal> constraint for its partition.
+ </para>
-<programlisting>
-CREATE TABLE measurement_y2006m02 PARTITION OF measurement
- FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
-CREATE TABLE measurement_y2006m03 PARTITION OF measurement
- FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
-...
-CREATE TABLE measurement_y2007m11 PARTITION OF measurement
- FOR VALUES FROM ('2007-11-01') TO ('2007-12-01');
-CREATE TABLE measurement_y2007m12 PARTITION OF measurement
- FOR VALUES FROM ('2007-12-01') TO ('2008-01-01');
-CREATE TABLE measurement_y2008m01 PARTITION OF measurement
- FOR VALUES FROM ('2008-01-01') TO ('2008-02-01');
-</programlisting>
- </para>
- </listitem>
+ <para>
+ While this function is more complex than the single-month case,
+ it doesn't need to be updated as often, since branches can be
+ added in advance of being needed.
+ </para>
- <listitem>
- <para>
- Create indexes on the key columns just like in case of inheritance
- partitions.
- </para>
- </listitem>
- </orderedlist>
+ <note>
+ <para>
+ In practice it might be best to check the newest partition first,
+ if most inserts go into that partition. For simplicity we have
+ shown the trigger's tests in the same order as in other parts
+ of this example.
+ </para>
+ </note>
- <note>
- <para>
- To implement sub-partitioning, specify the
- <literal>PARTITION BY</literal> clause in the commands used to create
- individual partitions, for example:
+ <para>
+ A different approach to redirecting inserts into the appropriate
+ partition table is to set up rules, instead of a trigger, on the
+ master table. For example:
<programlisting>
-CREATE TABLE measurement_y2006m02 PARTITION OF measurement
- FOR VALUES FROM ('2006-02-01') TO ('2006-03-01')
- PARTITION BY RANGE (peaktemp);
+CREATE RULE measurement_insert_y2006m02 AS
+ON INSERT TO measurement WHERE
+ ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' )
+DO INSTEAD
+ INSERT INTO measurement_y2006m02 VALUES (NEW.*);
+...
+CREATE RULE measurement_insert_y2008m01 AS
+ON INSERT TO measurement WHERE
+ ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' )
+DO INSTEAD
+ INSERT INTO measurement_y2008m01 VALUES (NEW.*);
</programlisting>
- After creating partitions of <structname>measurement_y2006m02</>, any
- data inserted into <structname>measurement</> that is mapped to
- <structname>measurement_y2006m02</> will be further redirected to one
- of its partitions based on the <structfield>peaktemp</> column.
- Partition key specified may overlap with the parent's partition key,
- although care must be taken when specifying the bounds of sub-partitions
- such that the accepted set of data constitutes a subset of what a
- partition's own bounds allows; the system does not try to check if
- that's really the case.
- </para>
- </note>
- </para>
-
- <para>
- As we can see, a complex partitioning scheme could require a
- substantial amount of DDL, although significantly less when using
- partitioned tables. In the above example we would be creating a new
- partition each month, so it might be wise to write a script that
- generates the required DDL automatically.
- </para>
+ A rule has significantly more overhead than a trigger, but the
+ overhead is paid once per query rather than once per row, so this
+ method might be advantageous for bulk-insert situations. In most
+ cases, however, the trigger method will offer better performance.
+ </para>
- </sect2>
+ <para>
+ Be aware that <command>COPY</> ignores rules. If you want to
+ use <command>COPY</> to insert data, you'll need to copy into the
+ correct partition table rather than into the master. <command>COPY</>
+ does fire triggers, so you can use it normally if you use the trigger
+ approach.
+ </para>
- <sect2 id="ddl-partitioning-managing-partitions">
- <title>Managing Partitions</title>
+ <para>
+ Another disadvantage of the rule approach is that there is no simple
+ way to force an error if the set of rules doesn't cover the insertion
+ date; the data will silently go into the master table instead.
+ </para>
+ </listitem>
- <para>
- Normally the set of partitions established when initially
- defining the table are not intended to remain static. It is
- common to want to remove old partitions of data and periodically
- add new partitions for new data. One of the most important
- advantages of partitioning is precisely that it allows this
- otherwise painful task to be executed nearly instantaneously by
- manipulating the partition structure, rather than physically moving large
- amounts of data around.
- </para>
+ <listitem>
+ <para>
+ Ensure that the <xref linkend="guc-constraint-exclusion">
+ configuration parameter is not disabled in
+ <filename>postgresql.conf</>.
+ If it is, queries will not be optimized as desired.
+ </para>
+ </listitem>
+ </orderedlist>
+ </para>
- <para>
- Both the inheritance-based and partitioned table methods allow this to
- be done, although the latter requires taking an <literal>ACCESS EXCLUSIVE</literal>
- lock on the master table for various commands mentioned below.
- </para>
+ <para>
+ As we can see, a complex partitioning scheme could require a
+ substantial amount of DDL. In the above example we would be creating
+ a new partition each month, so it might be wise to write a script that
+ generates the required DDL automatically.
+ </para>
+ </sect3>
- <para>
- The simplest option for removing old data is simply to drop the partition
- that is no longer necessary, which works using both methods of
- partitioning:
+ <sect3 id="ddl-partitioning-inheritance-maintenance">
+ <title>Partition Maintenance</title>
+ <para>
+ To remove old data quickly, simply to drop the partition that is no
+ longer necessary:
<programlisting>
DROP TABLE measurement_y2006m02;
</programlisting>
- This can very quickly delete millions of records because it doesn't have
- to individually delete every record.
- </para>
-
- <para>
- Another option that is often preferable is to remove the partition from
- the partitioned table but retain access to it as a table in its own
- right:
-<programlisting>
-ALTER TABLE measurement_y2006m02 NO INHERIT measurement;
-</programlisting>
+ </para>
- When using a partitioned table:
+ <para>
+ To remove the partition from the partitioned table but retain access to
+ it as a table in its own right:
<programlisting>
-ALTER TABLE measurement DETACH PARTITION measurement_y2006m02;
+ALTER TABLE measurement_y2006m02 NO INHERIT measurement;
</programlisting>
+ </para>
- This allows further operations to be performed on the data before
- it is dropped. For example, this is often a useful time to back up
- the data using <command>COPY</>, <application>pg_dump</>, or
- similar tools. It might also be a useful time to aggregate data
- into smaller formats, perform other data manipulations, or run
- reports.
- </para>
-
- <para>
- Similarly we can add a new partition to handle new data. We can create an
- empty partition in the partitioned table just as the original partitions
- were created above:
+ <para>
+ To add a new partition to handle new data, create an empty partition
+ just as the original partitions were created above:
<programlisting>
CREATE TABLE measurement_y2008m02 (
@@ -3577,17 +3641,9 @@ CREATE TABLE measurement_y2008m02 (
) INHERITS (measurement);
</programlisting>
- When using a partitioned table:
-
-<programlisting>
-CREATE TABLE measurement_y2008m02 PARTITION OF measurement
- FOR VALUES FROM ('2008-02-01') TO ('2008-03-01');
-</programlisting>
-
- As an alternative, it is sometimes more convenient to create the
- new table outside the partition structure, and make it a proper
- partition later. This allows the data to be loaded, checked, and
- transformed prior to it appearing in the partitioned table:
+ Alternatively, one may want to create the new table outside the partition
+ structure, and make it a partition after the data is loaded, checked,
+ and transformed.
<programlisting>
CREATE TABLE measurement_y2008m02
@@ -3598,31 +3654,64 @@ ALTER TABLE measurement_y2008m02 ADD CONSTRAINT y2008m02
-- possibly some other data preparation work
ALTER TABLE measurement_y2008m02 INHERIT measurement;
</programlisting>
+ </para>
+ </sect3>
- The last of the above commands when using a partitioned table would be:
+ <sect3 id="ddl-partitioning-inheritance-caveats">
+ <title>Caveats</title>
+ <para>
+ The following caveats apply to partitioned tables implemented using
+ inheritance:
+ <itemizedlist>
+ <listitem>
+ <para>
+ There is no automatic way to verify that all of the
+ <literal>CHECK</literal> constraints are mutually
+ exclusive. It is safer to create code that generates
+ partitions and creates and/or modifies associated objects than
+ to write each by hand.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ The schemes shown here assume that the partition key column(s)
+ of a row never change, or at least do not change enough to require
+ it to move to another partition. An <command>UPDATE</> that attempts
+ to do that will fail because of the <literal>CHECK</> constraints.
+ If you need to handle such cases, you can put suitable update triggers
+ on the partition tables, but it makes management of the structure
+ much more complicated.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ If you are using manual <command>VACUUM</command> or
+ <command>ANALYZE</command> commands, don't forget that
+ you need to run them on each partition individually. A command like:
<programlisting>
-ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
- FOR VALUES FROM ('2008-02-01') TO ('2008-03-01' );
+ANALYZE measurement;
</programlisting>
- </para>
+ will only process the master table.
+ </para>
+ </listitem>
- <tip>
- <para>
- Before running the <command>ATTACH PARTITION</> command, it is
- recommended to create a <literal>CHECK</> constraint on the table to
- be attached describing the desired partition constraint. Using the
- same, system is able to skip the scan to validate the implicit
- partition constraint. Without such a constraint, the table will be
- scanned to validate the partition constraint, while holding an
- <literal>ACCESS EXCLUSIVE</literal> lock on the parent table.
- One may want to drop the constraint after <command>ATTACH PARTITION</>
- is finished, because it is no longer necessary.
- </para>
- </tip>
- </sect2>
+ <listitem>
+ <para>
+ <command>INSERT</command> statements with <literal>ON CONFLICT</>
+ clauses are unlikely to work as expected, as the <literal>ON CONFLICT</>
+ action is only taken in case of unique violations on the specified
+ target relation, not its child relations.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </sect3>
+ </sect2>
- <sect2 id="ddl-partitioning-constraint-exclusion">
+ <sect2 id="ddl-partitioning-constraint-exclusion">
<title>Partitioning and Constraint Exclusion</title>
<indexterm>
@@ -3632,7 +3721,8 @@ ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
<para>
<firstterm>Constraint exclusion</> is a query optimization technique
that improves performance for partitioned tables defined in the
- fashion described above. As an example:
+ fashion described above (both declarative partitioned tables and those
+ implemented using inheritance). As an example:
<programlisting>
SET constraint_exclusion = on;
@@ -3715,160 +3805,15 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate >= DATE '2008-01-01';
are unlikely to benefit.
</para>
- <note>
- <para>
- Currently, constraint exclusion is also used for partitioned tables.
- However, we did not create any <literal>CHECK</literal> constraints
- for individual partitions as seen above. In this case, the optimizer
- uses internally generated constraint for every partition.
- </para>
- </note>
-
- </sect2>
-
- <sect2 id="ddl-partitioning-alternatives">
- <title>Alternative Partitioning Methods</title>
-
- <para>
- A different approach to redirecting inserts into the appropriate
- partition table is to set up rules, instead of a trigger, on the
- master table (unless it is a partitioned table). For example:
-
-<programlisting>
-CREATE RULE measurement_insert_y2006m02 AS
-ON INSERT TO measurement WHERE
- ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' )
-DO INSTEAD
- INSERT INTO measurement_y2006m02 VALUES (NEW.*);
-...
-CREATE RULE measurement_insert_y2008m01 AS
-ON INSERT TO measurement WHERE
- ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' )
-DO INSTEAD
- INSERT INTO measurement_y2008m01 VALUES (NEW.*);
-</programlisting>
-
- A rule has significantly more overhead than a trigger, but the overhead
- is paid once per query rather than once per row, so this method might be
- advantageous for bulk-insert situations. In most cases, however, the
- trigger method will offer better performance.
- </para>
-
- <para>
- Be aware that <command>COPY</> ignores rules. If you want to
- use <command>COPY</> to insert data, you'll need to copy into the correct
- partition table rather than into the master. <command>COPY</> does fire
- triggers, so you can use it normally if you use the trigger approach.
- </para>
-
- <para>
- Another disadvantage of the rule approach is that there is no simple
- way to force an error if the set of rules doesn't cover the insertion
- date; the data will silently go into the master table instead.
- </para>
-
- <para>
- Partitioning can also be arranged using a <literal>UNION ALL</literal>
- view, instead of table inheritance. For example,
-
-<programlisting>
-CREATE VIEW measurement AS
- SELECT * FROM measurement_y2006m02
-UNION ALL SELECT * FROM measurement_y2006m03
-...
-UNION ALL SELECT * FROM measurement_y2007m11
-UNION ALL SELECT * FROM measurement_y2007m12
-UNION ALL SELECT * FROM measurement_y2008m01;
-</programlisting>
-
- However, the need to recreate the view adds an extra step to adding and
- dropping individual partitions of the data set. In practice this
- method has little to recommend it compared to using inheritance.
- </para>
-
- </sect2>
-
- <sect2 id="ddl-partitioning-caveats">
- <title>Caveats</title>
-
- <para>
- The following caveats apply to using inheritance to implement partitioning:
- <itemizedlist>
- <listitem>
- <para>
- There is no automatic way to verify that all of the
- <literal>CHECK</literal> constraints are mutually
- exclusive. It is safer to create code that generates
- partitions and creates and/or modifies associated objects than
- to write each by hand.
- </para>
- </listitem>
-
- <listitem>
- <para>
- The schemes shown here assume that the partition key column(s)
- of a row never change, or at least do not change enough to require
- it to move to another partition. An <command>UPDATE</> that attempts
- to do that will fail because of the <literal>CHECK</> constraints.
- If you need to handle such cases, you can put suitable update triggers
- on the partition tables, but it makes management of the structure
- much more complicated.
- </para>
- </listitem>
-
- <listitem>
- <para>
- If you are using manual <command>VACUUM</command> or
- <command>ANALYZE</command> commands, don't forget that
- you need to run them on each partition individually. A command like:
-<programlisting>
-ANALYZE measurement;
-</programlisting>
- will only process the master table.
- </para>
- </listitem>
-
- <listitem>
- <para>
- <command>INSERT</command> statements with <literal>ON CONFLICT</>
- clauses are unlikely to work as expected, as the <literal>ON CONFLICT</>
- action is only taken in case of unique violations on the specified
- target relation, not its child relations.
- </para>
- </listitem>
- </itemizedlist>
- </para>
-
<para>
- The following caveats apply to partitioned tables created with the
- explicit syntax:
- <itemizedlist>
- <listitem>
- <para>
- An <command>UPDATE</> that causes a row to move from one partition to
- another fails, because the new value of the row fails to satisfy the
- implicit partition constraint of the original partition. This might
- change in future releases.
- </para>
- </listitem>
-
- <listitem>
- <para>
- Using the <literal>ON CONFLICT</literal> clause with partitioned tables
- will cause an error if <literal>DO UPDATE</literal> is specified as the
- alternative action, because unique or exclusion constraints can only be
- created on individual partitions. There is no support for enforcing
- uniqueness (or an exclusion constraint) across an entire partitioning
- hierarchy.
- </para>
- </listitem>
-
- </itemizedlist>
+ Constraint exclusion is also used for declarative partitioning, however
+ it is not required to create <literal>CHECK</literal> constraints for
+ individual partitions as when using table inheritance.
</para>
<para>
- The following caveats apply to constraint exclusion, which is currently
- used by both inheritance and partitioned tables:
+ The following caveats apply to constraint exclusion, which is used by
+ both inheritance and partitioned tables:
<itemizedlist>
<listitem>
@@ -3909,6 +3854,32 @@ ANALYZE measurement;
</itemizedlist>
</para>
</sect2>
+
+ <sect2 id="ddl-partitioning-alternatives">
+ <title>Alternative Partitioning Methods</title>
+
+ <sect3 id="ddl-partitioning-alternatives-union-all">
+ <title>Using UNION ALL view</title>
+ <para>
+ Partitioning can also be arranged using a <literal>UNION ALL</literal>
+ view, instead of table inheritance. For example,
+
+<programlisting>
+CREATE VIEW measurement AS
+ SELECT * FROM measurement_y2006m02
+UNION ALL SELECT * FROM measurement_y2006m03
+...
+UNION ALL SELECT * FROM measurement_y2007m11
+UNION ALL SELECT * FROM measurement_y2007m12
+UNION ALL SELECT * FROM measurement_y2008m01;
+</programlisting>
+
+ However, the need to recreate the view adds an extra step to adding and
+ dropping individual partitions of the data set. In practice this
+ method has little to recommend it compared to using inheritance.
+ </para>
+ </sect3>
+ </sect2>
</sect1>
<sect1 id="ddl-foreign-data">
--
2.11.0
On Tue, Mar 28, 2017 at 6:35 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
Attached updated patch.
Committed with editing here and there. I just left out the thing
about UNION ALL views, which seemed to brief a treatment to deserve
its own subsection. Maybe a longer explanation of that is worthwhile
or maybe it's not, but that can be a separate patch.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017/04/01 6:37, Robert Haas wrote:
On Tue, Mar 28, 2017 at 6:35 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:Attached updated patch.
Committed with editing here and there. I just left out the thing
about UNION ALL views, which seemed to brief a treatment to deserve
its own subsection. Maybe a longer explanation of that is worthwhile
or maybe it's not, but that can be a separate patch.
Thanks for committing.
I noticed what looks like a redundant line (or an incomplete sentence) in
the paragraph about multi-column partition keys. In the following text:
+ partitioning, if desired. Of course, this will often result in a
larger
+ number of partitions, each of which is individually smaller.
+ criteria. Using fewer columns may lead to coarser-grained
+ A query accessing the partitioned table will have
+ to scan fewer partitions if the conditions involve some or all of
these
This:
+ criteria. Using fewer columns may lead to coarser-grained
looks redundant. But maybe we can keep that sentence by completing it,
which the attached patch does as follows:
+ number of partitions, each of which is individually smaller. On the
+ other hand, using fewer columns may lead to a coarser-grained
+ partitioning criteria with smaller number of partitions.
The patch also fixes some typos I noticed.
Thanks,
Amit
Attachments:
doc-ddl-partitioning-fixes.patchtext/x-diff; name=doc-ddl-partitioning-fixes.patchDownload
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 5109778196..340c961b3f 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2932,7 +2932,7 @@ VALUES ('Albany', NULL, NULL, 'NY');
tables and partitions. For example, a partition cannot have any parents
other than the partitioned table it is a partition of, nor can a regular
table inherit from a partitioned table making the latter its parent.
- That means partitioned table and partitions do not participate in
+ That means partitioned tables and partitions do not participate in
inheritance with regular tables. Since a partition hierarchy consisting
of the partitioned table and its partitions is still an inheritance
hierarchy, all the normal rules of inheritance apply as described in
@@ -3036,11 +3036,12 @@ CREATE TABLE measurement (
<para>
You may decide to use multiple columns in the partition key for range
partitioning, if desired. Of course, this will often result in a larger
- number of partitions, each of which is individually smaller.
- criteria. Using fewer columns may lead to coarser-grained
- A query accessing the partitioned table will have
- to scan fewer partitions if the conditions involve some or all of these
- columns. For example, consider a table range partitioned using columns
+ number of partitions, each of which is individually smaller. On the
+ other hand, using fewer columns may lead to a coarser-grained
+ partitioning criteria with smaller number of partitions. A query
+ accessing the partitioned table will have to scan fewer partitions if
+ the conditions involve some or all of these columns.
+ For example, consider a table range partitioned using columns
<structfield>lastname</> and <structfield>firstname</> (in that order)
as the partition key.
</para>
@@ -3167,8 +3168,8 @@ CREATE INDEX ON measurement_y2008m01 (logdate);
</para>
<para>
- The simplest option for removing old data is simply to drop the partition
- that is no longer necessary:
+ The simplest option for removing old data is to drop the partition that
+ is no longer necessary:
<programlisting>
DROP TABLE measurement_y2006m02;
</programlisting>
@@ -3595,8 +3596,8 @@ DO INSTEAD
<sect3 id="ddl-partitioning-inheritance-maintenance">
<title>Partition Maintenance</title>
<para>
- To remove old data quickly, simply to drop the partition that is no
- longer necessary:
+ To remove old data quickly, simply drop the partition that is no longer
+ necessary:
<programlisting>
DROP TABLE measurement_y2006m02;
</programlisting>
@@ -3692,7 +3693,7 @@ ANALYZE measurement;
Triggers or rules will be needed to route rows to the desired
partition, unless the application is explicitly aware of the
partitioning scheme. Triggers may be complicated to write, and will
- be much slower than the tuple routing performed interally by
+ be much slower than the tuple routing performed internally by
declarative partitioning.
</para>
</listitem>
On Mon, Apr 3, 2017 at 12:52 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
I noticed what looks like a redundant line (or an incomplete sentence) in
the paragraph about multi-column partition keys. In the following text:+ partitioning, if desired. Of course, this will often result in a larger + number of partitions, each of which is individually smaller. + criteria. Using fewer columns may lead to coarser-grained + A query accessing the partitioned table will have + to scan fewer partitions if the conditions involve some or all of theseThis:
+ criteria. Using fewer columns may lead to coarser-grained
looks redundant. But maybe we can keep that sentence by completing it,
which the attached patch does as follows:+ number of partitions, each of which is individually smaller. On the + other hand, using fewer columns may lead to a coarser-grained + partitioning criteria with smaller number of partitions.The patch also fixes some typos I noticed.
Thanks for the corrections. Committed.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers