Partitioned Table Index Column Order
All,
I'm on PostgreSQL 13 and have a partitioned table with a primary key.
create table t( a integer, b integer, c varchar, d .. ) partitioned by
range( a );
As a best practice is it better to create the primary key starting or
ending with the partition column?
e.g.
1) t_pkey primary key (a, b, c)
or
2) t_pkey primary key (b, c, a)
Neither the PostgreSQL documentation nor Google have an answer - at least
as far as I could find. I see examples in the Postgres documentation that
use 2) but no explanation of why that choice was made.
Does it even make a difference?
Thoughts?
--
Rumpi Gravenstein
On 2021-Jun-23, Rumpi Gravenstein wrote:
As a best practice is it better to create the primary key starting or
ending with the partition column?
It is not relevant from the partitioning point of view. Other factors
can be used to decide the column order.
--
�lvaro Herrera Valdivia, Chile
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
On 2021-Jun-23, Rumpi Gravenstein wrote:
As a best practice is it better to create the primary key starting or
ending with the partition column?
It is not relevant from the partitioning point of view. Other factors
can be used to decide the column order.
See in particular the hints in
https://www.postgresql.org/docs/current/indexes-multicolumn.html
The only thing that's different about partitioned situations is that
any particular child partition might have only a few values of the
partitioning column, which'd suggest putting it last if there are
no other relevant considerations. However, if you need a particular
column order to match query requirements, that's certainly going
to be a more important consideration.
regards, tom lane
On Thu, 24 Jun 2021 at 10:55, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2021-Jun-23, Rumpi Gravenstein wrote:
As a best practice is it better to create the primary key starting or
ending with the partition column?It is not relevant from the partitioning point of view. Other factors
can be used to decide the column order.
I'm not so sure that's really 100% true. There is at least one
partitioning feature that will work when the partitioning column is
first and won't when it's not.
Ordered partition scans work with RANGE and LIST partitioning:
create table ab (a int, b int, primary key(a,b)) partition by range(a);
create table ab1 partition of ab for values from (0) to (10);
create table ab2 partition of ab for values from (10) to (20);
explain (costs off) select * from ab order by a;
QUERY PLAN
--------------------------------------------------
Append
-> Index Only Scan using ab1_pkey on ab1 ab_1
-> Index Only Scan using ab2_pkey on ab2 ab_2
Reverse the order and you get:
QUERY PLAN
----------------------------------
Sort
Sort Key: ab.a
-> Append
-> Seq Scan on ab1 ab_1
-> Seq Scan on ab2 ab_2
David
David Rowley <dgrowleyml@gmail.com> writes:
On Thu, 24 Jun 2021 at 10:55, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
It is not relevant from the partitioning point of view. Other factors
can be used to decide the column order.
I'm not so sure that's really 100% true. There is at least one
partitioning feature that will work when the partitioning column is
first and won't when it's not.
Ordered partition scans work with RANGE and LIST partitioning:
Sure, but is that any different from the behavior with unpartitioned
tables? You have to make the index column order agree with the
ORDER BY you want to use, in either case.
regards, tom lane
On Thu, 24 Jun 2021 at 11:56, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Rowley <dgrowleyml@gmail.com> writes:
On Thu, 24 Jun 2021 at 10:55, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
It is not relevant from the partitioning point of view. Other factors
can be used to decide the column order.I'm not so sure that's really 100% true. There is at least one
partitioning feature that will work when the partitioning column is
first and won't when it's not.
Ordered partition scans work with RANGE and LIST partitioning:Sure, but is that any different from the behavior with unpartitioned
tables? You have to make the index column order agree with the
ORDER BY you want to use, in either case.
The reason I mentioned it is that the performance of the ordered
partitioned scans pretty good. If the application does ORDER BY a,b
just as often as it does ORDER BY b,a and you just get to pick 1
index, then it's better to have the index with the partitioned key
first. At least one of the queries can get away without doing a Sort
that way. If you have the partition key last in the index then both
queries need to sort... You could fix that by adding a 2nd index, but
that's not always practical, so it seems worth a mention, at least to
me.
David