Declarative Range Partitioning Postgres 11

Started by Shatamjeev Dewanover 6 years ago12 messagesgeneral
Jump to latest
#1Shatamjeev Dewan
sdewan@nbsps.com

Hi,

I am trying to create a table in postgres 11 with timestamp column as a partition key using PARTITION BY RANGE (create_dtt). The table definition has also an id column which is a primary key.

ERROR: insufficient columns in PRIMARY KEY constraint definition
DETAIL: PRIMARY KEY constraint on table "audit_p" lacks column "create_dtt" which is part of the partition key.

I don't want partition key column : create_dtt to part of composite primary key. Is there any way I can create range partition on date column without including as part of primary key .

Please advise.

Thanks,
Shatamjeev

#2Michael Lewis
mlewis@entrata.com
In reply to: Shatamjeev Dewan (#1)
Re: Declarative Range Partitioning Postgres 11

No, what you want is not possible and probably won't ever be I would
expect. Scanning every partition to validate the primary key isn't scalable.

#3Ron
ronljohnsonjr@gmail.com
In reply to: Michael Lewis (#2)
Re: Declarative Range Partitioning Postgres 11

On 10/7/19 6:17 PM, Michael Lewis wrote:

No, what you want is not possible and probably won't ever be I would expect.

Sure it is.  Maybe not the (weird) way that Postgres does partitioning, but
the legacy RDBMS that I still occasionally maintain has for at least 25
years had partition key independent of any indexes.

Scanning every partition to validate the primary key isn't scalable.

That's only because of the way Pg implements partitioning.

--
Angular momentum makes the world go 'round.

#4Shatamjeev Dewan
sdewan@nbsps.com
In reply to: Michael Lewis (#2)
RE: Declarative Range Partitioning Postgres 11

Thanks Michael.

From: Michael Lewis <mlewis@entrata.com>
Sent: October-07-19 7:18 PM
To: Shatamjeev Dewan <sdewan@nbsps.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Declarative Range Partitioning Postgres 11

No, what you want is not possible and probably won't ever be I would expect. Scanning every partition to validate the primary key isn't scalable.

#5Shatamjeev Dewan
sdewan@nbsps.com
In reply to: Michael Lewis (#2)
RE: Declarative Range Partitioning Postgres 11

Hi Michael,

In this case , I always need to include partition key(date) in primary key ( if I have a primary key defined on non partition key column e.g id (in my case), to make it a composite primary key (id, date). This would allow duplicate id with different date,which is not desirable .

Thanks,
Shatamjeev

From: Michael Lewis <mlewis@entrata.com>
Sent: October-07-19 7:18 PM
To: Shatamjeev Dewan <sdewan@nbsps.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Declarative Range Partitioning Postgres 11

No, what you want is not possible and probably won't ever be I would expect. Scanning every partition to validate the primary key isn't scalable.

#6Michael Lewis
mlewis@entrata.com
In reply to: Ron (#3)
Re: Declarative Range Partitioning Postgres 11

On Mon, Oct 7, 2019 at 5:56 PM Ron <ronljohnsonjr@gmail.com> wrote:

On 10/7/19 6:17 PM, Michael Lewis wrote:

No, what you want is not possible and probably won't ever be I would

expect.

Sure it is. Maybe not the (weird) way that Postgres does partitioning,
but
the legacy RDBMS that I still occasionally maintain has for at least 25
years had partition key independent of any indexes.

Scanning every partition to validate the primary key isn't scalable.

That's only because of the way Pg implements partitioning.

I can dig that, but since this is a Postgres list and everything I have
heard indicates it is not a limitation that is likely to be removed in
Postgres, it seems like we are having two different discussions.

#7Michael Lewis
mlewis@entrata.com
In reply to: Shatamjeev Dewan (#5)
Re: Declarative Range Partitioning Postgres 11

On Tue, Oct 8, 2019 at 8:00 AM Shatamjeev Dewan <sdewan@nbsps.com> wrote:

Hi Michael,

In this case , I always need to include partition key(date) in primary
key ( if I have a primary key defined on non partition key column e.g id
(in my case), to make it a composite primary key (id, date). This would
allow duplicate id with different date,which is not desirable .

If you are generating the ID with a sequence, there isn't any real world
likelihood of conflict, but I do understand your concern in terms of
enforcing data integrity. Other than creating a custom stored procedure
that functions as a primary key constraint, I don't know of any way around
that.

Let's take a step back... why do you think you need to partition at all?
And why partition by the date/timestamp/timestamptz field? Also, from what
I have seen, PG12 is when partitioning really gets performant in terms of
more than 10 to 100 partitions, and you can then create FKeys to the
partitioned table (not possible in PG11). Also, if your frequent access of
the table is by date/timestamptz field, then you might consider a BRIN
index if you have high correlation between physical storage and values in
that field. That can mitigate the need for partitioning.

Our organization will be waiting until next quarter to upgrade to PG12 and
then partitioning a few of our largest tables. That is to say, I don't have
experience with partitioning in production yet so others may chime in with
better advice.

#8Shatamjeev Dewan
sdewan@nbsps.com
In reply to: Michael Lewis (#7)
RE: Declarative Range Partitioning Postgres 11

Thanks a lot Michael for invaluable advise . Appreciate your great help and support.

From: Michael Lewis <mlewis@entrata.com>
Sent: October-08-19 1:33 PM
To: Shatamjeev Dewan <sdewan@nbsps.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Declarative Range Partitioning Postgres 11

On Tue, Oct 8, 2019 at 8:00 AM Shatamjeev Dewan <sdewan@nbsps.com<mailto:sdewan@nbsps.com>> wrote:
Hi Michael,

In this case , I always need to include partition key(date) in primary key ( if I have a primary key defined on non partition key column e.g id (in my case), to make it a composite primary key (id, date). This would allow duplicate id with different date,which is not desirable .

If you are generating the ID with a sequence, there isn't any real world likelihood of conflict, but I do understand your concern in terms of enforcing data integrity. Other than creating a custom stored procedure that functions as a primary key constraint, I don't know of any way around that.

Let's take a step back... why do you think you need to partition at all? And why partition by the date/timestamp/timestamptz field? Also, from what I have seen, PG12 is when partitioning really gets performant in terms of more than 10 to 100 partitions, and you can then create FKeys to the partitioned table (not possible in PG11). Also, if your frequent access of the table is by date/timestamptz field, then you might consider a BRIN index if you have high correlation between physical storage and values in that field. That can mitigate the need for partitioning.

Our organization will be waiting until next quarter to upgrade to PG12 and then partitioning a few of our largest tables. That is to say, I don't have experience with partitioning in production yet so others may chime in with better advice.

#9Ron
ronljohnsonjr@gmail.com
In reply to: Michael Lewis (#7)
Re: Declarative Range Partitioning Postgres 11

On 10/8/19 12:33 PM, Michael Lewis wrote:

On Tue, Oct 8, 2019 at 8:00 AM Shatamjeev Dewan <sdewan@nbsps.com
<mailto:sdewan@nbsps.com>> wrote:

Hi Michael,

In this case , I always need to include partition key(date)  in
primary key ( if I have a primary key defined on non partition key
column e.g id (in my case), to make it a composite primary key (id,
date). This would allow duplicate id with different date,which is not
desirable .

If you are generating the ID with a sequence, there isn't any real world
likelihood of conflict, but I do understand your concern in terms of
enforcing data integrity. Other than creating a custom stored procedure
that functions as a primary key constraint, I don't know of any way around
that.

Let's take a step back... why do you think you need to partition at all?
And why partition by the date/timestamp/timestamptz field?

Because archiving old is (well, /should be/) easier that way.

--
Angular momentum makes the world go 'round.

#10Shatamjeev Dewan
sdewan@nbsps.com
In reply to: Michael Lewis (#7)
RE: Declarative Range Partitioning Postgres 11

Hi Michael,

I want to create a partition by year and subpartition by month in postgres 11 timestamp column. Please advise syntax.

Thanks,
Shatamjeev

From: Michael Lewis <mlewis@entrata.com>
Sent: October-08-19 1:33 PM
To: Shatamjeev Dewan <sdewan@nbsps.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Declarative Range Partitioning Postgres 11

On Tue, Oct 8, 2019 at 8:00 AM Shatamjeev Dewan <sdewan@nbsps.com<mailto:sdewan@nbsps.com>> wrote:
Hi Michael,

In this case , I always need to include partition key(date) in primary key ( if I have a primary key defined on non partition key column e.g id (in my case), to make it a composite primary key (id, date). This would allow duplicate id with different date,which is not desirable .

If you are generating the ID with a sequence, there isn't any real world likelihood of conflict, but I do understand your concern in terms of enforcing data integrity. Other than creating a custom stored procedure that functions as a primary key constraint, I don't know of any way around that.

Let's take a step back... why do you think you need to partition at all? And why partition by the date/timestamp/timestamptz field? Also, from what I have seen, PG12 is when partitioning really gets performant in terms of more than 10 to 100 partitions, and you can then create FKeys to the partitioned table (not possible in PG11). Also, if your frequent access of the table is by date/timestamptz field, then you might consider a BRIN index if you have high correlation between physical storage and values in that field. That can mitigate the need for partitioning.

Our organization will be waiting until next quarter to upgrade to PG12 and then partitioning a few of our largest tables. That is to say, I don't have experience with partitioning in production yet so others may chime in with better advice.

#11Michael Lewis
mlewis@entrata.com
In reply to: Shatamjeev Dewan (#10)
Re: Declarative Range Partitioning Postgres 11

On Fri, Nov 1, 2019 at 9:22 AM Shatamjeev Dewan <sdewan@nbsps.com> wrote:

Hi Michael,

I want to create a partition by year and subpartition by month in postgres
11 timestamp column. Please advise syntax.

https://www.postgresql.org/docs/11/ddl-partitioning.html

The documentation is rather clear with examples like-

CREATE TABLE measurement_y2006m02 PARTITION OF measurement
FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
CREATE TABLE measurement_y2006m03 PARTITION OF measurement
FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');

Note- Don't partition on function results like date_part because
performance will likely suffer greatly. Also note that the top end is
always exclusive so the above give a continuous range for those two months.

I would hesitate to partition by more than year alone before upgrading to
PG v12. The speed improvements for more than 10-100 partitions (max
recommended for PG11) is huge in 12.

#12Shatamjeev Dewan
sdewan@nbsps.com
In reply to: Michael Lewis (#11)
RE: Declarative Range Partitioning Postgres 11

Thanks a ton Michael

From: Michael Lewis <mlewis@entrata.com>
Sent: November-01-19 3:20 PM
To: Shatamjeev Dewan <sdewan@nbsps.com>
Cc: pgsql-general <pgsql-general@postgresql.org>
Subject: Re: Declarative Range Partitioning Postgres 11

On Fri, Nov 1, 2019 at 9:22 AM Shatamjeev Dewan <sdewan@nbsps.com<mailto:sdewan@nbsps.com>> wrote:
Hi Michael,

I want to create a partition by year and subpartition by month in postgres 11 timestamp column. Please advise syntax.

https://www.postgresql.org/docs/11/ddl-partitioning.html

The documentation is rather clear with examples like-

CREATE TABLE measurement_y2006m02 PARTITION OF measurement
FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
CREATE TABLE measurement_y2006m03 PARTITION OF measurement
FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');

Note- Don't partition on function results like date_part because performance will likely suffer greatly. Also note that the top end is always exclusive so the above give a continuous range for those two months.

I would hesitate to partition by more than year alone before upgrading to PG v12. The speed improvements for more than 10-100 partitions (max recommended for PG11) is huge in 12.