How Many Partitions are Good Performing

Started by Kumar, Virendraabout 8 years ago9 messagesgeneral
Jump to latest
#1Kumar, Virendra
Virendra.Kumar@guycarp.com

Can somebody tell us how many partitions are good number without impacting the performance. We are hearing around a thousand, is that a limit. Do we have plan to increase the number of partitions for a table. We would appreciate if somebody can help us with this?

Regards,
Virendra

________________________________

This message is intended only for the use of the addressee and may contain
information that is PRIVILEGED AND CONFIDENTIAL.

If you are not the intended recipient, you are hereby notified that any
dissemination of this communication is strictly prohibited. If you have
received this communication in error, please erase all copies of the message
and its attachments and notify the sender immediately. Thank you.

#2Rakesh Kumar
rakeshkumar464@mail.com
In reply to: Kumar, Virendra (#1)
Re: How Many Partitions are Good Performing

 You should have read carefully what I wrote. 1000 is not an upper limit. 1000 partition is the number after which performance starts dropping .

There is a blog in www.timescale.com which also highlights the same.

Sent: Monday, January 08, 2018 at 6:20 PM
From: "Kumar, Virendra" <Virendra.Kumar@guycarp.com>
To: "pgsql-general@postgresql.org" <pgsql-general@postgresql.org>
Subject: How Many Partitions are Good Performing

Can somebody tell us how many partitions are good number without impacting the performance. We are hearing around a thousand, is that a limit. Do we have plan to increase the number of partitions for a table. We would appreciate if somebody can help us with this?
 
Regards,
Virendra
 
 
------------------------------------------------------------
This message is intended only for the use of the addressee and may contain
information that is PRIVILEGED AND CONFIDENTIAL.

If you are not the intended recipient, you are hereby notified that any
dissemination of this communication is strictly prohibited. If you have
received this communication in error, please erase all copies of the message
and its attachments and notify the sender immediately. Thank you.

#3Michael Paquier
michael@paquier.xyz
In reply to: Rakesh Kumar (#2)
Re: How Many Partitions are Good Performing

On Tue, Jan 09, 2018 at 12:54:18AM +0100, Rakesh Kumar wrote:

Can somebody tell us how many partitions are good number without
impacting the performance. We are hearing around a thousand, is that a
limit. Do we have plan to increase the number of partitions for a
table. We would appreciate if somebody can help us with this?

What matters here is that the bottleneck comes from the planner which
uses basically a O(N^2) algorithm to evaluate all the partitions, so a
too high number causes planning time to increase dramatically. In
Postgres 11, things get improved with more partition-wise logics.
--
Michael

#4Andrew Staller
andrew@timescale.com
In reply to: Rakesh Kumar (#2)
Re: How Many Partitions are Good Performing

This is the blog post that Rakesh referenced:
https://blog.timescale.com/time-series-data-postgresql-10-vs-timescaledb-816ee808bac5

Please note, this analysis is done in the context of working with
time-series data, where 1000s of chunks is not uncommon because of the
append-mostly nature of the workload.

On Mon, Jan 8, 2018 at 6:54 PM, Rakesh Kumar <rakeshkumar464@mail.com>
wrote:

You should have read carefully what I wrote. 1000 is not an upper
limit. 1000 partition is the number after which performance starts
dropping .

There is a blog in www.timescale.com which also highlights the same.

Sent: Monday, January 08, 2018 at 6:20 PM
From: "Kumar, Virendra" <Virendra.Kumar@guycarp.com>
To: "pgsql-general@postgresql.org" <pgsql-general@postgresql.org>
Subject: How Many Partitions are Good Performing

Can somebody tell us how many partitions are good number without impacting
the performance. We are hearing around a thousand, is that a limit. Do we
have plan to increase the number of partitions for a table. We would
appreciate if somebody can help us with this?

Regards,
Virendra

------------------------------------------------------------
This message is intended only for the use of the addressee and may contain
information that is PRIVILEGED AND CONFIDENTIAL.

If you are not the intended recipient, you are hereby notified that any
dissemination of this communication is strictly prohibited. If you have
received this communication in error, please erase all copies of the
message
and its attachments and notify the sender immediately. Thank you.

--
TimescaleDB* | *Growth & Developer Evangelism
c: 908.581.9509

335 Madison Ave.
New York, NY 10017
http://www.timescale.com/
https://github.com/timescale/timescaledb

#5Kumar, Virendra
Virendra.Kumar@guycarp.com
In reply to: Andrew Staller (#4)
RE: How Many Partitions are Good Performing

Thank you Rakesh and Andrew!

We will not be doing time scaling but we have list of value based of which we will be partitioning the table and list is something around 7500 now.
For short term we are thinking of putting around a thousand partitions and when PG11 releases we will go for each value a partition.

Regards,
Virendra

From: Andrew Staller [mailto:andrew@timescale.com]
Sent: Tuesday, January 09, 2018 12:15 PM
To: Rakesh Kumar
Cc: Kumar, Virendra; pgsql-general@postgresql.org
Subject: Re: How Many Partitions are Good Performing

This is the blog post that Rakesh referenced:
https://blog.timescale.com/time-series-data-postgresql-10-vs-timescaledb-816ee808bac5

Please note, this analysis is done in the context of working with time-series data, where 1000s of chunks is not uncommon because of the append-mostly nature of the workload.

On Mon, Jan 8, 2018 at 6:54 PM, Rakesh Kumar <rakeshkumar464@mail.com<mailto:rakeshkumar464@mail.com>> wrote:

You should have read carefully what I wrote. 1000 is not an upper limit. 1000 partition is the number after which performance starts dropping .

There is a blog in www.timescale.com<http://www.timescale.com&gt; which also highlights the same.

Sent: Monday, January 08, 2018 at 6:20 PM
From: "Kumar, Virendra" <Virendra.Kumar@guycarp.com<mailto:Virendra.Kumar@guycarp.com>>
To: "pgsql-general@postgresql.org<mailto:pgsql-general@postgresql.org>" <pgsql-general@postgresql.org<mailto:pgsql-general@postgresql.org>>
Subject: How Many Partitions are Good Performing

Can somebody tell us how many partitions are good number without impacting the performance. We are hearing around a thousand, is that a limit. Do we have plan to increase the number of partitions for a table. We would appreciate if somebody can help us with this?

Regards,
Virendra

------------------------------------------------------------
This message is intended only for the use of the addressee and may contain
information that is PRIVILEGED AND CONFIDENTIAL.

If you are not the intended recipient, you are hereby notified that any
dissemination of this communication is strictly prohibited. If you have
received this communication in error, please erase all copies of the message
and its attachments and notify the sender immediately. Thank you.

--
TimescaleDB | Growth & Developer Evangelism
c: 908.581.9509

335 Madison Ave.
New York, NY 10017
http://www.timescale.com/
https://github.com/timescale/timescaledb

________________________________

This message is intended only for the use of the addressee and may contain
information that is PRIVILEGED AND CONFIDENTIAL.

If you are not the intended recipient, you are hereby notified that any
dissemination of this communication is strictly prohibited. If you have
received this communication in error, please erase all copies of the message
and its attachments and notify the sender immediately. Thank you.

#6Vincenzo Romano
vincenzo.romano@notorand.it
In reply to: Andrew Staller (#4)
Re: How Many Partitions are Good Performing

2018-01-09 18:15 GMT+01:00 Andrew Staller <andrew@timescale.com>:

This is the blog post that Rakesh referenced:
https://blog.timescale.com/time-series-data-postgresql-
10-vs-timescaledb-816ee808bac5

Please note, this analysis is done in the context of working with
time-series data, where 1000s of chunks is not uncommon because of the
append-mostly nature of the workload.

On Mon, Jan 8, 2018 at 6:54 PM, Rakesh Kumar <rakeshkumar464@mail.com>
wrote:

You should have read carefully what I wrote. 1000 is not an upper
limit. 1000 partition is the number after which performance starts
dropping .

There is a blog in www.timescale.com which also highlights the same.

Sent: Monday, January 08, 2018 at 6:20 PM
From: "Kumar, Virendra" <Virendra.Kumar@guycarp.com>
To: "pgsql-general@postgresql.org" <pgsql-general@postgresql.org>
Subject: How Many Partitions are Good Performing

Can somebody tell us how many partitions are good number without
impacting the performance. We are hearing around a thousand, is that a
limit. Do we have plan to increase the number of partitions for a table. We
would appreciate if somebody can help us with this?

Regards,
Virendra

------------------------------------------------------------
This message is intended only for the use of the addressee and may contain
information that is PRIVILEGED AND CONFIDENTIAL.

If you are not the intended recipient, you are hereby notified that any
dissemination of this communication is strictly prohibited. If you have
received this communication in error, please erase all copies of the
message
and its attachments and notify the sender immediately. Thank you.

--
TimescaleDB* | *Growth & Developer Evangelism
c: 908.581.9509

335 Madison Ave.
<https://maps.google.com/?q=335+Madison+Ave.%C2%A0New+York,+NY%C2%A010017&amp;entry=gmail&amp;source=g&gt;
New York, NY 10017
<https://maps.google.com/?q=335+Madison+Ave.%C2%A0New+York,+NY%C2%A010017&amp;entry=gmail&amp;source=g&gt;
http://www.timescale.com/
https://github.com/timescale/timescaledb

The data about the query performances would have shed more light on the
situation.
Unluckily there's none. Weird!

--
Vincenzo Romano - NotOrAnd.IT
Information Technologies
--
NON QVIETIS MARIBVS NAVTA PERITVS

#7pinker
pinker@onet.eu
In reply to: Kumar, Virendra (#1)
Re: How Many Partitions are Good Performing

I've run once a test on my laptop because was curious as well. From my
results (on laptop - 16GB RAM, 4 cores) the upper limit was 12k. Above it
planning time was unbearable high - much higher than execution time. It's
been tested on 9.5

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

#8Kumar, Virendra
Virendra.Kumar@guycarp.com
In reply to: pinker (#7)
RE: How Many Partitions are Good Performing

For test I created two tables with 7800 partitions each and joining them sees performance bottleneck. It is taking 5 seconds planning time. Please see attached plan.

Regards,
Virendra

-----Original Message-----
From: pinker [mailto:pinker@onet.eu]
Sent: Wednesday, January 10, 2018 12:07 PM
To: pgsql-general@postgresql.org
Subject: Re: How Many Partitions are Good Performing

I've run once a test on my laptop because was curious as well. From my results (on laptop - 16GB RAM, 4 cores) the upper limit was 12k. Above it planning time was unbearable high - much higher than execution time. It's been tested on 9.5

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html

________________________________

This message is intended only for the use of the addressee and may contain
information that is PRIVILEGED AND CONFIDENTIAL.

If you are not the intended recipient, you are hereby notified that any
dissemination of this communication is strictly prohibited. If you have
received this communication in error, please erase all copies of the message
and its attachments and notify the sender immediately. Thank you.

Attachments:

output.txttext/plain; name=output.txtDownload
#9pinker
pinker@onet.eu
In reply to: Kumar, Virendra (#8)
RE: How Many Partitions are Good Performing

yes, it doesn't look good. and it seems that statistics aren't accurate:
GroupAggregate (cost=271794.39..330553.67 rows=215630 width=152) (actual
time=30.641..37.303 rows=2792 loops=1)

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html