pg_(total_)relation_size and partitioned tables
Hi.
You may have guessed from $subject that the two don't work together.
create table p (a int) partition by list (a);
create table p1 partition of p for values in (1, 2) partition by list (a);
create table p11 partition of p1 for values in (1);
create table p12 partition of p1 for values in (2);
insert into p select i % 2 + 1 from generate_series(1, 1000) i;
On HEAD:
select pg_relation_size('p');
pg_relation_size
------------------
0
(1 row)
select pg_total_relation_size('p');
pg_total_relation_size
------------------------
0
(1 row)
After applying the attached patch:
select pg_relation_size('p');
pg_relation_size
------------------
49152
(1 row)
select pg_total_relation_size('p');
pg_total_relation_size
------------------------
98304
(1 row)
The patch basically accumulates and returns the sizes of all leaf
partitions when passed a partitioned table. Will add it to next CF.
Thanks,
Amit
Attachments:
0001-Teach-dbsize.c-functions-about-partitioned-tables.patchtext/plain; charset=UTF-8; name=0001-Teach-dbsize.c-functions-about-partitioned-tables.patchDownload+85-17
On Thu, Dec 14, 2017 at 12:23 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
You may have guessed from $subject that the two don't work together.
It works exactly as documented:
pg_total_relation_size(regclass) - Total disk space used by the
specified table, including all indexes and TOAST data
It says nothing about including partitions. If we change this, then
we certainly need to update the documentation (that might be a good
idea if we decide not to update this).
Personally, I'm -1 on including partitions, because then you can no
longer expect that the sum of pg_total_relation_size(regclass) across
all relations in the database will equal the size of the database
itself. Partitions will be counted a number of times equal to their
depth in the partitioning hierarchy. However, I understand that I
might get outvoted.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 17 December 2017 at 16:24, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Dec 14, 2017 at 12:23 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:You may have guessed from $subject that the two don't work together.
It works exactly as documented:
pg_total_relation_size(regclass) - Total disk space used by the
specified table, including all indexes and TOAST dataIt says nothing about including partitions. If we change this, then
we certainly need to update the documentation (that might be a good
idea if we decide not to update this).Personally, I'm -1 on including partitions, because then you can no
longer expect that the sum of pg_total_relation_size(regclass) across
all relations in the database will equal the size of the database
itself. Partitions will be counted a number of times equal to their
depth in the partitioning hierarchy. However, I understand that I
might get outvoted.
I'd also vote to leave the relation_size functions alone.
Perhaps it's worth thinking of changing pg_table_size() instead. We
have taken measures to try and hide the fact that a table is made up
of a bunch of partitions from the user in some cases, e.g DROP TABLE
works without CASCADE for a partitioned table. I'm sure there are
arguments in both directions here too though.
I generally think of the difference between a relation and a table
similarly to the difference between a tuple and a row. A relation is
just what we use to implement tables, and there can be multiple
relations per table (in the partitioning case), similar to how there
can be multiple tuple versions for a single row. So that might back up
that pg_table_size should include all relations that make up that
table.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Sun, Dec 17, 2017 at 9:54 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
I'd also vote to leave the relation_size functions alone.
Perhaps it's worth thinking of changing pg_table_size() instead. We
have taken measures to try and hide the fact that a table is made up
of a bunch of partitions from the user in some cases, e.g DROP TABLE
works without CASCADE for a partitioned table. I'm sure there are
arguments in both directions here too though.
Yeah, I don't really understand why changing pg_table_size() is any
more desirable than changing pg_relation_size().
I mean, we could have a table-size function that takes an array of
things you want to include (indexes, toast, partitions, etc), but
changing the semantics of existing functions seems like it's going to
be more painful than helpful (aside from the arguments I brought up
before).
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Sun, Dec 17, 2017 at 11:54 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
I'd also vote to leave the relation_size functions alone.
Count me in that bucket as well.
Perhaps it's worth thinking of changing pg_table_size() instead. We
have taken measures to try and hide the fact that a table is made up
of a bunch of partitions from the user in some cases, e.g DROP TABLE
works without CASCADE for a partitioned table. I'm sure there are
arguments in both directions here too though.I generally think of the difference between a relation and a table
similarly to the difference between a tuple and a row. A relation is
just what we use to implement tables, and there can be multiple
relations per table (in the partitioning case), similar to how there
can be multiple tuple versions for a single row. So that might back up
that pg_table_size should include all relations that make up that
table.
The barrier here is thin. What's proposed here is already doable with
a WITH RECURSIVE query. So why not just documenting this query and be
done with it instead of complicating the code? It seems to me that the
performance in calling pg_relation_size() in a cascading times fashion
would not matter much. Or one could invent an additional cascading
option which scans inheritance and/or partition chains, or simply have
a new function.
--
Michael
On Mon, Dec 18, 2017 at 9:29 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
The barrier here is thin. What's proposed here is already doable with
a WITH RECURSIVE query. So why not just documenting this query and be
done with it instead of complicating the code? It seems to me that the
performance in calling pg_relation_size() in a cascading times fashion
would not matter much. Or one could invent an additional cascading
option which scans inheritance and/or partition chains, or simply have
a new function.
I just blogged on the matter, and here is one possibility here
compatible with v10:
WITH RECURSIVE partition_info
(relid,
relname,
relsize,
relispartition,
relkind) AS (
SELECT oid AS relid,
relname,
pg_relation_size(oid) AS relsize,
relispartition,
relkind
FROM pg_catalog.pg_class
WHERE relname = 'parent_name' AND
relkind = 'p'
UNION ALL
SELECT
c.oid AS relid,
c.relname AS relname,
pg_relation_size(c.oid) AS relsize,
c.relispartition AS relispartition,
c.relkind AS relkind
FROM partition_info AS p,
pg_catalog.pg_inherits AS i,
pg_catalog.pg_class AS c
WHERE p.relid = i.inhparent AND
c.oid = i.inhrelid AND
c.relispartition
)
SELECT * FROM partition_info;
This is not really straight-forward. You could as well have the
pg_relation_size call in the outer query.
--
Michael
Thanks all for your thoughts.
I agree with the Robert's point which both David and Michael seem to agree
with that we shouldn't really be changing what pg_relation_size() is doing
under the covers. And I guess the same for pg_table_size(), too. Both of
those functions and their siblings work with relations that possess
on-disk structures and have associated relations (TOAST, indexes) that in
turn possess on-disk structures. It seems quite clearly documented as
such. Partitioned tables are different in that they neither possess
on-disk structures nor have any relations (TOAST, indexes) associated
directly with them. Instead, they have partitions that are the relations
that aforementioned dbsize.c functions are familiar with.
So, I withdraw the patch I originally posted in favor of some other approach.
Reply continues below...
On 2017/12/18 11:51, Michael Paquier wrote:
On Mon, Dec 18, 2017 at 9:29 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:The barrier here is thin. What's proposed here is already doable with
a WITH RECURSIVE query. So why not just documenting this query and be
done with it instead of complicating the code? It seems to me that the
performance in calling pg_relation_size() in a cascading times fashion
would not matter much. Or one could invent an additional cascading
option which scans inheritance and/or partition chains, or simply have
a new function.I just blogged on the matter, and here is one possibility here
compatible with v10:
WITH RECURSIVE partition_info
(relid,
relname,
relsize,
relispartition,
relkind) AS (
SELECT oid AS relid,
relname,
pg_relation_size(oid) AS relsize,
relispartition,
relkind
FROM pg_catalog.pg_class
WHERE relname = 'parent_name' AND
relkind = 'p'
UNION ALL
SELECT
c.oid AS relid,
c.relname AS relname,
pg_relation_size(c.oid) AS relsize,
c.relispartition AS relispartition,
c.relkind AS relkind
FROM partition_info AS p,
pg_catalog.pg_inherits AS i,
pg_catalog.pg_class AS c
WHERE p.relid = i.inhparent AND
c.oid = i.inhrelid AND
c.relispartition
)
SELECT * FROM partition_info;This is not really straight-forward. You could as well have the
pg_relation_size call in the outer query.
Thanks Michael for coming up with that.
Do you (and/or others) think that's something that we can wrap inside a
built-in function(s), that is, one defined in system_views.sql? Or if we
decide to have new functions, say, pg_get_partitions() and/or
pg_get_partition_sizes(), we might as well implement them as C functions
inside dbsize.c. If so, do we have want to have "partition" variants of
all *_size() functions viz. pg_relation_size(), pg_total_relation_size(),
pg_indexes_size(), and pg_table_size()?
Thanks,
Amit
On Mon, Dec 18, 2017 at 2:17 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
Do you (and/or others) think that's something that we can wrap inside a
built-in function(s), that is, one defined in system_views.sql? Or if we
decide to have new functions, say, pg_get_partitions() and/or
pg_get_partition_sizes(), we might as well implement them as C functions
inside dbsize.c. If so, do we have want to have "partition" variants of
all *_size() functions viz. pg_relation_size(), pg_total_relation_size(),
pg_indexes_size(), and pg_table_size()?
I can see value in something like Robert is proposing upthread by
having a function one is able to customize to decide what to include
in the calculation. There could be options for at least:
- partitions, or relation cascading.
- index.
- toast tables.
- visibility maps.
- FSM.
The first three ones is what Robert are mentioned, still I would see
the last two ones are interesting things if optional. Or we could have
just a SRF that returns one row per object scanned.
--
Michael
On 12/18/17 00:17, Amit Langote wrote:
I agree with the Robert's point which both David and Michael seem to agree
with that we shouldn't really be changing what pg_relation_size() is doing
under the covers. And I guess the same for pg_table_size(), too. Both of
those functions and their siblings work with relations that possess
on-disk structures and have associated relations (TOAST, indexes) that in
turn possess on-disk structures. It seems quite clearly documented as
such. Partitioned tables are different in that they neither possess
on-disk structures nor have any relations (TOAST, indexes) associated
directly with them. Instead, they have partitions that are the relations
that aforementioned dbsize.c functions are familiar with.
Here is another idea. If we had a function
pg_partition_root(regclass) returns regclass
(returning itself for non-partitioned relations), then users can easily
construct queries to get the results they want in different shapes, e.g.,
select pg_partition_root(c.oid), c.relname, pg_table_size(c.oid)
from pg_class c
order by 1
select pg_partition_root(c.oid), sum(pg_table_size(c.oid))
from pg_class c
group by 1
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 29 December 2017 at 08:03, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
Here is another idea. If we had a function
pg_partition_root(regclass) returns regclass
(returning itself for non-partitioned relations), then users can easily
construct queries to get the results they want in different shapes, e.g.,select pg_partition_root(c.oid), c.relname, pg_table_size(c.oid)
from pg_class c
order by 1select pg_partition_root(c.oid), sum(pg_table_size(c.oid))
from pg_class c
group by 1
That seems much nicer. I assume "root" would mean the top level
partitioned table. If so, would we also want
pg_partition_parent(regclass)? Or maybe something to control the
number of "levels-up" the function would run for. If we had that then
maybe -1 could mean "go until you find a table with no parent".
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 12/28/17 16:24, David Rowley wrote:
select pg_partition_root(c.oid), c.relname, pg_table_size(c.oid)
from pg_class c
order by 1select pg_partition_root(c.oid), sum(pg_table_size(c.oid))
from pg_class c
group by 1That seems much nicer. I assume "root" would mean the top level
partitioned table. If so, would we also want
pg_partition_parent(regclass)? Or maybe something to control the
number of "levels-up" the function would run for. If we had that then
maybe -1 could mean "go until you find a table with no parent".
Hmm, we need to think through some scenarios for what one would really
want to do with this functionality.
Clearly, the existing behavior is useful for management tasks like bloat
and vacuum monitoring.
And on the other hand you might want to have a logical view of, how big
is this partitioned table altogether.
But what are the uses for dealing with partial partition hierarchies?
How easy do we need to make that?
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut wrote:
But what are the uses for dealing with partial partition hierarchies?
How easy do we need to make that?
If you have multilevel partitioning, say partitions per year per site.
What is the volume of 2017 compared to 2016, on each site? I don't
think it needs to be super easy, but it should be reasonable.
I think pg_partition_parent() should be a simple function doing one
catalog lookup (already implemented as get_partition_parent(),
but needs a "missing_ok" case), and pg_partition_root() an iterative
version of that.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
I'm setting this patch to Returned with feedback.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018/01/17 22:51, Peter Eisentraut wrote:
I'm setting this patch to Returned with feedback.
OK. Sorry that I couldn't reply here since the CF started. I have
written some code to implement the pg_partition_root() idea you mentioned
upthread earlier this week, but hasn't been able to share it on the list
yet. I will post it soon and create a new entry in the next commit fest.
Thanks,
Amit
On 2018/01/02 22:45, Peter Eisentraut wrote:
On 12/28/17 16:24, David Rowley wrote:
select pg_partition_root(c.oid), c.relname, pg_table_size(c.oid)
from pg_class c
order by 1select pg_partition_root(c.oid), sum(pg_table_size(c.oid))
from pg_class c
group by 1That seems much nicer. I assume "root" would mean the top level
partitioned table. If so, would we also want
pg_partition_parent(regclass)? Or maybe something to control the
number of "levels-up" the function would run for. If we had that then
maybe -1 could mean "go until you find a table with no parent".Hmm, we need to think through some scenarios for what one would really
want to do with this functionality.Clearly, the existing behavior is useful for management tasks like bloat
and vacuum monitoring.And on the other hand you might want to have a logical view of, how big
is this partitioned table altogether.But what are the uses for dealing with partial partition hierarchies?
How easy do we need to make that?
I think having pg_partition_root() and pg_partition_parent() will give
users enough to get useful views as follows:
drop table p;
create table p (a int) partition by list (a);
create table p123 partition of p for values in (1, 2, 3) partition by list
(a);
create table p12 partition of p1 for values in (1, 2) partition by list (a);
create table p12 partition of p123 for values in (1, 2) partition by list (a);
create table p1 partition of p12 for values in (1);
create table p2 partition of p12 for values in (2);
create table p3 partition of p123 for values in (3);
insert into p select 1 from generate_series(1, 100);
insert into p select 2 from generate_series(1, 100);
insert into p select 3 from generate_series(1, 100);
select pg_partition_root(oid) as root_parent,
pg_partition_parent(oid) as parent,
relname as relname,
pg_total_relation_size(oid) as size
from pg_class
where relnamespace = 'public'::regnamespace
order by 4;
root_parent | parent | relname | size
-------------+--------+---------+------
p | | p | 0
p | p | p123 | 0
p | p123 | p12 | 0
p | p12 | p1 | 8192
p | p12 | p2 | 8192
p | p123 | p3 | 8192
(6 rows)
select pg_partition_root(oid) as root_parent,
sum(pg_total_relation_size(oid)) as size
from pg_class
where relnamespace = 'public'::regnamespace
group by 1
order by 1;
root_parent | size
-------------+-------
p | 24576
(1 row)
Attached a WIP patch.
Thanks,
Amit
Attachments:
v1-0001-Add-a-pg_partition_root-and-pg_partition_parent.patchtext/plain; charset=UTF-8; name=v1-0001-Add-a-pg_partition_root-and-pg_partition_parent.patchDownload+102-2
On Thu, Jan 18, 2018 at 06:54:18PM +0900, Amit Langote wrote:
I think having pg_partition_root() and pg_partition_parent() will give
users enough to get useful views as follows:
So... pg_partition_root() gives you access to the highest relation in
the hierarchy, and pg_partition_parent() gives you access to the direct
parent.
drop table p;
create table p (a int) partition by list (a);
create table p123 partition of p for values in (1, 2, 3) partition by list
(a);
create table p12 partition of p1 for values in (1, 2) partition by list (a);
create table p12 partition of p123 for values in (1, 2) partition by list (a);
create table p1 partition of p12 for values in (1);
create table p2 partition of p12 for values in (2);
create table p3 partition of p123 for values in (3);
You need to reorder those queries, the creation of the first p12 would
fail as p1 does not exist at this point. Wouldn't also a
pg_partition_tree() be useful? You could shape it as a function which
returns all regclass partitions in the tree as unique entries. Combined
with pg_partition_parent() it can be powerful as it returns NULL for the
partition at the top of the tree. So I think that we could live without
pg_partition_root(). At the end, let's design something which makes
unnecessary the use of WITH RECURSIVE when looking at a full partition
tree to ease the user's life.
Documentation, as well as regression tests, would be welcome :)
--
Michael
Thanks for taking a look.
On 2018/01/19 14:39, Michael Paquier wrote:
On Thu, Jan 18, 2018 at 06:54:18PM +0900, Amit Langote wrote:
I think having pg_partition_root() and pg_partition_parent() will give
users enough to get useful views as follows:So... pg_partition_root() gives you access to the highest relation in
the hierarchy, and pg_partition_parent() gives you access to the direct
parent.
Right.
drop table p;
create table p (a int) partition by list (a);
create table p123 partition of p for values in (1, 2, 3) partition by list(a);
create table p12 partition of p1 for values in (1, 2) partition by list (a);
create table p12 partition of p123 for values in (1, 2) partition by list (a);
create table p1 partition of p12 for values in (1);
create table p2 partition of p12 for values in (2);
create table p3 partition of p123 for values in (3);You need to reorder those queries, the creation of the first p12 would
fail as p1 does not exist at this point.
Oops. I had copy-pasted above commands from the psql's \s output and
ended up copying the command I didn't intend to. Here it is again, but
without the mistake I made in my last email:
drop table p;
create table p (a int) partition by list (a);
create table p123 partition of p for values in (1, 2, 3) partition by list
(a);
create table p12 partition of p123 for values in (1, 2) partition by list (a);
create table p1 partition of p12 for values in (1);
create table p2 partition of p12 for values in (2);
create table p3 partition of p123 for values in (3);
Wouldn't also a
pg_partition_tree() be useful? You could shape it as a function which
returns all regclass partitions in the tree as unique entries. Combined
with pg_partition_parent() it can be powerful as it returns NULL for the
partition at the top of the tree. So I think that we could live without
pg_partition_root(). At the end, let's design something which makes
unnecessary the use of WITH RECURSIVE when looking at a full partition
tree to ease the user's life.
Do you mean pg_partition_tree(regclass), that returns all partitions in
the partition tree whose root is passed as the parameter?
Perhaps, like the following (roughly implemented in the attached)?
select pg_partition_root(p) as root_parent,
pg_partition_parent(p) as parent,
p as relname,
pg_total_relation_size(p) as size
from pg_partition_tree_tables('p') p
order by 4;
root_parent | parent | relname | size
-------------+--------+---------+---------
p | | p | 0
p | p | p123 | 0
p | p123 | p12 | 0
p | p123 | p3 | 3653632
p | p12 | p1 | 3653632
p | p12 | p2 | 3653632
(6 rows)
Documentation, as well as regression tests, would be welcome :)
OK, I will add those things in the next version.
Thanks,
Amit
Attachments:
0001-Add-assorted-partition-reporting-functions.patchtext/plain; charset=UTF-8; name=0001-Add-assorted-partition-reporting-functions.patchDownload+152-2
On Fri, Jan 19, 2018 at 06:28:41PM +0900, Amit Langote wrote:
Do you mean pg_partition_tree(regclass), that returns all partitions in
the partition tree whose root is passed as the parameter?Perhaps, like the following (roughly implemented in the attached)?
select pg_partition_root(p) as root_parent,
pg_partition_parent(p) as parent,
p as relname,
pg_total_relation_size(p) as size
from pg_partition_tree_tables('p') p
order by 4;
root_parent | parent | relname | size
-------------+--------+---------+---------
p | | p | 0
p | p | p123 | 0
p | p123 | p12 | 0
p | p123 | p3 | 3653632
p | p12 | p1 | 3653632
p | p12 | p2 | 3653632
(6 rows)
It seems that you caught the idea. As long as each leaf and root is
listed uniquely, that would be acceptable to me, and useful for users.
If pg_partition_tree_tables() uses the top of the partition as input,
all the tree should show up. If you use a leaf, anything under the leaf
should show up. If a leaf is defined and it has no underlying leaves,
then only this outmost leaf should be listed.
+/*
+ * Returns Oids of tables in a publication.
+ */
Close enough, but that's not a publication.
Documentation, as well as regression tests, would be welcome :)
OK, I will add those things in the next version.
Thanks.
--
Michael
On 20 January 2018 at 23:14, Michael Paquier <michael.paquier@gmail.com> wrote:
If pg_partition_tree_tables() uses the top of the partition as input,
all the tree should show up. If you use a leaf, anything under the leaf
should show up. If a leaf is defined and it has no underlying leaves,
then only this outmost leaf should be listed.
hmm, that's thoroughly confusing. Just in case anyone else is stuck on
that, I just need to mention that a leaf is the does not have
branches, in nature or computer science.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Sun, Jan 21, 2018 at 07:16:38PM +1300, David Rowley wrote:
On 20 January 2018 at 23:14, Michael Paquier <michael.paquier@gmail.com> wrote:
If pg_partition_tree_tables() uses the top of the partition as input,
all the tree should show up. If you use a leaf, anything under the leaf
should show up. If a leaf is defined and it has no underlying leaves,
then only this outmost leaf should be listed.hmm, that's thoroughly confusing. Just in case anyone else is stuck on
that, I just need to mention that a leaf is the does not have
branches, in nature or computer science.
OK, sorry if my words are confusing. Imagine that you have the following
partition set:
p
/ \
/ \
p1 p2
/ \
/ \
p21 p22
If pg_partition_tree_tables() uses 'p' as input argument, then I would
imagine that it should return p, p1, p2, p21 and p22. If 'p2' is used,
then p2, p21 and p22 are the results. If either one of p1, p21 or p22 is
used as input, then the result is respectively p1, p21 or p22.
Amit's patch seems to be doing that kind of logic by using
find_all_inheritors, which is good. We need to make the difference
between relations that are part of a partition set and the other ones
which are part of an INHERIT link, and, at least it seems to me, the
patch is not careful with that. I haven't tested what is proposed
though, but this portion likely needs more thoughts.
--
Michael