Measuring relation free space

Started by Greg Smithover 14 years ago82 messageshackers
Jump to latest
#1Greg Smith
gsmith@gregsmith.com

Attached patch adds a new function to the pageinspect extension for
measuring total free space, in either tables or indexes. It returns the
free space as a percentage, so higher numbers mean more bloat. After
trying a couple of ways to quantify it, I've found this particular
measure correlates well with the nastiest bloat issues I've ran into in
production recently. For example, an index that had swelled to over 5X
its original size due to autovacuum issues registered at 0.86 on this
scale. I could easily see people putting an alert at something like
0.40 and picking candidates to reindex based on it triggering. That
would be about a million times smarter than how I've been muddling
through this class of problems so far.

Code by Jaime Casanova, based on a prototype by me. Thanks to attendees
and sponsors of the PgWest conference for helping to fund some deeper
exploration of this idea.

Here's a test case showing it in action:

create extension pageinspect;
create table t (k serial,v integer);
insert into t(v) (select generate_series(1,100000));
create index t_idx on t(k);
delete from t where k<50000;
vacuum t;

gsmith=# select relation_free_space('t');
relation_free_space
---------------------
0.445466

gsmith=# select relation_free_space('t_idx');
relation_free_space
---------------------
0.550946

Some open questions in my mind:

-Given this is doing a full table scan, should it hook into a ring
buffer to keep from trashing the buffer cache? Or might it loop over
the relation in a different way all together? I was thinking about
eyeing the FSM instead at one point, didn't explore that yet. There's
certainly a few ways to approach this, we just aimed at the easiest way
to get a working starter implementation, and associated results to
compare others against.

-Should there be a non-superuser version of this? We'd certainly need
to get a less cache demolishing version before that would seem wise.

-There were related things in the pageinspect module, but a case could
be made for this being a core function instead. It's a bit more likely
to be used in production than the rest of that extension.

-What if anything related to TOAST should this handle?

We're also planning to do a sampling version of this, using the same
approach ANALYZE does. Grab a number of blocks, extrapolate from
there. It shouldn't take many samples before the accuracy is better
than how people are estimated this now. That work is just waiting on
some better thinking about how to handle the full relation version first.

And, yes, the explanation in the docs and code should be clear that it's
returning a percentage, which I just realized when writing this. At
least I remembered to document something; still ahead of the average new
patch...

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

Attachments:

relation_free_space-v2.patchtext/x-patch; name=relation_free_space-v2.patchDownload+156-0
#2Magnus Hagander
magnus@hagander.net
In reply to: Greg Smith (#1)
Re: Measuring relation free space

On Sun, Nov 6, 2011 at 04:08, Greg Smith <greg@2ndquadrant.com> wrote:

Attached patch adds a new function to the pageinspect extension for
measuring total free space, in either tables or indexes.  It returns the
free space as a percentage, so higher numbers mean more bloat.  After trying
a couple of ways to quantify it, I've found this particular measure
correlates well with the nastiest bloat issues I've ran into in production
recently.  For example, an index that had swelled to over 5X its original
size due to autovacuum issues registered at 0.86 on this scale.  I could
easily see people putting an alert at something like 0.40 and picking
candidates to reindex based on it triggering.  That would be about a million
times smarter than how I've been muddling through this class of problems so
far.

Code by Jaime Casanova, based on a prototype by me.  Thanks to attendees and
sponsors of the PgWest conference for helping to fund some deeper
exploration of this idea.

Looks pretty useful.

One quick stylistic comment - we don't generally use "* 1.0" to turn
an int into a double - just use a cast.

-Given this is doing a full table scan, should it hook into a ring buffer to
keep from trashing the buffer cache?  Or might it loop over the relation in
a different way all together?  I was thinking about eyeing the FSM instead
at one point, didn't explore that yet.  There's certainly a few ways to
approach this, we just aimed at the easiest way to get a working starter
implementation, and associated results to compare others against.

Hooking into a ring buffer seems like an almost requirement before you
can run this on a larger production system, wouldn't it? I don't know
how hard that is code-wise, but it certainly seems worthwhile.

-Should there be a non-superuser version of this?  We'd certainly need to
get a less cache demolishing version before that would seem wise.

Not sure that's necessary - at least not for now. Many other
diagnostics functions are already superuser only...

-There were related things in the pageinspect module, but a case could be
made for this being a core function instead.  It's a bit more likely to be
used in production than the rest of that extension.

A case can be made for a lot of things in contrib to be in core ;) I
say let's keep it in pageinspect, but then also have you finish off
that "split up the contrib" patch :-)

-What if anything related to TOAST should this handle?

Similar data for TOAST relations would be intersting, no? But that's
easily done from userspace by just querying to the toast table
specifically, I assume?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#3Bernd Helmle
mailings@oopsware.de
In reply to: Greg Smith (#1)
Re: Measuring relation free space

--On 6. November 2011 01:08:11 -0200 Greg Smith <greg@2ndQuadrant.com> wrote:

Attached patch adds a new function to the pageinspect extension for measuring
total free space, in either tables or indexes.

I wonder if that should be done in the pgstattuple module, which output some
similar numbers.

--
Thanks

Bernd

#4Mark Kirkwood
mark.kirkwood@catalyst.net.nz
In reply to: Bernd Helmle (#3)
Re: Measuring relation free space

On 07/11/11 10:20, Bernd Helmle wrote:

--On 6. November 2011 01:08:11 -0200 Greg Smith <greg@2ndQuadrant.com>
wrote:

Attached patch adds a new function to the pageinspect extension for
measuring
total free space, in either tables or indexes.

I wonder if that should be done in the pgstattuple module, which
output some similar numbers.

Not meaning to disparage Greg's effort in any way, but I was thinking
the same thing about pg_freespacemap. I have not checked what - if any
differences there are in output, but it would be interesting to compare
which of the various (3 at present) extensions with slightly overlapping
areas of functionality should perhaps be merged.

I am guessing (at this point very much guessing) that pg_freespace map
may return its data faster, as pageinspect is gonna have to grovel
through all the pages for itself (whereas pg_freespacemap relies on
using info from the ... free space map fork).

regards

Mark

#5Greg Smith
gsmith@gregsmith.com
In reply to: Mark Kirkwood (#4)
Re: Measuring relation free space

On 11/06/2011 11:55 PM, Mark Kirkwood wrote:

I am guessing (at this point very much guessing) that pg_freespace map
may return its data faster, as pageinspect is gonna have to grovel
through all the pages for itself (whereas pg_freespacemap relies on
using info from the ... free space map fork).

I started with pageinspect because I wasn't sure if other methods would
be as accurate. For example, I wasn't sure until just before submission
that only the free space and size of the relation are needed to get a
useful measure here; at one point I was considering some other things
too. I've ruled them out as unnecessary as far as I can tell, but I
can't claim my tests are exhaustive. Some review confirmation that this
measure is useful for other people is one thing I was hoping for
feedback on, as one thing to consider in addition to the actual
implementation.

If this measurement is the only one needed, than as I said at the start
of the thread here it might easily be implemented to run just against
the free space map instead. I'm thinking of what's been sent so far as
a UI with matching output it should produce. If it's possible to get
the same numbers faster, exactly how to implement the function under the
hood is easy enough to change. Jaime already has a new version in
development that adds a ring buffer for example.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#6Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Magnus Hagander (#2)
Re: Measuring relation free space

On Sun, Nov 6, 2011 at 5:38 AM, Magnus Hagander <magnus@hagander.net> wrote:

Looks pretty useful.

thanks for the review, attached is a new version of it

One quick stylistic comment - we don't generally use "* 1.0" to turn
an int into a double - just use a cast.

ok

Hooking into a ring buffer seems like an almost requirement before you
can run this on a larger production system, wouldn't it? I don't know
how  hard that is code-wise, but it certainly seems worthwhile.

seems it wasn't too difficult... i just have to indicate the right
buffer access strategy so it's using a ring buffer now

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación

Attachments:

relation_free_space-v3.patchtext/x-patch; charset=US-ASCII; name=relation_free_space-v3.patchDownload+170-0
#7Robert Treat
xzilla@users.sourceforge.net
In reply to: Greg Smith (#5)
Re: Measuring relation free space

On Tue, Nov 8, 2011 at 1:07 PM, Greg Smith <greg@2ndquadrant.com> wrote:

On 11/06/2011 11:55 PM, Mark Kirkwood wrote:

I am guessing (at this point very much guessing) that pg_freespace map may
return its data faster, as pageinspect is gonna have to grovel through all
the pages for itself (whereas pg_freespacemap relies on using info from the
... free space map fork).

I started with pageinspect because I wasn't sure if other methods would be
as accurate.  For example, I wasn't sure until just before submission that
only the free space and size of the relation are needed to get a useful
measure here; at one point I was considering some other things too.  I've
ruled them out as unnecessary as far as I can tell, but I can't claim my
tests are exhaustive.  Some review confirmation that this measure is useful
for other people is one thing I was hoping for feedback on, as one thing to
consider in addition to the actual implementation.

If this measurement is the only one needed, than as I said at the start of
the thread here it might easily be implemented to run just against the free
space map instead.  I'm thinking of what's been sent so far as a UI with
matching output it should produce.  If it's possible to get the same numbers
faster, exactly how to implement the function under the hood is easy enough
to change.  Jaime already has a new version in development that adds a ring
buffer for example.

It's already easy to get "good enough" numbers based on user space
tools with very little overhead, so I think it's more important that
the server side tool be accurate rather than fast. Of course, if we
can get both, that's a bonus, but I'd rather not go that route at the
expense of accuracy. Just my .02.

Robert Treat
conjecture: xzilla.net
consulting: omniti.com

#8Greg Smith
gsmith@gregsmith.com
In reply to: Robert Treat (#7)
Re: Measuring relation free space

On 11/08/2011 05:07 PM, Robert Treat wrote:

It's already easy to get "good enough" numbers based on user space
tools with very little overhead, so I think it's more important that
the server side tool be accurate rather than fast.

What user space method do you consider good enough here? I haven't
found any approximation that I was really happy with; wouldn't have
bothered with this otherwise.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#9Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Jaime Casanova (#6)
Re: Measuring relation free space

Excerpts from Jaime Casanova's message of mar nov 08 18:12:25 -0300 2011:

On Sun, Nov 6, 2011 at 5:38 AM, Magnus Hagander <magnus@hagander.net> wrote:

Looks pretty useful.

thanks for the review, attached is a new version of it

Note that AFAIK you shouldn't update the 1.0 extension script ... you
have to create a 1.1 version (or whatever), update the default version
in the control file, and create an 1.0--1.1 script to upgrade from the
original version to 1.1.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#10Robert Treat
xzilla@users.sourceforge.net
In reply to: Greg Smith (#8)
Re: Measuring relation free space

On Tue, Nov 8, 2011 at 7:19 PM, Greg Smith <greg@2ndquadrant.com> wrote:

On 11/08/2011 05:07 PM, Robert Treat wrote:

It's already easy to get "good enough" numbers based on user space
tools with very little overhead, so I think it's more important that
the server side tool be accurate rather than fast.

What user space method do you consider good enough here?  I haven't found
any approximation that I was really happy with; wouldn't have bothered with
this otherwise.

check_postgres and the pg_bloat_report both use a method of comparing
on disk size vs estimated size based on table structure (or index
info). Run regularly, it's certainly possible to keep bloat under
control. That said, I'd still like to see something more accurate.

Robert Treat
conjecture: xzilla.net
consulting: omniti.com

#11Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Alvaro Herrera (#9)
Re: Measuring relation free space

On Wed, Nov 9, 2011 at 7:58 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Excerpts from Jaime Casanova's message of mar nov 08 18:12:25 -0300 2011:

On Sun, Nov 6, 2011 at 5:38 AM, Magnus Hagander <magnus@hagander.net> wrote:

Looks pretty useful.

thanks for the review, attached is a new version of it

Note that AFAIK you shouldn't update the 1.0 extension script ... you
have to create a 1.1 version (or whatever), update the default version
in the control file, and create an 1.0--1.1 script to upgrade from the
original version to 1.1.

good point... fixed that...
a question i have is: are we supposed to let the old script (1.0) around?

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación

Attachments:

relation_free_space-v4.patchtext/x-patch; charset=US-ASCII; name=relation_free_space-v4.patchDownload+294-6
#12Jeff Janes
jeff.janes@gmail.com
In reply to: Jaime Casanova (#11)
Re: Measuring relation free space

On Mon, Nov 14, 2011 at 2:02 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote:

On Wed, Nov 9, 2011 at 7:58 AM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Excerpts from Jaime Casanova's message of mar nov 08 18:12:25 -0300 2011:

On Sun, Nov 6, 2011 at 5:38 AM, Magnus Hagander <magnus@hagander.net> wrote:

Looks pretty useful.

thanks for the review, attached is a new version of it

Note that AFAIK you shouldn't update the 1.0 extension script ... you
have to create a 1.1 version (or whatever), update the default version
in the control file, and create an 1.0--1.1 script to upgrade from the
original version to 1.1.

good point... fixed that...
a question i have is: are we supposed to let the old script (1.0) around?

Since the syntax to install a non-default version is supported, I
would argue the old script should be kept.
CREATE extension pageinspect with version "1.0"

This patch applies and builds cleanly. It works either for "CREATE
EXTENSION" from scratch, or for updating from the prior version with
"ALTER EXTENSION..UPDATE".

It seems to be using the buffer ring strategy as advertised.

It reports space that is free exclusively for updates as being free.
In other words, it considers space free even if it is reserved against
inserts in deference to fillfactor. This is in contrast to
pg_freespace, which only reports space available for inserts as being
available. I think this is reasonable behavior, but it is subtle and
should perhaps be documented. (Is it common to use fill factors other
than the default in the first place? Do we assume that people using
fillfactor are sophisticated enough not to shot themselves in the
foot?)

As noted by Greg, the documentation calls it "total amount of free
free [sic] space" when that is not what is reported. However, it also
is not reporting a percentage, but rather a decimal fraction. The
reported value should be multiplied by 100, especially if the docs are
going to be changed to call it a percentage.

Unless I am missing something, all indexes are handled via a procedure
designed for BTree indices, "GetBTRelationFreeSpace". I don't know
that the ultimate behavior of this is wrong, but it seems unusual. If
I get some more time, I will try to explore what is actually going on
when called on other types of indexes.

I have no insight into how to handle toast tables, or non-superusers.
I had thought that toast tables had names of their own which could be
used, but I could not figure out how to do that.

Even if there are other ways to get approximately the same
information, this functionality seems to be a natural thing to have in
the pageinspect extension.

Cheers,

Jeff

#13Greg Smith
gsmith@gregsmith.com
In reply to: Jeff Janes (#12)
Re: Measuring relation free space

On 11/25/2011 04:42 PM, Jeff Janes wrote:

It reports space that is free exclusively for updates as being free.
In other words, it considers space free even if it is reserved against
inserts in deference to fillfactor. This is in contrast to
pg_freespace, which only reports space available for inserts as being
available. I think this is reasonable behavior, but it is subtle and
should perhaps be documented.

Ah, that's right, this is why I first wandered this specific path.
Ignoring fillfactor seems to have even more downsides as I see it.
Certainly deserves a doc improvement, as well as fixing the description
of the value so it's clearly a ratio rather than a true percentage.

(Is it common to use fill factors other
than the default in the first place? Do we assume that people using
fillfactor are sophisticated enough not to shot themselves in the
foot?)

It's not common, and I think anyone who sets fillfactor themselves would
understand the downside. The bigger risk are people who inherit designs
from others that use this feature, but the new person doesn't understand
it. If using this feature calls attention to a problem there that
prompts an investigation, I'd see that as a good thing, rather than a
foot shot.

Unless I am missing something, all indexes are handled via a procedure
designed for BTree indices, "GetBTRelationFreeSpace". I don't know
that the ultimate behavior of this is wrong, but it seems unusual. If
I get some more time, I will try to explore what is actually going on
when called on other types of indexes.

This I think I'll punt back toward Jaime, as well as asking "did you
have a plan for TOAST here?"

#14Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Greg Smith (#13)
Re: Measuring relation free space

On Mon, Nov 28, 2011 at 5:40 AM, Greg Smith <greg@2ndquadrant.com> wrote:

Unless I am missing something, all indexes are handled via a procedure
designed for BTree indices, "GetBTRelationFreeSpace".  I don't know
that the ultimate behavior of this is wrong, but it seems unusual.  If
I get some more time, I will try to explore what is actually going on
when called on other types of indexes.

This I think I'll punt back toward Jaime, as well as asking "did you have a
plan for TOAST here?"

for indexes. it seems pageinspect only deals with btree indexes and i
neglected to put a similar limitation on this function... now, because
the free space is calculated using PageGetFreeSpace() for indexes it
should be doing the right thing for all kind of indexes, i only put
the function there because i was trying to avoid to create a new file.
But if the function is right for all kind of indexes that's maybe
enough to create a new file and rename the helper function so is
obvious that it can manage all kind of indexes

for toast tables. a simple test here seems to show that is as easy as
to add toast tables in the supported objects and treat them as normal
pages...

or there is something i'm missing?

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación

#15Greg Smith
gsmith@gregsmith.com
In reply to: Greg Smith (#13)
Re: Measuring relation free space

On 11/28/2011 05:40 AM, Greg Smith wrote:

Ignoring fillfactor seems to have even more downsides as I see it.
Certainly deserves a doc improvement, as well as fixing the
description of the value so it's clearly a ratio rather than a true
percentage.

So: I'm very clear on what to do here now:

-Make the computation be in units that match it documetnation
-Take a look at other index types, as well as TOAST, at least to get the
easy ones right.
-Fully confirm the extension upgrade logic works as hoped

That's the must do stuff. Then there's two more features to consider
and do something with if sensible:

-Double check whether there is really a useful role in using
pg_freespace here. I don't think the numbers will be as good, but maybe
we don't care.
-Once the above is all sorted, add a second UI that samples some pages
and extrapolates, ANALYZE-style, rather than hitting everything.

This ones leaves as returned with feedback, feeling pretty good it will
be whipped into good shape for the last 9.2 CommitFest.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#16Noah Misch
noah@leadboat.com
In reply to: Bernd Helmle (#3)
Re: Measuring relation free space

On Sun, Nov 06, 2011 at 10:20:49PM +0100, Bernd Helmle wrote:

--On 6. November 2011 01:08:11 -0200 Greg Smith <greg@2ndQuadrant.com> wrote:

Attached patch adds a new function to the pageinspect extension for measuring
total free space, in either tables or indexes.

I wonder if that should be done in the pgstattuple module, which output
some similar numbers.

Indeed, pgstattuple already claims to show precisely the same measure. Its
reckoning is right in line for heaps, but the proposed pageinspect function
finds more free space in indexes:

[local] test=# SELECT t.free_percent, relation_free_space('pg_proc'), i.free_percent, relation_free_space('pg_proc_proname_args_nsp_index') FROM pgstattuple('pg_proc') t, pgstattuple('pg_proc_proname_args_nsp_index') i;
free_percent | relation_free_space | free_percent | relation_free_space
--------------+---------------------+--------------+---------------------
2.53 | 0.0253346 | 8.61 | 0.128041
(1 row)

Is one of those index figures simply wrong, or do they measure two senses of
free space, both of which are interesting to DBAs?

Thanks,
nm

#17Greg Smith
gsmith@gregsmith.com
In reply to: Noah Misch (#16)
Re: Measuring relation free space

On 12/15/2011 04:11 PM, Noah Misch wrote:

Is one of those index figures simply wrong, or do they measure two senses of
free space, both of which are interesting to DBAs?

I think the bigger one--the one I was aiming to measure--also includes
fill-factor space. It should be possible to isolate whether that's true
by running the function against a fresh index, or by trying tests with a
table where there's no useful fill. I need to add some of those to the
test example suite.

While in theory both measures of free space might be interesting to
DBAs, I'd prefer to have the one that reflects the lost space to
fill-factor if I'm checking an index. But as Robert Treat was pointing
out, even the very rough estimates being made with existing user-space
tools not using the contrib module features are helpful enough for a lot
of users. So long as it's easy and accuracy is good enough to find
bloated indexes, either implementation is probably good enough.

Shaking out the alternate implementation ideas was really my goal for
this CF here. The major goal of the next revision is to present the
options with a measure of their respective accuracy and runtime. If I
have to give up just a of bit of accuracy and make it much faster,
that's probably what most people want as an option. When Jaime and I
come back with an update, it really needs to have benchmarks and
accuracy numbers for each option. That may be complicated a bit
depending on how much of the table or index is cached, so isolating that
out will be a pain.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#18Noah Misch
noah@leadboat.com
In reply to: Greg Smith (#17)
Re: Measuring relation free space

On Fri, Dec 16, 2011 at 02:02:03AM -0500, Greg Smith wrote:

On 12/15/2011 04:11 PM, Noah Misch wrote:

Is one of those index figures simply wrong, or do they measure two senses of
free space, both of which are interesting to DBAs?

I think the bigger one--the one I was aiming to measure--also includes
fill-factor space. It should be possible to isolate whether that's true
by running the function against a fresh index, or by trying tests with a
table where there's no useful fill. I need to add some of those to the
test example suite.

No, both measures include fillfactor space. From a brief look at the code, the
proposed function counts space in non-leaf pages, while pgstattuple does not.
Also, the proposed function counts half-dead pages like live pages, while
pgstattuple counts them like dead pages.

One could perhaps justify those choices either way, but they seem too esoteric
for DBA exposure. I recommend choosing a policy on each and making both
pgstattuple() and any new code respect that policy.

Shaking out the alternate implementation ideas was really my goal for
this CF here. The major goal of the next revision is to present the
options with a measure of their respective accuracy and runtime. If I
have to give up just a of bit of accuracy and make it much faster,
that's probably what most people want as an option. When Jaime and I
come back with an update, it really needs to have benchmarks and
accuracy numbers for each option. That may be complicated a bit
depending on how much of the table or index is cached, so isolating that
out will be a pain.

The previous submission seemed to boil down to a speedier version of "SELECT
free_percent FROM pgstattuple('foo')". (Some of the other statistics aren't
cheap.) Considering that, the code does belong in the pgstattuple module.

The sampling approach you have mentioned sounds promising, especially for
indexes. For heap bloat, it may be hard to improve on pg_freespacemap-based and
check_postgres-style estimates with anything less than a full heap scan.

Thanks,
nm

#19Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Noah Misch (#16)
Re: Measuring relation free space

On Thu, Dec 15, 2011 at 4:11 PM, Noah Misch <noah@leadboat.com> wrote:

On Sun, Nov 06, 2011 at 10:20:49PM +0100, Bernd Helmle wrote:

--On 6. November 2011 01:08:11 -0200 Greg Smith <greg@2ndQuadrant.com> wrote:

Attached patch adds a new function to the pageinspect extension for measuring
total free space, in either tables or indexes.

I wonder if that should be done in the pgstattuple module, which output
some similar numbers.

Indeed, pgstattuple already claims to show precisely the same measure.  Its
reckoning is right in line for heaps, but the proposed pageinspect function
finds more free space in indexes:

[local] test=# SELECT t.free_percent, relation_free_space('pg_proc'), i.free_percent, relation_free_space('pg_proc_proname_args_nsp_index') FROM pgstattuple('pg_proc') t, pgstattuple('pg_proc_proname_args_nsp_index') i;
 free_percent | relation_free_space | free_percent | relation_free_space
--------------+---------------------+--------------+---------------------
        2.53 |           0.0253346 |         8.61 |            0.128041
(1 row)

Is one of those index figures simply wrong, or do they measure two senses of
free space, both of which are interesting to DBAs?

i created a test env using pgbench -s 20 -F 90, i then create a new
table (that keep tracks actions that happens the the pgbench tables,
so insert only) and changed a few fillfactors:
"""
relname | reltuples | reloptions
-------------------------------------+---- -------+------------------
audit_log | 804977 |
pgbench_accounts | 1529890 | {fillfactor=90}
pgbench_accounts_pkey | 1529890 | {fillfactor=50}
pgbench_branches | 20 | {fillfactor=100}
pgbench_branches_pkey | 20 |
pgbench_history | 94062 |
pgbench_tellers | 200 | {fillfactor=100}
pgbench_tellers_pkey | 200 |
(8 rows)
"""

and after running "pgbench -n -c 4 -j 2 -T 300" a few times, i used
attached free_space.sql to see what pg_freespacemap, pgstattuple and
relation_free_space had to say about these tables. the result is
attached in result_free_space.out

my first conclusion is that pg_freespacemap is unreliable when indexes
are involved (and looking at the documentation of that module confirms
that), also the fact that FSM is not designed for accuracy make me
think is not an option.

pgstattuple and relation_free_space are very close in all the numbers
except for 2 indexes pgbench_branches_pkey and pgbench_tellers_pkey;
after a VACUUM FULL and a REINDEX (and the difference persistence) i
checked pgbench_tellers_pkey contents (it has only one page besides
the metapage) and the numbers that i look at where:

page size: 8192
free size: 4148

which in good romance means 50% of free space... so, answering Noah's
question: if that difference has some meaning i can't see it but
looking at the evidence the measure relation_free_space() is giving is
the good one

so, tomorrow (or ...looking at the clock... later today) i will update
the relation_free_space() patch to accept toast tables and other kind
of indexes and add it to the commitfest unless someone says that my
math is wrong and somehow there is a more accurate way of getting the
free space (which is entirely possible)

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación

Attachments:

free_space.sqltext/x-sql; charset=US-ASCII; name=free_space.sqlDownload
result_free_space.outapplication/octet-stream; name=result_free_space.outDownload
#20Noah Misch
noah@leadboat.com
In reply to: Jaime Casanova (#19)
Re: Measuring relation free space

On Sat, Jan 14, 2012 at 04:41:57AM -0500, Jaime Casanova wrote:

pgstattuple and relation_free_space are very close in all the numbers
except for 2 indexes pgbench_branches_pkey and pgbench_tellers_pkey;
after a VACUUM FULL and a REINDEX (and the difference persistence) i
checked pgbench_tellers_pkey contents (it has only one page besides
the metapage) and the numbers that i look at where:

page size: 8192
free size: 4148

which in good romance means 50% of free space... so, answering Noah's
question: if that difference has some meaning i can't see it but
looking at the evidence the measure relation_free_space() is giving is
the good one

so, tomorrow (or ...looking at the clock... later today) i will update
the relation_free_space() patch to accept toast tables and other kind
of indexes and add it to the commitfest unless someone says that my
math is wrong and somehow there is a more accurate way of getting the
free space (which is entirely possible)

Did you see this followup[1]http://archives.postgresql.org/message-id/20111218165625.GB6393@tornado.leadboat.com? To summarize:

- pgstattuple() and relation_free_space() should emit the same number, even if
that means improving pgstattuple() at the same time.
- relation_free_space() belongs in the pgstattuple extension, because its role
is cheaper access to a single pgstattuple() metric.

Thanks,
nm

[1]: http://archives.postgresql.org/message-id/20111218165625.GB6393@tornado.leadboat.com

#21Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Noah Misch (#20)
#22Noah Misch
noah@leadboat.com
In reply to: Jaime Casanova (#21)
#23Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Noah Misch (#22)
#24Noah Misch
noah@leadboat.com
In reply to: Jaime Casanova (#23)
#25Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Noah Misch (#24)
#26Noah Misch
noah@leadboat.com
In reply to: Jaime Casanova (#25)
#27Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Noah Misch (#26)
#28Noah Misch
noah@leadboat.com
In reply to: Alvaro Herrera (#27)
#29Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Noah Misch (#28)
#30Noah Misch
noah@leadboat.com
In reply to: Jaime Casanova (#29)
#31Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Noah Misch (#30)
#32Noah Misch
noah@leadboat.com
In reply to: Jaime Casanova (#31)
#33Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Noah Misch (#32)
#34Noah Misch
noah@leadboat.com
In reply to: Jaime Casanova (#33)
#35Jaime Casanova
jcasanov@systemguards.com.ec
In reply to: Noah Misch (#34)
#36Robert Haas
robertmhaas@gmail.com
In reply to: Jaime Casanova (#35)
#37Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Greg Smith (#1)
#38Robert Haas
robertmhaas@gmail.com
In reply to: Abhijit Menon-Sen (#37)
#39Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Robert Haas (#38)
#40Amit Kapila
amit.kapila16@gmail.com
In reply to: Abhijit Menon-Sen (#37)
#41Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Amit Kapila (#40)
#42Andres Freund
andres@anarazel.de
In reply to: Abhijit Menon-Sen (#41)
#43Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Andres Freund (#42)
#44Andres Freund
andres@anarazel.de
In reply to: Abhijit Menon-Sen (#43)
#45Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Andres Freund (#44)
#46Simon Riggs
simon@2ndQuadrant.com
In reply to: Andres Freund (#44)
#47Andres Freund
andres@anarazel.de
In reply to: Simon Riggs (#46)
#48Amit Kapila
amit.kapila16@gmail.com
In reply to: Abhijit Menon-Sen (#41)
#49Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Abhijit Menon-Sen (#45)
#50Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Abhijit Menon-Sen (#49)
#51Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Jim Nasby (#50)
#52Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Abhijit Menon-Sen (#51)
#53Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Jim Nasby (#52)
#54Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Abhijit Menon-Sen (#53)
#55Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tomas Vondra (#54)
#56Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Jim Nasby (#55)
#57Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tomas Vondra (#56)
#58Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Jim Nasby (#57)
#59Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tomas Vondra (#58)
#60Simon Riggs
simon@2ndQuadrant.com
In reply to: Jim Nasby (#59)
#61Mark Kirkwood
mark.kirkwood@catalyst.net.nz
In reply to: Simon Riggs (#60)
#62Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Mark Kirkwood (#61)
#63Amit Kapila
amit.kapila16@gmail.com
In reply to: Abhijit Menon-Sen (#49)
#64Amit Kapila
amit.kapila16@gmail.com
In reply to: Tomas Vondra (#54)
#65Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Amit Kapila (#64)
#66Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Abhijit Menon-Sen (#65)
#67Amit Kapila
amit.kapila16@gmail.com
In reply to: Abhijit Menon-Sen (#66)
#68Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Amit Kapila (#67)
#69Amit Kapila
amit.kapila16@gmail.com
In reply to: Abhijit Menon-Sen (#68)
#70Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Amit Kapila (#69)
#71Amit Kapila
amit.kapila16@gmail.com
In reply to: Abhijit Menon-Sen (#70)
#72Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Amit Kapila (#71)
#73Amit Kapila
amit.kapila16@gmail.com
In reply to: Abhijit Menon-Sen (#72)
#74Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Amit Kapila (#73)
#75Amit Kapila
amit.kapila16@gmail.com
In reply to: Tomas Vondra (#74)
#76Andres Freund
andres@anarazel.de
In reply to: Abhijit Menon-Sen (#72)
#77Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Andres Freund (#76)
#78Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Abhijit Menon-Sen (#77)
#79Andres Freund
andres@anarazel.de
In reply to: Abhijit Menon-Sen (#78)
#80Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Andres Freund (#79)
#81Andres Freund
andres@anarazel.de
In reply to: Abhijit Menon-Sen (#80)
#82Abhijit Menon-Sen
ams@2ndQuadrant.com
In reply to: Andres Freund (#81)