Why do I have holes in my pages?

Started by Aleksey Tsalolikhinover 13 years ago11 messagesgeneral
Jump to latest
#1Aleksey Tsalolikhin
atsaloli.tech@gmail.com

Why do I have holes in my pages?

Postgres 8.4.12

"select ctid from big_table" on my master shows that pages have
"holes" in them.

Here is example for page 431665:

(431665,2)
(431665,5)
(431665,8)
(431665,11)
(431665,14)
(431665,17)
(431665,20)
(431665,23)

Current DB size is 400 GB and it takes up 2.7 TB (on a 6.6 TB filesystem).

So what happened to rows 1, 3 and 4 and so on?

I have to size a database server for next year's budget, and I will
have to explain to my mgmt why we have 400 GB taking up 2.7 TB. Help?
Would appreciate a pointer to the appropriate section in the manual
if this is documented.

Thanks,
-at

#2Bill Moran
wmoran@potentialtech.com
In reply to: Aleksey Tsalolikhin (#1)
Re: Why do I have holes in my pages?

In response to Aleksey Tsalolikhin <atsaloli.tech@gmail.com>:

Current DB size is 400 GB and it takes up 2.7 TB (on a 6.6 TB filesystem).

I expect that the first thing that others are going to ask
is "what is telling you that your DB is 400G?"

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

#3Aleksey Tsalolikhin
atsaloli.tech@gmail.com
In reply to: Bill Moran (#2)
Re: Why do I have holes in my pages?

On Thu, Sep 20, 2012 at 12:34 PM, Bill Moran <wmoran@potentialtech.com> wrote:

In response to Aleksey Tsalolikhin <atsaloli.tech@gmail.com>:

Current DB size is 400 GB and it takes up 2.7 TB (on a 6.6 TB filesystem).

I expect that the first thing that others are going to ask
is "what is telling you that your DB is 400G?"

Right on. I got that out of my pgstatspack report.

\l+ in psql tells me the same thing - 400 GB

#4Victor Yegorov
vyegorov@gmail.com
In reply to: Aleksey Tsalolikhin (#3)
Re: Why do I have holes in my pages?

Take a look at this part of the documentation:
http://www.postgresql.org/docs/current/interactive/routine-vacuuming.html#VACUUM-FOR-SPACE-RECOVERY

The “missing” entries belong to the tuples that you have DELETEd/UPDATEd
and that are no longer visible
to your current session, but still might be for the others, that started
some time ago. When tuples are no longer
needed, VACUUM will “release” the slots by adding them into the
FreeSpaceMap.

Still, if you have “empty” slots in the middle of your datafiles, VACUUM
cannot resize files.
This leads to the fact that while database size is being not so big, actual
disk space occupied by it
is bigger. This is called “bloat”.

Check the output of the query here:
http://wiki.postgresql.org/wiki/Show_database_bloat

Also, having such a big difference in the reported and actual size of the
database, may I ask:
- when was the last time you performed VACUUM?
- don't you have autovacuum = on (which is default) in your configuration?

2012/9/20 Aleksey Tsalolikhin <atsaloli.tech@gmail.com>

On Thu, Sep 20, 2012 at 12:34 PM, Bill Moran <wmoran@potentialtech.com>
wrote:

In response to Aleksey Tsalolikhin <atsaloli.tech@gmail.com>:

Current DB size is 400 GB and it takes up 2.7 TB (on a 6.6 TB

filesystem).

I expect that the first thing that others are going to ask
is "what is telling you that your DB is 400G?"

Right on. I got that out of my pgstatspack report.

\l+ in psql tells me the same thing - 400 GB

--
Victor Y. Yegorov

#5John R Pierce
pierce@hogranch.com
In reply to: Aleksey Tsalolikhin (#3)
Re: Why do I have holes in my pages?

On 09/20/12 1:34 PM, Aleksey Tsalolikhin wrote:

Right on. I got that out of my pgstatspack report.

\l+ in psql tells me the same thing - 400 GB

it might be interesting to see the output of...

du -hs $PGDATA/*

(assuming this is a linux or similar unix system)..... This will show
how much space is being used by the various directories under the PG
data directory. if pg_xlog is very large, you may have an issue with
wal archiving or something. if pg_log is very large, you may have an
issue with A) too much being logged, and B) nothing cleaning up stale
log files.

how did you arrive at the 2.7TB number? and what file system does
this 6.6TB volume use?

--
john r pierce N 37, W 122
santa cruz ca mid-left coast

#6Aleksey Tsalolikhin
atsaloli.tech@gmail.com
In reply to: John R Pierce (#5)
Re: Why do I have holes in my pages?

On Thu, Sep 20, 2012 at 1:53 PM, John R Pierce <pierce@hogranch.com> wrote:

On 09/20/12 1:34 PM, Aleksey Tsalolikhin wrote:

Right on. I got that out of my pgstatspack report.

\l+ in psql tells me the same thing - 400 GB

it might be interesting to see the output of...

du -hs $PGDATA/*

Well, that was it! Thanks, John!

2.3T /data/backups
400G /data/base

We store our pg_dumps on the same filesystem (they are copied off to
another server but we don't delete them) so it swelled the filesystem
size as reported by "df".

Sorry about that. And thanks for the help!

Aleksey

#7Jeff Janes
jeff.janes@gmail.com
In reply to: Victor Yegorov (#4)
Re: Why do I have holes in my pages?

On Thu, Sep 20, 2012 at 1:46 PM, Victor Yegorov <vyegorov@gmail.com> wrote:

Take a look at this part of the documentation:
http://www.postgresql.org/docs/current/interactive/routine-vacuuming.html#VACUUM-FOR-SPACE-RECOVERY

The “missing” entries belong to the tuples that you have DELETEd/UPDATEd and
that are no longer visible
to your current session, but still might be for the others, that started
some time ago. When tuples are no longer
needed, VACUUM will “release” the slots by adding them into the
FreeSpaceMap.

My understanding is (which is not very deep on this topic) is that it
is a lot more complicated than that.

To start with, it can be as you say where the ctid and its tuple are
interesting to someone, but not to you. But eventually the tuple is
not interesting to anyone, and its space can be reused. But the ctid
is still needed (to inform stragglers that it's corresponding tuple is
not interesting, and in fact no longer exists, so move on, nothing to
see). Then eventually even the ctid itself is not needed anymore even
for that purpose.

At that point the ctid can be re-used, but only if someone actually
wants a "new" ctid on that page. An ordinary vacuum will not close up
the gaps on un-used ctids. Only a vaccum full will do that.

The space used by these ctid gaps is not large, and as the OP
discovered, his wasted space was in fact happening outside of the
database itself.

Cheers,

Jeff

#8Victor Yegorov
vyegorov@gmail.com
In reply to: Jeff Janes (#7)
Re: Why do I have holes in my pages?

2012/9/21 Jeff Janes <jeff.janes@gmail.com>

To start with, it can be as you say where the ctid and its tuple are
interesting to someone, but not to you. But eventually the tuple is
not interesting to anyone, and its space can be reused. But the ctid
is still needed (to inform stragglers that it's corresponding tuple is
not interesting, and in fact no longer exists, so move on, nothing to
see). Then eventually even the ctid itself is not needed anymore even
for that purpose.

At that point the ctid can be re-used, but only if someone actually
wants a "new" ctid on that page. An ordinary vacuum will not close up
the gaps on un-used ctids. Only a vaccum full will do that.

In the “Routine Vacuuming” section of the documentation I read that:
«The standard form of VACUUM removes dead row versions in tables and
indexes and marks the space available for future reuse.»
and
«In contrast, VACUUM FULL actively compacts tables by writing a complete
new version of the table file with no dead space.»

What I wanted to say in the previous post was exactly this: vacuum will
mark the space (or gap) as free, while no space will be “returned” to the
OS, except if free pages are at the very end of the data file. I haven't
mentioned vacuum full at all.

It seems that this also matches your explanation, correct me if I'm wrong.

--
Victor Y. Yegorov

#9Jeff Janes
jeff.janes@gmail.com
In reply to: Victor Yegorov (#8)
Re: Why do I have holes in my pages?

On Fri, Sep 21, 2012 at 11:41 AM, Victor Yegorov <vyegorov@gmail.com> wrote:

It seems that this also matches your explanation, correct me if I'm wrong.

I think that the explanations do generally match. But, just because
you observe that the ctid space has not been reused (like the OP did),
does not mean that the tuple-space has not been reused. My
understanding is that tuple space is easier to reuse than ctid space
is. Sometimes that distinction is important, and sometimes it is not.
In general, doing "select ctid..." is a poor way of figuring out
where the space in your database is going.

Cheers,

Jeff

#10Aleksey Tsalolikhin
atsaloli.tech@gmail.com
In reply to: Jeff Janes (#9)
Re: Why do I have holes in my pages?

On Fri, Sep 21, 2012 at 12:45 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Fri, Sep 21, 2012 at 11:41 AM, Victor Yegorov <vyegorov@gmail.com> wrote:

It seems that this also matches your explanation, correct me if I'm wrong.

In general, doing "select ctid..." is a poor way of figuring out
where the space in your database is going.

Noted, thank you.

Aleksey

#11Sergey Konoplev
gray.ru@gmail.com
In reply to: Jeff Janes (#7)
Re: Why do I have holes in my pages?

On Fri, Sep 21, 2012 at 11:30 AM, Jeff Janes <jeff.janes@gmail.com> wrote:

At that point the ctid can be re-used, but only if someone actually
wants a "new" ctid on that page. An ordinary vacuum will not close up
the gaps on un-used ctids. Only a vaccum full will do that.

There are a couple of ways to do that except the vacuum full that
locks the table exclusively.

1. pg_reorg can re-organize tables on a postgres database without
locks. However it requires twice the space of the table size and might
lead to IO spikes.
2. pgcompactor a tables and indexes bloat reducing tool, without
locking also. It is slower than pg_reorg but does its job more gently.

The space used by these ctid gaps is not large, and as the OP
discovered, his wasted space was in fact happening outside of the
database itself.

Cheers,

Jeff

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sergey Konoplev

a database and software architect
http://www.linkedin.com/in/grayhemp

Jabber: gray.ru@gmail.com Skype: gray-hemp Phone: +14158679984