Vaccuming dead rows on busy databases

Started by Greg Sabino Mullaneabout 19 years ago4 messages
#1Greg Sabino Mullane
greg@turnstep.com

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Is there a general strategy out there for removing dead rows on
busy databases? I've learned that both a vacuum and a vacuum full
will not remove dead rows if there is an open transaction anywhere
on the entire cluster. Closing open transactions is not always a
viable solution, especially for periodic cron-driven vacuums. What
do people generally do to get around this problem? I can think of
a few possible solutions, but wanted to see if there was a best
practice before I went too much further. All my solutions are
external as well: I'd like to see something like VACUUM FULL WAIT. :)

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation
PGP Key: 0x14964AC8 200612181331
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-----BEGIN PGP SIGNATURE-----

iD8DBQFFht6qvJuQZxSWSsgRArZDAKD9zfrsuysep38o+UaW/2tDL/AdvACguZIO
jd0JERBi0AwBBZj+xu80Qtg=
=vSuy
-----END PGP SIGNATURE-----

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Sabino Mullane (#1)
Re: Vaccuming dead rows on busy databases

"Greg Sabino Mullane" <greg@turnstep.com> writes:

I'd like to see something like VACUUM FULL WAIT. :)

Sounds like a deadlock waiting to happen :-(

AFAIK the general practice is to just accept the fact that vacuum can't
remove recently-dead tuples. You should look into whether you can't
shorten your transactions --- very-long-running transactions create
other performance issues besides vacuum not removing stuff.

regards, tom lane

#3Matthew O'Connor
matthew@zeut.net
In reply to: Tom Lane (#2)
Re: Vaccuming dead rows on busy databases

Tom Lane wrote:

"Greg Sabino Mullane" <greg@turnstep.com> writes:

I'd like to see something like VACUUM FULL WAIT. :)

Sounds like a deadlock waiting to happen :-(

AFAIK the general practice is to just accept the fact that vacuum can't
remove recently-dead tuples. You should look into whether you can't
shorten your transactions --- very-long-running transactions create
other performance issues besides vacuum not removing stuff.

It seems to me that the most common support problem I keep seeing on the
mailing lists is VACUUM not working well because of long running
transactions. If I understand it correctly, people have talked about
reducing the problem by tracking xmin (or something, sorry if I'm
getting this wrong) on a per table basis rather and per cluster. Now
I'm sure this is not simple and I know I don't have the skills to do it,
but I think it would resolve (or at least significantly mitigate) what I
perceive as one of the biggest usage problems with PostgreSQL.

Comments?

#4Greg Sabino Mullane
greg@turnstep.com
In reply to: Tom Lane (#2)
Re: Vaccuming dead rows on busy databases

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...VACUUM FULL WAIT

Sounds like a deadlock waiting to happen :-(

Yeah, this sounds more and more like a job for a client application.

AFAIK the general practice is to just accept the fact that vacuum can't
remove recently-dead tuples. You should look into whether you can't
shorten your transactions --- very-long-running transactions create
other performance issues besides vacuum not removing stuff.

Sure, but does a long-running transaction on a different database in
the same cluster have any other consequences?

At any rate, I suppose this is something I can probably code around. If it
gets too bad, I'll try to coordinate the timing a bit more between the
databases, increase the frequency of vacuum, or simply kill some of the
long-running transactions before kicking off the vacuum.

Brian Hurt wrote:

My understanding is that vacuum can not delete any row that was deleted
after the oldest outstanding transaction. [snip]

Thanks, that was a good explanation.

Which is why having only a single transaction open, but it's been open
for 24 hours, is a problem.

Well, 24 hours is a bit much :), but perhaps until a database-specific xmin
is enabled, I'll also consider using an entirely different cluster for
databases which do long-runnning queries.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation
PGP Key: 0x14964AC8 200612181419
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iD8DBQFFhvZvvJuQZxSWSsgRArxYAKCh5YUbJosJiMDhon2vghIq0f0yIACeKdjD
0QK0N8P+C4odb7Vfvi5wy/U=
=Cvwh
-----END PGP SIGNATURE-----