TRUNCATE - timing of the return of disk space - caused by long-lived client?

Started by Vince Negriover 18 years ago4 messagesgeneral
Jump to latest
#1Vince Negri
vnegri@asl-electronics.co.uk

Hi All,

Running postgres 8.1.9 on SLES10 (kernel 2.6.16) currently, I noticed something about the behaviour
of TRUNCATE on a large table.

The docs imply that TRUNCATE is the best way to delete everything in a table
(rather than DELETE) since there is no need to VACUUM afterward - the disk space
used is immediately returned to the operating system.

In the setup in question, there is one table in the cluster that acts as a sequential
log. A long-lived process (24/7) connects to the cluster and writes rows to the table
(existing rows are never altered.) The client does not use transactions, only single INSERT commands.

Obviously this table can't be allowed to grow for ever, but it is important not to
disrupt the client connection. One approach is to periodically
DELETE old entries and then do a VACUUM so that they can be re-used. This is quite slow since
the table is large.

So I tried TRUNCATE on the table. It appeared to work - in that the row count dropped to zero
and the connected client was not disrupted, and "du" on the postgres data directory showed a fall.

But the available disk space (reported by "df") did not fall.

So I used "lsof | grep pgsql | grep deleted" to look for files that have been deleted but are held open
and sure enough, there is the file for the table I just truncated. It is referenced by a number of
postmaster processes(threads?) Most of which are associated with connections that have *never queried* the
table in question, which is odd, but one process is associated with the long-lived connection.

What causes the file handles of the truncated table to be released by all postmaster processes?
I am concerned that some of these files will only get fully deleted once all clients have disconnected
or the postgres server shuts down (neither of which is desirable.)

Vince

#2Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Vince Negri (#1)
Re: TRUNCATE - timing of the return of disk space - caused by long-lived client?

Vince Negri wrote:

So I tried TRUNCATE on the table. It appeared to work - in that the
row count dropped to zero and the connected client was not disrupted,
and "du" on the postgres data directory showed a fall.

But the available disk space (reported by "df") did not fall.

I think you shouldn't worry about it. The file will eventually be
closed (maybe after a couple of checkpoints) and the space returned to
the filesystem.

FYI what TRUNCATE does is create a new, separate file for the table and
index storages. The old one can still be open for a while, but it
should get detached not long after the transaction commits.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Vince Negri (#1)
Re: TRUNCATE - timing of the return of disk space - caused by long-lived client?

"Vince Negri" <vnegri@asl-electronics.co.uk> writes:

What causes the file handles of the truncated table to be released by all postmaster processes?

It should happen when the other backends process the sinval message
about the TRUNCATE, which at the latest should be the next time they
begin command execution. What were the other clients doing, just
sitting idle?

regards, tom lane

#4Vince Negri
vnegri@asl-electronics.co.uk
In reply to: Tom Lane (#3)
Re: TRUNCATE - timing of the return of disk space - caused by long-lived client?

Hi Tom (and all)

Yes, in the meantime I realised that the other relevant clients (the ones that
seemed to be holding the file handle) were ones that sat idle most of the time
and rarely executed any query. You are right, as each of these executed a query
(thus processing sinval) they released the filehandle.

Thanks for the pointers.

Vince

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: 26 October 2007 13:22
To: Vince Negri
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] TRUNCATE - timing of the return of disk space -
caused by long-lived client?

"Vince Negri" <vnegri@asl-electronics.co.uk> writes:

What causes the file handles of the truncated table to be released by all postmaster processes?

It should happen when the other backends process the sinval message
about the TRUNCATE, which at the latest should be the next time they
begin command execution. What were the other clients doing, just
sitting idle?

regards, tom lane