walwriter not closing old files

Started by Magnus Haganderalmost 16 years ago4 messageshackers
Jump to latest
#1Magnus Hagander
magnus@hagander.net

I've just applied the attached file to the walwriter, to solve a case
when it keeps handles around to old xlog segments, preventing them
from actually being removed, and as such also causing alerts in some
monitoring systems. The way to provoke the problem is:

1. Do something that makes the walwriter active. For example, open a
transaction, do something, wait a while, do something, commit.
2. Now feed the system a steady state of *small*, short-running
transactions. It's easier to provoke if you just do a simple insert
and then pg_switch_xlog(), but it can be done with a regular stream of
inserts. The important thing is that the updates must be short-lived
enough that walwriter *doesn't* trigger to flush anything out. If
you're unlucky (or lucky) you'll hit a walwriter run anyway, in which
case you just repeat the test.

This will leave walwriter holding the old segment open. If your system
*only* does these small and fairly quick transactions, it'll keep the
file open "forever". This is obviously only likely to happen on
lightly loaded systems, but it does keep the file from being properly
removed.

The attached patch will close the open xlog file if it's no longer in
use. The check runs only if there was nothing for walwriter to do - if
it had something to do, the regular xlog routines will switch the file
for us.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Attachments:

walwriter.patchapplication/octet-stream; name=walwriter.patchDownload+13-1
#2Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Magnus Hagander (#1)
Re: walwriter not closing old files

Magnus Hagander wrote:

I've just applied the attached file to the walwriter, to solve a
case when it keeps handles around to old xlog segments, preventing
them from actually being removed, and as such also causing alerts
in some monitoring systems.

Thanks! I wasted some time on these a while back; I'm sure this will
save others that kind of bother.

The way to provoke the problem is:

The way I ran into it was to have a web application which only ran
read-only transactions. Sooner or later it would need to write a
page from the buffer to make space to read a new page, and then it
would forever be holding a WAL file open, even after it was deleted.

Previous thread on the topic starts here:

http://archives.postgresql.org/pgsql-hackers/2009-11/msg01754.php

continuing here:

http://archives.postgresql.org/pgsql-hackers/2009-12/msg00060.php

Resulting in a TODO listed with this description:

Close deleted WAL files held open in *nix by long-lived read-only
backends

-Kevin

#3Magnus Hagander
magnus@hagander.net
In reply to: Kevin Grittner (#2)
Re: walwriter not closing old files

On Wed, Jun 9, 2010 at 14:04, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:

Magnus Hagander  wrote:

I've just applied the attached file to the walwriter, to solve a
case when it keeps handles around to old xlog segments, preventing
them from actually being removed, and as such also causing alerts
in some monitoring systems.

Thanks!  I wasted some time on these a while back; I'm sure this will
save others that kind of bother.

The way to provoke the problem is:

The way I ran into it was to have a web application which only ran
read-only transactions.  Sooner or later it would need to write a
page from the buffer to make space to read a new page, and then it
would forever be holding a WAL file open, even after it was deleted.

Previous thread on the topic starts here:

http://archives.postgresql.org/pgsql-hackers/2009-11/msg01754.php

continuing here:

http://archives.postgresql.org/pgsql-hackers/2009-12/msg00060.php

Resulting in a TODO listed with this description:

Close deleted WAL files held open in *nix by long-lived read-only
backends

That seems to be about the same, but not actually fixed by this one.
This fix *only* fixes it when this happens in the walwriter. Not in
regular backends. OTOH, this happened in non-readonly mode :) My
understanding is that the issue you're talking about happens with the
regular backends, not walwriter, right?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#4Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Kevin Grittner (#2)
Re: walwriter not closing old files

On 09/06/10 15:04, Kevin Grittner wrote:

Magnus Hagander wrote:

The way to provoke the problem is:

The way I ran into it was to have a web application which only ran
read-only transactions. Sooner or later it would need to write a
page from the buffer to make space to read a new page, and then it
would forever be holding a WAL file open, even after it was deleted.

Previous thread on the topic starts here:

http://archives.postgresql.org/pgsql-hackers/2009-11/msg01754.php

continuing here:

http://archives.postgresql.org/pgsql-hackers/2009-12/msg00060.php

Resulting in a TODO listed with this description:

Close deleted WAL files held open in *nix by long-lived read-only
backends

This patch only helps with walwriter, though, not backends. Your
scenario is probably even more common, but will need a different fix.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com