WIP archive_timeout patch

Started by Simon Riggsover 19 years ago11 messageshackers
Jump to latest
#1Simon Riggs
simon@2ndQuadrant.com

WIP archive_timeout.

All we need to do is add LWLock support to archiver.
Thoughts/ideas/hints welcome.

This is a patch-on-patch atop the xswitch.patch recently posted.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

Attachments:

archivetimeout.patchtext/x-patch; charset=UTF-8; name=archivetimeout.patchDownload+71-37
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#1)
Re: WIP archive_timeout patch

Simon Riggs <simon@2ndquadrant.com> writes:

WIP archive_timeout.
All we need to do is add LWLock support to archiver.
Thoughts/ideas/hints welcome.

Hint: this isn't the archiver's problem, and so you don't need to get
the archiver involved in the solution. I'd suggest bgwriter as a
reasonably appropriate place instead.

regards, tom lane

#3Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#2)
Re: WIP archive_timeout patch

On Thu, 2006-08-03 at 13:38 -0400, Tom Lane wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

WIP archive_timeout.
All we need to do is add LWLock support to archiver.
Thoughts/ideas/hints welcome.

Hint: this isn't the archiver's problem, and so you don't need to get
the archiver involved in the solution. I'd suggest bgwriter as a
reasonably appropriate place instead.

OK

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#4Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#3)
Re: WIP archive_timeout patch

On Thu, 2006-08-03 at 19:03 +0100, Simon Riggs wrote:

On Thu, 2006-08-03 at 13:38 -0400, Tom Lane wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

WIP archive_timeout.
All we need to do is add LWLock support to archiver.
Thoughts/ideas/hints welcome.

Hint: this isn't the archiver's problem, and so you don't need to get
the archiver involved in the solution. I'd suggest bgwriter as a
reasonably appropriate place instead.

OK

A slightly fuller answer:

Yes, thats a safer place than archiver, so I'll add it to bgwriter as
you suggest. Should have a patch complete by Tuesday, since travelling
now.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#5Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#1)
Re: WIP archive_timeout patch

On Wed, 2006-08-16 at 10:09 +0100, Simon Riggs wrote:

On Thu, 2006-08-03 at 19:03 +0100, Simon Riggs wrote:

On Thu, 2006-08-03 at 13:38 -0400, Tom Lane wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

WIP archive_timeout.
All we need to do is add LWLock support to archiver.
Thoughts/ideas/hints welcome.

Hint: this isn't the archiver's problem, and so you don't need to get
the archiver involved in the solution. I'd suggest bgwriter as a
reasonably appropriate place instead.

Revised patch enclosed, now believed to be production ready. This
implements regular log switching using the archive_timeout GUC.

Further patch enclosed implementing these changes plus the record type
version of pg_xlogfile_name_offset()

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

Attachments:

archive_timeout++.patchtext/x-patch; charset=utf-8; name=archive_timeout++.patchDownload+291-149
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#5)
Re: [PATCHES] WIP archive_timeout patch

Simon Riggs <simon@2ndquadrant.com> writes:

Revised patch enclosed, now believed to be production ready. This
implements regular log switching using the archive_timeout GUC.

Further patch enclosed implementing these changes plus the record type
version of pg_xlogfile_name_offset()

Applied with minor changes --- it seemed better to me to put tracking of
the last xlog switch time directly into xlog.c, instead of having the
bgwriter code try to determine whether a switch had happened recently.

I noticed a minor annoyance while testing: when the system is completely
idle, you get a forced segment switch every checkpoint_timeout seconds,
even though there is nothing useful to log. The checkpoint code is
smart enough not to do a checkpoint if nothing has happened since the
last one, and the xlog switch code is smart enough not to do a switch
if nothing has happened since the last one ... but they aren't talking
to each other and so each one's change looks like "something happened"
to the other one. I'm not sure how much trouble it's worth taking to
prevent this scenario, though. If you can't afford a WAL file switch
every five minutes, you probably shouldn't be using archive_timeout
anyway ...

regards, tom lane

#7Florian Pflug
fgp@phlo.org
In reply to: Tom Lane (#6)
Re: [PATCHES] WIP archive_timeout patch

Tom Lane wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

Revised patch enclosed, now believed to be production ready. This
implements regular log switching using the archive_timeout GUC.

Further patch enclosed implementing these changes plus the record type
version of pg_xlogfile_name_offset()

Applied with minor changes --- it seemed better to me to put tracking of
the last xlog switch time directly into xlog.c, instead of having the
bgwriter code try to determine whether a switch had happened recently.

I noticed a minor annoyance while testing: when the system is completely
idle, you get a forced segment switch every checkpoint_timeout seconds,
even though there is nothing useful to log. The checkpoint code is
smart enough not to do a checkpoint if nothing has happened since the
last one, and the xlog switch code is smart enough not to do a switch
if nothing has happened since the last one ... but they aren't talking
to each other and so each one's change looks like "something happened"
to the other one. I'm not sure how much trouble it's worth taking to
prevent this scenario, though. If you can't afford a WAL file switch
every five minutes, you probably shouldn't be using archive_timeout
anyway ...

Actually, this behaviour IMHO even has it's advantages - if you can be
sure that at least one wal will be archived every 5 minutes, then it's
easy to monitor the replication - you can just watch the logfile if the
slave, and send a failure notice if no logfile is imported at least
every 10 minutes or so.

Of course, for this to be useful, the documentation would have to tell
people about that behaviour, and it couldn't easily be changed in the next
release...

greetings, Florian Pflug

#8Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#6)
Re: [PATCHES] WIP archive_timeout patch

On Thu, 2006-08-17 at 19:11 -0400, Tom Lane wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

Revised patch enclosed, now believed to be production ready. This
implements regular log switching using the archive_timeout GUC.

Further patch enclosed implementing these changes plus the record type
version of pg_xlogfile_name_offset()

Applied with minor changes --- it seemed better to me to put tracking of
the last xlog switch time directly into xlog.c, instead of having the
bgwriter code try to determine whether a switch had happened recently.

Code location: sure.

I noticed a minor annoyance while testing: when the system is completely
idle, you get a forced segment switch every checkpoint_timeout seconds,
even though there is nothing useful to log. The checkpoint code is
smart enough not to do a checkpoint if nothing has happened since the
last one, and the xlog switch code is smart enough not to do a switch
if nothing has happened since the last one ... but they aren't talking
to each other and so each one's change looks like "something happened"
to the other one. I'm not sure how much trouble it's worth taking to
prevent this scenario, though. If you can't afford a WAL file switch
every five minutes, you probably shouldn't be using archive_timeout
anyway ...

I noticed that minor annoyance and understood that I had fixed it before
submitting. That was the reason for putting the code in bgwriter to
check whether the pointer had moved before attempting the switch...
perhaps that functionality has been removed?

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

#9Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Tom Lane (#6)
Re: [PATCHES] WIP archive_timeout patch

I noticed a minor annoyance while testing: when the system is
completely idle, you get a forced segment switch every
checkpoint_timeout seconds, even though there is nothing
useful to log. The checkpoint code is smart enough not to do
a checkpoint if nothing has happened since the last one, and
the xlog switch code is smart enough not to do a switch if
nothing has happened since the last one ... but they aren't
talking to each other and so each one's change looks like
"something happened"
to the other one. I'm not sure how much trouble it's worth
taking to prevent this scenario, though. If you can't afford
a WAL file switch every five minutes, you probably shouldn't
be using archive_timeout anyway ...

Um, I would have thought practical timeouts would be rather more
than 5 minutes than less. So this does seem like a problem to me :-(

Andreas

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#8)
Re: [PATCHES] WIP archive_timeout patch

Simon Riggs <simon@2ndquadrant.com> writes:

On Thu, 2006-08-17 at 19:11 -0400, Tom Lane wrote:

I noticed a minor annoyance while testing: when the system is completely
idle, you get a forced segment switch every checkpoint_timeout seconds,
even though there is nothing useful to log. The checkpoint code is
smart enough not to do a checkpoint if nothing has happened since the
last one, and the xlog switch code is smart enough not to do a switch
if nothing has happened since the last one ... but they aren't talking
to each other and so each one's change looks like "something happened"
to the other one.

I noticed that minor annoyance and understood that I had fixed it before
submitting. That was the reason for putting the code in bgwriter to
check whether the pointer had moved before attempting the switch...
perhaps that functionality has been removed?

No, the original form of the patch was equally vulnerable. AFAICS the
only way to prevent this would be for XLogRequestSwitch (or really
XLogInsert, which does the heavy lifting for this) to suppress a switch
if the current segment is empty *or* contains only a checkpoint WAL
record. Basically it'd have to pretend the checkpoint record is not
there. This is doable but seems a bit weird --- in particular, that
would mean that pg_switch_xlog sometimes returns a pointer less than
pg_current_xlog_location, which might confuse backup scripts.

On the whole I'm leaning towards not changing it. As Florian mentioned,
guaranteed segment-every-checkpoint isn't completely without its uses.
And people who are looking for low WAL volume ought to be stretching
out their checkpoint intervals anyway.

regards, tom lane

#11Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#10)
Re: [PATCHES] WIP archive_timeout patch

On Fri, 2006-08-18 at 08:52 -0400, Tom Lane wrote:

Simon Riggs <simon@2ndquadrant.com> writes:

On Thu, 2006-08-17 at 19:11 -0400, Tom Lane wrote:

I noticed a minor annoyance while testing: when the system is completely
idle, you get a forced segment switch every checkpoint_timeout seconds,
even though there is nothing useful to log. The checkpoint code is
smart enough not to do a checkpoint if nothing has happened since the
last one, and the xlog switch code is smart enough not to do a switch
if nothing has happened since the last one ... but they aren't talking
to each other and so each one's change looks like "something happened"
to the other one.

I noticed that minor annoyance and understood that I had fixed it before
submitting. That was the reason for putting the code in bgwriter to
check whether the pointer had moved before attempting the switch...
perhaps that functionality has been removed?

No, the original form of the patch was equally vulnerable. AFAICS the
only way to prevent this would be for XLogRequestSwitch (or really
XLogInsert, which does the heavy lifting for this) to suppress a switch
if the current segment is empty *or* contains only a checkpoint WAL
record. Basically it'd have to pretend the checkpoint record is not
there. This is doable but seems a bit weird --- in particular, that
would mean that pg_switch_xlog sometimes returns a pointer less than
pg_current_xlog_location, which might confuse backup scripts.

On the whole I'm leaning towards not changing it. As Florian mentioned,
guaranteed segment-every-checkpoint isn't completely without its uses.
And people who are looking for low WAL volume ought to be stretching
out their checkpoint intervals anyway.

Agreed.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com