ThisTimeLineID in checkpointer and bgwriter processes

Started by Heikki Linnakangasabout 13 years ago17 messages
#1Heikki Linnakangas
hlinnakangas@vmware.com

In both checkpointer.c and bgwriter.c, we do this before entering the
main loop:

/*
* Use the recovery target timeline ID during recovery
*/
if (RecoveryInProgress())
ThisTimeLineID = GetRecoveryTargetTLI();

That seems reasonable. However, since it's only done once, when the
process starts up, ThisTimeLineID is never updated in those processes,
even though the startup process changes recovery target timeline.

That actually seems harmless to me, and I also haven't heard of any
complaints of misbehavior in 9.1 or 9.2 caused by that. I'm not sure why
we bother to set ThisTimeLineID in those processes in the first place. I
think we did it when streaming replication was introduced because it was
an easy thing to do, because back then recovery target timeline never
changed after recovery was started. But now that it can change, I don't
think that makes sense anymore.

So, I propose that we simply remove those, and leave ThisTimeLineID at
zero in checkpointer and bgwriter processes, while we're still in
recovery. ThisTimeLineID is updated anyway before performing the first
checkpoint after finishing recovery, and AFAICS that's the first time
the value is needed.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Amit Kapila
amit.kapila@huawei.com
In reply to: Heikki Linnakangas (#1)
Re: ThisTimeLineID in checkpointer and bgwriter processes

On Wednesday, December 19, 2012 9:30 PM Heikki Linnakangas wrote:

In both checkpointer.c and bgwriter.c, we do this before entering the
main loop:

/*
* Use the recovery target timeline ID during recovery
*/
if (RecoveryInProgress())
ThisTimeLineID = GetRecoveryTargetTLI();

That seems reasonable. However, since it's only done once, when the
process starts up, ThisTimeLineID is never updated in those processes,
even though the startup process changes recovery target timeline.

That actually seems harmless to me, and I also haven't heard of any
complaints of misbehavior in 9.1 or 9.2 caused by that. I'm not sure
why
we bother to set ThisTimeLineID in those processes in the first place.

This is used in RemoveOldXlogFiles(), so if during recovery when it's not
set and
this function gets called, it might have some problem.
I think it could get called from CreateRestartPoint() during recovery.

With Regards,
Amit Kapila.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Amit Kapila (#2)
Re: ThisTimeLineID in checkpointer and bgwriter processes

On 20.12.2012 12:08, Amit Kapila wrote:

On Wednesday, December 19, 2012 9:30 PM Heikki Linnakangas wrote:

In both checkpointer.c and bgwriter.c, we do this before entering the
main loop:

/*
* Use the recovery target timeline ID during recovery
*/
if (RecoveryInProgress())
ThisTimeLineID = GetRecoveryTargetTLI();

That seems reasonable. However, since it's only done once, when the
process starts up, ThisTimeLineID is never updated in those processes,
even though the startup process changes recovery target timeline.

That actually seems harmless to me, and I also haven't heard of any
complaints of misbehavior in 9.1 or 9.2 caused by that. I'm not sure
why
we bother to set ThisTimeLineID in those processes in the first place.

This is used in RemoveOldXlogFiles(), so if during recovery when it's not
set and
this function gets called, it might have some problem.
I think it could get called from CreateRestartPoint() during recovery.

Hmm, right, it's used for this:

XLogFileName(lastoff, ThisTimeLineID, segno);

The resulting lastoff string, which is a xlog filename like
"000000020000000000000008", is used to compare filenames of files in
pg_xlog. However, the tli part, first 8 characters, are ignored for
comparison purposes. In addition to that, 'lastoff' is printed in a
DEBUG2 line, purely for debugging purposes.

So I think we're good on that front. But I'll add a comment there, and
use 0 explicitly instead of ThisTimeLineID, for clarity.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Amit Kapila
amit.kapila@huawei.com
In reply to: Heikki Linnakangas (#3)
Re: ThisTimeLineID in checkpointer and bgwriter processes

On Thursday, December 20, 2012 5:12 PM Heikki Linnakangas wrote:

On 20.12.2012 12:08, Amit Kapila wrote:

On Wednesday, December 19, 2012 9:30 PM Heikki Linnakangas wrote:

In both checkpointer.c and bgwriter.c, we do this before entering

the

main loop:

/*
* Use the recovery target timeline ID during recovery
*/
if (RecoveryInProgress())
ThisTimeLineID = GetRecoveryTargetTLI();

That seems reasonable. However, since it's only done once, when the
process starts up, ThisTimeLineID is never updated in those

processes,

even though the startup process changes recovery target timeline.

That actually seems harmless to me, and I also haven't heard of any
complaints of misbehavior in 9.1 or 9.2 caused by that. I'm not sure
why
we bother to set ThisTimeLineID in those processes in the first

place.

This is used in RemoveOldXlogFiles(), so if during recovery when it's

not

set and
this function gets called, it might have some problem.
I think it could get called from CreateRestartPoint() during

recovery.

Hmm, right, it's used for this:

XLogFileName(lastoff, ThisTimeLineID, segno);

So I think we're good on that front. But I'll add a comment there, and
use 0 explicitly instead of ThisTimeLineID, for clarity.

True, it might not have any functionality effect in RemoveOldXlogFiles().
However it can be used in PreallocXlogFiles()->XLogFileInit() as well.

With Regards,
Amit Kapila.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#1)
Re: ThisTimeLineID in checkpointer and bgwriter processes

On 20 December 2012 13:19, Amit Kapila <amit.kapila@huawei.com> wrote:

So I think we're good on that front. But I'll add a comment there, and
use 0 explicitly instead of ThisTimeLineID, for clarity.

True, it might not have any functionality effect in RemoveOldXlogFiles().
However it can be used in PreallocXlogFiles()->XLogFileInit() as well.

PreallocXlogFiles() should have been put to the sword long ago. It's a
performance tweak aimed at people without a performance problem in
this area.

I'll happily accept this excuse to remove it.

We can discuss whether any replacement is required.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Andres Freund
andres@2ndquadrant.com
In reply to: Simon Riggs (#5)
Re: ThisTimeLineID in checkpointer and bgwriter processes

On 2012-12-20 13:32:36 +0000, Simon Riggs wrote:

On 20 December 2012 13:19, Amit Kapila <amit.kapila@huawei.com> wrote:

So I think we're good on that front. But I'll add a comment there, and
use 0 explicitly instead of ThisTimeLineID, for clarity.

True, it might not have any functionality effect in RemoveOldXlogFiles().
However it can be used in PreallocXlogFiles()->XLogFileInit() as well.

PreallocXlogFiles() should have been put to the sword long ago. It's a
performance tweak aimed at people without a performance problem in
this area.

Creating, zeroing & fsync()'ing a 16MB file shouldn't be done in
individual transactions because it would increase latency noticeably. So
it seems it addresses a real performance problem. What do you dislike
about it?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Amit Kapila
amit.kapila@huawei.com
In reply to: Simon Riggs (#5)
Re: ThisTimeLineID in checkpointer and bgwriter processes

On Thursday, December 20, 2012 7:03 PM Simon Riggs wrote:

On 20 December 2012 13:19, Amit Kapila <amit.kapila@huawei.com> wrote:

So I think we're good on that front. But I'll add a comment there,

and

use 0 explicitly instead of ThisTimeLineID, for clarity.

True, it might not have any functionality effect in

RemoveOldXlogFiles().

However it can be used in PreallocXlogFiles()->XLogFileInit() as

well.

PreallocXlogFiles() should have been put to the sword long ago. It's a
performance tweak aimed at people without a performance problem in
this area.

Yes, it seems to be for a rare scenario.

I'll happily accept this excuse to remove it.

We can discuss whether any replacement is required.

We should think of alternatives if there is no other reasonable fix for the
problem.

With Regards,
Amit Kapila.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#3)
Re: ThisTimeLineID in checkpointer and bgwriter processes

On Thu, Dec 20, 2012 at 8:41 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:

On 20.12.2012 12:08, Amit Kapila wrote:

On Wednesday, December 19, 2012 9:30 PM Heikki Linnakangas wrote:

In both checkpointer.c and bgwriter.c, we do this before entering the
main loop:

/*
* Use the recovery target timeline ID during recovery
*/
if (RecoveryInProgress())
ThisTimeLineID = GetRecoveryTargetTLI();

That seems reasonable. However, since it's only done once, when the
process starts up, ThisTimeLineID is never updated in those processes,
even though the startup process changes recovery target timeline.

That actually seems harmless to me, and I also haven't heard of any
complaints of misbehavior in 9.1 or 9.2 caused by that. I'm not sure
why
we bother to set ThisTimeLineID in those processes in the first place.

This is used in RemoveOldXlogFiles(), so if during recovery when it's not
set and
this function gets called, it might have some problem.
I think it could get called from CreateRestartPoint() during recovery.

Hmm, right, it's used for this:

XLogFileName(lastoff, ThisTimeLineID, segno);

The resulting lastoff string, which is a xlog filename like
"000000020000000000000008", is used to compare filenames of files in
pg_xlog. However, the tli part, first 8 characters, are ignored for
comparison purposes. In addition to that, 'lastoff' is printed in a DEBUG2
line, purely for debugging purposes.

InstallXLogFileSegment() also uses ThisTimeLineID. But your recent commit
doesn't take care of it and prevents the standby from recycling the WAL files
properly. Specifically, the standby recycles the WAL file to wrong name.

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#5)
Re: ThisTimeLineID in checkpointer and bgwriter processes

Simon Riggs <simon@2ndQuadrant.com> writes:

PreallocXlogFiles() should have been put to the sword long ago. It's a
performance tweak aimed at people without a performance problem in
this area.

This claim seems remarkably lacking in any supporting evidence.

I'll freely grant that PreallocXlogFiles could stand to be improved
(by which I mean made more aggressive, ie willing to create files more
often and/or further in advance). I don't see how it follows that it's
okay to remove the functionality altogether. To the extent that we can
make that activity happen in checkpointer rather than in foreground
processes, it's surely a good thing.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#1)
Re: ThisTimeLineID in checkpointer and bgwriter processes

On 20 December 2012 13:19, Amit Kapila <amit.kapila@huawei.com> wrote:

True, it might not have any functionality effect in RemoveOldXlogFiles().
However it can be used in PreallocXlogFiles()->XLogFileInit() as well.

Which is never called in recovery because we never write WAL.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#9)
Re: ThisTimeLineID in checkpointer and bgwriter processes

On 20 December 2012 16:21, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

PreallocXlogFiles() should have been put to the sword long ago. It's a
performance tweak aimed at people without a performance problem in
this area.

This claim seems remarkably lacking in any supporting evidence.

I'll freely grant that PreallocXlogFiles could stand to be improved
(by which I mean made more aggressive, ie willing to create files more
often and/or further in advance). I don't see how it follows that it's
okay to remove the functionality altogether. To the extent that we can
make that activity happen in checkpointer rather than in foreground
processes, it's surely a good thing.

"More aggressive" implies it is currently in some way aggressive.

Removing it will make as little difference as keeping it, so let it stay.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Fujii Masao
masao.fujii@gmail.com
In reply to: Simon Riggs (#10)
Re: ThisTimeLineID in checkpointer and bgwriter processes

On Fri, Dec 21, 2012 at 1:46 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 20 December 2012 13:19, Amit Kapila <amit.kapila@huawei.com> wrote:

True, it might not have any functionality effect in RemoveOldXlogFiles().
However it can be used in PreallocXlogFiles()->XLogFileInit() as well.

Which is never called in recovery because we never write WAL.

No. CreateRestartPoint() calls PreallocXlogFiles(). Walreceiver may
write WAL, so PreallocXlogFiles() would be useful even during recovery
to some extent.

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Andres Freund
andres@2ndquadrant.com
In reply to: Simon Riggs (#10)
Re: ThisTimeLineID in checkpointer and bgwriter processes

On 2012-12-20 16:46:21 +0000, Simon Riggs wrote:

On 20 December 2012 13:19, Amit Kapila <amit.kapila@huawei.com> wrote:

True, it might not have any functionality effect in RemoveOldXlogFiles().
However it can be used in PreallocXlogFiles()->XLogFileInit() as well.

Which is never called in recovery because we never write WAL.

It's called from CreateRestartPoint. And the pre-inited files get used
by the walreceiver and I would guess thats beneficial at elast for sync
rep...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Fujii Masao (#8)
Re: ThisTimeLineID in checkpointer and bgwriter processes

On 20.12.2012 18:19, Fujii Masao wrote:

InstallXLogFileSegment() also uses ThisTimeLineID. But your recent commit
doesn't take care of it and prevents the standby from recycling the WAL files
properly. Specifically, the standby recycles the WAL file to wrong name.

A-ha, good catch. So that's actually a live bug in 9.1 and 9.2 as well:
after the recovery target timeline has changed, restartpoints will
continue to preallocate/recycle WAL files for the old timeline. That's
otherwise harmless, but the useless WAL files waste space, and
walreceiver will have to always create new files.

So instead of always running with ThisTimeLineID = 0 in the checkpointer
process, I guess we'll have to update it to the timeline being replayed,
when creating a restartpoint.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#14)
Re: ThisTimeLineID in checkpointer and bgwriter processes

On Fri, Dec 21, 2012 at 2:45 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:

On 20.12.2012 18:19, Fujii Masao wrote:

InstallXLogFileSegment() also uses ThisTimeLineID. But your recent commit
doesn't take care of it and prevents the standby from recycling the WAL
files
properly. Specifically, the standby recycles the WAL file to wrong name.

A-ha, good catch. So that's actually a live bug in 9.1 and 9.2 as well:
after the recovery target timeline has changed, restartpoints will continue
to preallocate/recycle WAL files for the old timeline. That's otherwise
harmless, but the useless WAL files waste space, and walreceiver will have
to always create new files.

So instead of always running with ThisTimeLineID = 0 in the checkpointer
process, I guess we'll have to update it to the timeline being replayed,
when creating a restartpoint.

Yes. Thanks for fixing that.

Regards,

--
Fujii Masao

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Amit Kapila
amit.kapila@huawei.com
In reply to: Heikki Linnakangas (#14)
Re: ThisTimeLineID in checkpointer and bgwriter processes

On Thursday, December 20, 2012 11:15 PM Heikki Linnakangas wrote:

On 20.12.2012 18:19, Fujii Masao wrote:

InstallXLogFileSegment() also uses ThisTimeLineID. But your recent

commit

doesn't take care of it and prevents the standby from recycling the

WAL files

properly. Specifically, the standby recycles the WAL file to wrong

name.

A-ha, good catch. So that's actually a live bug in 9.1 and 9.2 as well:
after the recovery target timeline has changed, restartpoints will
continue to preallocate/recycle WAL files for the old timeline. That's
otherwise harmless, but the useless WAL files waste space, and
walreceiver will have to always create new files.

So instead of always running with ThisTimeLineID = 0 in the
checkpointer
process, I guess we'll have to update it to the timeline being
replayed,
when creating a restartpoint.

Shouldn't there be a check if(RecoveryInProgress), before assigning
RecoveryTargetTLI to ThisTimeLineID in CreateRestartPoint()?

With Regards,
Amit Kapila.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Amit Kapila (#16)
Re: ThisTimeLineID in checkpointer and bgwriter processes

On 21.12.2012 08:18, Amit Kapila wrote:

On Thursday, December 20, 2012 11:15 PM Heikki Linnakangas wrote:

On 20.12.2012 18:19, Fujii Masao wrote:

InstallXLogFileSegment() also uses ThisTimeLineID. But your recent

commit

doesn't take care of it and prevents the standby from recycling the

WAL files

properly. Specifically, the standby recycles the WAL file to wrong

name.

A-ha, good catch. So that's actually a live bug in 9.1 and 9.2 as well:
after the recovery target timeline has changed, restartpoints will
continue to preallocate/recycle WAL files for the old timeline. That's
otherwise harmless, but the useless WAL files waste space, and
walreceiver will have to always create new files.

So instead of always running with ThisTimeLineID = 0 in the
checkpointer
process, I guess we'll have to update it to the timeline being
replayed,
when creating a restartpoint.

Shouldn't there be a check if(RecoveryInProgress), before assigning
RecoveryTargetTLI to ThisTimeLineID in CreateRestartPoint()?

Hmm, I don't think so. You're not supposed to get that far in
CreateRestartPoint() if recovery has already ended, or just being ended.
The startup process "ends recovery", ie. makes RecoveryInProgress()
return false, only after writing the end-of-recovery checkpoint. And
after the end-of-recovery checkpoint has been written,
CreateRestartPoint() will do nothing, because the end-of-recovery
checkpoint is later than the last potential restartpoint. I'm talking
about this check in CreateRestartPoint():

if (XLogRecPtrIsInvalid(lastCheckPointRecPtr) ||
XLByteLE(lastCheckPoint.redo, ControlFile->checkPointCopy.redo))
{
ereport(DEBUG2,
(errmsg("skipping restartpoint, already performed at %X/%X",
(uint32) (lastCheckPoint.redo >> 32), (uint32) lastCheckPoint.redo)));
...
return false;
}

However, there's this just before we recycle WAL segments:

/*
* Update pg_control, using current time. Check that it still shows
* IN_ARCHIVE_RECOVERY state and an older checkpoint, else do nothing;
* this is a quick hack to make sure nothing really bad happens if somehow
* we get here after the end-of-recovery checkpoint.
*/
LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY &&
XLByteLT(ControlFile->checkPointCopy.redo, lastCheckPoint.redo))
{
...

but I believe that "quick hack" is just paranoia. You should not get
that far after the end-of-recovery checkpoint.

In any case, if you somehow get there anyway, the worst that will happen
is that some old WAL segments are recycled/preallocated on the old
timeline, wasting some space until the next checkpoint.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers