Extra XLOG in Checkpoint for StandbySnapshot

Started by Amit Kapilaabout 13 years ago15 messages

amit.kapila@huawei.com

about 13 years ago

Observation is that whenever a checkpoint happens and the wal_level
configured is hot_standby then one standby snapshot XLOG gets written with
the information of "running transaction".

So if first time checkpoint happened at specified interval, it will create
new XLOG in LogStandbySnapshot, due to which checkpoint operation doesn't
get skipped again on next interval. This is okay if there are any running
transactions, but it seems XLOG is written even if there is no running xact.

As per the analysis below is the code snippet doing this:
running = GetRunningTransactionData();
LogCurrentRunningXacts(running);

So irrespective of value of running, snapshot is getting logged.

So We can modify to change this in function LogStandbySnapshot as below:
running = GetRunningTransactionData();
if (running->xcnt > 0)
LogCurrentRunningXacts(running);

So this check will make sure that if there is no operation happening i.e. no
new running transaction, then no need to log running transaction snapshot
and hence further checkpoint operations will be skipped.

Let me know if I am missing something?

With Regards,

Amit Kapila.

Simon Riggs

simon@2ndQuadrant.com

about 13 years ago

In reply to: Amit Kapila (#1)

Re: Extra XLOG in Checkpoint for StandbySnapshot

On 7 January 2013 12:39, Amit Kapila <amit.kapila@huawei.com> wrote:

So We can modify to change this in function LogStandbySnapshot as below:
running = GetRunningTransactionData();
if (running->xcnt > 0)
LogCurrentRunningXacts(running);

So this check will make sure that if there is no operation happening i.e. no
new running transaction, then no need to log running transaction snapshot
and hence further checkpoint operations will be skipped.

Let me know if I am missing something?

It's not the same test. The fact that nothing is running at that
moment is not the same thing as saying nothing at all has run since
last checkpoint.

If we skip the WAL record in the way you suggest, we'd be unable to
start quickly in some cases.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Reply to msg id not found: -6051914442300596882@unknownmsgidReference msg id not found: -6051914442300596882@unknownmsgid | Resolved by subject fallback

Amit Kapila

amit.kapila@huawei.com

about 13 years ago

In reply to: Simon Riggs (#2)

Re: Extra XLOG in Checkpoint for StandbySnapshot

On Monday, January 07, 2013 6:30 PM Simon Riggs wrote:

On 7 January 2013 12:39, Amit Kapila <amit.kapila@huawei.com> wrote:

So We can modify to change this in function LogStandbySnapshot as

below:

running = GetRunningTransactionData();
if (running->xcnt > 0)
LogCurrentRunningXacts(running);

So this check will make sure that if there is no operation happening

i.e. no

new running transaction, then no need to log running transaction

snapshot

and hence further checkpoint operations will be skipped.

Let me know if I am missing something?

It's not the same test. The fact that nothing is running at that
moment is not the same thing as saying nothing at all has run since
last checkpoint.

But isn't the functionality of LogStandbySnapshot() is to log "all running
xids" and "all current
AccessExclusiveLocks". For RunningTransactionLocks, WAL is avoided in
similar way.

If we skip the WAL record in the way you suggest, we'd be unable to
start quickly in some cases.

If there are any operations happened which have generated WAL, then on next
checkpoint interval the checkpoint operation should happen.
Which cases will it not able to start quickly?

With Regards,
Amit Kapila

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Simon Riggs

simon@2ndQuadrant.com

about 13 years ago

In reply to: Simon Riggs (#2)

Re: Extra XLOG in Checkpoint for StandbySnapshot

On 7 January 2013 13:33, Amit Kapila <amit.kapila@huawei.com> wrote:

If we skip the WAL record in the way you suggest, we'd be unable to
start quickly in some cases.

If there are any operations happened which have generated WAL, then on next
checkpoint interval the checkpoint operation should happen.
Which cases will it not able to start quickly?

The case where we do lots of work but momentarily we weren't doing
anything when we took the snapshot.

The absence of write transactions at one specific moment gives no
indication of behaviour at other points across the whole checkpoint
period.

If you make the correct test, I'd be more inclined to accept the premise.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Reply to msg id not found: 6085532064961386966@unknownmsgidReference msg id not found: -6051914442300596882@unknownmsgid

Andres Freund

andres@anarazel.de

about 13 years ago

In reply to: Amit Kapila (#3)

Re: Extra XLOG in Checkpoint for StandbySnapshot

On 2013-01-07 19:03:35 +0530, Amit Kapila wrote:

On Monday, January 07, 2013 6:30 PM Simon Riggs wrote:

On 7 January 2013 12:39, Amit Kapila <amit.kapila@huawei.com> wrote:

So We can modify to change this in function LogStandbySnapshot as

below:

running = GetRunningTransactionData();
if (running->xcnt > 0)
LogCurrentRunningXacts(running);

So this check will make sure that if there is no operation happening

i.e. no

new running transaction, then no need to log running transaction

snapshot

and hence further checkpoint operations will be skipped.

Let me know if I am missing something?

It's not the same test. The fact that nothing is running at that
moment is not the same thing as saying nothing at all has run since
last checkpoint.

But isn't the functionality of LogStandbySnapshot() is to log "all running
xids" and "all current
AccessExclusiveLocks". For RunningTransactionLocks, WAL is avoided in
similar way.

The information that no transactions are currently running allows you to
build a recovery snapshot, without that information the standby won't
start answering queries. Now that doesn't matter if all standbys already
have built a snapshot, but the primary cannot know that.
Having to issue a checkpoint while ensuring transactions are running
just to get a standby up doesn't seem like a good idea to me :)

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Amit Kapila

amit.kapila@huawei.com

about 13 years ago

In reply to: Andres Freund (#5)

Re: Extra XLOG in Checkpoint for StandbySnapshot

On Monday, January 07, 2013 7:15 PM Andres Freund wrote:

On 2013-01-07 19:03:35 +0530, Amit Kapila wrote:

On Monday, January 07, 2013 6:30 PM Simon Riggs wrote:

On 7 January 2013 12:39, Amit Kapila <amit.kapila@huawei.com>

wrote:

So We can modify to change this in function LogStandbySnapshot as

below:

running = GetRunningTransactionData();
if (running->xcnt > 0)
LogCurrentRunningXacts(running);

So this check will make sure that if there is no operation

happening

i.e. no

new running transaction, then no need to log running transaction

snapshot

and hence further checkpoint operations will be skipped.

Let me know if I am missing something?

It's not the same test. The fact that nothing is running at that
moment is not the same thing as saying nothing at all has run since
last checkpoint.

But isn't the functionality of LogStandbySnapshot() is to log "all

running

xids" and "all current
AccessExclusiveLocks". For RunningTransactionLocks, WAL is avoided in
similar way.

The information that no transactions are currently running allows you
to
build a recovery snapshot, without that information the standby won't
start answering queries. Now that doesn't matter if all standbys
already
have built a snapshot, but the primary cannot know that.

Can't we make sure that checkpoint operation doesn't happen for below conds.
a. nothing has happened during or after last checkpoint
OR
b. nothing except snapshotstanby WAL has happened

Currently it is done for point a.

Having to issue a checkpoint while ensuring transactions are running
just to get a standby up doesn't seem like a good idea to me :)

Simon:

If you make the correct test, I'd be more inclined to accept the premise.

Not sure, what exact you are expecting from test?
The test is do any one operation on system and then keep the system idle.
Now at each checkpoint interval, it logs WAL for SnapshotStandby.

With Regards,
Amit Kapila.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andres Freund

andres@2ndquadrant.com

about 13 years ago

In reply to: Amit Kapila (#6)

Re: Extra XLOG in Checkpoint for StandbySnapshot

On 2013-01-08 19:51:39 +0530, Amit Kapila wrote:

On Monday, January 07, 2013 7:15 PM Andres Freund wrote:

On 2013-01-07 19:03:35 +0530, Amit Kapila wrote:

On Monday, January 07, 2013 6:30 PM Simon Riggs wrote:

On 7 January 2013 12:39, Amit Kapila <amit.kapila@huawei.com>

wrote:

So We can modify to change this in function LogStandbySnapshot as

below:

running = GetRunningTransactionData();
if (running->xcnt > 0)
LogCurrentRunningXacts(running);

So this check will make sure that if there is no operation

happening

i.e. no

new running transaction, then no need to log running transaction

snapshot

and hence further checkpoint operations will be skipped.

Let me know if I am missing something?

It's not the same test. The fact that nothing is running at that
moment is not the same thing as saying nothing at all has run since
last checkpoint.

But isn't the functionality of LogStandbySnapshot() is to log "all

running

xids" and "all current
AccessExclusiveLocks". For RunningTransactionLocks, WAL is avoided in
similar way.

The information that no transactions are currently running allows you
to
build a recovery snapshot, without that information the standby won't
start answering queries. Now that doesn't matter if all standbys
already
have built a snapshot, but the primary cannot know that.

Can't we make sure that checkpoint operation doesn't happen for below conds.
a. nothing has happened during or after last checkpoint
OR
b. nothing except snapshotstanby WAL has happened

Currently it is done for point a.

Having to issue a checkpoint while ensuring transactions are running
just to get a standby up doesn't seem like a good idea to me :)

Simon:

If you make the correct test, I'd be more inclined to accept the premise.

Not sure, what exact you are expecting from test?
The test is do any one operation on system and then keep the system idle.
Now at each checkpoint interval, it logs WAL for SnapshotStandby.

I can't really follow what you want to do here. The snapshot is only
logged if a checkpoint is performed anyway? As recovery starts at (the
logical) checkpoint's location we need to log a snapshot exactly
there. If you want to avoid activity when the system is idle you need to
prevent checkpoints from occurring itself. There was a thread some time
back about that and its not as trivial as it seems at the first glance.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Amit Kapila

amit.kapila@huawei.com

about 13 years ago

In reply to: Andres Freund (#7)

Re: Extra XLOG in Checkpoint for StandbySnapshot

On Tuesday, January 08, 2013 8:01 PM Andres Freund wrote:

On 2013-01-08 19:51:39 +0530, Amit Kapila wrote:

On Monday, January 07, 2013 7:15 PM Andres Freund wrote:

On 2013-01-07 19:03:35 +0530, Amit Kapila wrote:

On Monday, January 07, 2013 6:30 PM Simon Riggs wrote:

On 7 January 2013 12:39, Amit Kapila <amit.kapila@huawei.com>

wrote:

So We can modify to change this in function

LogStandbySnapshot as

below:

running = GetRunningTransactionData();
if (running->xcnt > 0)
LogCurrentRunningXacts(running);

So this check will make sure that if there is no operation

happening

i.e. no

new running transaction, then no need to log running

transaction

snapshot

and hence further checkpoint operations will be skipped.

Let me know if I am missing something?

It's not the same test. The fact that nothing is running at

that

moment is not the same thing as saying nothing at all has run

since

last checkpoint.

But isn't the functionality of LogStandbySnapshot() is to log

"all

running

xids" and "all current
AccessExclusiveLocks". For RunningTransactionLocks, WAL is

avoided in

similar way.

The information that no transactions are currently running allows

you

to
build a recovery snapshot, without that information the standby

won't

start answering queries. Now that doesn't matter if all standbys
already
have built a snapshot, but the primary cannot know that.

Can't we make sure that checkpoint operation doesn't happen for below

conds.

a. nothing has happened during or after last checkpoint
OR
b. nothing except snapshotstanby WAL has happened

Currently it is done for point a.

Having to issue a checkpoint while ensuring transactions are

running

just to get a standby up doesn't seem like a good idea to me :)

Simon:

If you make the correct test, I'd be more inclined to accept the

premise.

Not sure, what exact you are expecting from test?
The test is do any one operation on system and then keep the system

idle.

Now at each checkpoint interval, it logs WAL for SnapshotStandby.

I can't really follow what you want to do here. The snapshot is only
logged if a checkpoint is performed anyway? As recovery starts at (the
logical) checkpoint's location we need to log a snapshot exactly
there. If you want to avoid activity when the system is idle you need
to
prevent checkpoints from occurring itself.

Even if the checkpoint is scheduled, it doesn't perform actual operation if
there's nothing logged between
current and previous checkpoint due to below check in CreateCheckPoint()
function.
if (curInsert == ControlFile->checkPoint +
MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint)) &&
ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo)

But if we set the wal_level as hot_standby, it will log snapshot, now next
time again when function CreateCheckPoint()
will get called due to scheduled checkpoint, the above check will fail and
it will again log snapshot, so this will continue, even if the system is
totally idle.
I understand that it doesn't cause any problem, but I think it is better if
the repeated log of snapshot in this scenario can be avoided.

There was a thread some time
back about that and its not as trivial as it seems at the first glance.

I know some part of it that it has been fixed to avoid checkpoint operation
for low activity system and later on that change is rolledback due to
another problem, but I am not sure if it has been agreed that we don't need
to do anything for the above scenario.

With Regards,
Amit Kapila.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andres Freund

andres@2ndquadrant.com

about 13 years ago

In reply to: Amit Kapila (#8)

Re: Extra XLOG in Checkpoint for StandbySnapshot

On 2013-01-08 20:33:28 +0530, Amit Kapila wrote:

On Tuesday, January 08, 2013 8:01 PM Andres Freund wrote:

On 2013-01-08 19:51:39 +0530, Amit Kapila wrote:

On Monday, January 07, 2013 7:15 PM Andres Freund wrote:

On 2013-01-07 19:03:35 +0530, Amit Kapila wrote:

On Monday, January 07, 2013 6:30 PM Simon Riggs wrote:

On 7 January 2013 12:39, Amit Kapila <amit.kapila@huawei.com>

wrote:

So We can modify to change this in function

LogStandbySnapshot as

below:

running = GetRunningTransactionData();
if (running->xcnt > 0)
LogCurrentRunningXacts(running);

So this check will make sure that if there is no operation

happening

i.e. no

new running transaction, then no need to log running

transaction

snapshot

and hence further checkpoint operations will be skipped.

Let me know if I am missing something?

It's not the same test. The fact that nothing is running at

that

moment is not the same thing as saying nothing at all has run

since

last checkpoint.

But isn't the functionality of LogStandbySnapshot() is to log

"all

running

xids" and "all current
AccessExclusiveLocks". For RunningTransactionLocks, WAL is

avoided in

similar way.

The information that no transactions are currently running allows

you

to
build a recovery snapshot, without that information the standby

won't

start answering queries. Now that doesn't matter if all standbys
already
have built a snapshot, but the primary cannot know that.

Can't we make sure that checkpoint operation doesn't happen for below

conds.

a. nothing has happened during or after last checkpoint
OR
b. nothing except snapshotstanby WAL has happened

Currently it is done for point a.

Having to issue a checkpoint while ensuring transactions are

running

just to get a standby up doesn't seem like a good idea to me :)

Simon:

If you make the correct test, I'd be more inclined to accept the

premise.

Not sure, what exact you are expecting from test?
The test is do any one operation on system and then keep the system

idle.

Now at each checkpoint interval, it logs WAL for SnapshotStandby.

I can't really follow what you want to do here. The snapshot is only
logged if a checkpoint is performed anyway? As recovery starts at (the
logical) checkpoint's location we need to log a snapshot exactly
there. If you want to avoid activity when the system is idle you need
to
prevent checkpoints from occurring itself.

Even if the checkpoint is scheduled, it doesn't perform actual operation if
there's nothing logged between
current and previous checkpoint due to below check in CreateCheckPoint()
function.
if (curInsert == ControlFile->checkPoint +
MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint)) &&
ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo)

But if we set the wal_level as hot_standby, it will log snapshot, now next
time again when function CreateCheckPoint()
will get called due to scheduled checkpoint, the above check will fail and
it will again log snapshot, so this will continue, even if the system is
totally idle.
I understand that it doesn't cause any problem, but I think it is better if
the repeated log of snapshot in this scenario can be avoided.

ISTM in that case you "just" need a way to cope with the additionally
logged record in the above piece of code. Not logging seems to be the
entirely wrong way to go at this.
I admit its not totally simple, but making HS less predictable seems
like a cure *far* worse than the disease.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Amit Kapila

amit.kapila@huawei.com

about 13 years ago

In reply to: Andres Freund (#9)

Re: Extra XLOG in Checkpoint for StandbySnapshot

On Tuesday, January 08, 2013 8:57 PM Andres Freund wrote:

On 2013-01-08 20:33:28 +0530, Amit Kapila wrote:

On Tuesday, January 08, 2013 8:01 PM Andres Freund wrote:

On 2013-01-08 19:51:39 +0530, Amit Kapila wrote:

On Monday, January 07, 2013 7:15 PM Andres Freund wrote:

On 2013-01-07 19:03:35 +0530, Amit Kapila wrote:

On Monday, January 07, 2013 6:30 PM Simon Riggs wrote:

On 7 January 2013 12:39, Amit Kapila

<amit.kapila@huawei.com>

wrote:

The information that no transactions are currently running

allows

you

to
build a recovery snapshot, without that information the standby

won't

start answering queries. Now that doesn't matter if all

standbys

already
have built a snapshot, but the primary cannot know that.

Can't we make sure that checkpoint operation doesn't happen for

below

conds.

a. nothing has happened during or after last checkpoint
OR
b. nothing except snapshotstanby WAL has happened

Currently it is done for point a.

Having to issue a checkpoint while ensuring transactions are

running

just to get a standby up doesn't seem like a good idea to me :)

Simon:

If you make the correct test, I'd be more inclined to accept

the

premise.

Not sure, what exact you are expecting from test?
The test is do any one operation on system and then keep the

system

idle.

Now at each checkpoint interval, it logs WAL for SnapshotStandby.

I can't really follow what you want to do here. The snapshot is

only

logged if a checkpoint is performed anyway? As recovery starts at

(the

logical) checkpoint's location we need to log a snapshot exactly
there. If you want to avoid activity when the system is idle you

need

to
prevent checkpoints from occurring itself.

Even if the checkpoint is scheduled, it doesn't perform actual

operation if

there's nothing logged between
current and previous checkpoint due to below check in

CreateCheckPoint()

function.
if (curInsert == ControlFile->checkPoint +
MAXALIGN(SizeOfXLogRecord +

sizeof(CheckPoint)) &&

ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo)

But if we set the wal_level as hot_standby, it will log snapshot, now

next

time again when function CreateCheckPoint()
will get called due to scheduled checkpoint, the above check will

fail and

it will again log snapshot, so this will continue, even if the system

is

totally idle.
I understand that it doesn't cause any problem, but I think it is

better if

the repeated log of snapshot in this scenario can be avoided.

ISTM in that case you "just" need a way to cope with the additionally
logged record in the above piece of code. Not logging seems to be the
entirely wrong way to go at this.

I think one of the ways code can be modified is as below:

+		    /*size of running transactions log when there is no
active transation*/	
+                if (!shutdown && XLogStandbyInfoActive()) 
+                { 
+                        runningXactXLog =
MAXALIGN(MinSizeOfXactRunningXacts) + SizeOfXLogRecord; 
+                }

! if (curInsert == ControlFile->checkPoint +
! MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint)) &&
! ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo)

! if (curInsert == ControlFile->checkPoint +
! MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint)) &&
! ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo + runningXactXLog)

Second condition is checking the last checkpoint WAL position with the
current one.
Since ControlFile->checkPointCopy.redo holds the value before "running
Xact" WAL was inserted
and ControlFile->checkPoint holds the value after "running Xact" WAL got
inserted, so if no new WAL was inserted apart from "running Xacts" and
"Checkpoint" WAL, then this condition will be true.

Not logging seems to be the entirely wrong way to go at this.

True.

I admit its not totally simple, but making HS less predictable seems
like a cure *far* worse than the disease.

Right, that's why I am trying to figure out if there can be a way to handle
without any compromise on HS.

With Regards,
Amit Kapila.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Andres Freund

andres@2ndquadrant.com

about 13 years ago

In reply to: Amit Kapila (#10)

Re: Extra XLOG in Checkpoint for StandbySnapshot

On 2013-01-09 14:04:32 +0530, Amit Kapila wrote:

On Tuesday, January 08, 2013 8:57 PM Andres Freund wrote:

On 2013-01-08 20:33:28 +0530, Amit Kapila wrote:

On Tuesday, January 08, 2013 8:01 PM Andres Freund wrote:

On 2013-01-08 19:51:39 +0530, Amit Kapila wrote:

On Monday, January 07, 2013 7:15 PM Andres Freund wrote:

On 2013-01-07 19:03:35 +0530, Amit Kapila wrote:

On Monday, January 07, 2013 6:30 PM Simon Riggs wrote:

On 7 January 2013 12:39, Amit Kapila

<amit.kapila@huawei.com>

wrote:

The information that no transactions are currently running

allows

you

to
build a recovery snapshot, without that information the standby

won't

start answering queries. Now that doesn't matter if all

standbys

already
have built a snapshot, but the primary cannot know that.

Can't we make sure that checkpoint operation doesn't happen for

below

conds.

a. nothing has happened during or after last checkpoint
OR
b. nothing except snapshotstanby WAL has happened

Currently it is done for point a.

Having to issue a checkpoint while ensuring transactions are

running

just to get a standby up doesn't seem like a good idea to me :)

Simon:

If you make the correct test, I'd be more inclined to accept

the

premise.

Not sure, what exact you are expecting from test?
The test is do any one operation on system and then keep the

system

idle.

Now at each checkpoint interval, it logs WAL for SnapshotStandby.

I can't really follow what you want to do here. The snapshot is

only

logged if a checkpoint is performed anyway? As recovery starts at

(the

logical) checkpoint's location we need to log a snapshot exactly
there. If you want to avoid activity when the system is idle you

need

to
prevent checkpoints from occurring itself.

Even if the checkpoint is scheduled, it doesn't perform actual

operation if

there's nothing logged between
current and previous checkpoint due to below check in

CreateCheckPoint()

function.
if (curInsert == ControlFile->checkPoint +
MAXALIGN(SizeOfXLogRecord +

sizeof(CheckPoint)) &&

ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo)

But if we set the wal_level as hot_standby, it will log snapshot, now

next

time again when function CreateCheckPoint()
will get called due to scheduled checkpoint, the above check will

fail and

it will again log snapshot, so this will continue, even if the system

is

totally idle.
I understand that it doesn't cause any problem, but I think it is

better if

the repeated log of snapshot in this scenario can be avoided.

ISTM in that case you "just" need a way to cope with the additionally
logged record in the above piece of code. Not logging seems to be the
entirely wrong way to go at this.

I think one of the ways code can be modified is as below:
+		    /*size of running transactions log when there is no
active transation*/
+                if (!shutdown && XLogStandbyInfoActive())
+                {
+                        runningXactXLog =
MAXALIGN(MinSizeOfXactRunningXacts) + SizeOfXLogRecord;
+                }
! if (curInsert == ControlFile->checkPoint +
! MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint)) &&
! ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo)

! if (curInsert == ControlFile->checkPoint +
! MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint)) &&
! ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo + runningXactXLog)

Second condition is checking the last checkpoint WAL position with the
current one.
Since ControlFile->checkPointCopy.redo holds the value before "running
Xact" WAL was inserted
and ControlFile->checkPoint holds the value after "running Xact" WAL got
inserted, so if no new WAL was inserted apart from "running Xacts" and
"Checkpoint" WAL, then this condition will be true.

I don't think thats safe, there could have been another record inserted
that happens to be MinSizeOfXactRunningXacts big and we would still skip
the checkpoint.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Amit Kapila

amit.kapila@huawei.com

about 13 years ago

In reply to: Andres Freund (#11)

Re: Extra XLOG in Checkpoint for StandbySnapshot

On Wednesday, January 09, 2013 2:28 PM Andres Freund wrote:

On 2013-01-09 14:04:32 +0530, Amit Kapila wrote:

On Tuesday, January 08, 2013 8:57 PM Andres Freund wrote:

On 2013-01-08 20:33:28 +0530, Amit Kapila wrote:

On Tuesday, January 08, 2013 8:01 PM Andres Freund wrote:

On 2013-01-08 19:51:39 +0530, Amit Kapila wrote:

On Monday, January 07, 2013 7:15 PM Andres Freund wrote:

On 2013-01-07 19:03:35 +0530, Amit Kapila wrote:

On Monday, January 07, 2013 6:30 PM Simon Riggs wrote:

On 7 January 2013 12:39, Amit Kapila

<amit.kapila@huawei.com>

wrote:

The information that no transactions are currently running

allows

you

to
build a recovery snapshot, without that information the

standby

won't

start answering queries. Now that doesn't matter if all

standbys

already
have built a snapshot, but the primary cannot know that.

Can't we make sure that checkpoint operation doesn't happen

for

below

conds.

a. nothing has happened during or after last checkpoint
OR
b. nothing except snapshotstanby WAL has happened

Currently it is done for point a.

Having to issue a checkpoint while ensuring transactions

are

running

just to get a standby up doesn't seem like a good idea to

me :)

Simon:

If you make the correct test, I'd be more inclined to

accept

the

premise.

Not sure, what exact you are expecting from test?
The test is do any one operation on system and then keep the

system

idle.

Now at each checkpoint interval, it logs WAL for

SnapshotStandby.

I can't really follow what you want to do here. The snapshot is

only

logged if a checkpoint is performed anyway? As recovery starts

at

(the

logical) checkpoint's location we need to log a snapshot

exactly

there. If you want to avoid activity when the system is idle

you

need

to
prevent checkpoints from occurring itself.

Even if the checkpoint is scheduled, it doesn't perform actual

operation if

there's nothing logged between
current and previous checkpoint due to below check in

CreateCheckPoint()

function.
if (curInsert == ControlFile->checkPoint +
MAXALIGN(SizeOfXLogRecord +

sizeof(CheckPoint)) &&

ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo)

But if we set the wal_level as hot_standby, it will log snapshot,

now

next

time again when function CreateCheckPoint()
will get called due to scheduled checkpoint, the above check will

fail and

it will again log snapshot, so this will continue, even if the

system

is

totally idle.
I understand that it doesn't cause any problem, but I think it is

better if

the repeated log of snapshot in this scenario can be avoided.

ISTM in that case you "just" need a way to cope with the

additionally

logged record in the above piece of code. Not logging seems to be

the
entirely wrong way to go at this.

I think one of the ways code can be modified is as below:
+		    /*size of running transactions log when there is no
active transation*/
+                if (!shutdown && XLogStandbyInfoActive())
+                {
+                        runningXactXLog =
MAXALIGN(MinSizeOfXactRunningXacts) + SizeOfXLogRecord;
+                }
! if (curInsert == ControlFile->checkPoint +
! MAXALIGN(SizeOfXLogRecord +
sizeof(CheckPoint)) &&

! ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo)

! if (curInsert == ControlFile->checkPoint +
! MAXALIGN(SizeOfXLogRecord +

sizeof(CheckPoint)) &&

! ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo + runningXactXLog)

Second condition is checking the last checkpoint WAL position with

the

current one.
Since ControlFile->checkPointCopy.redo holds the value before

"running

Xact" WAL was inserted
and ControlFile->checkPoint holds the value after "running Xact" WAL

got

inserted, so if no new WAL was inserted apart from "running Xacts"

and

"Checkpoint" WAL, then this condition will be true.

I don't think thats safe, there could have been another record inserted
that happens to be MinSizeOfXactRunningXacts big and we would still
skip the checkpoint.

I think such can happen only for when first time checkpoint is triggered,
and even then the first part of the check (curInsert ==
ControlFile->checkPoint + MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint))
will fail.

Value to runningXactXLog will be assigned only if wal_level is hot_stanby.
In that case if checkpoint is getting scheduled for 2nd or consecutive time,
it will include WAL for "running Xact" along with WAL for any other data.
So now even if the other data is of size MinSizeOfXactRunningXacts, the
check should fail and skip the checkpoint.

Also why such cannot happen for Checkpoint record, because there is almost
similar check for Checkpoint record in the same if check?

With Regards,
Amit Kapila.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

Andres Freund

andres@2ndquadrant.com

about 13 years ago

In reply to: Amit Kapila (#12)

Re: Extra XLOG in Checkpoint for StandbySnapshot

On 2013-01-09 15:06:04 +0530, Amit Kapila wrote:

On Wednesday, January 09, 2013 2:28 PM Andres Freund wrote:
On 2013-01-09 14:04:32 +0530, Amit Kapila wrote:

On Tuesday, January 08, 2013 8:57 PM Andres Freund wrote:

On 2013-01-08 20:33:28 +0530, Amit Kapila wrote:

On Tuesday, January 08, 2013 8:01 PM Andres Freund wrote:

On 2013-01-08 19:51:39 +0530, Amit Kapila wrote:

On Monday, January 07, 2013 7:15 PM Andres Freund wrote:

On 2013-01-07 19:03:35 +0530, Amit Kapila wrote:

On Monday, January 07, 2013 6:30 PM Simon Riggs wrote:

On 7 January 2013 12:39, Amit Kapila

<amit.kapila@huawei.com>

wrote:

The information that no transactions are currently running

allows

you

to
build a recovery snapshot, without that information the

standby

won't

start answering queries. Now that doesn't matter if all

standbys

already
have built a snapshot, but the primary cannot know that.

Can't we make sure that checkpoint operation doesn't happen

for

below

conds.

a. nothing has happened during or after last checkpoint
OR
b. nothing except snapshotstanby WAL has happened

Currently it is done for point a.

Having to issue a checkpoint while ensuring transactions

are

running

just to get a standby up doesn't seem like a good idea to

me :)

Simon:

If you make the correct test, I'd be more inclined to

accept

the

premise.

Not sure, what exact you are expecting from test?
The test is do any one operation on system and then keep the

system

idle.

Now at each checkpoint interval, it logs WAL for

SnapshotStandby.

I can't really follow what you want to do here. The snapshot is

only

logged if a checkpoint is performed anyway? As recovery starts

at

(the

logical) checkpoint's location we need to log a snapshot

exactly

there. If you want to avoid activity when the system is idle

you

need

to
prevent checkpoints from occurring itself.

Even if the checkpoint is scheduled, it doesn't perform actual

operation if

there's nothing logged between
current and previous checkpoint due to below check in

CreateCheckPoint()

function.
if (curInsert == ControlFile->checkPoint +
MAXALIGN(SizeOfXLogRecord +

sizeof(CheckPoint)) &&

ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo)

But if we set the wal_level as hot_standby, it will log snapshot,

now

next

time again when function CreateCheckPoint()
will get called due to scheduled checkpoint, the above check will

fail and

it will again log snapshot, so this will continue, even if the

system

is

totally idle.
I understand that it doesn't cause any problem, but I think it is

better if

the repeated log of snapshot in this scenario can be avoided.

ISTM in that case you "just" need a way to cope with the

additionally

logged record in the above piece of code. Not logging seems to be

the
entirely wrong way to go at this.

I think one of the ways code can be modified is as below:
+		    /*size of running transactions log when there is no
active transation*/
+                if (!shutdown && XLogStandbyInfoActive())
+                {
+                        runningXactXLog =
MAXALIGN(MinSizeOfXactRunningXacts) + SizeOfXLogRecord;
+                }
! if (curInsert == ControlFile->checkPoint +
! MAXALIGN(SizeOfXLogRecord +
sizeof(CheckPoint)) &&

! ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo)

! if (curInsert == ControlFile->checkPoint +
! MAXALIGN(SizeOfXLogRecord +

sizeof(CheckPoint)) &&

! ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo + runningXactXLog)

Second condition is checking the last checkpoint WAL position with

the

current one.
Since ControlFile->checkPointCopy.redo holds the value before

"running

Xact" WAL was inserted
and ControlFile->checkPoint holds the value after "running Xact" WAL

got

inserted, so if no new WAL was inserted apart from "running Xacts"

and

"Checkpoint" WAL, then this condition will be true.

I don't think thats safe, there could have been another record inserted
that happens to be MinSizeOfXactRunningXacts big and we would still
skip the checkpoint.
I think such can happen only for when first time checkpoint is triggered,
and even then the first part of the check (curInsert ==
ControlFile->checkPoint + MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint))
will fail.

Value to runningXactXLog will be assigned only if wal_level is hot_stanby.
In that case if checkpoint is getting scheduled for 2nd or consecutive time,
it will include WAL for "running Xact" along with WAL for any other data.
So now even if the other data is of size MinSizeOfXactRunningXacts, the
check should fail and skip the checkpoint.

Hm. The locking around checkpoints probably prevents the case I was
worried about in combination with the wal_level not changing while
running.

Also why such cannot happen for Checkpoint record, because there is almost
similar check for Checkpoint record in the same if check?

Because we compare the address of the checkpoint record and count the
size from that. That misses some cases (when wrapping to a new xlog
page) but it won't give false positives.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Amit Kapila

amit.kapila@huawei.com

about 13 years ago

In reply to: Andres Freund (#13)

Re: Extra XLOG in Checkpoint for StandbySnapshot

On Wednesday, January 09, 2013 5:49 PM Andres Freund wrote:

On 2013-01-09 15:06:04 +0530, Amit Kapila wrote:

On Wednesday, January 09, 2013 2:28 PM Andres Freund wrote:

On 2013-01-09 14:04:32 +0530, Amit Kapila wrote:

On Tuesday, January 08, 2013 8:57 PM Andres Freund wrote:

On 2013-01-08 20:33:28 +0530, Amit Kapila wrote:

On Tuesday, January 08, 2013 8:01 PM Andres Freund wrote:

On 2013-01-08 19:51:39 +0530, Amit Kapila wrote:

On Monday, January 07, 2013 7:15 PM Andres Freund wrote:

On 2013-01-07 19:03:35 +0530, Amit Kapila wrote:

On Monday, January 07, 2013 6:30 PM Simon Riggs

wrote:

On 7 January 2013 12:39, Amit Kapila

<amit.kapila@huawei.com>

wrote:

The information that no transactions are currently

running

allows

you

to
build a recovery snapshot, without that information the

standby

won't

start answering queries. Now that doesn't matter if all

standbys

already
have built a snapshot, but the primary cannot know

that.

Can't we make sure that checkpoint operation doesn't

happen

for

below

conds.

a. nothing has happened during or after last checkpoint
OR
b. nothing except snapshotstanby WAL has happened

Currently it is done for point a.

Having to issue a checkpoint while ensuring

transactions

are

running

just to get a standby up doesn't seem like a good idea

to

me :)

Simon:

If you make the correct test, I'd be more inclined to

accept

the

premise.

Not sure, what exact you are expecting from test?
The test is do any one operation on system and then keep

the

system

idle.

Now at each checkpoint interval, it logs WAL for

SnapshotStandby.

I can't really follow what you want to do here. The

snapshot is

only

logged if a checkpoint is performed anyway? As recovery

starts

at

(the

logical) checkpoint's location we need to log a snapshot

exactly

there. If you want to avoid activity when the system is

idle

you

need

to
prevent checkpoints from occurring itself.

Even if the checkpoint is scheduled, it doesn't perform

actual

operation if

there's nothing logged between
current and previous checkpoint due to below check in

CreateCheckPoint()

function.
if (curInsert == ControlFile->checkPoint +
MAXALIGN(SizeOfXLogRecord +

sizeof(CheckPoint)) &&

ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo)

But if we set the wal_level as hot_standby, it will log

snapshot,

now

next

time again when function CreateCheckPoint()
will get called due to scheduled checkpoint, the above check

will

fail and

it will again log snapshot, so this will continue, even if

the

system

is

totally idle.
I understand that it doesn't cause any problem, but I think

it is

better if

the repeated log of snapshot in this scenario can be avoided.

ISTM in that case you "just" need a way to cope with the

additionally

logged record in the above piece of code. Not logging seems to

be

the

entirely wrong way to go at this.

I think one of the ways code can be modified is as below:

+ /*size of running transactions log when there is

no
active transation*/
+                if (!shutdown && XLogStandbyInfoActive())
+                {
+                        runningXactXLog =
MAXALIGN(MinSizeOfXactRunningXacts) + SizeOfXLogRecord;
+                }
! if (curInsert == ControlFile->checkPoint +
! MAXALIGN(SizeOfXLogRecord +
sizeof(CheckPoint)) &&

! ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo)

! if (curInsert == ControlFile->checkPoint +
! MAXALIGN(SizeOfXLogRecord +

sizeof(CheckPoint)) &&

! ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo + runningXactXLog)

Second condition is checking the last checkpoint WAL position
with

the

current one.
Since ControlFile->checkPointCopy.redo holds the value before

"running

Xact" WAL was inserted
and ControlFile->checkPoint holds the value after "running Xact"

WAL

got

inserted, so if no new WAL was inserted apart from "running

Xacts"

and

"Checkpoint" WAL, then this condition will be true.

I don't think thats safe, there could have been another record

inserted

that happens to be MinSizeOfXactRunningXacts big and we would still
skip the checkpoint.

I think such can happen only for when first time checkpoint is

triggered,

and even then the first part of the check (curInsert ==
ControlFile->checkPoint + MAXALIGN(SizeOfXLogRecord +

sizeof(CheckPoint))

will fail.

Value to runningXactXLog will be assigned only if wal_level is

hot_stanby.

In that case if checkpoint is getting scheduled for 2nd or

consecutive time,

it will include WAL for "running Xact" along with WAL for any other

data.

So now even if the other data is of size MinSizeOfXactRunningXacts,

the

check should fail and skip the checkpoint.

Hm. The locking around checkpoints probably prevents the case I was
worried about in combination with the wal_level not changing while
running.

In that case, can we consider this as a patch for Commit.
If there is no objection, shall I prepare bug-fix patch for the scenario
reported.

With Regards,
Amit Kapila.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Amit kapila

amit.kapila@huawei.com

about 13 years ago

In reply to: Amit Kapila (#14)

1 attachment(s)

Re: Extra XLOG in Checkpoint for StandbySnapshot

On Wednesday, January 09, 2013 7:15 PM Amit Kapila wrote:
On Wednesday, January 09, 2013 5:49 PM Andres Freund wrote:

On 2013-01-09 15:06:04 +0530, Amit Kapila wrote:

On Wednesday, January 09, 2013 2:28 PM Andres Freund wrote:

On 2013-01-09 14:04:32 +0530, Amit Kapila wrote:

On 7 January 2013 12:39, Amit Kapila

<amit.kapila@huawei.com>

wrote:

The information that no transactions are currently

running

allows

you

to
build a recovery snapshot, without that information the

standby

won't

start answering queries. Now that doesn't matter if all

standbys

already
have built a snapshot, but the primary cannot know

that.

Can't we make sure that checkpoint operation doesn't

happen

for

below

conds.

a. nothing has happened during or after last checkpoint
OR
b. nothing except snapshotstanby WAL has happened

Currently it is done for point a.

Having to issue a checkpoint while ensuring

transactions

are

running

just to get a standby up doesn't seem like a good idea

to

me :)

Simon:

If you make the correct test, I'd be more inclined to

accept

the

premise.

Not sure, what exact you are expecting from test?
The test is do any one operation on system and then keep

the

system

idle.

Now at each checkpoint interval, it logs WAL for

SnapshotStandby.

I can't really follow what you want to do here. The

snapshot is

only

logged if a checkpoint is performed anyway? As recovery

starts

at

(the

logical) checkpoint's location we need to log a snapshot

exactly

there. If you want to avoid activity when the system is

idle

you

need

to
prevent checkpoints from occurring itself.

Even if the checkpoint is scheduled, it doesn't perform

actual

operation if

there's nothing logged between
current and previous checkpoint due to below check in

CreateCheckPoint()

function.
if (curInsert == ControlFile->checkPoint +
MAXALIGN(SizeOfXLogRecord +

sizeof(CheckPoint)) &&

ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo)

But if we set the wal_level as hot_standby, it will log

snapshot,

now

next

time again when function CreateCheckPoint()
will get called due to scheduled checkpoint, the above check

will

fail and

it will again log snapshot, so this will continue, even if

the

system

is

totally idle.
I understand that it doesn't cause any problem, but I think

it is

better if

the repeated log of snapshot in this scenario can be avoided.

ISTM in that case you "just" need a way to cope with the

additionally

logged record in the above piece of code. Not logging seems to

be

the

entirely wrong way to go at this.

I think one of the ways code can be modified is as below:

+ /*size of running transactions log when there is

no
active transation*/
+                if (!shutdown && XLogStandbyInfoActive())
+                {
+                        runningXactXLog =
MAXALIGN(MinSizeOfXactRunningXacts) + SizeOfXLogRecord;
+                }
! if (curInsert == ControlFile->checkPoint +
! MAXALIGN(SizeOfXLogRecord +
sizeof(CheckPoint)) &&

! ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo)

! if (curInsert == ControlFile->checkPoint +
! MAXALIGN(SizeOfXLogRecord +

sizeof(CheckPoint)) &&

! ControlFile->checkPoint ==
ControlFile->checkPointCopy.redo + runningXactXLog)

Second condition is checking the last checkpoint WAL position
with

the

current one.
Since ControlFile->checkPointCopy.redo holds the value before

"running

Xact" WAL was inserted
and ControlFile->checkPoint holds the value after "running Xact"

WAL

got

inserted, so if no new WAL was inserted apart from "running

Xacts"

and

"Checkpoint" WAL, then this condition will be true.

I don't think thats safe, there could have been another record

inserted

that happens to be MinSizeOfXactRunningXacts big and we would still
skip the checkpoint.

I think such can happen only for when first time checkpoint is

triggered,

and even then the first part of the check (curInsert ==
ControlFile->checkPoint + MAXALIGN(SizeOfXLogRecord +

sizeof(CheckPoint))

will fail.

Value to runningXactXLog will be assigned only if wal_level is

hot_stanby.

In that case if checkpoint is getting scheduled for 2nd or

consecutive time,

it will include WAL for "running Xact" along with WAL for any other

data.

So now even if the other data is of size MinSizeOfXactRunningXacts,

the

check should fail and skip the checkpoint.

Hm. The locking around checkpoints probably prevents the case I was
worried about in combination with the wal_level not changing while
running.

Patch to address the issue is attached with this mail.
Suggestions?

With Regards,
Amit Kapila.

Attachments:

avoid_Extra_WAL_checkpoint.patchapplication/octet-stream; name=avoid_Extra_WAL_checkpoint.patchDownload

*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 7084,7094 **** CreateCheckPoint(int flags)
  				  CHECKPOINT_FORCE)) == 0)
  	{
  		XLogRecPtr	curInsert;
  
  		INSERT_RECPTR(curInsert, Insert, Insert->curridx);
  		if (curInsert == ControlFile->checkPoint + 
  			MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint)) &&
! 			ControlFile->checkPoint == ControlFile->checkPointCopy.redo)
  		{
  			LWLockRelease(WALInsertLock);
  			LWLockRelease(CheckpointLock);
--- 7084,7110 ----
  				  CHECKPOINT_FORCE)) == 0)
  	{
  		XLogRecPtr	curInsert;
+ 		uint32 runningXactXLog = 0;
  
+ 		/* 
+ 		  * Take the size of RUNNING XACT XLOG, only if HOT STANDBY is enabled.
+ 		  * Because if it is not enabled, then RUNNING XACT XLOG is not logged.
+ 		  */
+ 		if (!shutdown && XLogStandbyInfoActive())
+ 		{
+ 			runningXactXLog = MAXALIGN(MinSizeOfXactRunningXacts) + SizeOfXLogRecord;
+ 		}		
+ 
+ 		/*
+ 		  * ControlFile->checkPointCopy.redo points to the offset before 
+ 		  * RUNNING XACT XLOG was inserted but ControlFile->checkPoint
+ 		  * points to the offset after insertion, so we should add size of RUNNING XACT XLOG
+ 		  * to check if any other new XLOG has been inserted
+ 		  */		  
  		INSERT_RECPTR(curInsert, Insert, Insert->curridx);
  		if (curInsert == ControlFile->checkPoint + 
  			MAXALIGN(SizeOfXLogRecord + sizeof(CheckPoint)) &&
! 			ControlFile->checkPoint == ControlFile->checkPointCopy.redo + runningXactXLog)
  		{
  			LWLockRelease(WALInsertLock);
  			LWLockRelease(CheckpointLock);