pgsql: Fix bug leading to restoring unlogged relations from empty files

Started by Andres Freundover 10 years ago3 messagescomitters
Jump to latest
#1Andres Freund
andres@anarazel.de

Fix bug leading to restoring unlogged relations from empty files.

At the end of crash recovery, unlogged relations are reset to the empty
state, using their init fork as the template. The init fork is copied to
the main fork without going through shared buffers. Unfortunately WAL
replay so far has not necessarily flushed writes from shared buffers to
disk at that point. In normal crash recovery, and before the
introduction of 'fast promotions' in fd4ced523 / 9.3, the
END_OF_RECOVERY checkpoint flushes the buffers out in time. But with
fast promotions that's not the case anymore.

To fix, force WAL writes targeting the init fork to be flushed
immediately (using the new FlushOneBuffer() function). In 9.5+ that
flush can centrally be triggered from the code dealing with restoring
full page writes (XLogReadBufferForRedoExtended), in earlier releases
that responsibility is in the hands of XLOG_HEAP_NEWPAGE's replay
function.

Backpatch to 9.1, even if this currently is only known to trigger in
9.3+. Flushing earlier is more robust, and it is advantageous to keep
the branches similar.

Typical symptoms of this bug are errors like
'ERROR: index "..." contains unexpected zero page at block 0'
shortly after promoting a node.

Reported-By: Thom Brown
Author: Andres Freund and Michael Paquier
Discussion: 20150326175024.GJ451@alap3.anarazel.de
Backpatch: 9.1-

Branch
------
REL9_5_STABLE

Details
-------
http://git.postgresql.org/pg/commitdiff/5b51805fe4b9fec95274924c5733093a5f685a58

Modified Files
--------------
src/backend/access/transam/xlogutils.c | 9 +++++++++
src/backend/storage/buffer/bufmgr.c | 21 +++++++++++++++++++++
src/include/storage/bufmgr.h | 1 +
3 files changed, 31 insertions(+)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#2Peter Eisentraut
peter_e@gmx.net
In reply to: Andres Freund (#1)
Re: pgsql: Fix bug leading to restoring unlogged relations from empty files

Coverity is complaining that in

void
FlushOneBuffer(Buffer buffer)
{
BufferDesc *bufHdr;

/* currently not needed, but no fundamental reason not to support */
Assert(!BufferIsLocal(buffer));

Assert(BufferIsPinned(buffer));

bufHdr = GetBufferDescriptor(buffer - 1);

LWLockHeldByMe(bufHdr->content_lock);

FlushBuffer(bufHdr, NULL);
}

the call to LWLockHeldByMe() is useless.

On 12/10/15 10:50 AM, Andres Freund wrote:

Fix bug leading to restoring unlogged relations from empty files.

At the end of crash recovery, unlogged relations are reset to the empty
state, using their init fork as the template. The init fork is copied to
the main fork without going through shared buffers. Unfortunately WAL
replay so far has not necessarily flushed writes from shared buffers to
disk at that point. In normal crash recovery, and before the
introduction of 'fast promotions' in fd4ced523 / 9.3, the
END_OF_RECOVERY checkpoint flushes the buffers out in time. But with
fast promotions that's not the case anymore.

To fix, force WAL writes targeting the init fork to be flushed
immediately (using the new FlushOneBuffer() function). In 9.5+ that
flush can centrally be triggered from the code dealing with restoring
full page writes (XLogReadBufferForRedoExtended), in earlier releases
that responsibility is in the hands of XLOG_HEAP_NEWPAGE's replay
function.

Backpatch to 9.1, even if this currently is only known to trigger in
9.3+. Flushing earlier is more robust, and it is advantageous to keep
the branches similar.

Typical symptoms of this bug are errors like
'ERROR: index "..." contains unexpected zero page at block 0'
shortly after promoting a node.

Reported-By: Thom Brown
Author: Andres Freund and Michael Paquier
Discussion: 20150326175024.GJ451@alap3.anarazel.de
Backpatch: 9.1-

Branch
------
REL9_5_STABLE

Details
-------
http://git.postgresql.org/pg/commitdiff/5b51805fe4b9fec95274924c5733093a5f685a58

Modified Files
--------------
src/backend/access/transam/xlogutils.c | 9 +++++++++
src/backend/storage/buffer/bufmgr.c | 21 +++++++++++++++++++++
src/include/storage/bufmgr.h | 1 +
3 files changed, 31 insertions(+)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

#3Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#2)
Re: pgsql: Fix bug leading to restoring unlogged relations from empty files

On Mon, Dec 14, 2015 at 2:26 PM, Peter Eisentraut <peter_e@gmx.net> wrote:

the call to LWLockHeldByMe() is useless.

Yes, but it should be an Assert. I guess that Andres is on it..
--
Michael

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers