[PATCH] Assert that the correct locks are held when calling PageGetLSN()
Hello all,
While working on checksum support for GPDB, we noticed that several
callers of PageGetLSN() didn't follow the correct locking procedure.
To try to help ferret out present and future mistakes, we added an
assertion to PageGetLSN() that checks whether those locks were being
held correctly. This patch is a first-draft attempt to port that
assertion back up to postgres master, based on work by Asim Praveen,
Ashwin Agrawal, and myself.
The patch is really two pieces: add the assertion, and fix the callers
that would trip it. The latter part is still in progress, because I'm
running into some places where I'm not sure what the correct way
forward is.
(I'm a newbie to this list and this code base, so please don't
hesitate to correct me on anything, whether that's code- or
culture-related!)
= Notes/Observations =
- This change introduces a subtle source incompatibility: whereas
previously both Pages and PageHeaders could be passed to PageGetLSN(),
now callers must explicitly cast to Page if they don't already have
one.
- I originally tried to put (and the GPDB patch succeeds in putting)
the entire assertion implementation into PageGetLSN in bufpage.h. That
seems unworkable here -- buf_internals.h is no longer publicized, and
the assertion needs those internals to perform its checks. So I moved
the assertion into its own helper function in bufmgr.c. If assertions
are disabled, that helper declaration gets turned into an empty macro,
so there shouldn't be a performance hit.
= Open Questions =
- Is my use of FRONTEND here correct? The documentation made it seem
like some compilers could try to link against the
AssertPageIsLockedForLSN function, even if PageGetLSN was never
called.
- Use of PageGetLSN() in heapam.c seems to be incorrect, but I don't
really know what the best way is to fix it. It explicitly unlocks the
buffer, with the comment that visibilitymap_set() needs to handle
locking itself, but then calls PageGetLSN() to determine whether
visibilitymap_set() needs to be called.
- TestForOldSnapshot is also problematic. The function is called in
places where only a shared lock is held, so it needs to hold the
header spinlock at least some of the time, but we only pass the Page
as an argument. I could do the same "find buffer descriptor from page
pointer" logic that we utilize in the assertion implementation, but is
there a better way?
- Should I try to add this assertion to PageSetLSN() as well? We were
focused mostly on the GetLSN side of things, since that was the more
risky piece of our backport. PageSetLSN is called from many more
places, and the review there would be much larger.
= Known Issues =
- This approach doesn't seem to work when checksums are disabled. In
that case, BufferGetLSNAtomic isn't actually atomic, so it seems to
always trip the assertion. I'm not sure of the best way to proceed --
is it really correct not to follow the locking contract if checksums
are disabled?
What do you think? Is this something worth pursuing? Any and all
comments welcome.
Thanks!
--Jacob
Attachments:
PageGetLSN-assert-locks-held.patchapplication/octet-stream; name=PageGetLSN-assert-locks-held.patchDownload
From e2bbc273a17e8effa8366764cfc1bea5625b7f1d Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@pivotal.io>
Date: Mon, 28 Aug 2017 10:36:31 -0700
Subject: [RFC PATCH] PageGetLSN: assert that locks are properly held
When accessing a shared page, callers must either hold an exclusive lock
on the buffer, OR a shared lock plus the buffer header spinlock (this
additional constraint exists because it's legal to write the LSN without
holding an exclusive lock). Several callers are currently not following
this contract.
To help developers find issues, assert that the proper locks are held
for shared pages inside PageGetLSN. The implementation for this check is
added in a helper function, AssertPageIsLockedForLSN, which is called
from PageGetLSN whenever backend code is built with assertions enabled.
This introduces a slight source incompatibility: previously, both Pages
and PageHeaders could be passed to the macro, but now callers must
explicitly pass a Page.
In places where PageGetLSN is clearly insufficient, move to
BufferGetLSNAtomic instead. This is a work in progress.
---
contrib/pageinspect/rawpage.c | 2 +-
src/backend/access/gist/gist.c | 4 ++--
src/backend/access/gist/gistget.c | 4 ++--
src/backend/access/gist/gistvacuum.c | 2 +-
src/backend/access/nbtree/nbtsearch.c | 2 +-
src/backend/access/nbtree/nbtutils.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 25 +++++++++++++++++++++++++
src/include/storage/bufpage.h | 17 +++++++++++++++--
8 files changed, 48 insertions(+), 10 deletions(-)
diff --git a/contrib/pageinspect/rawpage.c b/contrib/pageinspect/rawpage.c
index 25af22f453..e3a1ca0e3b 100644
--- a/contrib/pageinspect/rawpage.c
+++ b/contrib/pageinspect/rawpage.c
@@ -250,7 +250,7 @@ page_header(PG_FUNCTION_ARGS)
/* Extract information from the page header */
- lsn = PageGetLSN(page);
+ lsn = PageGetLSN((Page) page);
/* pageinspect >= 1.2 uses pg_lsn instead of text for the LSN field. */
if (TupleDescAttr(tupdesc, 0)->atttypid == TEXTOID)
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 565525bbdf..281d2b2a16 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -640,7 +640,7 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace, GISTSTATE *giststate)
}
stack->page = (Page) BufferGetPage(stack->buffer);
- stack->lsn = PageGetLSN(stack->page);
+ stack->lsn = BufferGetLSNAtomic(stack->buffer);
Assert(!RelationNeedsWAL(state.r) || !XLogRecPtrIsInvalid(stack->lsn));
/*
@@ -890,7 +890,7 @@ gistFindPath(Relation r, BlockNumber child, OffsetNumber *downlinkoffnum)
break;
}
- top->lsn = PageGetLSN(page);
+ top->lsn = BufferGetLSNAtomic(buffer);
/*
* If F_FOLLOW_RIGHT is set, the page to the right doesn't have a
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index 760ea0c997..68711bfc09 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -61,7 +61,7 @@ gistkillitems(IndexScanDesc scan)
* read. killedItems could be not valid so LP_DEAD hints applying is not
* safe.
*/
- if (PageGetLSN(page) != so->curPageLSN)
+ if (BufferGetLSNAtomic(buffer) != so->curPageLSN)
{
UnlockReleaseBuffer(buffer);
so->numKilled = 0; /* reset counter */
@@ -384,7 +384,7 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
* safe to apply LP_DEAD hints to the page later. This allows us to drop
* the pin for MVCC scans, which allows vacuum to avoid blocking.
*/
- so->curPageLSN = PageGetLSN(page);
+ so->curPageLSN = BufferGetLSNAtomic(buffer);
/*
* check all tuples on page
diff --git a/src/backend/access/gist/gistvacuum.c b/src/backend/access/gist/gistvacuum.c
index 77d9d12f0b..54a61cc427 100644
--- a/src/backend/access/gist/gistvacuum.c
+++ b/src/backend/access/gist/gistvacuum.c
@@ -249,7 +249,7 @@ gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
ptr = (GistBDItem *) palloc(sizeof(GistBDItem));
ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
- ptr->parentlsn = PageGetLSN(page);
+ ptr->parentlsn = BufferGetLSNAtomic(buffer);
ptr->next = stack->next;
stack->next = ptr;
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index 642c8943e7..1ca2623dea 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -1224,7 +1224,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum)
* safe to apply LP_DEAD hints to the page later. This allows us to drop
* the pin for MVCC scans, which allows vacuum to avoid blocking.
*/
- so->currPos.lsn = PageGetLSN(page);
+ so->currPos.lsn = BufferGetLSNAtomic(so->currPos.buf);
/*
* we must save the page's right-link while scanning it; this tells us
diff --git a/src/backend/access/nbtree/nbtutils.c b/src/backend/access/nbtree/nbtutils.c
index dbfb775dec..4360661789 100644
--- a/src/backend/access/nbtree/nbtutils.c
+++ b/src/backend/access/nbtree/nbtutils.c
@@ -1772,7 +1772,7 @@ _bt_killitems(IndexScanDesc scan)
return;
page = BufferGetPage(buf);
- if (PageGetLSN(page) == so->currPos.lsn)
+ if (BufferGetLSNAtomic(buf) == so->currPos.lsn)
so->currPos.buf = buf;
else
{
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 15795b0c5a..b871c76db7 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -4355,3 +4355,28 @@ TestForOldSnapshot_impl(Snapshot snapshot, Relation relation)
(errcode(ERRCODE_SNAPSHOT_TOO_OLD),
errmsg("snapshot too old")));
}
+
+#if defined(USE_ASSERT_CHECKING) && !defined(FRONTEND)
+void
+AssertPageIsLockedForLSN(Page page)
+{
+ char *pagePtr = page;
+
+ /*
+ * We only want to assert that we hold a lock on the page contents if the
+ * page is shared (i.e. it is one of the BufferBlocks).
+ */
+ if (BufferBlocks <= pagePtr
+ && pagePtr < (BufferBlocks + NBuffers * BLCKSZ))
+ {
+ ptrdiff_t bufId = (pagePtr - BufferBlocks) / BLCKSZ;
+ BufferDesc *buf = GetBufferDescriptor(bufId);
+ LWLock *content_lock = BufferDescriptorGetContentLock(buf);
+ uint32 buf_state = pg_atomic_read_u32(&buf->state);
+
+ Assert(LWLockHeldByMeInMode(content_lock, LW_EXCLUSIVE)
+ || (LWLockHeldByMeInMode(content_lock, LW_SHARED)
+ && (buf_state & BM_LOCKED)));
+ }
+}
+#endif
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 50c72a3c8d..db94310586 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -359,8 +359,21 @@ PageValidateSpecialPointer(Page page)
* Additional macros for access to page headers. (Beware multiple evaluation
* of the arguments!)
*/
-#define PageGetLSN(page) \
- PageXLogRecPtrGet(((PageHeader) (page))->pd_lsn)
+
+#if defined(USE_ASSERT_CHECKING) && !defined(FRONTEND)
+/* in bufmgr.c; used in PageGetLSN */
+extern void AssertPageIsLockedForLSN(Page page);
+#else
+#define AssertPageIsLockedForLSN(page)
+#endif
+
+static inline XLogRecPtr
+PageGetLSN(Page page)
+{
+ AssertPageIsLockedForLSN(page);
+ return PageXLogRecPtrGet(((PageHeader) page)->pd_lsn);
+}
+
#define PageSetLSN(page, lsn) \
PageXLogRecPtrSet(((PageHeader) (page))->pd_lsn, lsn)
--
2.14.0
On Thu, Aug 31, 2017 at 2:53 PM, Jacob Champion <pchampion@pivotal.io> wrote:
While working on checksum support for GPDB, we noticed that several
callers of PageGetLSN() didn't follow the correct locking procedure.
To try to help ferret out present and future mistakes, we added an
assertion to PageGetLSN() that checks whether those locks were being
held correctly. This patch is a first-draft attempt to port that
assertion back up to postgres master, based on work by Asim Praveen,
Ashwin Agrawal, and myself.The patch is really two pieces: add the assertion, and fix the callers
that would trip it. The latter part is still in progress, because I'm
running into some places where I'm not sure what the correct way
forward is.(I'm a newbie to this list and this code base, so please don't
hesitate to correct me on anything, whether that's code- or
culture-related!)
It's a good idea to add patches to commitfest.postgresql.org
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Sep 1, 2017 at 9:12 AM, Robert Haas <robertmhaas@gmail.com> wrote:
It's a good idea to add patches to commitfest.postgresql.org
Hi Robert, I added it yesterday as
https://commitfest.postgresql.org/14/1279/ . Let me know if I need to
touch anything up there.
Thanks!
--Jacob
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Sep 1, 2017 at 12:24 PM, Jacob Champion <pchampion@pivotal.io> wrote:
On Fri, Sep 1, 2017 at 9:12 AM, Robert Haas <robertmhaas@gmail.com> wrote:
It's a good idea to add patches to commitfest.postgresql.org
Hi Robert, I added it yesterday as
https://commitfest.postgresql.org/14/1279/ . Let me know if I need to
touch anything up there.
Nah, looks fine, I just somehow missed it when I looked the first time.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Sep 1, 2017 at 3:53 AM, Jacob Champion <pchampion@pivotal.io> wrote:
The patch is really two pieces: add the assertion, and fix the callers
that would trip it. The latter part is still in progress, because I'm
running into some places where I'm not sure what the correct way
forward is.
I looked at your patch.
While working on checksum support for GPDB, we noticed that several
callers of PageGetLSN() didn't follow the correct locking procedure.
Well, PageGetLSN can be used in some hot code paths, xloginsert.c
being one, so it does not seem wise to me to switch it to something
more complicated than a macro, and also it is not bounded to any
locking contracts now. For example, pageinspect works on a copy of the
buffer, so that's one case where we just don't care. So we should it
be tightened in Postgres? That's the portion of the proposal I don't
get. You could introduce a custom API for this purpose, isn't the
origin of your problems with GPDB for checksums that data is
replicated across many nodes so things need to be locked uniquely?
- This change introduces a subtle source incompatibility: whereas
previously both Pages and PageHeaders could be passed to PageGetLSN(),
now callers must explicitly cast to Page if they don't already have
one.
That would produce warnings for extensions, so people would likely
catch that when compiling with a newer version, if they use -Werror...
- Is my use of FRONTEND here correct? The documentation made it seem
like some compilers could try to link against the
AssertPageIsLockedForLSN function, even if PageGetLSN was never
called.
+#if defined(USE_ASSERT_CHECKING) && !defined(FRONTEND)
+void
+AssertPageIsLockedForLSN(Page page)
+{
From a design point of view, bufmgr.c is an API interface for buffer
management on the backend-size, so just using FRONTEND there looks
wrong to me (bufmgr.h is already bad enough IMO now).
- Use of PageGetLSN() in heapam.c seems to be incorrect, but I don't
really know what the best way is to fix it. It explicitly unlocks the
buffer, with the comment that visibilitymap_set() needs to handle
locking itself, but then calls PageGetLSN() to determine whether
visibilitymap_set() needs to be called.
This bit in src/backend/access/transam/README may interest you:
Note that we must only use PageSetLSN/PageGetLSN() when we know the action
is serialised. Only Startup process may modify data blocks during recovery,
so Startup process may execute PageGetLSN() without fear of serialisation
problems. All other processes must only call PageSet/GetLSN when holding
either an exclusive buffer lock or a shared lock plus buffer header lock,
or be writing the data block directly rather than through shared buffers
while holding AccessExclusiveLock on the relation.
So I see no problem with the exiting caller.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi Michael, thanks for the review!
On Sun, Sep 3, 2017 at 11:14 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
While working on checksum support for GPDB, we noticed that several
callers of PageGetLSN() didn't follow the correct locking procedure.Well, PageGetLSN can be used in some hot code paths, xloginsert.c
being one, so it does not seem wise to me to switch it to something
more complicated than a macro,
If raw perf is an issue -- now that I've refactored the assertion into
its own function, I could probably push the implementation back into a
macro. Something like
#define PageGetLSN(page) (AssertPageIsLockedForLSN((Page) page),
PageXLogRecPtrGet(((PageHeader) page)->pd_lsn))
Callers would need to watch out for the double-evaluation, but that's
already documented with this group of macros.
and also it is not bounded to any
locking contracts now. For example, pageinspect works on a copy of the
buffer, so that's one case where we just don't care. So we should it
be tightened in Postgres?
The proposed patch doesn't tighten the contract in this case -- if the
buffer that's passed to PageGetLSN() isn't a shared buffer (i.e. if
it's not part of the BufferBlocks array), the assertion passes
unconditionally.
That's the portion of the proposal I don't
get. You could introduce a custom API for this purpose, isn't the
origin of your problems with GPDB for checksums that data is
replicated across many nodes so things need to be locked uniquely?
GPDB introduces additional problems, but I think this issue is
independent of anything GPDB-specific. One way to look at it: are the
changes in the proposed patch, from PageGetLSN to BufferGetLSNAtomic,
correct? If not, well, that's a good discussion to have. But if they
are correct, then why were they missed? and is there a way to help
future contributors out in the future? That's the goal of this patch;
I don't think it's something that a custom API solves.
- This change introduces a subtle source incompatibility: whereas
previously both Pages and PageHeaders could be passed to PageGetLSN(),
now callers must explicitly cast to Page if they don't already have
one.That would produce warnings for extensions, so people would likely
catch that when compiling with a newer version, if they use -Werror...- Is my use of FRONTEND here correct? The documentation made it seem
like some compilers could try to link against the
AssertPageIsLockedForLSN function, even if PageGetLSN was never
called.+#if defined(USE_ASSERT_CHECKING) && !defined(FRONTEND) +void +AssertPageIsLockedForLSN(Page page) +{ From a design point of view, bufmgr.c is an API interface for buffer management on the backend-size, so just using FRONTEND there looks wrong to me (bufmgr.h is already bad enough IMO now).
The comments suggested that bufmgr.h could be incidentally pulled into
frontend code through other headers. Or do you mean that I don't need
to check for FRONTEND in the C source file (since, I assume, it's only
ever being compiled for the backend)? I'm not sure what you mean by
"bufmgr.h is already bad enough".
- Use of PageGetLSN() in heapam.c seems to be incorrect, but I don't
really know what the best way is to fix it. It explicitly unlocks the
buffer, with the comment that visibilitymap_set() needs to handle
locking itself, but then calls PageGetLSN() to determine whether
visibilitymap_set() needs to be called.This bit in src/backend/access/transam/README may interest you:
Note that we must only use PageSetLSN/PageGetLSN() when we know the action
is serialised. Only Startup process may modify data blocks during recovery,
so Startup process may execute PageGetLSN() without fear of serialisation
problems. All other processes must only call PageSet/GetLSN when holding
either an exclusive buffer lock or a shared lock plus buffer header lock,
or be writing the data block directly rather than through shared buffers
while holding AccessExclusiveLock on the relation.So I see no problem with the exiting caller.
Is heap_xlog_visible only called during startup? If so, is there a
general rule for which locks are required during startup and which
aren't?
Currently there is no exception in the assertion made for startup
processes, so that's something that would need to be changed. Is there
a way to determine whether the current process considers itself to be
a startup process?
Thanks again!
--Jacob
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Sep 6, 2017 at 1:15 AM, Jacob Champion <pchampion@pivotal.io> wrote:
On Sun, Sep 3, 2017 at 11:14 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:+#if defined(USE_ASSERT_CHECKING) && !defined(FRONTEND) +void +AssertPageIsLockedForLSN(Page page) +{ From a design point of view, bufmgr.c is an API interface for buffer management on the backend-size, so just using FRONTEND there looks wrong to me (bufmgr.h is already bad enough IMO now).The comments suggested that bufmgr.h could be incidentally pulled into
frontend code through other headers. Or do you mean that I don't need
to check for FRONTEND in the C source file (since, I assume, it's only
ever being compiled for the backend)? I'm not sure what you mean by
"bufmgr.h is already bad enough".
I mean that this should not become worse without a better reason. This
patch makes it so.
This bit in src/backend/access/transam/README may interest you:
Note that we must only use PageSetLSN/PageGetLSN() when we know the action
is serialised. Only Startup process may modify data blocks during recovery,
so Startup process may execute PageGetLSN() without fear of serialisation
problems. All other processes must only call PageSet/GetLSN when holding
either an exclusive buffer lock or a shared lock plus buffer header lock,
or be writing the data block directly rather than through shared buffers
while holding AccessExclusiveLock on the relation.So I see no problem with the exiting caller.
Is heap_xlog_visible only called during startup?
Recovery is more exact. When replaying a XLOG_HEAP2_VISIBLE record.
If so, is there a
general rule for which locks are required during startup and which
aren't?
You can surely say that a process is fine to read a variable without a
lock if it is the only one updating it, other processes would need a
lock to read a variable. In this case the startup process is the only
one updating blocks, so that's at least one code path where the is no
need to take a lock when looking at a page LSN with PageGetLSN.
Another example is pageinspect which works on a copy of a page and
uses PageGetLSN, so no direct locks on the buffer page is needed.
In short, it seems to me that this patch should be rejected in its
current shape. There is no need to put more constraints on a API which
does not need more constraints and which is used widely by extensions.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Sep 5, 2017 at 10:49 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
On Wed, Sep 6, 2017 at 1:15 AM, Jacob Champion <pchampion@pivotal.io> wrote:
On Sun, Sep 3, 2017 at 11:14 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:+#if defined(USE_ASSERT_CHECKING) && !defined(FRONTEND) +void +AssertPageIsLockedForLSN(Page page) +{ From a design point of view, bufmgr.c is an API interface for buffer management on the backend-size, so just using FRONTEND there looks wrong to me (bufmgr.h is already bad enough IMO now).The comments suggested that bufmgr.h could be incidentally pulled into
frontend code through other headers. Or do you mean that I don't need
to check for FRONTEND in the C source file (since, I assume, it's only
ever being compiled for the backend)? I'm not sure what you mean by
"bufmgr.h is already bad enough".I mean that this should not become worse without a better reason. This
patch makes it so.This bit in src/backend/access/transam/README may interest you:
Note that we must only use PageSetLSN/PageGetLSN() when we know the action
is serialised. Only Startup process may modify data blocks during recovery,
so Startup process may execute PageGetLSN() without fear of serialisation
problems. All other processes must only call PageSet/GetLSN when holding
either an exclusive buffer lock or a shared lock plus buffer header lock,
or be writing the data block directly rather than through shared buffers
while holding AccessExclusiveLock on the relation.So I see no problem with the exiting caller.
Is heap_xlog_visible only called during startup?
Recovery is more exact. When replaying a XLOG_HEAP2_VISIBLE record.
If so, is there a
general rule for which locks are required during startup and which
aren't?You can surely say that a process is fine to read a variable without a
lock if it is the only one updating it, other processes would need a
lock to read a variable. In this case the startup process is the only
one updating blocks, so that's at least one code path where the is no
need to take a lock when looking at a page LSN with PageGetLSN.
Is there a way to programmatically determine whether or not we're in
such a situation? That could be added to the assertion.
The code here is pretty inconsistent on whether or not PageGetLSN is
called inside or outside a lock -- see XLogReadBufferForRedoExtended
for instance.
Another example is pageinspect which works on a copy of a page and
uses PageGetLSN, so no direct locks on the buffer page is needed.
And again -- this case should be correctly handled by the new
assertion. Copies of pages are not checked for locks; we can't recover
a buffer header from a page that isn't part of the BufferBlocks array.
In short, it seems to me that this patch should be rejected in its
current shape.
Is the half of the patch that switches PageGetLSN to
BufferGetLSNAtomic correct, at least?
There is no need to put more constraints on a API which
does not need more constraints and which is used widely by extensions.
The goal here is not to add more constraints, but to verify that the
existing constraints are followed. Perhaps this patch doesn't do that
well/correctly, but I want to make sure we're on the same page
regarding the end goal.
--Jacob
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Sep 6, 2017 at 8:37 AM, Jacob Champion <pchampion@pivotal.io> wrote:
On Tue, Sep 5, 2017 at 10:49 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:In short, it seems to me that this patch should be rejected in its
current shape.Is the half of the patch that switches PageGetLSN to
BufferGetLSNAtomic correct, at least?
Any further thoughts on this? If the BufferGetLSNAtomic fixes made
here are not correct to begin with, then the rest of the patch is
probably moot; I just want to double-check that that is the case.
--Jacob
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 20 Sep 2017, at 00:29, Jacob Champion <pchampion@pivotal.io> wrote:
On Wed, Sep 6, 2017 at 8:37 AM, Jacob Champion <pchampion@pivotal.io> wrote:
On Tue, Sep 5, 2017 at 10:49 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:In short, it seems to me that this patch should be rejected in its
current shape.Is the half of the patch that switches PageGetLSN to
BufferGetLSNAtomic correct, at least?Any further thoughts on this? If the BufferGetLSNAtomic fixes made
here are not correct to begin with, then the rest of the patch is
probably moot; I just want to double-check that that is the case.
Based on the discussions in this thread, I’m marking this patch Returned with
feedback. Please re-submit a new version in an upcoming commitfest when ready.
cheers ./daniel
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Sep 4, 2017 at 2:14 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
A>> that would trip it. The latter part is still in progress, because I'm
Well, PageGetLSN can be used in some hot code paths, xloginsert.c
being one, so it does not seem wise to me to switch it to something
more complicated than a macro, and also it is not bounded to any
locking contracts now.
I don't see how turning it into a static inline function is worse.
We've been doing a fair amount of that lately and it generally makes
things better, not worse, often avoiding multiple evaluation hazards
at the same time it generates better machine code.
I find this patch sort of messy; in particular, the definition of
AssertPageIsLockedForLSN tries to reverse-engineer a buffer ID from a
Page, and that's sort of ugly. But I think the concept of trying to
make sure that our code is adhering to necessary locking rules is a
pretty good one.
I think the first question we ought to be asking ourselves is whether
any of the PageGetLSN -> BufferGetLSNAtomic changes the patch
introduces are live bugs. If they are, then we ought to fix those
separately (and probably back-patch). If they are not, then we need
to think about how to adjust the patch so that it doesn't complain
about things that are in fact OK.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Oct 2, 2017 at 9:09 PM, Robert Haas <robertmhaas@gmail.com> wrote:
I think the first question we ought to be asking ourselves is whether
any of the PageGetLSN -> BufferGetLSNAtomic changes the patch
introduces are live bugs. If they are, then we ought to fix those
separately (and probably back-patch). If they are not, then we need
to think about how to adjust the patch so that it doesn't complain
about things that are in fact OK.
If you look at each item one by one that the patch touches based on
the contract defined in transam/README...
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -640,7 +640,7 @@ gistdoinsert(Relation r, IndexTuple itup, Size
freespace, GISTSTATE *giststate)
}
stack->page = (Page) BufferGetPage(stack->buffer);
- stack->lsn = PageGetLSN(stack->page);
+ stack->lsn = BufferGetLSNAtomic(stack->buffer);
There is an incorrect use of PageGetLSN here. A shared lock can be
taken on the page but there is no buffer header lock used when using
PageGetLSN.
@@ -890,7 +890,7 @@ gistFindPath(Relation r, BlockNumber child,
OffsetNumber *downlinkoffnum)
break;
}
- top->lsn = PageGetLSN(page);
+ top->lsn = BufferGetLSNAtomic(buffer);
Here as well only a shared lock is taken on the page.
@@ -61,7 +61,7 @@ gistkillitems(IndexScanDesc scan)
* read. killedItems could be not valid so LP_DEAD hints applying is not
* safe.
*/
- if (PageGetLSN(page) != so->curPageLSN)
+ if (BufferGetLSNAtomic(buffer) != so->curPageLSN)
{
UnlockReleaseBuffer(buffer);
so->numKilled = 0; /* reset counter */
@@ -384,7 +384,7 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem
*pageItem, double *myDistances,
* safe to apply LP_DEAD hints to the page later. This allows us to drop
* the pin for MVCC scans, which allows vacuum to avoid blocking.
*/
- so->curPageLSN = PageGetLSN(page);
+ so->curPageLSN = BufferGetLSNAtomic(buffer);
Those two as well only use a shared lock.
@@ -249,7 +249,7 @@ gistbulkdelete(IndexVacuumInfo *info,
IndexBulkDeleteResult *stats,
ptr = (GistBDItem *) palloc(sizeof(GistBDItem));
ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
- ptr->parentlsn = PageGetLSN(page);
+ ptr->parentlsn = BufferGetLSNAtomic(buffer);
ptr->next = stack->next;
stack->next = ptr;
Here also a shared lock is only taken, that's a VACUUM code path for
Gist indexes.
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -1224,7 +1224,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection
dir, OffsetNumber offnum)
* safe to apply LP_DEAD hints to the page later. This allows us to drop
* the pin for MVCC scans, which allows vacuum to avoid blocking.
*/
- so->currPos.lsn = PageGetLSN(page);
+ so->currPos.lsn = BufferGetLSNAtomic(so->currPos.buf);
Here the caller of _bt_readpage should have taken a lock, but the page
is not necessarily pinned. Still, _bt_getbuf makes sure that the pin
is done.
+++ b/src/backend/access/nbtree/nbtutils.c
@@ -1772,7 +1772,7 @@ _bt_killitems(IndexScanDesc scan)
return;
page = BufferGetPage(buf);
- if (PageGetLSN(page) == so->currPos.lsn)
+ if (BufferGetLSNAtomic(buf) == so->currPos.lsn)
so->currPos.buf = buf;
Same here thanks to _bt_getbuf.
So those bits could be considered for integration.
Also, if I read the gist code correctly, there is one other case in
gistFindCorrectParent. And in btree, there is one occurence in
_bt_split. In XLogRecordAssemble, there could be more checks regarding
the locks taken on the buffer registered.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Oct 2, 2017 at 10:19 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
So those bits could be considered for integration.
Yes, and they also tend to suggest that the rest of the patch has value.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Oct 2, 2017 at 11:44 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Oct 2, 2017 at 10:19 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:So those bits could be considered for integration.
Yes, and they also tend to suggest that the rest of the patch has value.
Well, there are cases where you don't need any locking checks, and the
proposed patch ignores that. Take for example pageinspect which works
on a copy of a page, or just WAL replay which serializes everything,
and in both cases PageGetLSN can be used safely. So for compatibility
reasons I think that PageGetLSN should be kept alone to not surprise
anybody using it, or at least an equivalent should be introduced. It
would be interesting to make BufferGetLSNAtomic hold tighter checks,
like something that makes use of LWLockHeldByMe for example.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Michael Paquier wrote:
On Mon, Oct 2, 2017 at 11:44 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Oct 2, 2017 at 10:19 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:So those bits could be considered for integration.
Yes, and they also tend to suggest that the rest of the patch has value.
Well, there are cases where you don't need any locking checks, and the
proposed patch ignores that. Take for example pageinspect which works
on a copy of a page, or just WAL replay which serializes everything,
and in both cases PageGetLSN can be used safely. So for compatibility
reasons I think that PageGetLSN should be kept alone to not surprise
anybody using it, or at least an equivalent should be introduced. It
would be interesting to make BufferGetLSNAtomic hold tighter checks,
like something that makes use of LWLockHeldByMe for example.
I'd argue about this in the same direction I argued about
BufferGetPage() needing an LSN check that's applied separately: if it's
too easy for a developer to do the wrong thing (i.e. fail to apply the
separate LSN check after reading the page) then the routine should be
patched to do the check itself, and another routine should be offered
for the cases that explicitly can do without the check.
I was eventually outvoted in the BufferGetPage() saga, though, so I'm
not sure that that discussion sets precedent.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Oct 2, 2017 at 11:05 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
Well, there are cases where you don't need any locking checks, and the
proposed patch ignores that.
I understand that, but shouldn't we then look for a way to adjust the
patch so that it doesn't have that issue any longer, rather than just
kicking it to the curb? I mean, just saying "patch suxxor, next"
doesn't seem like the right approach to something that has apparently
already found real problems.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Oct 3, 2017 at 12:54 AM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Michael Paquier wrote:
On Mon, Oct 2, 2017 at 11:44 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Oct 2, 2017 at 10:19 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:So those bits could be considered for integration.
Yes, and they also tend to suggest that the rest of the patch has value.
Well, there are cases where you don't need any locking checks, and the
proposed patch ignores that. Take for example pageinspect which works
on a copy of a page, or just WAL replay which serializes everything,
and in both cases PageGetLSN can be used safely. So for compatibility
reasons I think that PageGetLSN should be kept alone to not surprise
anybody using it, or at least an equivalent should be introduced. It
would be interesting to make BufferGetLSNAtomic hold tighter checks,
like something that makes use of LWLockHeldByMe for example.
Jacob, here are some ideas to make this thread move on. I would
suggest to produce a set of patches that do things incrementally:
1) One patch that changes the calls of PageGetLSN to
BufferGetLSNAtomic which are now not appropriate. You have spotted
some of them in the btree and gist code, but not all based on my first
lookup. There is still one in gistFindCorrectParent and one in btree
_bt_split. The monitoring of the other calls (sequence.c and
vacuumlazy.c) looked safe. There is another one in XLogRecordAssemble
that should be fixed I think.
2) A second patch that strengthens a bit checks around
BufferGetLSNAtomic. One idea would be to use LWLockHeldByMe, as you
are originally suggesting.
A comment could be as well added in bufpage.h for PageGetLSN to let
users know that it should be used carefully, in the vein of what is
mentioned in src/backend/access/transam/README.
I'd argue about this in the same direction I argued about
BufferGetPage() needing an LSN check that's applied separately: if it's
too easy for a developer to do the wrong thing (i.e. fail to apply the
separate LSN check after reading the page) then the routine should be
patched to do the check itself, and another routine should be offered
for the cases that explicitly can do without the check.I was eventually outvoted in the BufferGetPage() saga, though, so I'm
not sure that that discussion sets precedent.
Oh... I don't recall this discussion. A quick lookup at the archives
does not show me a specific thread either.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Michael Paquier wrote:
On Tue, Oct 3, 2017 at 12:54 AM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
I'd argue about this in the same direction I argued about
BufferGetPage() needing an LSN check that's applied separately: if it's
too easy for a developer to do the wrong thing (i.e. fail to apply the
separate LSN check after reading the page) then the routine should be
patched to do the check itself, and another routine should be offered
for the cases that explicitly can do without the check.I was eventually outvoted in the BufferGetPage() saga, though, so I'm
not sure that that discussion sets precedent.Oh... I don't recall this discussion. A quick lookup at the archives
does not show me a specific thread either.
I mean Kevin patch for snapshot-too-old:
/messages/by-id/CACjxUsPPCbov6DDPnuGpR=FmxHsjSn_MRi3rJYgvbRMCRfFz+A@mail.gmail.com
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Michael Paquier wrote:
On Tue, Oct 3, 2017 at 12:54 AM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
I'd argue about this in the same direction I argued about
BufferGetPage() needing an LSN check that's applied separately: if it's
too easy for a developer to do the wrong thing (i.e. fail to apply the
separate LSN check after reading the page) then the routine should be
patched to do the check itself, and another routine should be offered
for the cases that explicitly can do without the check.I was eventually outvoted in the BufferGetPage() saga, though, so I'm
not sure that that discussion sets precedent.Oh... I don't recall this discussion. A quick lookup at the archives
does not show me a specific thread either.
Just search for "�sop" in the archives:
/messages/by-id/CACjxUsPPCbov6DDPnuGpR=FmxHsjSn_MRi3rJYgvbRMCRfFz+A@mail.gmail.com
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Oct 6, 2017 at 7:34 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Just search for "Æsop" in the archives:
/messages/by-id/CACjxUsPPCbov6DDPnuGpR=FmxHsjSn_MRi3rJYgvbRMCRfFz+A@mail.gmail.com
(laugh)
I didn't know this one.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi Michael
On Mon, Oct 2, 2017 at 6:48 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:
Jacob, here are some ideas to make this thread move on. I would
suggest to produce a set of patches that do things incrementally:
1) One patch that changes the calls of PageGetLSN to
BufferGetLSNAtomic which are now not appropriate. You have spotted
some of them in the btree and gist code, but not all based on my first
lookup. There is still one in gistFindCorrectParent and one in btree
_bt_split. The monitoring of the other calls (sequence.c and
vacuumlazy.c) looked safe. There is another one in XLogRecordAssemble
that should be fixed I think.
Thank you for your suggestions. Please find the first patch attached as
"0001-...". We verified both, gistFindCorrectParent and _bt_split and all
calls to PageGetLSN are made with exclusive lock on the buffer contents
held.
2) A second patch that strengthens a bit checks around
BufferGetLSNAtomic. One idea would be to use LWLockHeldByMe, as you
are originally suggesting.
A comment could be as well added in bufpage.h for PageGetLSN to let
users know that it should be used carefully, in the vein of what is
mentioned in src/backend/access/transam/README.
The second patch "0002-..." does the above. We have a comment added to
AssertPageIsLockedForLSN as suggested.
The assertion added caught at least one code path where TestForOldSnapshot
calls PageGetLSN without holding any lock. The snapshot_too_old test in
"check-world" failed due to the assertion failure. This needs to be fixed,
see the open question in the opening mail on this thread.
Asim and Jacob
Attachments:
0001-Change-incorrect-calls-to-PageGetLSN-to-BufferGetLSN.patchapplication/octet-stream; name=0001-Change-incorrect-calls-to-PageGetLSN-to-BufferGetLSN.patchDownload
From 20082b295637de311a126658d14ebc34402d565d Mon Sep 17 00:00:00 2001
From: Jacob Champion <pchampion@pivotal.io>
Date: Fri, 3 Nov 2017 16:47:44 -0700
Subject: [PATCH 1/2] Change incorrect calls to PageGetLSN to
BufferGetLSNAtomic
When accessing a shared page, callers must either hold an exclusive lock
on the buffer, OR a shared lock plus the buffer header spinlock (this
additional constraint exists because it's legal to write the LSN without
holding an exclusive lock). Several callers are currently not following
this contract because they call PageGetLSN while holding only a shared
content lock.
---
src/backend/access/gist/gist.c | 4 ++--
src/backend/access/gist/gistget.c | 4 ++--
src/backend/access/gist/gistvacuum.c | 2 +-
src/backend/access/nbtree/nbtsearch.c | 2 +-
src/backend/access/nbtree/nbtutils.c | 2 +-
5 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index aec174cd00..8709cffaed 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -640,7 +640,7 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace, GISTSTATE *giststate)
}
stack->page = (Page) BufferGetPage(stack->buffer);
- stack->lsn = PageGetLSN(stack->page);
+ stack->lsn = BufferGetLSNAtomic(stack->buffer);
Assert(!RelationNeedsWAL(state.r) || !XLogRecPtrIsInvalid(stack->lsn));
/*
@@ -890,7 +890,7 @@ gistFindPath(Relation r, BlockNumber child, OffsetNumber *downlinkoffnum)
break;
}
- top->lsn = PageGetLSN(page);
+ top->lsn = BufferGetLSNAtomic(buffer);
/*
* If F_FOLLOW_RIGHT is set, the page to the right doesn't have a
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index 06dac0bb53..97c3572306 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -61,7 +61,7 @@ gistkillitems(IndexScanDesc scan)
* read. killedItems could be not valid so LP_DEAD hints applying is not
* safe.
*/
- if (PageGetLSN(page) != so->curPageLSN)
+ if (BufferGetLSNAtomic(buffer) != so->curPageLSN)
{
UnlockReleaseBuffer(buffer);
so->numKilled = 0; /* reset counter */
@@ -384,7 +384,7 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
* safe to apply LP_DEAD hints to the page later. This allows us to drop
* the pin for MVCC scans, which allows vacuum to avoid blocking.
*/
- so->curPageLSN = PageGetLSN(page);
+ so->curPageLSN = BufferGetLSNAtomic(buffer);
/*
* check all tuples on page
diff --git a/src/backend/access/gist/gistvacuum.c b/src/backend/access/gist/gistvacuum.c
index 77d9d12f0b..54a61cc427 100644
--- a/src/backend/access/gist/gistvacuum.c
+++ b/src/backend/access/gist/gistvacuum.c
@@ -249,7 +249,7 @@ gistbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
ptr = (GistBDItem *) palloc(sizeof(GistBDItem));
ptr->blkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
- ptr->parentlsn = PageGetLSN(page);
+ ptr->parentlsn = BufferGetLSNAtomic(buffer);
ptr->next = stack->next;
stack->next = ptr;
diff --git a/src/backend/access/nbtree/nbtsearch.c b/src/backend/access/nbtree/nbtsearch.c
index 642c8943e7..1ca2623dea 100644
--- a/src/backend/access/nbtree/nbtsearch.c
+++ b/src/backend/access/nbtree/nbtsearch.c
@@ -1224,7 +1224,7 @@ _bt_readpage(IndexScanDesc scan, ScanDirection dir, OffsetNumber offnum)
* safe to apply LP_DEAD hints to the page later. This allows us to drop
* the pin for MVCC scans, which allows vacuum to avoid blocking.
*/
- so->currPos.lsn = PageGetLSN(page);
+ so->currPos.lsn = BufferGetLSNAtomic(so->currPos.buf);
/*
* we must save the page's right-link while scanning it; this tells us
diff --git a/src/backend/access/nbtree/nbtutils.c b/src/backend/access/nbtree/nbtutils.c
index dbfb775dec..4360661789 100644
--- a/src/backend/access/nbtree/nbtutils.c
+++ b/src/backend/access/nbtree/nbtutils.c
@@ -1772,7 +1772,7 @@ _bt_killitems(IndexScanDesc scan)
return;
page = BufferGetPage(buf);
- if (PageGetLSN(page) == so->currPos.lsn)
+ if (BufferGetLSNAtomic(buf) == so->currPos.lsn)
so->currPos.buf = buf;
else
{
--
2.13.5 (Apple Git-94)
0002-PageGetLSN-assert-that-locks-are-properly-held.patchapplication/octet-stream; name=0002-PageGetLSN-assert-that-locks-are-properly-held.patchDownload
From 443221e84b1360b0b20159c84e2f5bf5c787d6a8 Mon Sep 17 00:00:00 2001
From: Asim R P <apraveen@pivotal.io>
Date: Fri, 3 Nov 2017 16:48:32 -0700
Subject: [PATCH 2/2] PageGetLSN: assert that locks are properly held
To help developers find issues, assert that the proper locks are held
for shared pages inside PageGetLSN. The implementation for this check is
added in a helper function, AssertPageIsLockedForLSN, which is called
from PageGetLSN whenever backend code is built with assertions enabled.
This introduces a slight source incompatibility: previously, both Pages
and PageHeaders could be passed to the macro, but now callers must
explicitly pass a Page.
---
contrib/pageinspect/rawpage.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 35 +++++++++++++++++++++++++++++++++++
src/include/storage/bufpage.h | 17 +++++++++++++++--
3 files changed, 51 insertions(+), 3 deletions(-)
diff --git a/contrib/pageinspect/rawpage.c b/contrib/pageinspect/rawpage.c
index 25af22f453..e3a1ca0e3b 100644
--- a/contrib/pageinspect/rawpage.c
+++ b/contrib/pageinspect/rawpage.c
@@ -250,7 +250,7 @@ page_header(PG_FUNCTION_ARGS)
/* Extract information from the page header */
- lsn = PageGetLSN(page);
+ lsn = PageGetLSN((Page) page);
/* pageinspect >= 1.2 uses pg_lsn instead of text for the LSN field. */
if (TupleDescAttr(tupdesc, 0)->atttypid == TEXTOID)
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 15795b0c5a..8fbb8e79b3 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -4355,3 +4355,38 @@ TestForOldSnapshot_impl(Snapshot snapshot, Relation relation)
(errcode(ERRCODE_SNAPSHOT_TOO_OLD),
errmsg("snapshot too old")));
}
+
+#if defined(USE_ASSERT_CHECKING) && !defined(FRONTEND)
+void
+AssertPageIsLockedForLSN(Page page)
+{
+ char *pagePtr = page;
+
+ /*
+ * We only want to assert that we hold a lock on the page contents if the
+ * page is shared (i.e. it is one of the BufferBlocks).
+ */
+ if (BufferBlocks <= pagePtr
+ && pagePtr < (BufferBlocks + NBuffers * BLCKSZ))
+ {
+ ptrdiff_t bufId = (pagePtr - BufferBlocks) / BLCKSZ;
+ BufferDesc *buf = GetBufferDescriptor(bufId);
+ LWLock *content_lock = BufferDescriptorGetContentLock(buf);
+ uint32 buf_state = pg_atomic_read_u32(&buf->state);
+
+ /*
+ * Either:
+ * 1) we hold the exclusive lock for the buffer contents, so reading is
+ * always safe, or
+ * 2) we hold the shared lock and the header spinlock, in which case we
+ * are protected against other exclusive lock holders and other WAL
+ * log hint writers, or
+ * 3) XLog hint bits aren't enabled (so there are no WAL log hint
+ * writers) and we just need the shared lock by itself.
+ */
+ Assert(LWLockHeldByMeInMode(content_lock, LW_EXCLUSIVE)
+ || (LWLockHeldByMeInMode(content_lock, LW_SHARED)
+ && (!XLogHintBitIsNeeded() || (buf_state & BM_LOCKED))));
+ }
+}
+#endif
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 50c72a3c8d..db94310586 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -359,8 +359,21 @@ PageValidateSpecialPointer(Page page)
* Additional macros for access to page headers. (Beware multiple evaluation
* of the arguments!)
*/
-#define PageGetLSN(page) \
- PageXLogRecPtrGet(((PageHeader) (page))->pd_lsn)
+
+#if defined(USE_ASSERT_CHECKING) && !defined(FRONTEND)
+/* in bufmgr.c; used in PageGetLSN */
+extern void AssertPageIsLockedForLSN(Page page);
+#else
+#define AssertPageIsLockedForLSN(page)
+#endif
+
+static inline XLogRecPtr
+PageGetLSN(Page page)
+{
+ AssertPageIsLockedForLSN(page);
+ return PageXLogRecPtrGet(((PageHeader) page)->pd_lsn);
+}
+
#define PageSetLSN(page, lsn) \
PageXLogRecPtrSet(((PageHeader) (page))->pd_lsn, lsn)
--
2.13.5 (Apple Git-94)
On Tue, Nov 7, 2017 at 7:27 AM, Asim Praveen <apraveen@pivotal.io> wrote:
On Mon, Oct 2, 2017 at 6:48 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:Jacob, here are some ideas to make this thread move on. I would
suggest to produce a set of patches that do things incrementally:
1) One patch that changes the calls of PageGetLSN to
BufferGetLSNAtomic which are now not appropriate. You have spotted
some of them in the btree and gist code, but not all based on my first
lookup. There is still one in gistFindCorrectParent and one in btree
_bt_split. The monitoring of the other calls (sequence.c and
vacuumlazy.c) looked safe. There is another one in XLogRecordAssemble
that should be fixed I think.Thank you for your suggestions. Please find the first patch attached as
"0001-...". We verified both, gistFindCorrectParent and _bt_split and all
calls to PageGetLSN are made with exclusive lock on the buffer contents
held.
Cool. Thanks for double-checking. XLogRecordAssemble() is fine after
more lookup at this code, XLogRegisterBuffer already doing some sanity
checks.
2) A second patch that strengthens a bit checks around
BufferGetLSNAtomic. One idea would be to use LWLockHeldByMe, as you
are originally suggesting.
A comment could be as well added in bufpage.h for PageGetLSN to let
users know that it should be used carefully, in the vein of what is
mentioned in src/backend/access/transam/README.The second patch "0002-..." does the above. We have a comment added to
AssertPageIsLockedForLSN as suggested.
Did you really test WAL replay? This still ignores that PageGetLSN is
as well taken in some code paths, like recovery, where actions on the
page are guaranteed to be serialized, like during recovery, so this
patch would cause the system to blow up. Note that pageinspect,
amcheck and wal_consistency_checking also process on page copies. So
the assertion failure of 0002 would trigger in those cases.
The assertion added caught at least one code path where TestForOldSnapshot
calls PageGetLSN without holding any lock. The snapshot_too_old test in
"check-world" failed due to the assertion failure. This needs to be fixed,
see the open question in the opening mail on this thread.
Good point. I am looping Kevin Grittner here for his input, as the
author and committer of old_snapshot_threshold. Things can be
addressed with a separate patch by roughly moving the check on
PageGetLSN() to TestForOldSnapshot_impl() and use BufferGetLSNAtomic()
instead.
The commit fest has lost view of this entry, and so did I. So I have
added a new entry:
https://commitfest.postgresql.org/16/1371/
BufferGetLSNAtomic() could really use LWLockHeldByMe(). Could you
consider it with an extra patch on top of 0001?
It seems to me that 0001 is good for a committer lookup, that will get
rid of all existing bugs. For 0002, what you are proposing is still
not a good idea for anything using page copies. Here are some
suggestions:
- Implement a PageGetLSNFromCopy, dedicated at working correctly when
working on a page copy. Then switch callers of amcheck, pageinspect
and wal_consistency_checking to use that.
- Implement something like GetLSNFromLockedPage, and switch of
backend's PageGetLSN to that. Performance impact could be seen..
- Have a PageGetLSNSafe, which can be used safely for serialized actions.
It could be an idea to remove PageGetLSN to force a breakage of
extensions calling it, so as they would review any of its calls. Not a
fan of that though.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi Michael,
On Mon, Nov 6, 2017 at 6:18 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
Did you really test WAL replay?
Is there a way to test this other than installcheck-world? The only
failure we've run into at the moment is in the snapshot-too-old tests.
Maybe we're not configuring with all the tests enabled?
The assertion added caught at least one code path where TestForOldSnapshot
calls PageGetLSN without holding any lock. The snapshot_too_old test in
"check-world" failed due to the assertion failure. This needs to be fixed,
see the open question in the opening mail on this thread.Good point. I am looping Kevin Grittner here for his input, as the
author and committer of old_snapshot_threshold. Things can be
addressed with a separate patch by roughly moving the check on
PageGetLSN() to TestForOldSnapshot_impl() and use BufferGetLSNAtomic()
instead.
It still doesn't pass the tests, as not all code paths ensure that a
content lock is held before calling TestForOldSnapshot.
BufferGetLSNAtomic only adds the spinlock.
The commit fest has lost view of this entry, and so did I. So I have
added a new entry:
https://commitfest.postgresql.org/16/1371/
Thank you!
BufferGetLSNAtomic() could really use LWLockHeldByMe(). Could you
consider it with an extra patch on top of 0001?
The LWLockHeldByMe() calls are added to BufferGetLSNAtomic in patch
0002 (because it does all its work through PageGetLSN).
It seems to me that 0001 is good for a committer lookup, that will get
rid of all existing bugs. For 0002, what you are proposing is still
not a good idea for anything using page copies.
I think there is still significant confusion here. To be clear: this
patch is intended to add no changes for local page copies. As I've
tried to say (three or four times now): to our understanding, local
page copies are not allocated in the shared BufferBlocks region and
are therefore not subject to the new assertions. Am I missing a corner
case, or completely misunderstanding your point? I never got a direct
response to my earlier questions on this.
--Jacob
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Nov 7, 2017 at 9:26 AM, Jacob Champion <pchampion@pivotal.io> wrote:
On Mon, Nov 6, 2017 at 6:18 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:It seems to me that 0001 is good for a committer lookup, that will get
rid of all existing bugs. For 0002, what you are proposing is still
not a good idea for anything using page copies.I think there is still significant confusion here. To be clear: this
patch is intended to add no changes for local page copies.
Maybe a better way to put this is: we see no failures with the
pageinspect regression tests, which exercise PageGetLSN() via the
page_header() function. Are there other tests we should be paying
attention to that might show a problem with our local-copy logic?
--Jacob
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Nov 8, 2017 at 2:26 AM, Jacob Champion <pchampion@pivotal.io> wrote:
On Mon, Nov 6, 2017 at 6:18 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:Did you really test WAL replay?
Is there a way to test this other than installcheck-world? The only
failure we've run into at the moment is in the snapshot-too-old tests.
Maybe we're not configuring with all the tests enabled?
Not automatically. The simplest method I use in this case is
installcheck with a standby replaying changes behind.
The assertion added caught at least one code path where TestForOldSnapshot
calls PageGetLSN without holding any lock. The snapshot_too_old test in
"check-world" failed due to the assertion failure. This needs to be fixed,
see the open question in the opening mail on this thread.Good point. I am looping Kevin Grittner here for his input, as the
author and committer of old_snapshot_threshold. Things can be
addressed with a separate patch by roughly moving the check on
PageGetLSN() to TestForOldSnapshot_impl() and use BufferGetLSNAtomic()
instead.It still doesn't pass the tests, as not all code paths ensure that a
content lock is held before calling TestForOldSnapshot.
BufferGetLSNAtomic only adds the spinlock.
I would prefer waiting for more input from Kevin here. This may prove
to be a more invasive change. There are no objections into fixing the
existing callers in index AMs though.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi Michael
On Mon, Nov 6, 2017 at 6:18 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:
Did you really test WAL replay? This still ignores that PageGetLSN is
as well taken in some code paths, like recovery, where actions on the
page are guaranteed to be serialized, like during recovery, so this
patch would cause the system to blow up. Note that pageinspect,
amcheck and wal_consistency_checking also process on page copies. So
the assertion failure of 0002 would trigger in those cases.
Indeed, the assertion tripped during WAL replay on the standby. This was
caught by TAP tests under src/test/recovery. The assertion is now fixed so
that WAL replay is exempt from the check. Please find the new patch
attached. The tests now pass with the fix. I also manually verified that
recovery works with "wal_consistency_checking=all".
Asim
Attachments:
0002-PageGetLSN-assert-that-locks-are-properly-held.patchapplication/octet-stream; name=0002-PageGetLSN-assert-that-locks-are-properly-held.patchDownload
From ac0146edd7548ad6283b74c9c6b7c1847cca4b6f Mon Sep 17 00:00:00 2001
From: Asim R P <apraveen@pivotal.io>
Date: Wed, 8 Nov 2017 14:46:29 -0800
Subject: [PATCH 2/2] PageGetLSN: assert that locks are properly held
To help developers find issues, assert that the proper locks are held
for shared pages inside PageGetLSN. The implementation for this check is
added in a helper function, AssertPageIsLockedForLSN, which is called
from PageGetLSN whenever backend code is built with assertions enabled.
This introduces a slight source incompatibility: previously, both Pages
and PageHeaders could be passed to the macro, but now callers must
explicitly pass a Page.
---
contrib/pageinspect/rawpage.c | 2 +-
src/backend/storage/buffer/bufmgr.c | 37 +++++++++++++++++++++++++++++++++++++
src/include/storage/bufpage.h | 17 +++++++++++++++--
3 files changed, 53 insertions(+), 3 deletions(-)
diff --git a/contrib/pageinspect/rawpage.c b/contrib/pageinspect/rawpage.c
index 25af22f453..e3a1ca0e3b 100644
--- a/contrib/pageinspect/rawpage.c
+++ b/contrib/pageinspect/rawpage.c
@@ -250,7 +250,7 @@ page_header(PG_FUNCTION_ARGS)
/* Extract information from the page header */
- lsn = PageGetLSN(page);
+ lsn = PageGetLSN((Page) page);
/* pageinspect >= 1.2 uses pg_lsn instead of text for the LSN field. */
if (TupleDescAttr(tupdesc, 0)->atttypid == TEXTOID)
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 15795b0c5a..df27936943 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -4355,3 +4355,40 @@ TestForOldSnapshot_impl(Snapshot snapshot, Relation relation)
(errcode(ERRCODE_SNAPSHOT_TOO_OLD),
errmsg("snapshot too old")));
}
+
+#if defined(USE_ASSERT_CHECKING) && !defined(FRONTEND)
+void
+AssertPageIsLockedForLSN(Page page)
+{
+ char *pagePtr = page;
+
+ /*
+ * We only want to assert that we hold a lock on the page contents if the
+ * page is shared (i.e. it is one of the BufferBlocks). Crash recovery or
+ * WAL replay is exempt from this assertion because page acecss is
+ * guaranteed to be serialized.
+ */
+ if (!InRecovery && BufferBlocks <= pagePtr
+ && pagePtr < (BufferBlocks + NBuffers * BLCKSZ))
+ {
+ ptrdiff_t bufId = (pagePtr - BufferBlocks) / BLCKSZ;
+ BufferDesc *buf = GetBufferDescriptor(bufId);
+ LWLock *content_lock = BufferDescriptorGetContentLock(buf);
+ uint32 buf_state = pg_atomic_read_u32(&buf->state);
+
+ /*
+ * Either:
+ * 1) we hold the exclusive lock for the buffer contents, so reading is
+ * always safe, or
+ * 2) we hold the shared lock and the header spinlock, in which case we
+ * are protected against other exclusive lock holders and other WAL
+ * log hint writers, or
+ * 3) XLog hint bits aren't enabled (so there are no WAL log hint
+ * writers) and we just need the shared lock by itself.
+ */
+ Assert(LWLockHeldByMeInMode(content_lock, LW_EXCLUSIVE)
+ || (LWLockHeldByMeInMode(content_lock, LW_SHARED)
+ && (!XLogHintBitIsNeeded() || (buf_state & BM_LOCKED))));
+ }
+}
+#endif
diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 50c72a3c8d..db94310586 100644
--- a/src/include/storage/bufpage.h
+++ b/src/include/storage/bufpage.h
@@ -359,8 +359,21 @@ PageValidateSpecialPointer(Page page)
* Additional macros for access to page headers. (Beware multiple evaluation
* of the arguments!)
*/
-#define PageGetLSN(page) \
- PageXLogRecPtrGet(((PageHeader) (page))->pd_lsn)
+
+#if defined(USE_ASSERT_CHECKING) && !defined(FRONTEND)
+/* in bufmgr.c; used in PageGetLSN */
+extern void AssertPageIsLockedForLSN(Page page);
+#else
+#define AssertPageIsLockedForLSN(page)
+#endif
+
+static inline XLogRecPtr
+PageGetLSN(Page page)
+{
+ AssertPageIsLockedForLSN(page);
+ return PageXLogRecPtrGet(((PageHeader) page)->pd_lsn);
+}
+
#define PageSetLSN(page, lsn) \
PageXLogRecPtrSet(((PageHeader) (page))->pd_lsn, lsn)
--
2.13.5 (Apple Git-94)
On Thu, Nov 9, 2017 at 8:25 AM, Asim Praveen <apraveen@pivotal.io> wrote:
Indeed, the assertion tripped during WAL replay on the standby. This was
caught by TAP tests under src/test/recovery. The assertion is now fixed so
that WAL replay is exempt from the check. Please find the new patch
attached. The tests now pass with the fix. I also manually verified that
recovery works with "wal_consistency_checking=all".
I still have a bad feeling about that bit... Still, it does not change
the fact that patch 0001 in
/messages/by-id/CANXE4TccH_VjdKaHc9=KyH0Y7WORqZN+=mH5f=mP0Bw3gzX1Sw@mail.gmail.com
needs a committer per the fact that it visibly fixes incorrect backend
code and API contract. So I am switching the CF entry to ready for
committer, but only for 0001.
The other things could always be taken care of later.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
So ... gistdoinsert can sometimes hold an exclusive lock, so we could do
this instead:
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 0e499598a4..2ea19d2683 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -566,7 +566,8 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace, GISTSTATE *giststate)
}
stack->page = (Page) BufferGetPage(stack->buffer);
- stack->lsn = PageGetLSN(stack->page);
+ stack->lsn = xlocked ?
+ PageGetLSN(stack->page) : BufferGetLSNAtomic(stack->buffer);
Assert(!RelationNeedsWAL(state.r) || !XLogRecPtrIsInvalid(stack->lsn));
I suppose it doesn't really hurt all that much. I admit I'm a bit
nervous about the fact that code is uncovered.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Pushed 0001.
I confirm the finding that 0002 makes TestForOldSnapshot blow up. I
suppose the right fix is to pass the buffer, not just the page, rather
than trying to reverse-engineer the buffer from the page.
Regarding adding the proposed checks, which I think we should clearly
do, my preferred fix would be to split PageGetLSN in two: one that is
applied to shared buffers, which always runs the assertion, which
retains the current name, and another that is applied to other buffers,
which never does but it does check that the page is not in
shared_buffers and uses another name.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera wrote:
Pushed 0001.
I marked the CF entry as "committed", BTW. I assume you're going to
ship an updated version of 0002 to the next commitfest. If you have a
new version during this commitfest, feel free to turn this entry back to
"needs review".
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Jan 09, 2018 at 05:19:00PM -0300, Alvaro Herrera wrote:
Alvaro Herrera wrote:
So ... gistdoinsert can sometimes hold an exclusive lock, so we could do
this instead:diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c index 0e499598a4..2ea19d2683 100644 --- a/src/backend/access/gist/gist.c +++ b/src/backend/access/gist/gist.c @@ -566,7 +566,8 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace, GISTSTATE *giststate) }stack->page = (Page) BufferGetPage(stack->buffer); - stack->lsn = PageGetLSN(stack->page); + stack->lsn = xlocked ? + PageGetLSN(stack->page) : BufferGetLSNAtomic(stack->buffer); Assert(!RelationNeedsWAL(state.r) || !XLogRecPtrIsInvalid(stack->lsn));
Indeed. That's better. Thanks.
I marked the CF entry as "committed", BTW. I assume you're going to
ship an updated version of 0002 to the next commitfest. If you have a
new version during this commitfest, feel free to turn this entry back to
"needs review".
That's fine for me. The rest can always be revisited later once the
issues raised are addressed.
--
Michael