Remove HeapTuple and Buffer dependency for predicate locking functions
Proposing following changes to make predicate locking and checking
functions generic and remove dependency on HeapTuple and Heap AM. We
made these changes to help with Zedstore. I think the changes should
help Zheap and other AMs in general.
- Change PredicateLockTuple() to PredicateLockTID(). So, instead of
passing HeapTuple to it, just pass ItemPointer and tuple insert
transaction id if known. This was also discussed earlier in [1],
don't think was done in that context but would be helpful in future
if such requirement comes up as well.
- CheckForSerializableConflictIn() take blocknum instead of
buffer. Currently, the function anyways does nothing with the buffer
just needs blocknum. Also, helps to decouple dependency on buffer as
not all AMs may have one to one notion between blocknum and single
buffer. Like for zedstore, tuple is stored across individual column
buffers. So, wish to have way to lock not physical buffer but
logical blocknum.
- CheckForSerializableConflictOut() no more takes HeapTuple nor
buffer, instead just takes xid. Push heap specific parts from
CheckForSerializableConflictOut() into its own function
HeapCheckForSerializableConflictOut() which calls
CheckForSerializableConflictOut(). The alternative option could be
CheckForSerializableConflictOut() take callback function and
callback arguments, which gets called if required after performing
prechecks. Though currently I fell AM having its own wrapper to
perform AM specific task and then calling
CheckForSerializableConflictOut() is fine.
Attaching patch which makes these changes.
This way PredicateLockTID(), CheckForSerializableConflictIn() and
CheckForSerializableConflictOut() functions become usable by any AM.
1]
/messages/by-id/CAEepm=2QbqQ_+KQQCnhKukF6NEAeq4SqiO3Qxe+fHza5-H-jKA@mail.gmail.com
Attachments:
v1-0001-Remove-HeapTuple-dependency-for-predicate-locking.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Remove-HeapTuple-dependency-for-predicate-locking.patchDownload+150-114
Hi,
On 2019-06-24 10:41:06 -0700, Ashwin Agrawal wrote:
Proposing following changes to make predicate locking and checking
functions generic and remove dependency on HeapTuple and Heap AM. We
made these changes to help with Zedstore. I think the changes should
help Zheap and other AMs in general.
Indeed.
- Change PredicateLockTuple() to PredicateLockTID(). So, instead of
passing HeapTuple to it, just pass ItemPointer and tuple insert
transaction id if known. This was also discussed earlier in [1],
don't think was done in that context but would be helpful in future
if such requirement comes up as well.
Right.
- CheckForSerializableConflictIn() take blocknum instead of
buffer. Currently, the function anyways does nothing with the buffer
just needs blocknum. Also, helps to decouple dependency on buffer as
not all AMs may have one to one notion between blocknum and single
buffer. Like for zedstore, tuple is stored across individual column
buffers. So, wish to have way to lock not physical buffer but
logical blocknum.
Hm. I wonder if we somehow ought to generalize the granularity scheme
for predicate locks to not be tuple/page/relation. But even if, that's
probably a separate patch.
- CheckForSerializableConflictOut() no more takes HeapTuple nor
buffer, instead just takes xid. Push heap specific parts from
CheckForSerializableConflictOut() into its own function
HeapCheckForSerializableConflictOut() which calls
CheckForSerializableConflictOut(). The alternative option could be
CheckForSerializableConflictOut() take callback function and
callback arguments, which gets called if required after performing
prechecks. Though currently I fell AM having its own wrapper to
perform AM specific task and then calling
CheckForSerializableConflictOut() is fine.
I think it's right to move the xid handling out of
CheckForSerializableConflictOut(). But I think we also ought to move the
subtransaction handling out of the function - e.g. zheap doesn't
want/need that.
Attaching patch which makes these changes.
Please make sure that there's a CF entry for this (I'm in a plane with a
super slow connection, otherwise I'd check).
Greetings,
Andres Freund
On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de> wrote:
On 2019-06-24 10:41:06 -0700, Ashwin Agrawal wrote:
Proposing following changes to make predicate locking and checking
functions generic and remove dependency on HeapTuple and Heap AM. We
made these changes to help with Zedstore. I think the changes should
help Zheap and other AMs in general.Indeed.
+1
- Change PredicateLockTuple() to PredicateLockTID(). So, instead of
passing HeapTuple to it, just pass ItemPointer and tuple insert
transaction id if known. This was also discussed earlier in [1],
don't think was done in that context but would be helpful in future
if such requirement comes up as well.Right.
+1
- CheckForSerializableConflictIn() take blocknum instead of
buffer. Currently, the function anyways does nothing with the buffer
just needs blocknum. Also, helps to decouple dependency on buffer as
not all AMs may have one to one notion between blocknum and single
buffer. Like for zedstore, tuple is stored across individual column
buffers. So, wish to have way to lock not physical buffer but
logical blocknum.Hm. I wonder if we somehow ought to generalize the granularity scheme
for predicate locks to not be tuple/page/relation. But even if, that's
probably a separate patch.
What do you have in mind? This is certainly the traditional way to do
lock hierarchies (archeological fun: we used to have
src/backend/storage/lock/multi.c, a "standard multi-level lock manager
as per the Gray paper", before commits 3f7fbf85 and e6e9e18e
decommissioned it; SSI brought the concept back again in a parallel
lock manager), but admittedly it has assumptions about physical
storage baked into the naming. Perhaps you just want to give those
things different labels, "TID range" instead of page, for the benefit
of "logical" TID users? Perhaps you want to permit more levels? That
seems premature as long as TIDs are defined in terms of blocks and
offsets, so this stuff is reflected all over the source tree...
- CheckForSerializableConflictOut() no more takes HeapTuple nor
buffer, instead just takes xid. Push heap specific parts from
CheckForSerializableConflictOut() into its own function
HeapCheckForSerializableConflictOut() which calls
CheckForSerializableConflictOut(). The alternative option could be
CheckForSerializableConflictOut() take callback function and
callback arguments, which gets called if required after performing
prechecks. Though currently I fell AM having its own wrapper to
perform AM specific task and then calling
CheckForSerializableConflictOut() is fine.I think it's right to move the xid handling out of
CheckForSerializableConflictOut(). But I think we also ought to move the
subtransaction handling out of the function - e.g. zheap doesn't
want/need that.
Thoughts on this Ashwin?
Attaching patch which makes these changes.
Please make sure that there's a CF entry for this (I'm in a plane with a
super slow connection, otherwise I'd check).
This all makes sense, and I'd like to be able to commit this soon. We
come up with something nearly identical for zheap. I'm subscribing
Kuntal Ghosh to see if he has comments, as he worked on that. The
main point of difference seems to be the assumption about how
subtransactions work.
--
Thomas Munro
https://enterprisedb.com
On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de> wrote:
- CheckForSerializableConflictOut() no more takes HeapTuple nor
buffer, instead just takes xid. Push heap specific parts from
CheckForSerializableConflictOut() into its own function
HeapCheckForSerializableConflictOut() which calls
CheckForSerializableConflictOut(). The alternative option could be
CheckForSerializableConflictOut() take callback function and
callback arguments, which gets called if required after performing
prechecks. Though currently I fell AM having its own wrapper to
perform AM specific task and then calling
CheckForSerializableConflictOut() is fine.I think it's right to move the xid handling out of
CheckForSerializableConflictOut(). But I think we also ought to move the
subtransaction handling out of the function - e.g. zheap doesn't
want/need that.Thoughts on this Ashwin?
I think the only part its doing for sub-transaction is
SubTransGetTopmostTransaction(xid). If xid passed to this function is
already top most transaction which is case for zheap and zedstore, then
there is no downside to keeping that code here in common place.
Hi,
On 2019-07-31 09:57:58 +1200, Thomas Munro wrote:
On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de> wrote:
Hm. I wonder if we somehow ought to generalize the granularity scheme
for predicate locks to not be tuple/page/relation. But even if, that's
probably a separate patch.What do you have in mind?
My concern is that continuing to inferring the granularity levels from
the tid doesn't seem like a great path forward. An AMs use of tids might
not necessarily be very amenable to that, if the mapping isn't actually
block based.
Perhaps you just want to give those things different labels, "TID
range" instead of page, for the benefit of "logical" TID users?
Perhaps you want to permit more levels? That seems premature as long
as TIDs are defined in terms of blocks and offsets, so this stuff is
reflected all over the source tree...
I'm mostly wondering if the different levels shouldn't be computed
outside of predicate.c.
Greetings,
Andres Freund
Hi,
On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote:
On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <thomas.munro@gmail.com> wrote:
On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de> wrote:
- CheckForSerializableConflictOut() no more takes HeapTuple nor
buffer, instead just takes xid. Push heap specific parts from
CheckForSerializableConflictOut() into its own function
HeapCheckForSerializableConflictOut() which calls
CheckForSerializableConflictOut(). The alternative option could be
CheckForSerializableConflictOut() take callback function and
callback arguments, which gets called if required after performing
prechecks. Though currently I fell AM having its own wrapper to
perform AM specific task and then calling
CheckForSerializableConflictOut() is fine.I think it's right to move the xid handling out of
CheckForSerializableConflictOut(). But I think we also ought to move the
subtransaction handling out of the function - e.g. zheap doesn't
want/need that.Thoughts on this Ashwin?
I think the only part its doing for sub-transaction is
SubTransGetTopmostTransaction(xid). If xid passed to this function is
already top most transaction which is case for zheap and zedstore, then
there is no downside to keeping that code here in common place.
Well, it's far from a cheap function. It'll do unnecessary on-disk
lookups in many cases. I'd call that quite a downside.
Greetings,
Andres Freund
On Wed, Jul 31, 2019 at 10:55 AM Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote:
On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <thomas.munro@gmail.com>
wrote:
On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de>
wrote:
- CheckForSerializableConflictOut() no more takes HeapTuple nor
buffer, instead just takes xid. Push heap specific parts from
CheckForSerializableConflictOut() into its own function
HeapCheckForSerializableConflictOut() which calls
CheckForSerializableConflictOut(). The alternative option couldbe
CheckForSerializableConflictOut() take callback function and
callback arguments, which gets called if required afterperforming
prechecks. Though currently I fell AM having its own wrapper to
perform AM specific task and then calling
CheckForSerializableConflictOut() is fine.I think it's right to move the xid handling out of
CheckForSerializableConflictOut(). But I think we also ought to movethe
subtransaction handling out of the function - e.g. zheap doesn't
want/need that.Thoughts on this Ashwin?
I think the only part its doing for sub-transaction is
SubTransGetTopmostTransaction(xid). If xid passed to this function is
already top most transaction which is case for zheap and zedstore, then
there is no downside to keeping that code here in common place.Well, it's far from a cheap function. It'll do unnecessary on-disk
lookups in many cases. I'd call that quite a downside.
Okay, agree, its costly function and better to avoid the call if possible.
Instead of moving the handling out of the function, how do feel about
adding boolean isTopTransactionId argument to function
CheckForSerializableConflictOut(). The AMs, which implicitly know, only
pass top transaction Id to this function, can pass true and avoid the
function call to SubTransGetTopmostTransaction(xid). With this
subtransaction code remains in generic place and AMs intending to use it
continue to leverage the common code, plus explicitly clarifies the
behavior as well.
On Wed, Jul 31, 2019 at 12:37 PM Ashwin Agrawal <aagrawal@pivotal.io> wrote:
On Wed, Jul 31, 2019 at 10:55 AM Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote:
On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <thomas.munro@gmail.com>
wrote:
On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de>
wrote:
- CheckForSerializableConflictOut() no more takes HeapTuple nor
buffer, instead just takes xid. Push heap specific parts from
CheckForSerializableConflictOut() into its own function
HeapCheckForSerializableConflictOut() which calls
CheckForSerializableConflictOut(). The alternative option couldbe
CheckForSerializableConflictOut() take callback function and
callback arguments, which gets called if required afterperforming
prechecks. Though currently I fell AM having its own wrapper to
perform AM specific task and then calling
CheckForSerializableConflictOut() is fine.I think it's right to move the xid handling out of
CheckForSerializableConflictOut(). But I think we also ought tomove the
subtransaction handling out of the function - e.g. zheap doesn't
want/need that.Thoughts on this Ashwin?
I think the only part its doing for sub-transaction is
SubTransGetTopmostTransaction(xid). If xid passed to this function is
already top most transaction which is case for zheap and zedstore, then
there is no downside to keeping that code here in common place.Well, it's far from a cheap function. It'll do unnecessary on-disk
lookups in many cases. I'd call that quite a downside.Okay, agree, its costly function and better to avoid the call if possible.
Instead of moving the handling out of the function, how do feel about
adding boolean isTopTransactionId argument to function
CheckForSerializableConflictOut(). The AMs, which implicitly know, only
pass top transaction Id to this function, can pass true and avoid the
function call to SubTransGetTopmostTransaction(xid). With this
subtransaction code remains in generic place and AMs intending to use it
continue to leverage the common code, plus explicitly clarifies the
behavior as well.
Added argument to function to make the subtransaction handling optional in
attached v2 of patch.
Attachments:
v2-0001-Remove-HeapTuple-dependency-for-predicate-locking.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Remove-HeapTuple-dependency-for-predicate-locking.patchDownload+162-119
Hi,
On 2019-07-31 12:37:58 -0700, Ashwin Agrawal wrote:
On Wed, Jul 31, 2019 at 10:55 AM Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote:
On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <thomas.munro@gmail.com>
wrote:
On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de>
wrote:
- CheckForSerializableConflictOut() no more takes HeapTuple nor
buffer, instead just takes xid. Push heap specific parts from
CheckForSerializableConflictOut() into its own function
HeapCheckForSerializableConflictOut() which calls
CheckForSerializableConflictOut(). The alternative option couldbe
CheckForSerializableConflictOut() take callback function and
callback arguments, which gets called if required afterperforming
prechecks. Though currently I fell AM having its own wrapper to
perform AM specific task and then calling
CheckForSerializableConflictOut() is fine.I think it's right to move the xid handling out of
CheckForSerializableConflictOut(). But I think we also ought to movethe
subtransaction handling out of the function - e.g. zheap doesn't
want/need that.Thoughts on this Ashwin?
I think the only part its doing for sub-transaction is
SubTransGetTopmostTransaction(xid). If xid passed to this function is
already top most transaction which is case for zheap and zedstore, then
there is no downside to keeping that code here in common place.Well, it's far from a cheap function. It'll do unnecessary on-disk
lookups in many cases. I'd call that quite a downside.Okay, agree, its costly function and better to avoid the call if possible.
Instead of moving the handling out of the function, how do feel about
adding boolean isTopTransactionId argument to function
CheckForSerializableConflictOut(). The AMs, which implicitly know, only
pass top transaction Id to this function, can pass true and avoid the
function call to SubTransGetTopmostTransaction(xid). With this
subtransaction code remains in generic place and AMs intending to use it
continue to leverage the common code, plus explicitly clarifies the
behavior as well.
Looking at the code as of master, we currently have:
- PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure
out a whether the tuple has been locked by the current
transaction. That check afaict just should be
TransactionIdIsCurrentTransactionId(), without all the other
stuff that's done today.
TransactionIdIsCurrentTransactionId() imo ought to be optimized to
always check for the top level transactionid first - that's a good bet
today, but even moreso for the upcoming AMs that won't have separate
xids for subtransactions. Alternatively we shouldn't make that a
binary search for each subtrans level, but just have a small
simplehash hashtable for xids.
- CheckForSerializableConflictOut() wants to get the toplevel xid for
the tuple, because that's the one the predicate hashtable stores.
In your patch you've already moved the HTSV() call etc out of
CheckForSerializableConflictOut(). I'm somewhat inclined to think that
the SubTransGetTopmostTransaction() call ought to go along with that.
I don't really think that belongs in predicate.c, especially if
most/all new AMs don't use subtransaction ids.
The only downside is that currently the
TransactionIdEquals(xid, GetTopTransactionIdIfAny()) check
avoids the SubTransGetTopmostTransaction() check.
But again, the better fix for that seems to be to improve the generic
code. As written the check won't prevent a subtrans lookup for heap
when subtransactions are in use, and it's IME pretty common for tuples
to get looked at again in the transaction that has created them. So
I'm somewhat inclined to think that SubTransGetTopmostTransaction()
should have a fast-path for the current transaction - probably just
employing TransactionIdIsCurrentTransactionId().
I don't really see what we gain by having the subtrans handling in the
predicate code. Especially given that we've already moved the HTSV()
handling out, it seems architecturally the wrong place to me - but I
admit that that's a fuzzy argument. The relevant mapping should be one
line in the caller.
I wonder if it'd be wroth to combine the
TransactionIdIsCurrentTransactionId() calls in the heap cases that
currently do both, PredicateLockTuple() and
HeapCheckForSerializableConflictOut(). The heap_fetch() case probably
isn't commonly that hot a pathq, but heap_hot_search_buffer() is.
Minor notes:
- I don't think 'insert_xid' is necessarily great - it could also be the
updating xid etc. And while you can argue that an update is an insert
in the current heap, that's not the case for future AMs.
- to me
@@ -1621,7 +1622,8 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
if (valid)
{
ItemPointerSetOffsetNumber(tid, offnum);
- PredicateLockTuple(relation, heapTuple, snapshot);
+ PredicateLockTID(relation, &(heapTuple)->t_self, snapshot,
+ HeapTupleHeaderGetXmin(heapTuple->t_data));
if (all_dead)
*all_dead = false;
return true;
What are those parens - as placed they can't do anything. Did you
intend to write &(heapTuple->t_self)? Even that is pretty superfluous,
but it at least clarifies the precedence.
I'm also a bit confused why we don't need to pass in the offset of the
current tuple, rather than the HOT root tuple here. That's not related
to this patch. But aren't we locking the wrong tuple here, in case of
HOT?
- I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a
portion of it's code as a static inline. In particular, it's a shame
that we currently perform external function calls at quite the
frequency when serializable isn't even in use.
Greetings,
Andres Freund
On Thu, Aug 1, 2019 at 2:36 AM Andres Freund <andres@anarazel.de> wrote:
I think the only part its doing for sub-transaction is
SubTransGetTopmostTransaction(xid). If xid passed to this function is
already top most transaction which is case for zheap and zedstore, then
there is no downside to keeping that code here in common place.Well, it's far from a cheap function. It'll do unnecessary on-disk
lookups in many cases. I'd call that quite a downside.Okay, agree, its costly function and better to avoid the call if possible.
Instead of moving the handling out of the function, how do feel about
adding boolean isTopTransactionId argument to function
CheckForSerializableConflictOut(). The AMs, which implicitly know, only
pass top transaction Id to this function, can pass true and avoid the
function call to SubTransGetTopmostTransaction(xid). With this
subtransaction code remains in generic place and AMs intending to use it
continue to leverage the common code, plus explicitly clarifies the
behavior as well.Looking at the code as of master, we currently have:
- PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure
out a whether the tuple has been locked by the current
transaction. That check afaict just should be
TransactionIdIsCurrentTransactionId(), without all the other
stuff that's done today.
Yeah. this is the only part where predicate locking uses the subxids.
Since, predicate locking always use the top xid, IMHO, it'll be good
to make this api independent of subxids.
TransactionIdIsCurrentTransactionId() imo ought to be optimized to
always check for the top level transactionid first - that's a good bet
today, but even moreso for the upcoming AMs that won't have separate
xids for subtransactions. Alternatively we shouldn't make that a
binary search for each subtrans level, but just have a small
simplehash hashtable for xids.
A check for top transaction id first and usage of simple sound like
good optimizations. But, I'm not sure whether these changes should be
part of this patch or a separate one.
--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com
On Wed, Jul 31, 2019 at 2:06 PM Andres Freund <andres@anarazel.de> wrote:
Looking at the code as of master, we currently have:
Super awesome feedback and insights, thank you!
- PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure
out a whether the tuple has been locked by the current
transaction. That check afaict just should be
TransactionIdIsCurrentTransactionId(), without all the other
stuff that's done today.
Agree. v1-0002 patch attached does that now. Please let me know if that's
what you meant.
TransactionIdIsCurrentTransactionId() imo ought to be optimized to
always check for the top level transactionid first - that's a good bet
today, but even moreso for the upcoming AMs that won't have separate
xids for subtransactions. Alternatively we shouldn't make that a
binary search for each subtrans level, but just have a small
simplehash hashtable for xids.
v1-0001 patch checks for GetTopTransactionIdIfAny() first in
TransactionIdIsCurrentTransactionId() which seems yes better in general and
more for future. That mostly meets the needs for current discussion.
The alternative of not using binary search seems bigger refactoring and
should be handled as separate optimization exercise outside of this thread.
- CheckForSerializableConflictOut() wants to get the toplevel xid for
the tuple, because that's the one the predicate hashtable stores.In your patch you've already moved the HTSV() call etc out of
CheckForSerializableConflictOut(). I'm somewhat inclined to think that
the SubTransGetTopmostTransaction() call ought to go along with that.
I don't really think that belongs in predicate.c, especially if
most/all new AMs don't use subtransaction ids.The only downside is that currently the
TransactionIdEquals(xid, GetTopTransactionIdIfAny()) check
avoids the SubTransGetTopmostTransaction() check.But again, the better fix for that seems to be to improve the generic
code. As written the check won't prevent a subtrans lookup for heap
when subtransactions are in use, and it's IME pretty common for tuples
to get looked at again in the transaction that has created them. So
I'm somewhat inclined to think that SubTransGetTopmostTransaction()
should have a fast-path for the current transaction - probably just
employing TransactionIdIsCurrentTransactionId().
That optimization, as Kuntal also mentioned, seems something which can be
done on-top afterwards on current patch.
I don't really see what we gain by having the subtrans handling in the
predicate code. Especially given that we've already moved the HTSV()
handling out, it seems architecturally the wrong place to me - but I
admit that that's a fuzzy argument. The relevant mapping should be one
line in the caller.
Okay, I moved the sub transaction handling out of
CheckForSerializableConflictOut() and have it along side HTSV() now.
The reason I felt leaving subtransaction handling in generic place, was it
might be premature to thing no future AM will need it. Plus, all
serializable function api's having same expectations is easier. Like
PredicateLockTuple() can be passed top or subtransaction id and it can
handle it but with the change CheckForSerializableConflictOut() only be
feed top transaction ID. But its fine and can see the point of AM needing
it can easily get top transaction ID and feed it as heap.
I wonder if it'd be wroth to combine the
TransactionIdIsCurrentTransactionId() calls in the heap cases that
currently do both, PredicateLockTuple() and
HeapCheckForSerializableConflictOut(). The heap_fetch() case probably
isn't commonly that hot a pathq, but heap_hot_search_buffer() is.
Maybe, will give thought to it separate from the current patch.
Minor notes: - I don't think 'insert_xid' is necessarily great - it could also be the updating xid etc. And while you can argue that an update is an insert in the current heap, that's not the case for future AMs. - to me @@ -1621,7 +1622,8 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer, if (valid) { ItemPointerSetOffsetNumber(tid, offnum); - PredicateLockTuple(relation, heapTuple, snapshot); + PredicateLockTID(relation, &(heapTuple)->t_self, snapshot, + HeapTupleHeaderGetXmin(heapTuple->t_data)); if (all_dead) *all_dead = false; return true;What are those parens - as placed they can't do anything. Did you
intend to write &(heapTuple->t_self)? Even that is pretty superfluous,
but it at least clarifies the precedence.
Fixed. No idea what I was thinking there, mostly feel I intended to have it
as like &(heapTuple->t_self).
I'm also a bit confused why we don't need to pass in the offset of the
current tuple, rather than the HOT root tuple here. That's not related
to this patch. But aren't we locking the wrong tuple here, in case of
HOT?
Yes, root is being locked here instead of the HOT. But I don't have full
context on the same. If we wish to fix it though, can be easily done now
with the patch by passing "tid" instead of &(heapTuple->t_self).
- I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a
portion of it's code as a static inline. In particular, it's a shame
that we currently perform external function calls at quite the
frequency when serializable isn't even in use.
I am not sure on portion of the code part? SerializationNeededForRead() is
static inline function in C file. Can't inline
CheckForSerializableConflictOutNeeded() without moving
SerializationNeededForRead() and some other variables to header file.
CheckForSerializableConflictOut() wasn't inline either, so a function call
was performed earlier as well when serializable isn't even in use.
I understand that with refactor, HeapCheckForSerializableConflictOut() is
called which calls CheckForSerializableConflictOutNeeded(). If that's the
problem, for addressing the same, I had proposed alternative way to
refactor. CheckForSerializableConflictOut() can take callback function and
void* callback argument for AM specific check instead. So, the flow would
be AM calling CheckForSerializableConflictOut() as today and only if
serializable in use will invoke the callback to check with AM if more work
should be performed or not. Essentially
HeapCheckForSerializableConflictOut() will become callback function
instead. Due to void* callback argument aspect I didn't like that solution
and felt AM performing checks and calling CheckForSerializableConflictOut()
seems more straight forward.
Attachments:
v1-0001-Optimize-TransactionIdIsCurrentTransactionId.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Optimize-TransactionIdIsCurrentTransactionId.patchDownload+3-1
v1-0002-Optimize-PredicateLockTuple.patchtext/x-patch; charset=US-ASCII; name=v1-0002-Optimize-PredicateLockTuple.patchDownload+4-19
v1-0003-Remove-HeapTuple-dependency-for-predicate-locking.patchtext/x-patch; charset=US-ASCII; name=v1-0003-Remove-HeapTuple-dependency-for-predicate-locking.patchDownload+164-124
On Sat, Aug 3, 2019 at 11:56 AM Ashwin Agrawal <aagrawal@pivotal.io> wrote:
On Wed, Jul 31, 2019 at 2:06 PM Andres Freund <andres@anarazel.de> wrote:
I'm also a bit confused why we don't need to pass in the offset of the
current tuple, rather than the HOT root tuple here. That's not related
to this patch. But aren't we locking the wrong tuple here, in case of
HOT?Yes, root is being locked here instead of the HOT. But I don't have full context on the same. If we wish to fix it though, can be easily done now with the patch by passing "tid" instead of &(heapTuple->t_self).
Here are three relevant commits:
1. Commit dafaa3efb75 "Implement genuine serializable isolation
level." (2011) locked the root tuple, because it set t_self to *tid.
Possibly due to confusion about the effect of the preceding line
ItemPointerSetOffsetNumber(tid, offnum).
2. Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self,
offnum).
3. Commit b89e151054a "Introduce logical decoding." (2014) also did
ItemPointerSet(&(heapTuple->t_self), BufferGetBlockNumber(buffer),
offnum), for the benefit of historical MVCC snapshots (unnecessarily,
considering the change in the commit #2), but then, intending to
"reset to original, non-redirected, tid", clobbered it, reintroducing
the bug fixed by #2.
My first guess is that commit #3 above was developed before commit #2,
and finished up clobbering it. In fact, both logical decoding and SSI
want offnum, so we should be able to just remove the "reset" bit
(perhaps like in the attached sketch, not really tested, though it
passes). This must be in want of an isolation test, but I haven't yet
tried to get my head around how to write a test that would show the
difference.
--
Thomas Munro
https://enterprisedb.com
Attachments:
fix.txttext/plain; charset=US-ASCII; name=fix.txtDownload+2-4
Hi,
On 2019-08-05 20:58:05 +1200, Thomas Munro wrote:
On Sat, Aug 3, 2019 at 11:56 AM Ashwin Agrawal <aagrawal@pivotal.io> wrote:
On Wed, Jul 31, 2019 at 2:06 PM Andres Freund <andres@anarazel.de> wrote:
I'm also a bit confused why we don't need to pass in the offset of the
current tuple, rather than the HOT root tuple here. That's not related
to this patch. But aren't we locking the wrong tuple here, in case of
HOT?Yes, root is being locked here instead of the HOT. But I don't have full context on the same. If we wish to fix it though, can be easily done now with the patch by passing "tid" instead of &(heapTuple->t_self).
Here are three relevant commits:
Thanks for digging!
1. Commit dafaa3efb75 "Implement genuine serializable isolation
level." (2011) locked the root tuple, because it set t_self to *tid.
Possibly due to confusion about the effect of the preceding line
ItemPointerSetOffsetNumber(tid, offnum).2. Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self,
offnum).
Hm. It's not at all sure that it's ok to report the non-root tuple tid
here. I.e. I'm fairly sure there was a reason to not just set it to the
actual tid. I think I might have written that up on the list at some
point. Let me dig in memory and list. Obviously possible that that was
also obsoleted by parallel changes.
3. Commit b89e151054a "Introduce logical decoding." (2014) also did
ItemPointerSet(&(heapTuple->t_self), BufferGetBlockNumber(buffer),
offnum), for the benefit of historical MVCC snapshots (unnecessarily,
considering the change in the commit #2), but then, intending to
"reset to original, non-redirected, tid", clobbered it, reintroducing
the bug fixed by #2.
My first guess is that commit #3 above was developed before commit #2,
and finished up clobbering it.
Yea, that sounds likely.
This must be in want of an isolation test, but I haven't yet tried to
get my head around how to write a test that would show the difference.
Indeed.
Greetings,
Andres Freund
On Tue, Aug 6, 2019 at 4:35 AM Andres Freund <andres@anarazel.de> wrote:
On 2019-08-05 20:58:05 +1200, Thomas Munro wrote:
1. Commit dafaa3efb75 "Implement genuine serializable isolation
level." (2011) locked the root tuple, because it set t_self to *tid.
Possibly due to confusion about the effect of the preceding line
ItemPointerSetOffsetNumber(tid, offnum).2. Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self,
offnum).Hm. It's not at all sure that it's ok to report the non-root tuple tid
here. I.e. I'm fairly sure there was a reason to not just set it to the
actual tid. I think I might have written that up on the list at some
point. Let me dig in memory and list. Obviously possible that that was
also obsoleted by parallel changes.
Adding Heikki and Kevin.
I haven't found your earlier discussion about that yet, but would be
keen to read it if you can find it. I wondered if your argument
might have had something to do with heap pruning, but I can't find a
problem there. It's not as though the TID of any visible tuples
change, it's just that some dead stuff goes away and the root line
pointer is changed to LP_REDIRECT if it wasn't already.
As for the argument for locking the tuple we emit rather than the HOT
root, I think the questions are: (1) How exactly do we get away with
locking only one version in a regular non-HOT update chain? (2) Is it
OK to do that with a HOT root?
The answer to the first question is given in README-SSI[1]Excerpt from README-SSI:.
Unfortunately it doesn't discuss HOT directly, but I suspect the
answer is no, HOT is not special here. By my reading, it relies on
the version you lock being the version visible to your snapshot, which
is important because later updates have to touch that tuple to write
the next version. That doesn't apply to some arbitrarily older tuple
that happens to be a HOT root. Concretely, heap_update() does
CheckForSerializableConflictIn(relation, &oldtup, buffer), which is
only going to produce a rw conflict if T1 took an SIREAD on precisely
the version T2 locks in that path, not some arbitrarily older version
that happens to be a HOT root. A HOT root might never be considered
again by concurrent writers, no?
As a minor consequence, the optimisation in
CheckTargetForConflictsIn() assumes that a tuple being updated has the
same tag as we locked when reading the tuple, which isn't the case if
we locked the root while reading but now have the TID for the version
we actually read, so in master we leak a tuple lock unnecessarily
until end-of-transaction when we update a HOT tuple.
This must be in want of an isolation test, but I haven't yet tried to
get my head around how to write a test that would show the difference.Indeed.
One practical problem is that the only way to reach
PredicateLockTuple() is from an index scan, and the index scan locks
the index page (or the whole index, depending on
rd_indam->ampredlocks). So I think if you want to see a serialization
anomaly you'll need multiple indexes (so that index page locks don't
hide the problem), a long enough HOT chain and then probably several
transactions to be able to miss a cycle that should be picked up by
the logic in [1]Excerpt from README-SSI:. I'm out of steam for this problem today though.
The simple test from the report[3]/messages/by-id/523C29A8.20904@vmware.com that resulted in commit 81fbbfe3352
doesn't work for me (ie with permutation "r1" "r2" "w1" "w2" "c1" "c2"
twice in a row). The best I've come up with so far is an assertion
that we predicate-lock the same row version that we emitted to the
user, when reached via an index lookup that visits a HOT row. The
test outputs 'f' for master, but 't' with the change to heapam.c.
[1]: Excerpt from README-SSI:
===
* PostgreSQL does not use "update in place" with a rollback log
for its MVCC implementation. Where possible it uses "HOT" updates on
the same page (if there is room and no indexed value is changed).
For non-HOT updates the old tuple is expired in place and a new tuple
is inserted at a new location. Because of this difference, a tuple
lock in PostgreSQL doesn't automatically lock any other versions of a
row. We don't try to copy or expand a tuple lock to any other
versions of the row, based on the following proof that any additional
serialization failures we would get from that would be false
positives:
o If transaction T1 reads a row version (thus acquiring a
predicate lock on it) and a second transaction T2 updates that row
version (thus creating a rw-conflict graph edge from T1 to T2), must a
third transaction T3 which re-updates the new version of the row also
have a rw-conflict in from T1 to prevent anomalies? In other words,
does it matter whether we recognize the edge T1 -> T3?
o If T1 has a conflict in, it certainly doesn't. Adding the
edge T1 -> T3 would create a dangerous structure, but we already had
one from the edge T1 -> T2, so we would have aborted something anyway.
(T2 has already committed, else T3 could not have updated its output;
but we would have aborted either T1 or T1's predecessor(s). Hence
no cycle involving T1 and T3 can survive.)
o Now let's consider the case where T1 doesn't have a
rw-conflict in. If that's the case, for this edge T1 -> T3 to make a
difference, T3 must have a rw-conflict out that induces a cycle in the
dependency graph, i.e. a conflict out to some transaction preceding T1
in the graph. (A conflict out to T1 itself would be problematic too,
but that would mean T1 has a conflict in, the case we already
eliminated.)
o So now we're trying to figure out if there can be an
rw-conflict edge T3 -> T0, where T0 is some transaction that precedes
T1. For T0 to precede T1, there has to be some edge, or sequence of
edges, from T0 to T1. At least the last edge has to be a wr-dependency
or ww-dependency rather than a rw-conflict, because T1 doesn't have a
rw-conflict in. And that gives us enough information about the order
of transactions to see that T3 can't have a rw-conflict to T0:
- T0 committed before T1 started (the wr/ww-dependency implies this)
- T1 started before T2 committed (the T1->T2 rw-conflict implies this)
- T2 committed before T3 started (otherwise, T3 would get aborted
because of an update conflict)
o That means T0 committed before T3 started, and therefore
there can't be a rw-conflict from T3 to T0.
o So in all cases, we don't need the T1 -> T3 edge to
recognize cycles. Therefore it's not necessary for T1's SIREAD lock
on the original tuple version to cover later versions as well.
===
[2]: /messages/by-id/52527E4D.4060302@vmware.com
[3]: /messages/by-id/523C29A8.20904@vmware.com
--
Thomas Munro
https://enterprisedb.com
Attachments:
0001-Predicate-lock-the-visible-heap-tuple-not-the-HOT-ro.patchapplication/octet-stream; name=0001-Predicate-lock-the-visible-heap-tuple-not-the-HOT-ro.patchDownload+46-5
Hello Thomas,
On Tue, Aug 6, 2019 at 9:50 AM Thomas Munro <thomas.munro@gmail.com> wrote:
On Tue, Aug 6, 2019 at 4:35 AM Andres Freund <andres@anarazel.de> wrote:
On 2019-08-05 20:58:05 +1200, Thomas Munro wrote:
1. Commit dafaa3efb75 "Implement genuine serializable isolation
level." (2011) locked the root tuple, because it set t_self to *tid.
Possibly due to confusion about the effect of the preceding line
ItemPointerSetOffsetNumber(tid, offnum).2. Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self,
offnum).Hm. It's not at all sure that it's ok to report the non-root tuple tid
here. I.e. I'm fairly sure there was a reason to not just set it to the
actual tid. I think I might have written that up on the list at some
point. Let me dig in memory and list. Obviously possible that that was
also obsoleted by parallel changes.Adding Heikki and Kevin.
I haven't found your earlier discussion about that yet, but would be
keen to read it if you can find it. I wondered if your argument
might have had something to do with heap pruning, but I can't find a
problem there. It's not as though the TID of any visible tuples
change, it's just that some dead stuff goes away and the root line
pointer is changed to LP_REDIRECT if it wasn't already.As for the argument for locking the tuple we emit rather than the HOT
root, I think the questions are: (1) How exactly do we get away with
locking only one version in a regular non-HOT update chain? (2) Is it
OK to do that with a HOT root?The answer to the first question is given in README-SSI[1].
Unfortunately it doesn't discuss HOT directly, but I suspect the
answer is no, HOT is not special here. By my reading, it relies on
the version you lock being the version visible to your snapshot, which
is important because later updates have to touch that tuple to write
the next version. That doesn't apply to some arbitrarily older tuple
that happens to be a HOT root. Concretely, heap_update() does
CheckForSerializableConflictIn(relation, &oldtup, buffer), which is
only going to produce a rw conflict if T1 took an SIREAD on precisely
the version T2 locks in that path, not some arbitrarily older version
that happens to be a HOT root. A HOT root might never be considered
again by concurrent writers, no?
If I understand the problem, this is the same serialization issue as
with in-place updates for zheap. I had a discussion with Kevin
regarding the same in this thread [1]Re: In-place updates and serializable transactions: /messages/by-id/CAGz5QCJzreUqJqHeXrbEs6xb0zCNKBHhOj6D9Tjd3btJTzydxg@mail.gmail.com. It seems if we're locking the
hot root id, we may report some false positive serializable errors.
This must be in want of an isolation test, but I haven't yet tried to
get my head around how to write a test that would show the difference.Indeed.
One practical problem is that the only way to reach
PredicateLockTuple() is from an index scan, and the index scan locks
the index page (or the whole index, depending on
rd_indam->ampredlocks). So I think if you want to see a serialization
anomaly you'll need multiple indexes (so that index page locks don't
hide the problem), a long enough HOT chain and then probably several
transactions to be able to miss a cycle that should be picked up by
the logic in [1]. I'm out of steam for this problem today though.The simple test from the report[3] that resulted in commit 81fbbfe3352
doesn't work for me (ie with permutation "r1" "r2" "w1" "w2" "c1" "c2"
twice in a row). The best I've come up with so far is an assertion
that we predicate-lock the same row version that we emitted to the
user, when reached via an index lookup that visits a HOT row. The
test outputs 'f' for master, but 't' with the change to heapam.c.
Here is an example from the multiple-row-versions isolation test which
fails if we perform in-place updates for zheap. I think the same will
be relevant if we lock root tuple id instead of the tuple itself.
Step 1: T1-> BEGIN; Read FROM t where id=1000000;
Step 2: T2-> BEGIN; UPDATE t where id=1000000; COMMIT; (creates T1->T2)
Step 3: T3-> BEGIN; UPDATE t where id=1000000; Read FROM t where id=500000;
Step 4: T4-> BEGIN; UPDATE t where id= 500000; Read FROM t where id=1;
COMMIT; (creates T3->T4)
Step 5: T3-> COMMIT;
Step 6: T1-> UPDATE t where id=1; COMMIT; (creates T4->T1,)
At step 6, when the update statement is executed, T1 is rolled back
because of T3->T4->T1.
But for zheap, step 3 also creates a dependency T1->T3 because of
in-place update. When T4 commits in step 4, it marks T3 as doomed
because of T1 --> T3 --> T4. Hence, in step 5, T3 is rolled back.
[1]: Re: In-place updates and serializable transactions: /messages/by-id/CAGz5QCJzreUqJqHeXrbEs6xb0zCNKBHhOj6D9Tjd3btJTzydxg@mail.gmail.com
/messages/by-id/CAGz5QCJzreUqJqHeXrbEs6xb0zCNKBHhOj6D9Tjd3btJTzydxg@mail.gmail.com
--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com
On 06/08/2019 07:20, Thomas Munro wrote:
On Tue, Aug 6, 2019 at 4:35 AM Andres Freund <andres@anarazel.de> wrote:
On 2019-08-05 20:58:05 +1200, Thomas Munro wrote:
1. Commit dafaa3efb75 "Implement genuine serializable isolation
level." (2011) locked the root tuple, because it set t_self to *tid.
Possibly due to confusion about the effect of the preceding line
ItemPointerSetOffsetNumber(tid, offnum).2. Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self,
offnum).Hm. It's not at all sure that it's ok to report the non-root tuple tid
here. I.e. I'm fairly sure there was a reason to not just set it to the
actual tid. I think I might have written that up on the list at some
point. Let me dig in memory and list. Obviously possible that that was
also obsoleted by parallel changes.Adding Heikki and Kevin.
I haven't found your earlier discussion about that yet, but would be
keen to read it if you can find it. I wondered if your argument
might have had something to do with heap pruning, but I can't find a
problem there. It's not as though the TID of any visible tuples
change, it's just that some dead stuff goes away and the root line
pointer is changed to LP_REDIRECT if it wasn't already.As for the argument for locking the tuple we emit rather than the HOT
root, I think the questions are: (1) How exactly do we get away with
locking only one version in a regular non-HOT update chain? (2) Is it
OK to do that with a HOT root?The answer to the first question is given in README-SSI[1].
Unfortunately it doesn't discuss HOT directly, but I suspect the
answer is no, HOT is not special here. By my reading, it relies on
the version you lock being the version visible to your snapshot, which
is important because later updates have to touch that tuple to write
the next version. That doesn't apply to some arbitrarily older tuple
that happens to be a HOT root. Concretely, heap_update() does
CheckForSerializableConflictIn(relation, &oldtup, buffer), which is
only going to produce a rw conflict if T1 took an SIREAD on precisely
the version T2 locks in that path, not some arbitrarily older version
that happens to be a HOT root. A HOT root might never be considered
again by concurrent writers, no?
Your analysis is spot on. Thanks for the clear write-up!
This must be in want of an isolation test, but I haven't yet tried to
get my head around how to write a test that would show the difference.Indeed.
One practical problem is that the only way to reach
PredicateLockTuple() is from an index scan, and the index scan locks
the index page (or the whole index, depending on
rd_indam->ampredlocks). So I think if you want to see a serialization
anomaly you'll need multiple indexes (so that index page locks don't
hide the problem), a long enough HOT chain and then probably several
transactions to be able to miss a cycle that should be picked up by
the logic in [1]. I'm out of steam for this problem today though.
I had some steam, and wrote a spec that reproduces this bug. It wasn't
actually that hard to reproduce, fortunately. Or unfortunately; people
might well be hitting it in production. I used the "freezetest.spec"
from the 2013 thread as the starting point, and added one UPDATE to the
initialization, so that the test starts with an already HOT-updated
tuple. It should throw a serialization error, but on current master, it
does not. After applying your fix.txt, it does.
Your fix.txt seems correct. For clarity, I'd prefer moving things around
a bit, though, so that the t_self is set earlier in the function, at the
same place where the other HeapTuple fields are set. And set blkno and
offnum together, in one ItemPointerSet call. With that, I'm not sure we
need such a verbose comment explaining why t_self needs to be updated
but I kept it for now.
Attached is a patch that contains your fix.txt with the changes for
clarity mentioned above, and an isolationtester test case.
PS. Because heap_hot_search_buffer() now always sets heapTuple->t_self
to the returned tuple version, updating *tid is redundant. And the call
in heapam_index_fetch_tuple() wouldn't need to do
"bslot->base.tupdata.t_self = *tid".
- Heikki
Attachments:
0001-Fix-predicate-locking-of-HOT-updated-rows.patchtext/x-patch; name=0001-Fix-predicate-locking-of-HOT-updated-rows.patchDownload+69-17
On Tue, Aug 6, 2019 at 9:26 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
I had some steam, and wrote a spec that reproduces this bug. It wasn't
actually that hard to reproduce, fortunately. Or unfortunately; people
might well be hitting it in production. I used the "freezetest.spec"
from the 2013 thread as the starting point, and added one UPDATE to the
initialization, so that the test starts with an already HOT-updated
tuple. It should throw a serialization error, but on current master, it
does not. After applying your fix.txt, it does.
Thanks! Ahh, right, I was expecting it to be harder to see an
undetected anomaly, because of the index page lock, but of course we
never actually write to that page so it's just the heap tuple lock
holding everything together.
Your fix.txt seems correct. For clarity, I'd prefer moving things around
a bit, though, so that the t_self is set earlier in the function, at the
same place where the other HeapTuple fields are set. And set blkno and
offnum together, in one ItemPointerSet call. With that, I'm not sure we
need such a verbose comment explaining why t_self needs to be updated
but I kept it for now.
+1
Attached is a patch that contains your fix.txt with the changes for
clarity mentioned above, and an isolationtester test case.
LGTM.
PS. Because heap_hot_search_buffer() now always sets heapTuple->t_self
to the returned tuple version, updating *tid is redundant. And the call
in heapam_index_fetch_tuple() wouldn't need to do
"bslot->base.tupdata.t_self = *tid".
Right, that sounds like a separate improvement for master only.
--
Thomas Munro
https://enterprisedb.com
On 06/08/2019 13:35, Thomas Munro wrote:
On Tue, Aug 6, 2019 at 9:26 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Attached is a patch that contains your fix.txt with the changes for
clarity mentioned above, and an isolationtester test case.LGTM.
Pushed, thanks!
- Heikki
On Fri, Aug 2, 2019 at 4:56 PM Ashwin Agrawal <aagrawal@pivotal.io> wrote:
On Wed, Jul 31, 2019 at 2:06 PM Andres Freund <andres@anarazel.de> wrote:
Looking at the code as of master, we currently have:
Super awesome feedback and insights, thank you!
- PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure
out a whether the tuple has been locked by the current
transaction. That check afaict just should be
TransactionIdIsCurrentTransactionId(), without all the other
stuff that's done today.Agree. v1-0002 patch attached does that now. Please let me know if that's
what you meant.TransactionIdIsCurrentTransactionId() imo ought to be optimized to
always check for the top level transactionid first - that's a good bet
today, but even moreso for the upcoming AMs that won't have separate
xids for subtransactions. Alternatively we shouldn't make that a
binary search for each subtrans level, but just have a small
simplehash hashtable for xids.v1-0001 patch checks for GetTopTransactionIdIfAny() first in
TransactionIdIsCurrentTransactionId() which seems yes better in general and
more for future. That mostly meets the needs for current discussion.The alternative of not using binary search seems bigger refactoring and
should be handled as separate optimization exercise outside of this thread.- CheckForSerializableConflictOut() wants to get the toplevel xid for
the tuple, because that's the one the predicate hashtable stores.In your patch you've already moved the HTSV() call etc out of
CheckForSerializableConflictOut(). I'm somewhat inclined to think that
the SubTransGetTopmostTransaction() call ought to go along with that.
I don't really think that belongs in predicate.c, especially if
most/all new AMs don't use subtransaction ids.The only downside is that currently the
TransactionIdEquals(xid, GetTopTransactionIdIfAny()) check
avoids the SubTransGetTopmostTransaction() check.But again, the better fix for that seems to be to improve the generic
code. As written the check won't prevent a subtrans lookup for heap
when subtransactions are in use, and it's IME pretty common for tuples
to get looked at again in the transaction that has created them. So
I'm somewhat inclined to think that SubTransGetTopmostTransaction()
should have a fast-path for the current transaction - probably just
employing TransactionIdIsCurrentTransactionId().That optimization, as Kuntal also mentioned, seems something which can be
done on-top afterwards on current patch.I don't really see what we gain by having the subtrans handling in the
predicate code. Especially given that we've already moved the HTSV()
handling out, it seems architecturally the wrong place to me - but I
admit that that's a fuzzy argument. The relevant mapping should be one
line in the caller.Okay, I moved the sub transaction handling out of
CheckForSerializableConflictOut() and have it along side HTSV() now.The reason I felt leaving subtransaction handling in generic place, was it
might be premature to thing no future AM will need it. Plus, all
serializable function api's having same expectations is easier. Like
PredicateLockTuple() can be passed top or subtransaction id and it can
handle it but with the change CheckForSerializableConflictOut() only be
feed top transaction ID. But its fine and can see the point of AM needing
it can easily get top transaction ID and feed it as heap.I wonder if it'd be wroth to combine the
TransactionIdIsCurrentTransactionId() calls in the heap cases that
currently do both, PredicateLockTuple() and
HeapCheckForSerializableConflictOut(). The heap_fetch() case probably
isn't commonly that hot a pathq, but heap_hot_search_buffer() is.Maybe, will give thought to it separate from the current patch.
Minor notes: - I don't think 'insert_xid' is necessarily great - it could also be the updating xid etc. And while you can argue that an update is an insert in the current heap, that's not the case for future AMs. - to me @@ -1621,7 +1622,8 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer, if (valid) { ItemPointerSetOffsetNumber(tid, offnum); - PredicateLockTuple(relation, heapTuple, snapshot); + PredicateLockTID(relation, &(heapTuple)->t_self, snapshot, + HeapTupleHeaderGetXmin(heapTuple->t_data)); if (all_dead) *all_dead = false; return true;What are those parens - as placed they can't do anything. Did you
intend to write &(heapTuple->t_self)? Even that is pretty superfluous,
but it at least clarifies the precedence.Fixed. No idea what I was thinking there, mostly feel I intended to have
it as like &(heapTuple->t_self).I'm also a bit confused why we don't need to pass in the offset of the
current tuple, rather than the HOT root tuple here. That's not related
to this patch. But aren't we locking the wrong tuple here, in case of
HOT?Yes, root is being locked here instead of the HOT. But I don't have full
context on the same. If we wish to fix it though, can be easily done now
with the patch by passing "tid" instead of &(heapTuple->t_self).- I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a
portion of it's code as a static inline. In particular, it's a shame
that we currently perform external function calls at quite the
frequency when serializable isn't even in use.I am not sure on portion of the code part? SerializationNeededForRead() is
static inline function in C file. Can't inline
CheckForSerializableConflictOutNeeded() without moving
SerializationNeededForRead() and some other variables to header file.
CheckForSerializableConflictOut() wasn't inline either, so a function call
was performed earlier as well when serializable isn't even in use.I understand that with refactor, HeapCheckForSerializableConflictOut() is
called which calls CheckForSerializableConflictOutNeeded(). If that's the
problem, for addressing the same, I had proposed alternative way to
refactor. CheckForSerializableConflictOut() can take callback function and
void* callback argument for AM specific check instead. So, the flow would
be AM calling CheckForSerializableConflictOut() as today and only if
serializable in use will invoke the callback to check with AM if more work
should be performed or not. Essentially
HeapCheckForSerializableConflictOut() will become callback function
instead. Due to void* callback argument aspect I didn't like that solution
and felt AM performing checks and calling CheckForSerializableConflictOut()
seems more straight forward.
Attaching re-based version of the patches on top of current master, which
has the fix for HOT serializable predicate locking bug spotted by Andres
committed now.
Attachments:
v2-0001-Optimize-TransactionIdIsCurrentTransactionId.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Optimize-TransactionIdIsCurrentTransactionId.patchDownload+3-1
v2-0002-Optimize-PredicateLockTuple.patchtext/x-patch; charset=US-ASCII; name=v2-0002-Optimize-PredicateLockTuple.patchDownload+4-19
v2-0003-Remove-HeapTuple-dependency-for-predicate-locking.patchtext/x-patch; charset=US-ASCII; name=v2-0003-Remove-HeapTuple-dependency-for-predicate-locking.patchDownload+164-124
On Thu, Aug 8, 2019 at 6:53 AM Ashwin Agrawal <aagrawal@pivotal.io> wrote:
- I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a
portion of it's code as a static inline. In particular, it's a shame
that we currently perform external function calls at quite the
frequency when serializable isn't even in use.I am not sure on portion of the code part? SerializationNeededForRead() is static inline function in C file. Can't inline CheckForSerializableConflictOutNeeded() without moving SerializationNeededForRead() and some other variables to header file. CheckForSerializableConflictOut() wasn't inline either, so a function call was performed earlier as well when serializable isn't even in use.
I agree that it's strange that we do these high frequency function
calls just to figure out that we're not even using this stuff, which
ultimately comes down to the static global variable MySerializableXact
being not reachable from (say) an inline function defined in a header.
That's something to look into in another thread.
Attaching re-based version of the patches on top of current master, which has the fix for HOT serializable predicate locking bug spotted by Andres committed now.
I'm planning to commit these three patches on Monday. I've attached
versions with whitespace-only changes from pgindent, and commit
messages lightly massaged and updated to point to this discussion and
reviewers.