SSI freezing bug
Hi,
Prompted by Andres Freund's comments on my Freezing without Write I/O
patch, I realized that there's there's an existing bug in the way
predicate locking handles freezing (or rather, it doesn't handle it).
When a tuple is predicate-locked, the key of the lock is ctid+xmin.
However, when a tuple is frozen, its xmin is changed to FrozenXid. That
effectively invalidates any predicate lock on the tuple, as checking for
a lock on the same tuple later won't find it as the xmin is different.
Attached is an isolationtester spec to demonstrate this.
- Heikki
Attachments:
freezetest.spectext/x-rpm-spec; name=freezetest.specDownload
Hi,
On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote:
When a tuple is predicate-locked, the key of the lock is ctid+xmin. However,
when a tuple is frozen, its xmin is changed to FrozenXid. That effectively
invalidates any predicate lock on the tuple, as checking for a lock on the
same tuple later won't find it as the xmin is different.Attached is an isolationtester spec to demonstrate this.
Do you have any idea to fix that besides keeping the xmin horizon below the
lowest of the xids that are predicate locked? Which seems nasty to
compute and is probably not trivial to fit into the procarray.c
machinery?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2013-09-20 13:53:04 +0200, Andres Freund wrote:
Hi,
On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote:
When a tuple is predicate-locked, the key of the lock is ctid+xmin. However,
when a tuple is frozen, its xmin is changed to FrozenXid. That effectively
invalidates any predicate lock on the tuple, as checking for a lock on the
same tuple later won't find it as the xmin is different.Attached is an isolationtester spec to demonstrate this.
Do you have any idea to fix that besides keeping the xmin horizon below the
lowest of the xids that are predicate locked? Which seems nasty to
compute and is probably not trivial to fit into the procarray.c
machinery?
A better solution probably is to promote tuple-level locks if they exist
to a relation level one upon freezing I guess?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@2ndquadrant.com> wrote:
On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote:
When a tuple is predicate-locked, the key of the lock is ctid+xmin.
However, when a tuple is frozen, its xmin is changed to FrozenXid. That
effectively
invalidates any predicate lock on the tuple, as checking for a lock on
the
same tuple later won't find it as the xmin is different.Attached is an isolationtester spec to demonstrate this.
Do you have any idea to fix that besides keeping the xmin horizon below the
lowest of the xids that are predicate locked? Which seems nasty to
compute and is probably not trivial to fit into the procarray.c
machinery?A better solution probably is to promote tuple-level locks if they exist
to a relation level one upon freezing I guess?
That would work. A couple other ideas would be to use the oldest
serializable xmin which is being calculated in predicate.c to
either prevent freezing of newer transaction or to summarize
predicate locks (using the existing SLRU-based summarization
functions) which use xmin values for xids which are being frozen.
The transactions must already be committed, and so are eligible for
summarization.
I'm not sure which is best. Will review, probably this weekend.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Heikki Linnakangas <hlinnakangas@vmware.com> schrieb:
Kevin Grittner <kgrittn@ymail.com> wrote:
Andres Freund <andres@2ndquadrant.com> wrote:
On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote:
When a tuple is predicate-locked, the key of the lock is
ctid+xmin.
However, when a tuple is frozen, its xmin is changed to FrozenXid.
That
effectively
invalidates any predicate lock on the tuple, as checking for alock
on
the
same tuple later won't find it as the xmin is different.Attached is an isolationtester spec to demonstrate this.
Do you have any idea to fix that besides keeping the xmin horizon
below the
lowest of the xids that are predicate locked? Which seems nasty to
compute and is probably not trivial to fit into the procarray.c
machinery?A better solution probably is to promote tuple-level locks if they
exist
to a relation level one upon freezing I guess?
That would work. A couple other ideas would be to use the oldest
serializable xmin which is being calculated in predicate.c to
either prevent freezing of newer transaction or to summarize
predicate locks (using the existing SLRU-based summarization
functions) which use xmin values for xids which are being frozen.
The transactions must already be committed, and so are eligible for
summarization.What's the point of using xmin as part of the lock key in the first
place, wouldn't ctid alone suffice? If the tuple was visible to the
reading transaction, it cannot be vacuumed away until the transaction
commits. No other tuple can appear with the same ctid.
SSI locks can live longer than one TX/snapshot.
Andres
--
Please excuse brevity and formatting - I am writing this on my mobile phone.
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: 30fc74df-39e3-44da-90b8-1bc2be6bffde@email.android.com
Kevin Grittner <kgrittn@ymail.com> wrote:
Andres Freund <andres@2ndquadrant.com> wrote:
On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote:
When a tuple is predicate-locked, the key of the lock is ctid+xmin.
However, when a tuple is frozen, its xmin is changed to FrozenXid.That
effectively
invalidates any predicate lock on the tuple, as checking for a lockon
the
same tuple later won't find it as the xmin is different.Attached is an isolationtester spec to demonstrate this.
Do you have any idea to fix that besides keeping the xmin horizon
below the
lowest of the xids that are predicate locked? Which seems nasty to
compute and is probably not trivial to fit into the procarray.c
machinery?A better solution probably is to promote tuple-level locks if they
exist
to a relation level one upon freezing I guess?
That would work. A couple other ideas would be to use the oldest
serializable xmin which is being calculated in predicate.c to
either prevent freezing of newer transaction or to summarize
predicate locks (using the existing SLRU-based summarization
functions) which use xmin values for xids which are being frozen.
The transactions must already be committed, and so are eligible for
summarization.
What's the point of using xmin as part of the lock key in the first place, wouldn't ctid alone suffice? If the tuple was visible to the reading transaction, it cannot be vacuumed away until the transaction commits. No other tuple can appear with the same ctid.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 09/21/2013 10:46 PM, Andres Freund wrote:
Heikki Linnakangas <hlinnakangas@vmware.com> schrieb:
Kevin Grittner <kgrittn@ymail.com> wrote:
Andres Freund <andres@2ndquadrant.com> wrote:
On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote:
When a tuple is predicate-locked, the key of the lock is
ctid+xmin.
However, when a tuple is frozen, its xmin is changed to FrozenXid.
That
effectively
invalidates any predicate lock on the tuple, as checking for alock
on
the
same tuple later won't find it as the xmin is different.Attached is an isolationtester spec to demonstrate this.
Do you have any idea to fix that besides keeping the xmin horizon
below the
lowest of the xids that are predicate locked? Which seems nasty to
compute and is probably not trivial to fit into the procarray.c
machinery?A better solution probably is to promote tuple-level locks if they
exist
to a relation level one upon freezing I guess?
That would work. A couple other ideas would be to use the oldest
serializable xmin which is being calculated in predicate.c to
either prevent freezing of newer transaction or to summarize
predicate locks (using the existing SLRU-based summarization
functions) which use xmin values for xids which are being frozen.
The transactions must already be committed, and so are eligible for
summarization.What's the point of using xmin as part of the lock key in the first
place, wouldn't ctid alone suffice? If the tuple was visible to the
reading transaction, it cannot be vacuumed away until the transaction
commits. No other tuple can appear with the same ctid.SSI locks can live longer than one TX/snapshot.
But the only way that ctid can change without xmin changing is when
you update the tuple in the same TX that created it.
Could it be the case here or can we safely exclude this ?
--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 09/20/2013 12:55 PM, Heikki Linnakangas wrote:
Hi,
Prompted by Andres Freund's comments on my Freezing without Write I/O
patch, I realized that there's there's an existing bug in the way
predicate locking handles freezing (or rather, it doesn't handle it).When a tuple is predicate-locked, the key of the lock is ctid+xmin.
However, when a tuple is frozen, its xmin is changed to FrozenXid.
That effectively invalidates any predicate lock on the tuple, as
checking for a lock on the same tuple later won't find it as the xmin
is different.Attached is an isolationtester spec to demonstrate this.
The case is even fishier than that.
That is, you can get bad behaviour on at least v9.2.4 even without
VACUUM FREEZE.
You just need to run
permutation "r1" "r2" "w1" "w2" "c1" "c2"
twice in a row.
the first time it does get serialization error at "c2"
but the 2nd time both "c1" and "c2" complete successfully
Cheers
--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic O�
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 22.09.2013 00:12, Hannu Krosing wrote:
On 09/21/2013 10:46 PM, Andres Freund wrote:
Heikki Linnakangas<hlinnakangas@vmware.com> schrieb:
Kevin Grittner<kgrittn@ymail.com> wrote:
Andres Freund<andres@2ndquadrant.com> wrote:
On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote:
When a tuple is predicate-locked, the key of the lock is
ctid+xmin.
However, when a tuple is frozen, its xmin is changed to FrozenXid.
That
effectively
invalidates any predicate lock on the tuple, as checking for alock
on
the
same tuple later won't find it as the xmin is different.Attached is an isolationtester spec to demonstrate this.
Do you have any idea to fix that besides keeping the xmin horizon
below the
lowest of the xids that are predicate locked? Which seems nasty to
compute and is probably not trivial to fit into the procarray.c
machinery?A better solution probably is to promote tuple-level locks if they
exist
to a relation level one upon freezing I guess?
That would work. A couple other ideas would be to use the oldest
serializable xmin which is being calculated in predicate.c to
either prevent freezing of newer transaction or to summarize
predicate locks (using the existing SLRU-based summarization
functions) which use xmin values for xids which are being frozen.
The transactions must already be committed, and so are eligible for
summarization.What's the point of using xmin as part of the lock key in the first
place, wouldn't ctid alone suffice? If the tuple was visible to the
reading transaction, it cannot be vacuumed away until the transaction
commits. No other tuple can appear with the same ctid.SSI locks can live longer than one TX/snapshot.
But the only way that ctid can change without xmin changing is when
you update the tuple in the same TX that created it.
I don't think that's relevant. We're not talking about the "ctid" field
of a tuple, ie. HeapTupleHeader.t_ctid. We're talking about its physical
location, ie. HeapTuple->t_self.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 21.09.2013 23:46, Andres Freund wrote:
Heikki Linnakangas<hlinnakangas@vmware.com> schrieb:
Kevin Grittner<kgrittn@ymail.com> wrote:
Andres Freund<andres@2ndquadrant.com> wrote:
On 2013-09-20 13:55:36 +0300, Heikki Linnakangas wrote:
When a tuple is predicate-locked, the key of the lock is
ctid+xmin.
However, when a tuple is frozen, its xmin is changed to FrozenXid.
That
effectively
invalidates any predicate lock on the tuple, as checking for alock
on
the
same tuple later won't find it as the xmin is different.Attached is an isolationtester spec to demonstrate this.
Do you have any idea to fix that besides keeping the xmin horizon
below the
lowest of the xids that are predicate locked? Which seems nasty to
compute and is probably not trivial to fit into the procarray.c
machinery?A better solution probably is to promote tuple-level locks if they
exist
to a relation level one upon freezing I guess?
That would work. A couple other ideas would be to use the oldest
serializable xmin which is being calculated in predicate.c to
either prevent freezing of newer transaction or to summarize
predicate locks (using the existing SLRU-based summarization
functions) which use xmin values for xids which are being frozen.
The transactions must already be committed, and so are eligible for
summarization.What's the point of using xmin as part of the lock key in the first
place, wouldn't ctid alone suffice? If the tuple was visible to the
reading transaction, it cannot be vacuumed away until the transaction
commits. No other tuple can appear with the same ctid.SSI locks can live longer than one TX/snapshot.
Hmm, true.
Another idea: we could vacuum away predicate locks, when the
corresponding tuples are vacuumed from the heap. Before the 2nd phase of
vacuum, before removing the dead tuples, we could scan the predicate
lock hash and remove any locks belonging to the tuples we're about to
remove. And not include xmin in the lock key.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 23.09.2013 01:07, Hannu Krosing wrote:
On 09/20/2013 12:55 PM, Heikki Linnakangas wrote:
Hi,
Prompted by Andres Freund's comments on my Freezing without Write I/O
patch, I realized that there's there's an existing bug in the way
predicate locking handles freezing (or rather, it doesn't handle it).When a tuple is predicate-locked, the key of the lock is ctid+xmin.
However, when a tuple is frozen, its xmin is changed to FrozenXid.
That effectively invalidates any predicate lock on the tuple, as
checking for a lock on the same tuple later won't find it as the xmin
is different.Attached is an isolationtester spec to demonstrate this.
The case is even fishier than that.
That is, you can get bad behaviour on at least v9.2.4 even without
VACUUM FREEZE.You just need to run
permutation "r1" "r2" "w1" "w2" "c1" "c2"
twice in a row.
the first time it does get serialization error at "c2"
but the 2nd time both "c1" and "c2" complete successfully
Oh, interesting. I did some debugging on this: there are actually *two*
bugs, either one of which alone is enough to cause this on its own:
1. in heap_hot_search_buffer(), the PredicateLockTuple() call is passed
wrong offset number. heapTuple->t_self is set to the tid of the first
tuple in the chain that's visited, not the one actually being read.
2. CheckForSerializableConflictIn() uses the tuple's t_ctid field
instead of t_self to check for exiting predicate locks on the tuple. If
the tuple was updated, but the updater rolled back, t_ctid points to the
aborted dead tuple.
After fixing both of those bugs, running the test case twice in a row
works, ie. causes a conflict and a rollback both times. Anyone see a
problem with this?
That still leaves the original problem I spotted, with freezing; that's
yet another unrelated bug.
- Heikki
Attachments:
ssi-hot-fix-1.patchtext/x-diff; name=ssi-hot-fix-1.patchDownload+5-3
Andres Freund <andres@2ndquadrant.com> wrote:
A better solution probably is to promote tuple-level locks if
they exist to a relation level one upon freezing I guess?
It would be sufficient to promote the tuple lock to a page lock.
It would be pretty easy to add a function to predicate.c which
would accept a Relation and HeapTuple, check for a predicate lock
for the tuple, and add a page lock if found (which will
automatically clear the tuple lock). This new function would be
called when a tuple was chosen for freezing. Since freezing always
causes WAL-logging and disk I/O, the cost of a couple hash table
operations should not be noticeable.
This seems like a bug fix which should be back-patched to 9.1, yes?
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2013-10-01 07:41:46 -0700, Kevin Grittner wrote:
Andres Freund <andres@2ndquadrant.com> wrote:
A better solution probably is to promote tuple-level locks if
they exist to a relation level one upon freezing I guess?It would be sufficient to promote the tuple lock to a page lock.
It would be pretty easy to add a function to predicate.c which
would accept a Relation and HeapTuple, check for a predicate lock
for the tuple, and add a page lock if found (which will
automatically clear the tuple lock).� This new function would be
called when a tuple was chosen for freezing.� Since freezing always
causes WAL-logging and disk I/O, the cost of a couple hash table
operations should not be noticeable.
Yea, not sure why I was thinking of table level locks.
This seems like a bug fix which should be back-patched to 9.1, yes?
Yes.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@2ndquadrant.com> wrote:
On 2013-10-01 07:41:46 -0700, Kevin Grittner wrote:
Andres Freund <andres@2ndquadrant.com> wrote:
A better solution probably is to promote tuple-level locks if
they exist to a relation level one upon freezing I guess?It would be sufficient to promote the tuple lock to a page lock.
It would be pretty easy to add a function to predicate.c which
would accept a Relation and HeapTuple, check for a predicate lock
for the tuple, and add a page lock if found (which will
automatically clear the tuple lock). This new function would be
called when a tuple was chosen for freezing. Since freezing always
causes WAL-logging and disk I/O, the cost of a couple hash table
operations should not be noticeable.Yea, not sure why I was thinking of table level locks.
This seems like a bug fix which should be back-patched to 9.1, yes?
Yes.
Patch attached, including new isolation test based on Heikki's
example. This patch does change the signature of
heap_freeze_tuple(). If anyone thinks there is risk that external
code may be calling this, I could keep the old function with its
old behavior (including the bug) and add a new function name with
the new parameters added -- the old function could call the new one
with NULL for the last two parameters. I'm not sure whether that
is better than breaking a compile of code which uses the old
signature, which would force a choice about what to do.
Will give it a couple days for feedback before pushing.
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
freeze-ssi-v1.patchtext/x-diff; name=freeze-ssi-v1.patchDownload+2618-15
On 03.10.2013 01:05, Kevin Grittner wrote:
Andres Freund<andres@2ndquadrant.com> wrote:
On 2013-10-01 07:41:46 -0700, Kevin Grittner wrote:
Andres Freund<andres@2ndquadrant.com> wrote:
A better solution probably is to promote tuple-level locks if
they exist to a relation level one upon freezing I guess?It would be sufficient to promote the tuple lock to a page lock.
It would be pretty easy to add a function to predicate.c which
would accept a Relation and HeapTuple, check for a predicate lock
for the tuple, and add a page lock if found (which will
automatically clear the tuple lock). This new function would be
called when a tuple was chosen for freezing. Since freezing always
causes WAL-logging and disk I/O, the cost of a couple hash table
operations should not be noticeable.Yea, not sure why I was thinking of table level locks.
This seems like a bug fix which should be back-patched to 9.1, yes?
Yes.
Patch attached, including new isolation test based on Heikki's
example. This patch does change the signature of
heap_freeze_tuple(). If anyone thinks there is risk that external
code may be calling this, I could keep the old function with its
old behavior (including the bug) and add a new function name with
the new parameters added -- the old function could call the new one
with NULL for the last two parameters. I'm not sure whether that
is better than breaking a compile of code which uses the old
signature, which would force a choice about what to do.Will give it a couple days for feedback before pushing.
IMHO it would be better to remove xmin from the lock key, and vacuum
away the old predicate locks when the corresponding tuple is vacuumed.
The xmin field is only required to handle the case that a tuple is
vacuumed, and a new unrelated tuple is inserted to the same slot.
Removing the lock when the tuple is removed fixes that.
In fact, I cannot even come up with a situation where you would have a
problem if we just removed xmin from the key, even if we didn't vacuum
away old locks. I don't think the old lock can conflict with anything
that would see the new tuple that gets inserted in the same slot. I have
a feeling that you could probably prove that if you stare long enough at
the diagram of a dangerous structure and the properties required for a
conflict.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
IMHO it would be better to remove xmin from the lock key, and vacuum
away the old predicate locks when the corresponding tuple is vacuumed.
The xmin field is only required to handle the case that a tuple is
vacuumed, and a new unrelated tuple is inserted to the same slot.
Removing the lock when the tuple is removed fixes that.In fact, I cannot even come up with a situation where you would have a
problem if we just removed xmin from the key, even if we didn't vacuum
away old locks. I don't think the old lock can conflict with anything
that would see the new tuple that gets inserted in the same slot. I have
a feeling that you could probably prove that if you stare long enough at
the diagram of a dangerous structure and the properties required for a
conflict.
You are the one who suggested adding xmin to the key:
/messages/by-id/4D5A36FC.6010203@enterprisedb.com
I will review that thread in light of your recent comments, but the
fact is that xmin was not originally in the lock key, testing
uncovered bugs, and adding xmin fixed those bugs. I know I tried
some other approach first, which turned out to be complex and quite
messy -- it may have been similar to what you are proposing now.
It seems to me that a change such as you are now suggesting is
likely to be too invasive to back-patch. Do you agree that it
would make sense to apply the patch I have proposed, back to 9.1,
and then consider any alternative as 9.4 material?
--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Oct 03, 2013 at 06:19:49AM -0700, Kevin Grittner wrote:
Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
IMHO it would be better to remove xmin from the lock key, and vacuum
away the old predicate locks when the corresponding tuple is vacuumed.
The xmin field is only required to handle the case that a tuple is
vacuumed, and a new unrelated tuple is inserted to the same slot.
Removing the lock when the tuple is removed fixes that.
This seems definitely safe: we need the predicate locks to determine if
someone is modifying a tuple we read, and certainly if it's eligible
for vacuum nobody's going to be modifying that tuple anymore.
In fact, I cannot even come up with a situation where you would have a
problem if we just removed xmin from the key, even if we didn't vacuum
away old locks. I don't think the old lock can conflict with anything
that would see the new tuple that gets inserted in the same slot. I have
a feeling that you could probably prove that if you stare long enough at
the diagram of a dangerous structure and the properties required for a
conflict.
This would also be safe, in the sense that it's OK to flag a
conflict even if one doesn't exist. I'm not convinced that it isn't
possible to have false positives this way. I think it's possible for a
tuple to be vacuumed away and the ctid reused before the predicate
locks on it are eligible for cleanup. (In fact, isn't this what was
happening in the thread Kevin linked?)
You are the one who suggested adding xmin to the key:
/messages/by-id/4D5A36FC.6010203@enterprisedb.com
I will review that thread in light of your recent comments, but the
fact is that xmin was not originally in the lock key, testing
uncovered bugs, and adding xmin fixed those bugs.� I know I tried
some other approach first, which turned out to be complex and quite
messy -- it may have been similar to what you are proposing now.
At the time, we thought it was necessary for a predicate lock to lock
*all future versions* of a tuple, and so we had a bunch of code to
maintain a version chain. That was fraught with bugs, and turned out
not to be necessary (IIRC, we worked that out at the pub at PGcon).
That made it critical to distinguish different tuples that had the same
ctid because they could wind up in the wrong chain or cause a cycle.
With that code ripped out, that's no longer an issue.
But all this is an exceptionally subtle part of what was an
exceptionally complex patch, so a lot of careful thought is needed
here...
It seems to me that a change such as you are now suggesting is
likely to be too invasive to back-patch.� Do you agree that it
would make sense to apply the patch I have proposed, back to 9.1,
and then consider any alternative as 9.4 material?
I agree with this.
Dan
--
Dan R. K. Ports UW CSE http://drkp.net/
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2013-10-03 21:14:17 -0700, Dan Ports wrote:
On Thu, Oct 03, 2013 at 06:19:49AM -0700, Kevin Grittner wrote:
Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
IMHO it would be better to remove xmin from the lock key, and vacuum
away the old predicate locks when the corresponding tuple is vacuumed.
The xmin field is only required to handle the case that a tuple is
vacuumed, and a new unrelated tuple is inserted to the same slot.
Removing the lock when the tuple is removed fixes that.This seems definitely safe: we need the predicate locks to determine if
someone is modifying a tuple we read, and certainly if it's eligible
for vacuum nobody's going to be modifying that tuple anymore.
But we're talking about freezing a tuple, not removing a dead tuple. I
don't see anything preventing modification of a frozen tuple. Right?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 04.10.2013 13:23, Andres Freund wrote:
On 2013-10-03 21:14:17 -0700, Dan Ports wrote:
On Thu, Oct 03, 2013 at 06:19:49AM -0700, Kevin Grittner wrote:
Heikki Linnakangas<hlinnakangas@vmware.com> wrote:
IMHO it would be better to remove xmin from the lock key, and vacuum
away the old predicate locks when the corresponding tuple is vacuumed.
The xmin field is only required to handle the case that a tuple is
vacuumed, and a new unrelated tuple is inserted to the same slot.
Removing the lock when the tuple is removed fixes that.This seems definitely safe: we need the predicate locks to determine if
someone is modifying a tuple we read, and certainly if it's eligible
for vacuum nobody's going to be modifying that tuple anymore.But we're talking about freezing a tuple, not removing a dead tuple. I
don't see anything preventing modification of a frozen tuple. Right?
Right, but if we no longer include xmin in the lock key, freezing a
tuple makes no difference. Currently, the problem is that when a tuple
is frozen, the locks on the tuple on the tuple are "lost", as the
xmin+ctid of the lock no longer matches xmin+ctid of the tuple.
Removing xmin from the lock altogether solves that problem, but it
introduces the possibility that when an old tuple is vacuumed away and a
new tuple is inserted on the same slot, a lock on the old tuple is
confused to be a lock on the new tuple. And that problem can be fixed by
vacuuming locks, when the corresponding tuple is vacuumed.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2013-10-04 13:51:00 +0300, Heikki Linnakangas wrote:
On 04.10.2013 13:23, Andres Freund wrote:
On 2013-10-03 21:14:17 -0700, Dan Ports wrote:
On Thu, Oct 03, 2013 at 06:19:49AM -0700, Kevin Grittner wrote:
Heikki Linnakangas<hlinnakangas@vmware.com> wrote:
IMHO it would be better to remove xmin from the lock key, and vacuum
away the old predicate locks when the corresponding tuple is vacuumed.
The xmin field is only required to handle the case that a tuple is
vacuumed, and a new unrelated tuple is inserted to the same slot.
Removing the lock when the tuple is removed fixes that.This seems definitely safe: we need the predicate locks to determine if
someone is modifying a tuple we read, and certainly if it's eligible
for vacuum nobody's going to be modifying that tuple anymore.But we're talking about freezing a tuple, not removing a dead tuple. I
don't see anything preventing modification of a frozen tuple. Right?Right, but if we no longer include xmin in the lock key, freezing a tuple
makes no difference. Currently, the problem is that when a tuple is frozen,
the locks on the tuple on the tuple are "lost", as the xmin+ctid of the lock
no longer matches xmin+ctid of the tuple.Removing xmin from the lock altogether solves that problem, but it
introduces the possibility that when an old tuple is vacuumed away and a new
tuple is inserted on the same slot, a lock on the old tuple is confused to
be a lock on the new tuple. And that problem can be fixed by vacuuming
locks, when the corresponding tuple is vacuumed.
But locks on those still can have meaning, right? From my very limited
understanding of predicate.c/SSI I don't see why we cannot have
meaningful conflicts on tuples that are already vacuumable. You'd need
some curiously interleaved transactions, sure, but it seems possible?
ISTM we'd need to peg the xmin horizon for vacuum to the oldest xmin
predicate.c keeps track of.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers