Small SSI issues

Started by Heikki Linnakangasover 14 years ago13 messages

heikki.linnakangas@enterprisedb.com

over 14 years ago

It makes wonders to take a couple of months break from looking at a
piece of code, and then review it in detail again. It's like a whole new
pair of eyes :-).

Here's a bunch of small issues that I spotted:

* The oldserxid code is broken for non-default BLCKSZ.
o The warning will come either too early or too late
o There is no safeguard against actually wrapping around the SLRU,
just the warning
o I'm suspicious of the OldSerXidPagePrecedesLogically() logic with
32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger than
necessary to cover the whole range of 2^32 transactions, so at high
XIDs, say 2^32-1, doesn't it incorrectly think that low XIDs, say
10, are in the past, not the future?

We've discussed these SLRU issues already, but still..

/*
* Keep a pointer to the currently-running serializable transaction (if any)
* for quick reference.
* TODO SSI: Remove volatile qualifier and the then-unnecessary casts?
*/
static volatile SERIALIZABLEXACT *MySerializableXact = InvalidSerializableXact;

* So, should we remove it? In most places where contents of
MySerializableXact are modified, we're holding
SerializableXactHashLock in exclusive mode. However, there's two
exceptions:

o GetSafeSnapshot() modifies MySerializableXact->flags without any
lock. It also inspects MySerializableXact->possibleUnsafeConflicts
without a lock. What if somone sets some other flag in the flags
bitmap just when GetSafeSnapshot() is about to set
SXACT_FLAG_DEFERRABLE_WAITING? One of the flags might be lost :-(.

o CheckForSerializableConflictIn() sets SXACT_FLAG_DID_WRITE without
holding a lock. The same danger is here if someone else tries to
set some other flag concurrently.

I think we should simply acquire SerializableXactHashLock in
GetSafeSnapshot(). It shouldn't be so much of a hotspot that it would
make any difference in performance. CheckForSerializableConflictIn()
might be called a lot, however, so maybe we need to check if the flag
is set first, and only acquire the lock and set it if it's not set
already.

* Is the SXACT_FLAG_ROLLED_BACK flag necessary? It's only set in
ReleasePredicateLocks() for a fleeting moment while the function
releases all conflicts and locks held by the transaction, and finally
the sxact struct itself containing the flag. Also, isn't a
transaction that's already been marked for death the same as one that
has already rolled back, for the purposes of detecting conflicts?

* The HavePartialClearThrough/CanPartialClearThrough mechanism needs a
better explanation. The only explanation currently is this:

if (--(PredXact->WritableSxactCount) == 0)
{
/*
* Release predicate locks and rw-conflicts in for all committed
* transactions. There are no longer any transactions which might
* conflict with the locks and no chance for new transactions to
* overlap. Similarly, existing conflicts in can't cause pivots,
* and any conflicts in which could have completed a dangerous
* structure would already have caused a rollback, so any
* remaining ones must be benign.
*/
PredXact->CanPartialClearThrough = PredXact->LastSxactCommitSeqNo;
}

If I understand that correctly, any predicate lock belonging to any
already committed transaction can be safely zapped away at that
instant. We don't do it immediately, because it might be expensive.
Instead, we set CanPartialClearThrough, and do it lazily in
ClearOldPredicateLocks(). But what is the purpose of
HavePartialClearThrough?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 14 years ago

In reply to: Heikki Linnakangas (#1)

Re: Small SSI issues

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:

Here's a bunch of small issues that I spotted:

Thanks for going over it again. It is encouraging that you didn't
spot any *big* issues.

* The oldserxid code is broken for non-default BLCKSZ.
o The warning will come either too early or too late

Good point. That part is easily fixed -- if we want to keep the
warning, in light of the next few points.

o There is no safeguard against actually wrapping around the
SLRU, just the warning

Any thoughts on what we should do instead? If someone holds open a
transaction long enough to burn through a billion transaction IDs
(or possibly less if someone uses a smaller BLCKSZ), should we
generate a FATAL error? Of course, one other option would be to
allow more SLRU segment files, as you raised on another thread.
Then this issue goes away because they would hit other, existing,
protections against transaction wraparound first.

o I'm suspicious of the OldSerXidPagePrecedesLogically() logic
with 32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger
than necessary to cover the whole range of 2^32 transactions,
so at high XIDs, say 2^32-1, doesn't it incorrectly think
that low XIDs, say 10, are in the past, not the future?

I will look that over to see; but if this is broken, then one of the
other SLRU users is probably broken, because I think I stole this
code pretty directly from another spot.

/*
* Keep a pointer to the currently-running serializable
* transaction (if any) for quick reference.
* TODO SSI: Remove volatile qualifier and the then-unnecessary
* casts?
*/
static volatile SERIALIZABLEXACT *MySerializableXact =
InvalidSerializableXact;

* So, should we remove it? In most places where contents of
MySerializableXact are modified, we're holding
SerializableXactHashLock in exclusive mode. However, there's
two exceptions:

o GetSafeSnapshot() modifies MySerializableXact->flags without
any lock. It also inspects
MySerializableXact->possibleUnsafeConflicts without a lock.
What if somone sets some other flag in the flags bitmap just
when GetSafeSnapshot() is about to set
SXACT_FLAG_DEFERRABLE_WAITING? One of the flags might be lost
:-(.

o CheckForSerializableConflictIn() sets SXACT_FLAG_DID_WRITE
without holding a lock. The same danger is here if someone
else tries to set some other flag concurrently.

I think we should simply acquire SerializableXactHashLock in
GetSafeSnapshot(). It shouldn't be so much of a hotspot that it
would make any difference in performance.
CheckForSerializableConflictIn() might be called a lot,
however, so maybe we need to check if the flag is set first,
and only acquire the lock and set it if it's not set already.

OK.

Do checks such as that argue for keeping the volatile flag, or do
you think we can drop it if we make those changes? (That would also
allow dropping a number of casts which exist just to avoid
warnings.)

* Is the SXACT_FLAG_ROLLED_BACK flag necessary? It's only set in
ReleasePredicateLocks() for a fleeting moment while the
function releases all conflicts and locks held by the
transaction, and finally the sxact struct itself containing the
flag.

I think that one can go away. It had more of a point many months
ago before we properly sorted out what belongs in
PreCommit_CheckForSerializationFailure() and what belongs in
ReleasePredicateLocks(). The point at which we reached clarity on
that and moved things around, this flag probably became obsolete.

Also, isn't a transaction that's already been marked for death
the same as one that has already rolled back, for the purposes
of detecting conflicts?

Yes.

We should probably ignore any marked-for-death transaction during
conflict detection and serialization failure detection. As a start,
anywhere there is now a check for rollback and not for this, we
should change it to this. There may be some places this can be
checked which haven't yet been identified and touched.

* The HavePartialClearThrough/CanPartialClearThrough mechanism
needs a better explanation. The only explanation currently is
this:

if (--(PredXact->WritableSxactCount) == 0)
{
/*
* Release predicate locks and rw-conflicts in for
* all committed transactions. There are no longer any
* transactions which might conflict with the locks and no
* chance for new transactions to overlap. Similarly,
* existing conflicts in can't cause pivots, and any
* conflicts in which could have completed a dangerous
* structure would already have caused a rollback, so any
* remaining ones must be benign.
*/
PredXact->CanPartialClearThrough =
PredXact->LastSxactCommitSeqNo;
}

If I understand that correctly, any predicate lock belonging to
any already committed transaction can be safely zapped away at
that instant.

Correct.

We don't do it immediately, because it might be expensive.

I think it has more to do with getting the LW locks right. We make
the call to ClearOldPredicateLocks() farther down in the function,
so this is effectively setting things up for that call.

But what is the purpose of HavePartialClearThrough?

To avoid doing unnecessary work for completed transactions on which
we still need to keep some information but for which we were
previously able to clear predicate locks. This relates to the
"mitigation" work discussed in these and other posts from around
that time:

http://archives.postgresql.org/pgsql-hackers/2010-10/msg01754.php

http://archives.postgresql.org/pgsql-hackers/2010-12/msg02119.php

I'm happy to work on modifications for any of this or to stay out of
your way if you want to. Should I put together a patch for those
items where we seem to agree and have a clear way forward?

-Kevin

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 14 years ago

In reply to: Kevin Grittner (#2)

Re: Small SSI issues

On 10.06.2011 18:05, Kevin Grittner wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> wrote:

o There is no safeguard against actually wrapping around the
SLRU, just the warning

Any thoughts on what we should do instead? If someone holds open a
transaction long enough to burn through a billion transaction IDs
(or possibly less if someone uses a smaller BLCKSZ), should we
generate a FATAL error?

FATAL is a bit harsh, ERROR seems more appropriate.

o I'm suspicious of the OldSerXidPagePrecedesLogically() logic
with 32k BLCKSZ. In that case, OLDSERXID_MAX_PAGE is a larger
than necessary to cover the whole range of 2^32 transactions,
so at high XIDs, say 2^32-1, doesn't it incorrectly think
that low XIDs, say 10, are in the past, not the future?

I will look that over to see; but if this is broken, then one of the
other SLRU users is probably broken, because I think I stole this
code pretty directly from another spot.

It was copied from async.c, which doesn't have this problem because it's
not mapping XIDs to the slru. There, the max queue size is determined by
the *_MAX_PAGE, and you point to a particular location in the SLRU with
simply page+offset.

/*
* Keep a pointer to the currently-running serializable
* transaction (if any) for quick reference.
* TODO SSI: Remove volatile qualifier and the then-unnecessary
* casts?
*/
static volatile SERIALIZABLEXACT *MySerializableXact =
InvalidSerializableXact;

* So, should we remove it? In most places where contents of
MySerializableXact are modified, we're holding
SerializableXactHashLock in exclusive mode. However, there's
two exceptions:

o GetSafeSnapshot() modifies MySerializableXact->flags without
any lock. It also inspects
MySerializableXact->possibleUnsafeConflicts without a lock.
What if somone sets some other flag in the flags bitmap just
when GetSafeSnapshot() is about to set
SXACT_FLAG_DEFERRABLE_WAITING? One of the flags might be lost
:-(.

o CheckForSerializableConflictIn() sets SXACT_FLAG_DID_WRITE
without holding a lock. The same danger is here if someone
else tries to set some other flag concurrently.

I think we should simply acquire SerializableXactHashLock in
GetSafeSnapshot(). It shouldn't be so much of a hotspot that it
would make any difference in performance.
CheckForSerializableConflictIn() might be called a lot,
however, so maybe we need to check if the flag is set first,
and only acquire the lock and set it if it's not set already.

OK.

Do checks such as that argue for keeping the volatile flag, or do
you think we can drop it if we make those changes? (That would also
allow dropping a number of casts which exist just to avoid
warnings.)

I believe we can drop it, I'll double-check.

I'm happy to work on modifications for any of this or to stay out of
your way if you want to. Should I put together a patch for those
items where we seem to agree and have a clear way forward?

I'll fix the MySerializableXact locking issue, and come back to the
other issues on Monday. If you have the time and energy to write a patch
by then, feel free, but I can look into them otherwise.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Dan Ports

drkp@csail.mit.edu

over 14 years ago

In reply to: Heikki Linnakangas (#3)

Re: Small SSI issues

On Fri, Jun 10, 2011 at 09:43:58PM +0300, Heikki Linnakangas wrote:

Do checks such as that argue for keeping the volatile flag, or do
you think we can drop it if we make those changes? (That would also
allow dropping a number of casts which exist just to avoid
warnings.)

I believe we can drop it, I'll double-check.

Yes, dropping it seems like the thing to do. It's been on my list for a
while. We are not really getting anything out of declaring it volatile
since we cast the volatile qualifier away most of the time.

Dan

--
Dan R. K. Ports MIT CSAIL http://drkp.net/

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 14 years ago

In reply to: Dan Ports (#4)

Re: Small SSI issues

Dan Ports wrote:
On Fri, Jun 10, 2011 at 09:43:58PM +0300, Heikki Linnakangas wrote:

Do checks such as that argue for keeping the volatile flag, or do
you think we can drop it if we make those changes? (That would
also allow dropping a number of casts which exist just to avoid
warnings.)

I believe we can drop it, I'll double-check.

Yes, dropping it seems like the thing to do. It's been on my list
for a while. We are not really getting anything out of declaring it
volatile since we cast the volatile qualifier away most of the
time.

I'm not concerned about references covered by
SerializableXactHashLock. I am more concerned about some of the
tests for whether the (MySerializableXact == InvalidSerializableXact)
checks and any other tests not covered by that lock are OK without it
(and OK with it). Since my knowledge of weak memory ordering
behavior is, well, weak I didn't want to try to make that call.

-Kevin

Import Notes

Resolved by subject fallback

Dan Ports

drkp@csail.mit.edu

over 14 years ago

In reply to: Kevin Grittner (#5)

Re: Small SSI issues

On Sat, Jun 11, 2011 at 01:38:31PM -0500, Kevin Grittner wrote:

I'm not concerned about references covered by
SerializableXactHashLock. I am more concerned about some of the
tests for whether the (MySerializableXact == InvalidSerializableXact)
checks and any other tests not covered by that lock are OK without it
(and OK with it). Since my knowledge of weak memory ordering
behavior is, well, weak I didn't want to try to make that call.

Oh, those checks are definitely not an issue -- MySerializableXact
itself (the pointer, not the thing it's pointing to) is in
backend-local memory, so it won't change under us.

The volatile qualifier (as written) doesn't help with that anyway, it
attaches to the data being pointed to, not the pointer itself.

Dan

--
Dan R. K. Ports MIT CSAIL http://drkp.net/

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 14 years ago

In reply to: Dan Ports (#6)

1 attachment(s)

Re: Small SSI issues

Heikki Linnakangas wrote:
On 10.06.2011 18:05, Kevin Grittner wrote:

Heikki Linnakangas wrote:

o There is no safeguard against actually wrapping around the
SLRU, just the warning

Any thoughts on what we should do instead? If someone holds open a
transaction long enough to burn through a billion transaction IDs
(or possibly less if someone uses a smaller BLCKSZ), should we
generate a FATAL error?

FATAL is a bit harsh, ERROR seems more appropriate.

If we don't cancel the long-running transaction, don't we continue to
have a problem?

Do checks such as that argue for keeping the volatile flag, or do
you think we can drop it if we make those changes? (That would
also allow dropping a number of casts which exist just to avoid
warnings.)

I believe we can drop it, I'll double-check.

I see you committed a patch for this, but there were some casts which
become unnecessary with that change that you missed. Patch attached
to clean up the ones I could spot.

-Kevin

Attachments:

ssi-nocast-1.patchapplication/octet-stream; name=ssi-nocast-1.patchDownload

*** a/src/backend/storage/lmgr/predicate.c
--- b/src/backend/storage/lmgr/predicate.c
***************
*** 1436,1443 **** GetSafeSnapshot(Snapshot origSnapshot)
  		 * them marked us as conflicted.
  		 */
  		MySerializableXact->flags |= SXACT_FLAG_DEFERRABLE_WAITING;
! 		while (!(SHMQueueEmpty((SHM_QUEUE *)
! 							 &MySerializableXact->possibleUnsafeConflicts) ||
  				 SxactIsROUnsafe(MySerializableXact)))
  		{
  			LWLockRelease(SerializableXactHashLock);
--- 1436,1442 ----
  		 * them marked us as conflicted.
  		 */
  		MySerializableXact->flags |= SXACT_FLAG_DEFERRABLE_WAITING;
! 		while (!(SHMQueueEmpty(&MySerializableXact->possibleUnsafeConflicts) ||
  				 SxactIsROUnsafe(MySerializableXact)))
  		{
  			LWLockRelease(SerializableXactHashLock);
***************
*** 3120,3132 **** ReleasePredicateLocks(const bool isCommit)
  		 * opposed to 'outLink' for the r/w xacts.
  		 */
  		possibleUnsafeConflict = (RWConflict)
! 			SHMQueueNext((SHM_QUEUE *) &MySerializableXact->possibleUnsafeConflicts,
! 				  (SHM_QUEUE *) &MySerializableXact->possibleUnsafeConflicts,
  						 offsetof(RWConflictData, inLink));
  		while (possibleUnsafeConflict)
  		{
  			nextConflict = (RWConflict)
! 				SHMQueueNext((SHM_QUEUE *) &MySerializableXact->possibleUnsafeConflicts,
  							 &possibleUnsafeConflict->inLink,
  							 offsetof(RWConflictData, inLink));
  
--- 3119,3131 ----
  		 * opposed to 'outLink' for the r/w xacts.
  		 */
  		possibleUnsafeConflict = (RWConflict)
! 			SHMQueueNext(&MySerializableXact->possibleUnsafeConflicts,
! 						 &MySerializableXact->possibleUnsafeConflicts,
  						 offsetof(RWConflictData, inLink));
  		while (possibleUnsafeConflict)
  		{
  			nextConflict = (RWConflict)
! 				SHMQueueNext(&MySerializableXact->possibleUnsafeConflicts,
  							 &possibleUnsafeConflict->inLink,
  							 offsetof(RWConflictData, inLink));
  
***************
*** 3159,3171 **** ReleasePredicateLocks(const bool isCommit)
  	 * previously committed transactions.
  	 */
  	conflict = (RWConflict)
! 		SHMQueueNext((SHM_QUEUE *) &MySerializableXact->outConflicts,
! 					 (SHM_QUEUE *) &MySerializableXact->outConflicts,
  					 offsetof(RWConflictData, outLink));
  	while (conflict)
  	{
  		nextConflict = (RWConflict)
! 			SHMQueueNext((SHM_QUEUE *) &MySerializableXact->outConflicts,
  						 &conflict->outLink,
  						 offsetof(RWConflictData, outLink));
  
--- 3158,3170 ----
  	 * previously committed transactions.
  	 */
  	conflict = (RWConflict)
! 		SHMQueueNext(&MySerializableXact->outConflicts,
! 					 &MySerializableXact->outConflicts,
  					 offsetof(RWConflictData, outLink));
  	while (conflict)
  	{
  		nextConflict = (RWConflict)
! 			SHMQueueNext(&MySerializableXact->outConflicts,
  						 &conflict->outLink,
  						 offsetof(RWConflictData, outLink));
  
***************
*** 3192,3204 **** ReleasePredicateLocks(const bool isCommit)
  	 * we're rolling back, clear them all.
  	 */
  	conflict = (RWConflict)
! 		SHMQueueNext((SHM_QUEUE *) &MySerializableXact->inConflicts,
! 					 (SHM_QUEUE *) &MySerializableXact->inConflicts,
  					 offsetof(RWConflictData, inLink));
  	while (conflict)
  	{
  		nextConflict = (RWConflict)
! 			SHMQueueNext((SHM_QUEUE *) &MySerializableXact->inConflicts,
  						 &conflict->inLink,
  						 offsetof(RWConflictData, inLink));
  
--- 3191,3203 ----
  	 * we're rolling back, clear them all.
  	 */
  	conflict = (RWConflict)
! 		SHMQueueNext(&MySerializableXact->inConflicts,
! 					 &MySerializableXact->inConflicts,
  					 offsetof(RWConflictData, inLink));
  	while (conflict)
  	{
  		nextConflict = (RWConflict)
! 			SHMQueueNext(&MySerializableXact->inConflicts,
  						 &conflict->inLink,
  						 offsetof(RWConflictData, inLink));
  
***************
*** 3219,3231 **** ReleasePredicateLocks(const bool isCommit)
  		 * up if they are known safe or known unsafe.
  		 */
  		possibleUnsafeConflict = (RWConflict)
! 			SHMQueueNext((SHM_QUEUE *) &MySerializableXact->possibleUnsafeConflicts,
! 				  (SHM_QUEUE *) &MySerializableXact->possibleUnsafeConflicts,
  						 offsetof(RWConflictData, outLink));
  		while (possibleUnsafeConflict)
  		{
  			nextConflict = (RWConflict)
! 				SHMQueueNext((SHM_QUEUE *) &MySerializableXact->possibleUnsafeConflicts,
  							 &possibleUnsafeConflict->outLink,
  							 offsetof(RWConflictData, outLink));
  
--- 3218,3230 ----
  		 * up if they are known safe or known unsafe.
  		 */
  		possibleUnsafeConflict = (RWConflict)
! 			SHMQueueNext(&MySerializableXact->possibleUnsafeConflicts,
! 						 &MySerializableXact->possibleUnsafeConflicts,
  						 offsetof(RWConflictData, outLink));
  		while (possibleUnsafeConflict)
  		{
  			nextConflict = (RWConflict)
! 				SHMQueueNext(&MySerializableXact->possibleUnsafeConflicts,
  							 &possibleUnsafeConflict->outLink,
  							 offsetof(RWConflictData, outLink));
  
***************
*** 3296,3302 **** ReleasePredicateLocks(const bool isCommit)
  	/* Add this to the list of transactions to check for later cleanup. */
  	if (isCommit)
  		SHMQueueInsertBefore(FinishedSerializableTransactions,
! 						  (SHM_QUEUE *) &(MySerializableXact->finishedLink));
  
  	if (!isCommit)
  		ReleaseOneSerializableXact(MySerializableXact, false, false);
--- 3295,3301 ----
  	/* Add this to the list of transactions to check for later cleanup. */
  	if (isCommit)
  		SHMQueueInsertBefore(FinishedSerializableTransactions,
! 							 &MySerializableXact->finishedLink);
  
  	if (!isCommit)
  		ReleaseOneSerializableXact(MySerializableXact, false, false);
***************
*** 3796,3802 **** CheckForSerializableConflictOut(const bool visible, const Relation relation,
  					  errhint("The transaction might succeed if retried.")));
  
  			if (SxactHasSummaryConflictIn(MySerializableXact)
! 			|| !SHMQueueEmpty((SHM_QUEUE *) &MySerializableXact->inConflicts))
  				ereport(ERROR,
  						(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
  						 errmsg("could not serialize access due to read/write dependencies among transactions"),
--- 3795,3801 ----
  					  errhint("The transaction might succeed if retried.")));
  
  			if (SxactHasSummaryConflictIn(MySerializableXact)
! 				|| !SHMQueueEmpty(&MySerializableXact->inConflicts))
  				ereport(ERROR,
  						(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
  						 errmsg("could not serialize access due to read/write dependencies among transactions"),
***************
*** 4469,4476 **** PreCommit_CheckForSerializationFailure(void)
  	}
  
  	nearConflict = (RWConflict)
! 		SHMQueueNext((SHM_QUEUE *) &MySerializableXact->inConflicts,
! 					 (SHM_QUEUE *) &MySerializableXact->inConflicts,
  					 offsetof(RWConflictData, inLink));
  	while (nearConflict)
  	{
--- 4468,4475 ----
  	}
  
  	nearConflict = (RWConflict)
! 		SHMQueueNext(&MySerializableXact->inConflicts,
! 					 &MySerializableXact->inConflicts,
  					 offsetof(RWConflictData, inLink));
  	while (nearConflict)
  	{
***************
*** 4503,4509 **** PreCommit_CheckForSerializationFailure(void)
  		}
  
  		nearConflict = (RWConflict)
! 			SHMQueueNext((SHM_QUEUE *) &MySerializableXact->inConflicts,
  						 &nearConflict->inLink,
  						 offsetof(RWConflictData, inLink));
  	}
--- 4502,4508 ----
  		}
  
  		nearConflict = (RWConflict)
! 			SHMQueueNext(&MySerializableXact->inConflicts,
  						 &nearConflict->inLink,
  						 offsetof(RWConflictData, inLink));
  	}
***************
*** 4550,4558 **** AtPrepare_PredicateLocks(void)
  	 * outConflicts lists, if they're non-empty we'll represent that by
  	 * setting the appropriate summary conflict flags.
  	 */
! 	if (!SHMQueueEmpty((SHM_QUEUE *) &MySerializableXact->inConflicts))
  		xactRecord->flags |= SXACT_FLAG_SUMMARY_CONFLICT_IN;
! 	if (!SHMQueueEmpty((SHM_QUEUE *) &MySerializableXact->outConflicts))
  		xactRecord->flags |= SXACT_FLAG_SUMMARY_CONFLICT_OUT;
  
  	RegisterTwoPhaseRecord(TWOPHASE_RM_PREDICATELOCK_ID, 0,
--- 4549,4557 ----
  	 * outConflicts lists, if they're non-empty we'll represent that by
  	 * setting the appropriate summary conflict flags.
  	 */
! 	if (!SHMQueueEmpty(&MySerializableXact->inConflicts))
  		xactRecord->flags |= SXACT_FLAG_SUMMARY_CONFLICT_IN;
! 	if (!SHMQueueEmpty(&MySerializableXact->outConflicts))
  		xactRecord->flags |= SXACT_FLAG_SUMMARY_CONFLICT_OUT;
  
  	RegisterTwoPhaseRecord(TWOPHASE_RM_PREDICATELOCK_ID, 0,

Import Notes

Resolved by subject fallback

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 14 years ago

In reply to: Kevin Grittner (#7)

Re: Small SSI issues

On 12.06.2011 17:59, Kevin Grittner wrote:

Heikki Linnakangas wrote:
On 10.06.2011 18:05, Kevin Grittner wrote:

Heikki Linnakangas wrote:

o There is no safeguard against actually wrapping around the
SLRU, just the warning

Any thoughts on what we should do instead? If someone holds open a
transaction long enough to burn through a billion transaction IDs
(or possibly less if someone uses a smaller BLCKSZ), should we
generate a FATAL error?

FATAL is a bit harsh, ERROR seems more appropriate.

If we don't cancel the long-running transaction, don't we continue to
have a problem?

Yes, but ERROR is enough to kill the transaction. Unless it's in a
subtransaction, I guess. But anyway, there's no guarantee that the
long-running transaction will hit the problem first, or at all. You're
much more likely to kill an innocent new transaction that tries to
acquire its first locks, than an old long-running transaction that's
been running for a while already, probably idle doing nothing, or doing
a long seqscan on a large table with the whole table locked already.

I see you committed a patch for this, but there were some casts which
become unnecessary with that change that you missed. Patch attached
to clean up the ones I could spot.

Ah, thanks, applied. I didn't realize those were also only needed
because of the volatile qualifier.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 14 years ago

In reply to: Kevin Grittner (#2)

Re: Small SSI issues

On 10.06.2011 18:05, Kevin Grittner wrote:

Heikki Linnakangas<heikki.linnakangas@enterprisedb.com> wrote:

* Is the SXACT_FLAG_ROLLED_BACK flag necessary? It's only set in
ReleasePredicateLocks() for a fleeting moment while the
function releases all conflicts and locks held by the
transaction, and finally the sxact struct itself containing the
flag.

I think that one can go away. It had more of a point many months
ago before we properly sorted out what belongs in
PreCommit_CheckForSerializationFailure() and what belongs in
ReleasePredicateLocks(). The point at which we reached clarity on
that and moved things around, this flag probably became obsolete.

Also, isn't a transaction that's already been marked for death
the same as one that has already rolled back, for the purposes
of detecting conflicts?

Yes.

We should probably ignore any marked-for-death transaction during
conflict detection and serialization failure detection. As a start,
anywhere there is now a check for rollback and not for this, we
should change it to this.

Ok, I removed the SXACT_FLAG_ROLLED_BACK flag. I also renamed the
marked-for-death flag into SXACT_FLAG_DOOMED; that's a lot shorter.

There may be some places this can be
checked which haven't yet been identified and touched.

Yeah - in 9.2.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#10

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 14 years ago

In reply to: Heikki Linnakangas (#9)

1 attachment(s)

Re: Small SSI issues

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote:

There may be some places this can be checked which haven't yet
been identified and touched.

Yeah - in 9.2.

No argument here. I'm all for stabilizing and getting the thing out
-- I think we've established that performance is good for many
workloads as it stands, and that there are workloads where it will
never be useful. Chipping away at the gray area, to make it perform
well in a few workloads where it currently doesn't (and, of course,
*even better* in workloads where it's currently better than the
alternatives), seems like future release work to me.

There is one issue you raised in this post:

http://archives.postgresql.org/message-id/4DEF3194.6030305@enterprisedb.com

Robert questioned whether it should be 9.1 material here:

http://archives.postgresql.org/message-id/BANLkTint2i2fHDTdr=Xq3K=YrxegovGmTw@mail.gmail.com

I posted a patch here:

http://archives.postgresql.org/message-id/4DEFB169020000250003E3BD@gw.wicourts.gov

Should I put that patch into a 9.2 CF?

There is an unnecessary include of predicate.h in nbtree.c we should
delete. That seems safe enough.

You questioned whether OldSerXidPagePrecedesLogically was buggy. I
will look at that by this weekend at the latest. If it is buggy we
obviously should fix that. Are there any other changes you think we
should make to handle the odd corner cases in the SLRU for SSI? It
did occur to me that we should be safe from actual overwriting of an
old entry by the normal transaction wrap-around protections -- the
worst that should happen with the current code (I think) is that in
extreme cases we may get LOG level messages or accumulate a
surprising number of SLRU segment files. That's because SLRU will
start to nag about things at one billion transactions, but we need
to get all the way to two billion transactions used up while a
single serializable transaction remains active before we could
overwrite something.

It seems like it might be a good idea to apply pgindent formating to
the latest SSI changes, to minimize conflict on back-patching any
bug fixes. I've attached a patch to do this formatting -- entirely
whitespace changes from a pgindent run against selected files.

Unless I'm missing something, the only remaining changes needed are
for documentation (as mentioned in previous posts). I will work on
those after I look at OldSerXidPagePrecedesLogically.

-Kevin

Attachments:

ssi-pgindent-post-beta2.patchtext/plain; name=ssi-pgindent-post-beta2.patchDownload

*** a/src/backend/access/nbtree/nbtsearch.c
--- b/src/backend/access/nbtree/nbtsearch.c
***************
*** 849,856 **** _bt_first(IndexScanDesc scan, ScanDirection dir)
  	if (!BufferIsValid(buf))
  	{
  		/*
! 		 * We only get here if the index is completely empty.
! 		 * Lock relation because nothing finer to lock exists.
  		 */
  		PredicateLockRelation(rel, scan->xs_snapshot);
  		return false;
--- 849,856 ----
  	if (!BufferIsValid(buf))
  	{
  		/*
! 		 * We only get here if the index is completely empty. Lock relation
! 		 * because nothing finer to lock exists.
  		 */
  		PredicateLockRelation(rel, scan->xs_snapshot);
  		return false;
*** a/src/backend/access/transam/twophase_rmgr.c
--- b/src/backend/access/transam/twophase_rmgr.c
***************
*** 26,32 **** const TwoPhaseCallback twophase_recover_callbacks[TWOPHASE_RM_MAX_ID + 1] =
  	NULL,						/* END ID */
  	lock_twophase_recover,		/* Lock */
  	NULL,						/* pgstat */
! 	multixact_twophase_recover,	/* MultiXact */
  	predicatelock_twophase_recover		/* PredicateLock */
  };
  
--- 26,32 ----
  	NULL,						/* END ID */
  	lock_twophase_recover,		/* Lock */
  	NULL,						/* pgstat */
! 	multixact_twophase_recover, /* MultiXact */
  	predicatelock_twophase_recover		/* PredicateLock */
  };
  
***************
*** 44,50 **** const TwoPhaseCallback twophase_postabort_callbacks[TWOPHASE_RM_MAX_ID + 1] =
  	NULL,						/* END ID */
  	lock_twophase_postabort,	/* Lock */
  	pgstat_twophase_postabort,	/* pgstat */
! 	multixact_twophase_postabort,	/* MultiXact */
  	NULL						/* PredicateLock */
  };
  
--- 44,50 ----
  	NULL,						/* END ID */
  	lock_twophase_postabort,	/* Lock */
  	pgstat_twophase_postabort,	/* pgstat */
! 	multixact_twophase_postabort,		/* MultiXact */
  	NULL						/* PredicateLock */
  };
  
*** a/src/backend/storage/lmgr/predicate.c
--- b/src/backend/storage/lmgr/predicate.c
***************
*** 483,492 **** SerializationNeededForRead(Relation relation, Snapshot snapshot)
  	 * MySerializableXact, so that subsequent calls to this function can exit
  	 * quickly.
  	 *
! 	 * A transaction is flagged as RO_SAFE if all concurrent R/W
! 	 * transactions commit without having conflicts out to an earlier
! 	 * snapshot, thus ensuring that no conflicts are possible for this
! 	 * transaction.
  	 */
  	if (SxactIsROSafe(MySerializableXact))
  	{
--- 483,491 ----
  	 * MySerializableXact, so that subsequent calls to this function can exit
  	 * quickly.
  	 *
! 	 * A transaction is flagged as RO_SAFE if all concurrent R/W transactions
! 	 * commit without having conflicts out to an earlier snapshot, thus
! 	 * ensuring that no conflicts are possible for this transaction.
  	 */
  	if (SxactIsROSafe(MySerializableXact))
  	{
***************
*** 498,504 **** SerializationNeededForRead(Relation relation, Snapshot snapshot)
  	if (!PredicateLockingNeededForRelation(relation))
  		return false;
  
! 	return true;	/* no excuse to skip predicate locking */
  }
  
  /*
--- 497,503 ----
  	if (!PredicateLockingNeededForRelation(relation))
  		return false;
  
! 	return true;				/* no excuse to skip predicate locking */
  }
  
  /*
***************
*** 516,522 **** SerializationNeededForWrite(Relation relation)
  	if (!PredicateLockingNeededForRelation(relation))
  		return false;
  
! 	return true;	/* no excuse to skip predicate locking */
  }
  
  
--- 515,521 ----
  	if (!PredicateLockingNeededForRelation(relation))
  		return false;
  
! 	return true;				/* no excuse to skip predicate locking */
  }
  
  
*** a/src/include/storage/predicate.h
--- b/src/include/storage/predicate.h
***************
*** 54,60 **** extern void ReleasePredicateLocks(const bool isCommit);
  
  /* conflict detection (may also trigger rollback) */
  extern void CheckForSerializableConflictOut(const bool valid, const Relation relation, const HeapTuple tuple,
! 											const Buffer buffer, const Snapshot snapshot);
  extern void CheckForSerializableConflictIn(const Relation relation, const HeapTuple tuple, const Buffer buffer);
  extern void CheckTableForSerializableConflictIn(const Relation relation);
  
--- 54,60 ----
  
  /* conflict detection (may also trigger rollback) */
  extern void CheckForSerializableConflictOut(const bool valid, const Relation relation, const HeapTuple tuple,
! 								const Buffer buffer, const Snapshot snapshot);
  extern void CheckForSerializableConflictIn(const Relation relation, const HeapTuple tuple, const Buffer buffer);
  extern void CheckTableForSerializableConflictIn(const Relation relation);
  
*** a/src/include/storage/predicate_internals.h
--- b/src/include/storage/predicate_internals.h
***************
*** 90,98 **** typedef struct SERIALIZABLEXACT
  	int			pid;			/* pid of associated process */
  } SERIALIZABLEXACT;
  
! #define SXACT_FLAG_COMMITTED				0x00000001 /* already committed */
! #define SXACT_FLAG_PREPARED					0x00000002 /* about to commit */
! #define SXACT_FLAG_DOOMED					0x00000004 /* will roll back */
  /*
   * The following flag actually means that the flagged transaction has a
   * conflict out *to a transaction which committed ahead of it*.  It's hard
--- 90,98 ----
  	int			pid;			/* pid of associated process */
  } SERIALIZABLEXACT;
  
! #define SXACT_FLAG_COMMITTED				0x00000001	/* already committed */
! #define SXACT_FLAG_PREPARED					0x00000002	/* about to commit */
! #define SXACT_FLAG_DOOMED					0x00000004	/* will roll back */
  /*
   * The following flag actually means that the flagged transaction has a
   * conflict out *to a transaction which committed ahead of it*.  It's hard
***************
*** 132,139 **** typedef struct PredXactListData
  	/*
  	 * These global variables are maintained when registering and cleaning up
  	 * serializable transactions.  They must be global across all backends,
! 	 * but are not needed outside the predicate.c source file. Protected
! 	 * by SerializableXactHashLock.
  	 */
  	TransactionId SxactGlobalXmin;		/* global xmin for active serializable
  										 * transactions */
--- 132,139 ----
  	/*
  	 * These global variables are maintained when registering and cleaning up
  	 * serializable transactions.  They must be global across all backends,
! 	 * but are not needed outside the predicate.c source file. Protected by
! 	 * SerializableXactHashLock.
  	 */
  	TransactionId SxactGlobalXmin;		/* global xmin for active serializable
  										 * transactions */

#11

Kevin Grittner

Kevin.Grittner@wicourts.gov

over 14 years ago

In reply to: Kevin Grittner (#10)

Re: Small SSI issues

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:

Unless I'm missing something, the only remaining changes needed
are for documentation (as mentioned in previous posts).

I just found notes that we also need regression tests for the
SSI/DDL combination and a comment in lazy_truncate_heap.

At any rate, not anything which is part of executable code....

-Kevin

#12

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 14 years ago

In reply to: Kevin Grittner (#10)

Re: Small SSI issues

On 15.06.2011 19:10, Kevin Grittner wrote:

There is an unnecessary include of predicate.h in nbtree.c we should
delete. That seems safe enough.
...
It seems like it might be a good idea to apply pgindent formating to
the latest SSI changes, to minimize conflict on back-patching any
bug fixes. I've attached a patch to do this formatting -- entirely
whitespace changes from a pgindent run against selected files.

Ok, committed the pgindent patch and removed the unnecessary include.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#13

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 14 years ago

In reply to: Kevin Grittner (#10)

Re: Small SSI issues

On 15.06.2011 19:10, Kevin Grittner wrote:

There is one issue you raised in this post:

http://archives.postgresql.org/message-id/4DEF3194.6030305@enterprisedb.com

Robert questioned whether it should be 9.1 material here:

http://archives.postgresql.org/message-id/BANLkTint2i2fHDTdr=Xq3K=YrxegovGmTw@mail.gmail.com

I posted a patch here:

http://archives.postgresql.org/message-id/4DEFB169020000250003E3BD@gw.wicourts.gov

Should I put that patch into a 9.2 CF?

Yeah. I've added it to the September commitfest.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com