Coping with nLocks overflow

Started by Tom Laneover 17 years ago3 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

We have recently seen one definite and one probable report of overflow
of the nLocks field of a backend's local lock table:
http://archives.postgresql.org/pgsql-bugs/2008-09/msg00021.php

While it's still unclear exactly why 8.3 seems more prone to this than
earlier releases, the general shape of the problem seems clear enough.
We have many code paths that intentionally take a lock on some object
and leave it locked until end of transaction. Repeat such a command on
the same object enough times within one transaction, and voila,
overflow. What's news, perhaps, is that we've reached a performance
level where this can actually happen within transactions of lengths that
people might try to run.

I'm considering that a simple solution to this might be to widen nLocks
from int to int64. This would definitely fix it on machines that have
working int64 arithmetic, and if there are any left that do not, they're
probably not fast enough to encounter the overflow in real-world usage
anyway. For machines that aren't native 64-bit it would add a couple
of cycles to lock acquisition/release, but that's hardly likely to be
measurable compared to all the other work done in
LockAcquire/LockRelease.

Comments, objections?

regards, tom lane

#2Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#1)
Re: Coping with nLocks overflow

Tom Lane <tgl@sss.pgh.pa.us> writes:

We have recently seen one definite and one probable report of overflow
of the nLocks field of a backend's local lock table:
http://archives.postgresql.org/pgsql-bugs/2008-09/msg00021.php
...
Comments, objections?

In that case the problem could have been postponed by making nlocks unsigned.
Not much point in that I guess.

Alternatively perhaps we could indicate when taking a lock that we intend to
hold the lock until the end of the transaction. In that case we don't need the
usage counter at all and could just mark it with a special value which we
never increment or decrement just wait until we release all locks at the end
of transaction?

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's 24x7 Postgres support!

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#2)
Re: Coping with nLocks overflow

Gregory Stark <stark@enterprisedb.com> writes:

Alternatively perhaps we could indicate when taking a lock that we intend to
hold the lock until the end of the transaction. In that case we don't need the
usage counter at all and could just mark it with a special value which we
never increment or decrement just wait until we release all locks at the end
of transaction?

I considered that, and also considered installing an overflow flag (the
idea being that once nLocks overflows we'd just insist on holding the
lock till transaction end). But the point isn't clear ... I mean, I
think no one but me even believes anymore in the concept of keeping the
code base minimally safe for INT64_IS_BUSTED machines ;-). Given the
risk of creating a bug or masking future lock-acquisition bugs, I
thought that adding complexity here wasn't warranted.

regards, tom lane