Silent deadlock possible in current sources

Started by Tom Laneover 25 years ago1 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

Observe:

heap_update()
{
/* lock page containing old copy of tuple */
LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);

...

/* Find buffer for new tuple */
if ((unsigned) MAXALIGN(newtup->t_len) <= PageGetFreeSpace((Page) dp))
newbuf = buffer;
else
newbuf = RelationGetBufferForTuple(relation, newtup->t_len, buffer);

...

if (newbuf != buffer)
{
LockBuffer(newbuf, BUFFER_LOCK_UNLOCK);
WriteBuffer(newbuf);
}
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
WriteBuffer(buffer);
}

RelationGetBufferForTuple(Relation relation, Size len, Buffer Ubuf)
{
if (!relation->rd_myxactonly)
LockPage(relation, 0, ExclusiveLock);

...

buffer = ReadBuffer(relation, lastblock - 1);

if (buffer != Ubuf)
LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);

...

if (!relation->rd_myxactonly)
UnlockPage(relation, 0, ExclusiveLock);

...
}

In other words, if heap_update can't fit the new copy of the tuple on
the same page it's already on, then *while still holding the exclusive
lock on the old tuple's buffer*, it calls RelationGetBufferForTuple
which tries to grab the relation's extension lock and then the exclusive
lock on the last page of the relation. The code is smart enough to deal
with the case that the old tuple is in the last page of the relation
(in which case we already have the exclusive buffer lock on that page,
and mustn't ask for it twice).

BUT: suppose two different processes are trying to do this at about
the same time. Process A is updating a tuple in the last page of the
relation and Process B is updating a tuple in some earlier page. Both
are able to get their exclusive buffer locks on their old tuples' pages.
Now, suppose that Process B is a little bit ahead and so it is first
to reach the LockPage operation. It gets the relation extension lock.
Now it wants to get an exclusive buffer lock on the last page of the
relation. It can't, because Process A already has that lock --- but
now Process A will be waiting to get the relation extension lock that
Process B has.

This deadlock is not detected or reported because the buffer lock
mechanism doesn't have any deadlock detection capability (buffer locks
aren't done via the lock manager, which might be considered a bug in
itself). Instead, the two processes just silently lock up, and
thereafter so will all other processes that try to update or insert
in that relation.

This bug did not exist in 7.0.* because heap_update used to release
its exclusive lock on the source page while extending the relation:

/*
* New item won't fit on same page as old item, have to look for a
* new place to put it. Note that we have to unlock current buffer
* context - not good but RelationPutHeapTupleAtEnd uses extend
* lock.
*/
LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
RelationPutHeapTupleAtEnd(relation, newtup);
LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);

I'm inclined to think that that is the correct solution and the new
approach is simply broken. But, not knowing what Vadim had in mind
while making this change, I'm going to leave it to him to fix this.

Although this specific lockup mode didn't exist in 7.0.*, it does
suggest a possible cause of the deadlocks-with-no-deadlock-report
behavior that a couple of people have reported with 7.0: maybe there
is another logic path that allows a deadlock involving two buffer locks,
or a buffer lock and a normal lock. I'm on the warpath now ...

regards, tom lane

From nobody Sat May 17 01:08:44 2025
Received: from arctica.sime.com (root@arctica.sime.com [193.228.80.12])
by hub.org (8.10.1/8.10.1) with ESMTP id e7UI3tC66774
for <pgsql-hackers@postgresql.org>;
Wed, 30 Aug 2000 14:03:55 -0400 (EDT)
Received: from loki (c-039.static.AT.EU.net [193.154.188.39])
by arctica.sime.com (8.10.0/8.10.0) with SMTP id e7UI3qL23532
for <pgsql-hackers@postgresql.org>; Wed, 30 Aug 2000 20:03:52 +0200
From: Mario Weilguni <mweilguni@sime.com>
Reply-To: mweilguni@sime.com
To: Postgres Hacker Lister <pgsql-hackers@postgresql.org>
Subject: Patch for TNS services
Date: Wed, 30 Aug 2000 20:04:33 +0200
X-Mailer: KMail [version 1.1.90]
Content-Type: Multipart/Mixed;
boundary="------------Boundary-00=_L7A4CXIUPUVOPJQXHIYU"
MIME-Version: 1.0
Message-Id: <00083020043302.00781@loki>
X-Archive-Number: 200008/828

--------------Boundary-00=_L7A4CXIUPUVOPJQXHIYU
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

-----BEGIN PGP SIGNED MESSAGE-----

Last week I created a patch for the Postgres client side libraries to allow=
=20
something like a (not so mighty) form of Oracle TNS, but nobody showed any=
=20
interest. Currently, the patch is not perfect yet, but works fine for us. I=
=20
want to avoid improving the patch if there is no interest in it, so if you=
=20
think it might be a worthy improvement please drop me a line.

It works like this:
The patch allows to supply another parameter to the Postgres connect string=
,=20
called "service". So, instead of having a connect string (e.g. in PHP) like=
=20
"dbname=3Dfoo host=3Dbar port=3D5433 user=3Dfoouser password=3Dbarpass"
the string would be
"service=3Dstupid_name_here"
or more often
"service=3Dstupid_name_here user=3Dfoouser password=3Dbarpass"

There's a config file /etc/pg_service.conf, having an entry like:
[stupid_name_here]
dbname=3Dfoo
host=3Dbar
port=3D5433
....

The advantage is you can go from one database host, database, port or=20
whatever without having to touch the scripts or applications. We're current=
ly=20
in the process of migrating all of our PHP and Python scripts to another fr=
om=20
localhost, port 5433 to another machine, port 5432 and it's not something I=
=20
ever want to do again, I'd to change around 100 files and I'm still not sur=
e=20
if I've missed one.

The patch is client-side only, around 100 lines, needs no changes to the=20
backend and is compatible with all applications supplying a connection stri=
ng=20
(not using PQsetdblogin)

- --=20
Why is it always Segmentation's fault?
-----BEGIN PGP SIGNATURE-----
Version: 2.6.3i
Charset: noconv

iQCVAwUBOa1MsQotfkegMgnVAQEIsAP+Na72pNdT+RoQcjuX5cn1TKkPlNAh9BV5
kCNP+Zui6WfZSiA8RYPuruXF0QyEMPZZD6AI9Wqr5sQ75kVSb65uOt9rLrdS0bxA
WTClNjlLKG3Rk1IGSFBm+C0p8lcA3AYTohHLhHB3q+WeLTneI5lJfwpo2AWyinQt
0k/1r6EwpUk=3D
=3D+skX
-----END PGP SIGNATURE-----

--------------Boundary-00=_L7A4CXIUPUVOPJQXHIYU
Content-Type: text/x-c; name="postgres-7.0-services.patch"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="postgres-7.0-services.patch"

LS0tIHBvc3RncmVzcWwtNy4wLm9sZC9zcmMvaW50ZXJmYWNlcy9saWJwcS9m
ZS1jb25uZWN0LmMJTW9uIEFwciAxNyAwNTo0NTozNCAyMDAwCisrKyBwb3N0
Z3Jlc3FsLTcuMC9zcmMvaW50ZXJmYWNlcy9saWJwcS9mZS1jb25uZWN0LmMJ
U2F0IEF1ZyAyNiAxODo1MjozOSAyMDAwCkBAIC0xMDUsNiArMTA1LDkgQEAK
IAl7ImF1dGh0eXBlIiwgIlBHQVVUSFRZUEUiLCBEZWZhdWx0QXV0aHR5cGUs
IE5VTEwsCiAJIkRhdGFiYXNlLUF1dGh0eXBlIiwgIkQiLCAyMH0sCiAKKwl7
InNlcnZpY2UiLCAiUEdTRVJWSUNFIiwgTlVMTCwgTlVMTCwKKwkgIkRhdGFi
YXNlLVNlcnZpY2UiLCAiIiwgMjB9LAorCiAJeyJ1c2VyIiwgIlBHVVNFUiIs
IE5VTEwsIE5VTEwsCiAJIkRhdGFiYXNlLVVzZXIiLCAiIiwgMjB9LAogCkBA
IC0xNzYsNiArMTc5LDggQEAKIHN0YXRpYyBjaGFyICpjb25uaW5mb19nZXR2
YWwoUFFjb25uaW5mb09wdGlvbiAqY29ubk9wdGlvbnMsCiAJCQkJY29uc3Qg
Y2hhciAqa2V5d29yZCk7CiBzdGF0aWMgdm9pZCBkZWZhdWx0Tm90aWNlUHJv
Y2Vzc29yKHZvaWQgKmFyZywgY29uc3QgY2hhciAqbWVzc2FnZSk7CitzdGF0
aWMgaW50ICBwYXJzZVNlcnZpY2VJbmZvKFBRY29ubmluZm9PcHRpb24gKm9w
dGlvbnMsIAorCQkJICAgICBQUUV4cEJ1ZmZlciBlcnJvck1lc3NhZ2UpOwog
CiAKIC8qIC0tLS0tLS0tLS0tLS0tLS0KQEAgLTIwODEsNiArMjA4NiwxMTQg
QEAKIAlyZXR1cm4gU1RBVFVTX09LOwogfQogCitpbnQgcGFyc2VTZXJ2aWNl
SW5mbyhQUWNvbm5pbmZvT3B0aW9uICpvcHRpb25zLCBQUUV4cEJ1ZmZlciBl
cnJvck1lc3NhZ2UpIHsKKyAgY2hhciAqc2VydmljZSA9IGNvbm5pbmZvX2dl
dHZhbChvcHRpb25zLCAic2VydmljZSIpOworICBjaGFyICpzZXJ2aWNlRmls
ZSA9ICIvZXRjL3BnX3NlcnZpY2UuY29uZiI7CisgIGludCAgTUFYQlVGU0la
RSA9IDI1NjsKKyAgaW50ICBncm91cF9mb3VuZCA9IDA7CisgIGludCAgbGlu
ZW5yPTAsIGk7CisKKyAgaWYoc2VydmljZSAhPSBOVUxMKSB7CisgICAgRklM
RSAqZjsKKyAgICBjaGFyIGJ1ZltNQVhCVUZTSVpFXSwgKmxpbmU7CisgICAg
CisgICAgZiA9IGZvcGVuKHNlcnZpY2VGaWxlLCAiciIpOworICAgIGlmKGYg
PT0gTlVMTCkgeworICAgICAgcHJpbnRmUFFFeHBCdWZmZXIoZXJyb3JNZXNz
YWdlLCAiRVJST1I6IFNlcnZpY2UgZmlsZSAnJXMnIG5vdCBmb3VuZFxuIiwK
KwkJCXNlcnZpY2VGaWxlKTsKKyAgICAgIHJldHVybiAxOworICAgIH0KKwor
ICAgIC8qIEFzIGRlZmF1bHQsIHNldCB0aGUgZGF0YWJhc2UgbmFtZSB0byB0
aGUgbmFtZSBvZiB0aGUgc2VydmljZSAqLworICAgIGZvcihpID0gMDsgb3B0
aW9uc1tpXS5rZXl3b3JkOyBpKyspCisgICAgICBpZihzdHJjbXAob3B0aW9u
c1tpXS5rZXl3b3JkLCAiZGJuYW1lIikgPT0gMCkgeworCWlmKG9wdGlvbnNb
aV0udmFsICE9IE5VTEwpCisJICBmcmVlKG9wdGlvbnNbaV0udmFsKTsKKwlv
cHRpb25zW2ldLnZhbCA9IHN0cmR1cChzZXJ2aWNlKTsKKyAgICAgIH0KKyAg
ICAKKyAgICB3aGlsZSgobGluZSA9IGZnZXRzKGJ1ZiwgTUFYQlVGU0laRS0x
LCBmKSkgIT0gTlVMTCkgeworICAgICAgbGluZW5yKys7CisKKyAgICAgIGlm
KHN0cmxlbihsaW5lKSA+PSBNQVhCVUZTSVpFIC0gMikgeworCWZjbG9zZShm
KTsKKwlwcmludGZQUUV4cEJ1ZmZlcihlcnJvck1lc3NhZ2UsCisJCQkgIkVS
Uk9SOiBsaW5lICVkIHRvbyBsb25nIGluIHNlcnZpY2UgZmlsZSAnJXMnXG4i
LAorCQkJICBsaW5lbnIsCisJCQkgc2VydmljZUZpbGUpOworCXJldHVybiAy
OworICAgICAgfQorCisgICAgICAvKiBpZ25vcmUgRU9MIGF0IGVuZCBvZiBs
aW5lICovCisgICAgICBpZihzdHJsZW4obGluZSkgJiYgbGluZVtzdHJsZW4o
bGluZSktMV0gPT0gJ1xuJykKKwlsaW5lW3N0cmxlbihsaW5lKS0xXSA9IDA7
CisKKyAgICAgIC8qIGlnbm9yZSBsZWFkaW5nIGJsYW5rcyAqLworICAgICAg
d2hpbGUoKmxpbmUgJiYgaXNzcGFjZShsaW5lWzBdKSkKKwlsaW5lKys7CisK
KyAgICAgIC8qIGlnbm9yZSBjb21tZW50cyBhbmQgZW1wdHkgbGluZXMgKi8K
KyAgICAgIGlmKHN0cmxlbihsaW5lKSA9PSAwIHx8IGxpbmVbMF0gPT0gJyMn
KQorCWNvbnRpbnVlOworCisgICAgICAvKiBDaGVjayBmb3IgcmlnaHQgZ3Jv
dXBuYW1lICovCisgICAgICBpZihsaW5lWzBdID09ICdbJykgeworCWlmKGdy
b3VwX2ZvdW5kKSB7CisJICAvKiBncm91cCBpbmZvIGFscmVhZHkgcmVhZCAq
LworCSAgZmNsb3NlKGYpOworCSAgcmV0dXJuIDA7CisJfQorCisJaWYoc3Ry
bmNtcChsaW5lKzEsIHNlcnZpY2UsIHN0cmxlbihzZXJ2aWNlKSkgPT0gMCAm
JgorCSAgIGxpbmVbc3RybGVuKHNlcnZpY2UpKzFdID09ICddJykKKwkgIGdy
b3VwX2ZvdW5kID0gMTsKKwllbHNlCisJICBncm91cF9mb3VuZCA9IDA7Cisg
ICAgICB9IGVsc2UgeworCWlmKGdyb3VwX2ZvdW5kKSB7CisJICAvKiBGaW5h
bGx5LCB3ZSBhcmUgaW4gdGhlIHJpZ2h0IGdyb3VwIGFuZCBjYW4gcGFyc2Ug
dGhlIGxpbmUgKi8KKwkgIGNoYXIgKmtleSwgKnZhbDsKKwkgIGludCBmb3Vu
ZF9rZXl3b3JkOworCisJICBrZXkgPSBzdHJ0b2sobGluZSwgIj0iKTsKKwkg
IGlmKGtleSA9PSBOVUxMKSB7CisJICAgIHByaW50ZlBRRXhwQnVmZmVyKGVy
cm9yTWVzc2FnZSwKKwkJCSAgICAgICJFUlJPUjogc3ludGF4IGVycm9yIGlu
IHNlcnZpY2UgZmlsZSAnJXMnLCBsaW5lICVkXG4iLAorCQkJICAgICAgc2Vy
dmljZUZpbGUsCisJCQkgICAgICBsaW5lbnIpOworCSAgICBmY2xvc2UoZik7
CisJICAgIHJldHVybiAzOworCSAgfQorCSAgdmFsID0gbGluZSArIHN0cmxl
bihsaW5lKSArIDE7CisJICAKKwkgIGZvdW5kX2tleXdvcmQgPSAwOworCSAg
Zm9yKGkgPSAwOyBvcHRpb25zW2ldLmtleXdvcmQ7IGkrKykgeworCSAgICBp
ZihzdHJjbXAob3B0aW9uc1tpXS5rZXl3b3JkLCBrZXkpID09IDApIHsKKyAJ
ICAgICAgaWYob3B0aW9uc1tpXS52YWwgIT0gTlVMTCkKKyAJCWZyZWUob3B0
aW9uc1tpXS52YWwpOworIAkgICAgICBvcHRpb25zW2ldLnZhbCA9IHN0cmR1
cCh2YWwpOworCSAgICAgIGZvdW5kX2tleXdvcmQgPSAxOworCSAgICB9CisJ
ICB9CisKKwkgIGlmKCFmb3VuZF9rZXl3b3JkKSB7CisJICAgIHByaW50ZlBR
RXhwQnVmZmVyKGVycm9yTWVzc2FnZSwKKwkJCSAgICAgICJFUlJPUjogc3lu
dGF4IGVycm9yIGluIHNlcnZpY2UgZmlsZSAnJXMnLCBsaW5lICVkXG4iLAor
CQkJICAgICAgc2VydmljZUZpbGUsCisJCQkgICAgICBsaW5lbnIpOworCSAg
ICBmY2xvc2UoZik7CisJICAgIHJldHVybiAzOworCSAgfQorCX0KKyAgICAg
IH0KKyAgICB9CisKKyAgICBmY2xvc2UoZik7CisgIH0KKworICByZXR1cm4g
MDsKK30KKwogCiAvKiAtLS0tLS0tLS0tLS0tLS0tCiAgKiBDb25uaW5mbyBw
YXJzZXIgcm91dGluZQpAQCAtMjI1NCw2ICsyMzY3LDE0IEBACiAJCWlmIChv
cHRpb24tPnZhbCkKIAkJCWZyZWUob3B0aW9uLT52YWwpOwogCQlvcHRpb24t
PnZhbCA9IHN0cmR1cChwdmFsKTsKKworCX0KKworCS8qIE5vdyBjaGVjayBm
b3Igc2VydmljZSBpbmZvICovCQorCWlmKHBhcnNlU2VydmljZUluZm8ob3B0
aW9ucywgZXJyb3JNZXNzYWdlKSkgeworCSAgUFFjb25uaW5mb0ZyZWUob3B0
aW9ucyk7CisJICBmcmVlKGJ1Zik7CisJICByZXR1cm4gTlVMTDsKIAl9CiAK
IAkvKiBEb25lIHdpdGggdGhlIG1vZGlmaWFibGUgaW5wdXQgc3RyaW5nICov
Cg==

--------------Boundary-00=_L7A4CXIUPUVOPJQXHIYU--