Missing extension locks in the nbtree code

Started by Andres Freundalmost 11 years ago2 messageshackers
Jump to latest
#1Andres Freund
andres@anarazel.de

Hi,

There've recently been more and more reports of "unexpected data beyond
EOF in block %u of relation %s" for me to think that it's likely to be
caused by a kernel bug. It's now been reproduced at least on somewhat
recent linux and freebsd versions.

So I started looking around for causes. Not for the first time.

One, probably harmless thing is that _bt_getroot() creates the initial
root page without an extension lock. That's not pretty, but it should
happen on the first write and be safe due to the content lock on the
metapage. ISTM we should still not do that, but it's probably not the
explanation.

The fix is just to change
if (fd == -1 || XLByteInSeg(change->lsn, curOpenSegNo))
into
if (fd == -1 || !XLByteInSeg(change->lsn, curOpenSegNo))

the bug doesn't have any correctness implications afaics, just
performance. We read all the spilled files till the end, so even change
spilled to the wrong segment gets picked up.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#1)
Re: Missing extension locks in the nbtree code

On 2015-07-06 23:21:12 +0200, Andres Freund wrote:

There've recently been more and more reports of "unexpected data beyond
EOF in block %u of relation %s" for me to think that it's likely to be
caused by a kernel bug. It's now been reproduced at least on somewhat
recent linux and freebsd versions.

So I started looking around for causes. Not for the first time.

One, probably harmless thing is that _bt_getroot() creates the initial
root page without an extension lock. That's not pretty, but it should
happen on the first write and be safe due to the content lock on the
metapage. ISTM we should still not do that, but it's probably not the
explanation.

Uh, this was a mixup, I didn't want to send this email yet. The below
obviously is from a different thread.

The fix is just to change
if (fd == -1 || XLByteInSeg(change->lsn, curOpenSegNo))
into
if (fd == -1 || !XLByteInSeg(change->lsn, curOpenSegNo))

the bug doesn't have any correctness implications afaics, just
performance. We read all the spilled files till the end, so even change
spilled to the wrong segment gets picked up.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers