CreateFakeRelcacheEntry versus Hot Standby

Started by Tom Lanealmost 16 years ago3 messages
#1Tom Lane
tgl@sss.pgh.pa.us

I was rather surprised to find this code still present:

/*
* We set up the lockRelId in case anything tries to lock the dummy
* relation. Note that this is fairly bogus since relNode may be
* different from the relation's OID. It shouldn't really matter though,
* since we are presumably running by ourselves and can't have any lock
* conflicts ...
*/
rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode;
rel->rd_lockInfo.lockRelId.relId = rnode.relNode;

Seems quite unsafe in HS.

regards, tom lane

#2Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#1)
Re: CreateFakeRelcacheEntry versus Hot Standby

Tom Lane wrote:

I was rather surprised to find this code still present:

/*
* We set up the lockRelId in case anything tries to lock the dummy
* relation. Note that this is fairly bogus since relNode may be
* different from the relation's OID. It shouldn't really matter though,
* since we are presumably running by ourselves and can't have any lock
* conflicts ...
*/
rel->rd_lockInfo.lockRelId.dbId = rnode.dbNode;
rel->rd_lockInfo.lockRelId.relId = rnode.relNode;

Seems quite unsafe in HS.

Good catch.

I started looking at the callers of CreateFakeRelcacheEntry to see if
any of them actually take a lock on the Relation. It seems not, so maybe
we should just set those to InvalidOid, and add an
Assert(OidIsValid(dbId) && OidIsValid(relId)) to LockAcquire() to ensure
that a fake relcache entry is not used to grab locks in the future either.

However, I spotted another bug. In ginContinueSplit(), we call
CreateFakeRelcacheEntry(), and pass the fake relcache entry to
prepareEntryScan() or prepareDataScan(). Those functions store the
passed-in Relation ptr in the passed in GinBtreeData struct. After the
call, we free the fake relcache entry, but the pointer is still present
in the GinBtreeData. The findParents() call later in the function uses
it, and will thus access free'd memory.

A trivial fix is to delay freeing the fake relcache entry in
ginContinueSplit(). But this once again shows that our WAL redo
functions are not as well tested as they should be, and the WAL redo
cleanup functions in particularly.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#2)
Re: CreateFakeRelcacheEntry versus Hot Standby

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

I started looking at the callers of CreateFakeRelcacheEntry to see if
any of them actually take a lock on the Relation. It seems not, so maybe
we should just set those to InvalidOid, and add an
Assert(OidIsValid(dbId) && OidIsValid(relId)) to LockAcquire() to ensure
that a fake relcache entry is not used to grab locks in the future either.

+1, I was thinking that would be a reasonable thing to try. (You can't
assert dbID != 0, but the relid test should be sufficient.)

However, I spotted another bug. In ginContinueSplit(), we call
CreateFakeRelcacheEntry(), and pass the fake relcache entry to
prepareEntryScan() or prepareDataScan(). Those functions store the
passed-in Relation ptr in the passed in GinBtreeData struct. After the
call, we free the fake relcache entry, but the pointer is still present
in the GinBtreeData. The findParents() call later in the function uses
it, and will thus access free'd memory.

A trivial fix is to delay freeing the fake relcache entry in
ginContinueSplit(). But this once again shows that our WAL redo
functions are not as well tested as they should be, and the WAL redo
cleanup functions in particularly.

Well, HS will at least help us shake out the redo functions. Cleanup
is a hard case to test ...

regards, tom lane