BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

Started by PG Bug reporting formalmost 3 years ago33 messagesbugs

noreply@postgresql.org

almost 3 years ago

The following bug has been logged on the website:

Bug reference: 17994
Logged by: Alexander Lakhin
Email address: exclusion@gmail.com
PostgreSQL version: 16beta1
Operating system: Ubuntu 22.04
Description:

The following script:
for ((n=1;n<=10;n++)); do
echo "ITERATION $n"

createdb test
numclients=50
for ((c=1;c<=$numclients;c++)); do
cat << EOF | psql test -o /dev/null &
CREATE PUBLICATION testpub_$c FOR ALL TABLES;
CREATE TYPE rec_$c AS (f1 text, f2 text);
CREATE TEMP TABLE tbl_$c (f rec_$c);
INSERT INTO tbl_$c VALUES ((repeat('1234567890', 100000), 'a'));

UPDATE tbl_$c SET f.f2 = 'b';
EOF
done
wait
dropdb test

grep 'TRAP:' server.log && break;
done

invokes errors or a server crash:
ITERATION 1
ITERATION 2
ERROR: unsupported byval length: 32639
ITERATION 3
ITERATION 4
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
connection to server was lost
WARNING: terminating connection because of crash of another server
process
DETAIL: The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
dropdb: error: database removal failed: server closed the connection
unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
TRAP: failed Assert("(thisatt->attalign) == TYPALIGN_SHORT"), File:
"heaptuple.c", Line: 1312, PID: 1737623

The stack trace of the crash:
Core was generated by `postgres: law test [local] UPDATE
'.
Program terminated with signal SIGABRT, Aborted.

warning: Section `.reg-xstate/1737623' in core file too small.
#0 __pthread_kill_implementation (no_tid=0, signo=6,
threadid=140071061268416) at ./nptl/pthread_kill.c:44
44 ./nptl/pthread_kill.c: No such file or directory.
(gdb) bt
#0 __pthread_kill_implementation (no_tid=0, signo=6,
threadid=140071061268416) at ./nptl/pthread_kill.c:44
#1 __pthread_kill_internal (signo=6, threadid=140071061268416) at
./nptl/pthread_kill.c:78
#2 __GI___pthread_kill (threadid=140071061268416, signo=signo@entry=6) at
./nptl/pthread_kill.c:89
#3 0x00007f64d7c27476 in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#4 0x00007f64d7c0d7f3 in __GI_abort () at ./stdlib/abort.c:79
#5 0x0000555ede8dc3b3 in ExceptionalCondition (
conditionName=0x555edeb49700 "(thisatt->attalign) == TYPALIGN_SHORT",
fileName=0x555edeb49020 "heaptuple.c", lineNumber=1312) at assert.c:66
#6 0x0000555edc7b76d7 in heap_deform_tuple (tuple=0x7ffe49cfe570,
tupleDesc=0x7f64ca4562d8,
values=0x62500005ad40, isnull=0x62500005b130) at heaptuple.c:1312
#7 0x0000555edd389d0b in ExecEvalFieldStoreDeForm (state=0x62500005b230,
op=0x62500005b338,
econtext=0x62500005ac20) at execExprInterp.c:3210
#8 0x0000555edd3729b4 in ExecInterpExpr (state=0x62500005b230,
econtext=0x62500005ac20, isnull=0x7ffe49cfe900)
at execExprInterp.c:1446
#9 0x0000555edd3791f2 in ExecInterpExprStillValid (state=0x62500005b230,
econtext=0x62500005ac20,
isNull=0x7ffe49cfe900) at execExprInterp.c:1870
#10 0x0000555edd406e84 in ExecEvalExprSwitchContext (state=0x62500005b230,
econtext=0x62500005ac20,
isNull=0x7ffe49cfe900) at ../../../src/include/executor/executor.h:355
#11 0x0000555edd4070ab in ExecProject (projInfo=0x62500005b228) at
../../../src/include/executor/executor.h:389
#12 0x0000555edd40883e in ExecScan (node=0x62500005ab10,
accessMtd=0x555edd5843cf <SeqNext>,
recheckMtd=0x555edd5847cd <SeqRecheck>) at execScan.c:237
#13 0x0000555edd58484e in ExecSeqScan (pstate=0x62500005ab10) at
nodeSeqscan.c:112
#14 0x0000555edd3f5a71 in ExecProcNodeFirst (node=0x62500005ab10) at
execProcnode.c:464
#15 0x0000555edd5495a5 in ExecProcNode (node=0x62500005ab10) at
../../../src/include/executor/executor.h:273
#16 0x0000555edd5674b3 in ExecModifyTable (pstate=0x62500005a490) at
nodeModifyTable.c:3616
#17 0x0000555edd3f5a71 in ExecProcNodeFirst (node=0x62500005a490) at
execProcnode.c:464
#18 0x0000555edd3af500 in ExecProcNode (node=0x62500005a490) at
../../../src/include/executor/executor.h:273
#19 0x0000555edd3be25a in ExecutePlan (estate=0x62500005a218,
planstate=0x62500005a490,
use_parallel_mode=false, operation=CMD_UPDATE, sendTuples=false,
numberTuples=0,
direction=ForwardScanDirection, dest=0x625000072770, execute_once=true)
at execMain.c:1670
#20 0x0000555edd3b17f2 in standard_ExecutorRun (queryDesc=0x619000001598,
direction=ForwardScanDirection,
count=0, execute_once=true) at execMain.c:365
#21 0x0000555edd3b0e37 in ExecutorRun (queryDesc=0x619000001598,
direction=ForwardScanDirection, count=0,
execute_once=true) at execMain.c:309
#22 0x0000555ede0d688a in ProcessQuery (plan=0x625000006c28,
sourceText=0x625000005218 "UPDATE tbl_14 SET f.f2 = 'b';", params=0x0,
queryEnv=0x0, dest=0x625000072770,
qc=0x7ffe49cff1c0) at pquery.c:160
#23 0x0000555ede0deb02 in PortalRunMulti (portal=0x625000025a18,
isTopLevel=true, setHoldSnapshot=false,
dest=0x625000072770, altdest=0x625000072770, qc=0x7ffe49cff1c0) at
pquery.c:1277
#24 0x0000555ede0dba90 in PortalRun (portal=0x625000025a18,
count=9223372036854775807, isTopLevel=true,
run_once=true, dest=0x625000072770, altdest=0x625000072770,
qc=0x7ffe49cff1c0) at pquery.c:791
#25 0x0000555ede0c592d in exec_simple_query (query_string=0x625000005218
"UPDATE tbl_14 SET f.f2 = 'b';")
at postgres.c:1274
#26 0x0000555ede0d347a in PostgresMain (dbname=0x62900001b358 "test",
username=0x6250000020f8 "law")
at postgres.c:4632
#27 0x0000555eddcf401f in BackendRun (port=0x614000016040) at
postmaster.c:4461
#28 0x0000555eddcf2723 in BackendStartup (port=0x614000016040) at
postmaster.c:4189
#29 0x0000555eddce84dd in ServerLoop () at postmaster.c:1779
#30 0x0000555eddce69fe in PostmasterMain (argc=3, argv=0x6030000006a0) at
postmaster.c:1463
#31 0x0000555edd67850e in main (argc=3, argv=0x6030000006a0) at main.c:198

(gdb) frame 6
#6 0x0000555edc7b76d7 in heap_deform_tuple (tuple=0x7ffe49cfe570,
tupleDesc=0x7f64ca4562d8,
values=0x62500005ad40, isnull=0x62500005b130) at heaptuple.c:1312
1312 off = att_align_nominal(off,
thisatt->attalign);
(gdb) p thisatt->attalign
$1 = 0 '\000'

(gdb) frame 7
#7 0x0000555edd389d0b in ExecEvalFieldStoreDeForm (state=0x62500005b230,
op=0x62500005b338,
econtext=0x62500005ac20) at execExprInterp.c:3210
3210 heap_deform_tuple(&tmptup, tupDesc,
(gdb) p *tupDesc
$2 = {natts = 18406, tdtypeid = 1952409456, tdtypmod = 1953718639,
tdrefcount = 859320671, constr = 0x3338,
attrs = 0x7f64ca4562f0}

With some debug logging added I see (note the natts value):
2023-06-24 15:23:12.488 MSK [1741232] LOG: get_cached_rowtype() after
lookup_type_cache| type_id: 16389, typentry->tupDesc: 0x7ff12f045640,
typentry->tupDesc->natts: 2
2023-06-24 15:23:12.488 MSK [1741232] STATEMENT: UPDATE tbl_5 SET f.f2 =
'b';
2023-06-24 15:23:12.488 MSK [1741293] LOG: statement: CREATE TYPE rec_23 AS
(f1 text, f2 text);
2023-06-24 15:23:12.488 MSK [1741232] LOG: ExecEvalFieldStoreDeForm()
before heap_deform_tuple| tupDesc: 0x7ff12f045640, tupDesc->natts: 16427
2023-06-24 15:23:12.488 MSK [1741232] STATEMENT: UPDATE tbl_5 SET f.f2 =
'b';
TRAP: failed Assert("(thisatt->attalign) == TYPALIGN_SHORT"), File:
"heaptuple.c", Line: 1312, PID: 1741232

So it looks like *tupDesc memory was overridden due to relcache invalidation
during ExecEvalFieldStoreDeForm() execution after the get_cached_rowtype()
call.

The pg_usleep(1000) call added inside ExecEvalFieldStoreDeForm() after
get_cached_rowtype() greatly increases the probability of the crash.

Reproduced on REL_11_STABLE (since 96e38fa5e) .. master.

Tom Lane

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: PG Bug reporting form (#1)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

PG Bug reporting form <noreply@postgresql.org> writes:

The following script:
...
invokes errors or a server crash:

Thanks for the report! The good news is that the case shown here
is easily fixed, as attached. get_cached_rowtype() explicitly
specifies that its result doesn't hold good across database
access if you don't increment the tupdesc's refcount. Most of
the callers get this right, but ExecEvalFieldStoreDeForm() is
doing things in the wrong order by fetching the tupdesc before
detoasting the input datum. We could fix that by temporarily
incrementing the refcount, but I don't see any reason we can't
just reorder the steps to make it safe.

The bad news is that while investigating this I came across
another hazard that seems much messier to fix. To wit, if
you have a tuple with "missing" pass-by-ref columns, then
some of the columns output by heap_deform_tuple() will
contain pointers into the tupdesc (cf. getmissingattr()).
That's not sufficient lifespan in the scenario we're dealing
with here: if the tupdesc gets trashed anytime before the
eventual ExecEvalFieldStoreForm(), we'll have garbage in the
result tuple.

Worse, I fear there may be many other places with latent bugs of the
same ilk. Before the attmissingval code was added, one could assume
that the result of heap_deform_tuple would hold good as long as you
had a pin on the source tuple. But now it depends on metadata as
well, and I don't think we have a coherent story about that.

Any thoughts what to do? I considered making getmissingattr
apply datumCopy, but that would probably lead to unpleasant
memory leaks.

regards, tom lane

Andres Freund

andres@anarazel.de

almost 3 years ago

In reply to: Tom Lane (#2)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

Hi,

On 2023-06-28 14:49:34 -0400, Tom Lane wrote:

The bad news is that while investigating this I came across
another hazard that seems much messier to fix. To wit, if
you have a tuple with "missing" pass-by-ref columns, then
some of the columns output by heap_deform_tuple() will
contain pointers into the tupdesc (cf. getmissingattr()).
That's not sufficient lifespan in the scenario we're dealing
with here: if the tupdesc gets trashed anytime before the
eventual ExecEvalFieldStoreForm(), we'll have garbage in the
result tuple.

What are the scenarios that could lead to the tupledesc being released before
ExecEvalFieldStoreForm()? I guess a function call that does an ALTER TABLE
might do the trick? But shouldn't such cases be prohibited by
CheckTableNotInUse()? If other sessions caused the tupledesc to be changed,
we should already hang onto the old definition via the
RememberToFreeTupleDescAtEOX() mechanism?

What about erroring out if the get_cached_rowtype() in
ExecEvalFieldStoreForm() indicates the tupledesc changed? Hm, I guess that
could cause spurious errors, our mechanism for determining whether tupledescs
changed is pretty coarse.

Worse, I fear there may be many other places with latent bugs of the
same ilk. Before the attmissingval code was added, one could assume
that the result of heap_deform_tuple would hold good as long as you
had a pin on the source tuple. But now it depends on metadata as
well, and I don't think we have a coherent story about that.

Any thoughts what to do? I considered making getmissingattr
apply datumCopy, but that would probably lead to unpleasant
memory leaks.

Ugh. This is pretty nasty.

The only real thing I can think of is to use refcounted tupledescs more
widely. Part of the problem would be dealing with releasing the refcounts,
another what to do about non-refcounted descs. WRT the former, I guess we
could add a variant of ResourceOwnerRememberTupleDesc() that doesn't warn on
commit if there are unreleased references.

Greetings,

Andres Freund

Tom Lane

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: Andres Freund (#3)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

Andres Freund <andres@anarazel.de> writes:

On 2023-06-28 14:49:34 -0400, Tom Lane wrote:

That's not sufficient lifespan in the scenario we're dealing
with here: if the tupdesc gets trashed anytime before the
eventual ExecEvalFieldStoreForm(), we'll have garbage in the
result tuple.

What are the scenarios that could lead to the tupledesc being released before
ExecEvalFieldStoreForm()? I guess a function call that does an ALTER TABLE
might do the trick? But shouldn't such cases be prohibited by
CheckTableNotInUse()?

Any old cache flush could do it; you don't need a local trigger event.
(Alexander's original test case uses CREATE PUBLICATION to trigger
the cache flush, but that's just an easy way to make it more-or-less
deterministic.)

If other sessions caused the tupledesc to be changed,
we should already hang onto the old definition via the
RememberToFreeTupleDescAtEOX() mechanism?

I believe the tupdesc in question is actually in the typcache,
which doesn't have anything like RememberToFreeTupleDescAtEOX
(which is a horrid hack anyway if you ask me).

We could probably make things better for this specific case by
teaching the typcache not to replace a cached tupdesc unless its
contents actually change. But that just makes it harder to get
to a bug instance; it's not a cure-all.

regards, tom lane

Andres Freund

andres@anarazel.de

almost 3 years ago

In reply to: Tom Lane (#4)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

Hi,

On 2023-06-28 15:52:28 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

If other sessions caused the tupledesc to be changed,
we should already hang onto the old definition via the
RememberToFreeTupleDescAtEOX() mechanism?

I believe the tupdesc in question is actually in the typcache,
which doesn't have anything like RememberToFreeTupleDescAtEOX
(which is a horrid hack anyway if you ask me).

It's in the typecache, but that just uses the relcache's tupledesc for
non-record composites. But looks like it doesn't suffice, because
TypeCacheRelCallback() releases the refcount the typecache held, regardless of
the tupledesc having changed meaningfully or not.

So even if there can't have been "important" changes to the tupledesc due to
locking, we end up with the newer tupledesc on a second lookup...

I agree that the RememberToFreeTupleDescAtEOX thing is a ugly hack, but I
don't think it's easy to come up with something good...

We could probably make things better for this specific case by
teaching the typcache not to replace a cached tupdesc unless its
contents actually change. But that just makes it harder to get
to a bug instance; it's not a cure-all.

Yea :(.

Greetings,

Andres Freund

Andrew Dunstan

andrew@dunslane.net

almost 3 years ago

In reply to: Andres Freund (#5)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

On 2023-06-28 We 16:54, Andres Freund wrote:

Hi,

On 2023-06-28 15:52:28 -0400, Tom Lane wrote:

Andres Freund<andres@anarazel.de> writes:

If other sessions caused the tupledesc to be changed,
we should already hang onto the old definition via the
RememberToFreeTupleDescAtEOX() mechanism?

I believe the tupdesc in question is actually in the typcache,
which doesn't have anything like RememberToFreeTupleDescAtEOX
(which is a horrid hack anyway if you ask me).

It's in the typecache, but that just uses the relcache's tupledesc for
non-record composites. But looks like it doesn't suffice, because
TypeCacheRelCallback() releases the refcount the typecache held, regardless of
the tupledesc having changed meaningfully or not.

So even if there can't have been "important" changes to the tupledesc due to
locking, we end up with the newer tupledesc on a second lookup...

I agree that the RememberToFreeTupleDescAtEOX thing is a ugly hack, but I
don't think it's easy to come up with something good...

We could probably make things better for this specific case by
teaching the typcache not to replace a cached tupdesc unless its
contents actually change. But that just makes it harder to get
to a bug instance; it's not a cure-all.

Yea :(.

:-(

I thought about whether the datumCopy() idea might be manageable, but
getmissingattr() is inlined by heap_getttr() which has distressingly
large number of call sites, including in third party code.

Could we maybe cache missing values elsewhere in a way that's less
volatile? That's an extremely vague idea, just thinking out loud.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

Andrew Dunstan

andrew@dunslane.net

almost 3 years ago

In reply to: Andrew Dunstan (#6)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

On 2023-06-28 We 17:57, Andrew Dunstan wrote:

On 2023-06-28 We 16:54, Andres Freund wrote:

Hi,

On 2023-06-28 15:52:28 -0400, Tom Lane wrote:

Andres Freund<andres@anarazel.de> writes:

If other sessions caused the tupledesc to be changed,
we should already hang onto the old definition via the
RememberToFreeTupleDescAtEOX() mechanism?

I believe the tupdesc in question is actually in the typcache,
which doesn't have anything like RememberToFreeTupleDescAtEOX
(which is a horrid hack anyway if you ask me).

It's in the typecache, but that just uses the relcache's tupledesc for
non-record composites. But looks like it doesn't suffice, because
TypeCacheRelCallback() releases the refcount the typecache held, regardless of
the tupledesc having changed meaningfully or not.

So even if there can't have been "important" changes to the tupledesc due to
locking, we end up with the newer tupledesc on a second lookup...

I agree that the RememberToFreeTupleDescAtEOX thing is a ugly hack, but I
don't think it's easy to come up with something good...

We could probably make things better for this specific case by
teaching the typcache not to replace a cached tupdesc unless its
contents actually change. But that just makes it harder to get
to a bug instance; it's not a cure-all.

Yea :(.

:-(

I thought about whether the datumCopy() idea might be manageable, but
getmissingattr() is inlined by heap_getttr() which has distressingly
large number of call sites, including in third party code.

Could we maybe cache missing values elsewhere in a way that's less
volatile? That's an extremely vague idea, just thinking out loud.

After (not) sleeping on this overnight, and discussing it with a
colleague this morning, here's a suggestion. We have a hash table, keyed
by (tdtypeid, attnum) where we store a datumCopy'd version of the value.
If it's present just return the value instead of getting it from the
tupdesc. The hash table is blown away at the end of the transaction.
Assuming that's workable I think it would not be a large patch.

My colleague did ask if we had live reproducible case.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

Tom Lane

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: Andrew Dunstan (#7)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

Andrew Dunstan <andrew@dunslane.net> writes:

After (not) sleeping on this overnight, and discussing it with a
colleague this morning, here's a suggestion. We have a hash table, keyed
by (tdtypeid, attnum) where we store a datumCopy'd version of the value.
If it's present just return the value instead of getting it from the
tupdesc. The hash table is blown away at the end of the transaction.
Assuming that's workable I think it would not be a large patch.

That sounds possibly workable. I'm a bit concerned about added
overhead, and about whether the hashtable needs invalidation support.
It might be better to key it off (relfilenode, attnum).

My colleague did ask if we had live reproducible case.

I have not attempted to build one, but it'd probably be fairly
easy to do if you turn on debug_discard_caches.

regards, tom lane

Andrew Dunstan

andrew@dunslane.net

almost 3 years ago

In reply to: Tom Lane (#8)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

On 2023-06-29 Th 10:26, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

After (not) sleeping on this overnight, and discussing it with a
colleague this morning, here's a suggestion. We have a hash table, keyed
by (tdtypeid, attnum) where we store a datumCopy'd version of the value.
If it's present just return the value instead of getting it from the
tupdesc. The hash table is blown away at the end of the transaction.
Assuming that's workable I think it would not be a large patch.

That sounds possibly workable.

OK, good, we have a plan.

I'm a bit concerned about added
overhead, and about whether the hashtable needs invalidation support.
It might be better to key it off (relfilenode, attnum).

re overhead: getmissingattr isn't called at all except for "off the end"
attributes. Setting up the hash table at most once per txn doesn't seem
likely to cost much, and even that won't be done if getmissingattr isn't
called.

re relfilenode: we don't have it in getmissingattr, so that would
involve looking it up or whacking around the API. Neither of those seem
good. Do you have any reason for thinking tdtypeid will not be sufficient?

re invalidation: that seems to suggest that the missing value could
change under us. I don't think it can, but if it can then more than just
this is broken. If not, how would invalidation affect us?

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#10

Tom Lane

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: Andrew Dunstan (#9)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-06-29 Th 10:26, Tom Lane wrote:

I'm a bit concerned about added
overhead, and about whether the hashtable needs invalidation support.
It might be better to key it off (relfilenode, attnum).

re relfilenode: we don't have it in getmissingattr, so that would
involve looking it up or whacking around the API.

Yeah, I was afraid of that.

re invalidation: that seems to suggest that the missing value could
change under us. I don't think it can, but if it can then more than just
this is broken. If not, how would invalidation affect us?

The scenario I'm afraid of is that we cache a missingval for table X,
then X gets dropped, then a new table Y gets created that by bad luck
has the same type OID as X did, then we add a column to Y that
requires a missingval, and now we have an entry in the hashtable that
matches Y but contains the wrong value. Admittedly, it seems very
low probability that this would all happen within the span of one
transaction, so maybe we can get away with ignoring the case. But
if we used relfilenode, we'd have at least a little more protection
because of the tombstone files that prevent immediate re-use of a
relfilenode OID. I'm not sure that it'd be bulletproof even with
relfilenode, though. Maybe we should bite the bullet and provide
invalidation based on a pg_type inval callback.

regards, tom lane

#11

Andrew Dunstan

andrew@dunslane.net

almost 3 years ago

In reply to: Tom Lane (#10)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

On 2023-06-29 Th 15:25, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

On 2023-06-29 Th 10:26, Tom Lane wrote:

I'm a bit concerned about added
overhead, and about whether the hashtable needs invalidation support.
It might be better to key it off (relfilenode, attnum).

re relfilenode: we don't have it in getmissingattr, so that would
involve looking it up or whacking around the API.

Yeah, I was afraid of that.

re invalidation: that seems to suggest that the missing value could
change under us. I don't think it can, but if it can then more than just
this is broken. If not, how would invalidation affect us?

The scenario I'm afraid of is that we cache a missingval for table X,
then X gets dropped, then a new table Y gets created that by bad luck
has the same type OID as X did, then we add a column to Y that
requires a missingval, and now we have an entry in the hashtable that
matches Y but contains the wrong value. Admittedly, it seems very
low probability that this would all happen within the span of one
transaction, so maybe we can get away with ignoring the case. But
if we used relfilenode, we'd have at least a little more protection
because of the tombstone files that prevent immediate re-use of a
relfilenode OID. I'm not sure that it'd be bulletproof even with
relfilenode, though. Maybe we should bite the bullet and provide
invalidation based on a pg_type inval callback.

Yeah, Robert has just convinced me, so I'll do it like that. It doesn't
look too hard.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#12

Tom Lane

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: Andrew Dunstan (#11)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-06-29 Th 15:25, Tom Lane wrote:

Maybe we should bite the bullet and provide
invalidation based on a pg_type inval callback.

Yeah, Robert has just convinced me, so I'll do it like that. It doesn't
look too hard.

Oh, I have a better idea. We're only going to need all this for
pass-by-ref types, right? Why not make the hash key be the value
itself? Wrap it in a bytea perhaps to avoid needing a bespoke
hash function.

regards, tom lane

#13

Andrew Dunstan

andrew@dunslane.net

almost 3 years ago

In reply to: Tom Lane (#12)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

On 2023-06-29 Th 18:41, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

On 2023-06-29 Th 15:25, Tom Lane wrote:

Maybe we should bite the bullet and provide
invalidation based on a pg_type inval callback.

Yeah, Robert has just convinced me, so I'll do it like that. It doesn't
look too hard.

Oh, I have a better idea. We're only going to need all this for
pass-by-ref types, right?

Yes, the value we get back for byval types isn't a pointer that might
disappear.

Why not make the hash key be the value
itself? Wrap it in a bytea perhaps to avoid needing a bespoke
hash function.

Not sure I understand.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#14

Tom Lane

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: Andrew Dunstan (#13)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-06-29 Th 18:41, Tom Lane wrote:

Why not make the hash key be the value
itself? Wrap it in a bytea perhaps to avoid needing a bespoke
hash function.

Not sure I understand.

Say the missingval for a particular column is text 'abc'.
We don't actually care which column it is, all we need is a
copy of that datum that will stay put for the rest of the
transaction. So I'm thinking that the lookup key for the
hash table should actually be the contents of the datum,
and we don't need to store anything else at all. (If we
happen to have two columns with the same missingval, they
can perfectly well share this hash entry.) Then there's
no question of invalidation, or at least the existing
invalidation mechanisms for tupdescs do all we need.

regards, tom lane

#15

Andrew Dunstan

andrew@dunslane.net

almost 3 years ago

In reply to: Tom Lane (#14)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

On 2023-06-30 Fr 08:46, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

On 2023-06-29 Th 18:41, Tom Lane wrote:

Why not make the hash key be the value
itself? Wrap it in a bytea perhaps to avoid needing a bespoke
hash function.

Not sure I understand.

Say the missingval for a particular column is text 'abc'.
We don't actually care which column it is, all we need is a
copy of that datum that will stay put for the rest of the
transaction. So I'm thinking that the lookup key for the
hash table should actually be the contents of the datum,
and we don't need to store anything else at all. (If we
happen to have two columns with the same missingval, they
can perfectly well share this hash entry.) Then there's
no question of invalidation, or at least the existing
invalidation mechanisms for tupdescs do all we need.

OK, I get it. Do we have a routine to wrap a Datum in a bytea, or do I
need to write one? It's slightly amusing that the original patch for
fast defaults stored the missing value as a bytea, but someone (Andres
IIRC) preferred that we store it as a one element array, so that's what
we went with.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#16

Tom Lane

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: Andrew Dunstan (#15)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

Andrew Dunstan <andrew@dunslane.net> writes:

OK, I get it. Do we have a routine to wrap a Datum in a bytea, or do I
need to write one?

I don't recall that functionality being exported anywhere,
but it'd only be a couple lines of code in any case.

(Whether it's actually worth doing, I'm not sure about.
It only helps if it lets you use existing hashtable support
routines, but I'm not quite sure that any of those exist
for bytea either.)

regards, tom lane

#17

Andrew Dunstan

andrew@dunslane.net

almost 3 years ago

In reply to: Tom Lane (#16)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

On 2023-06-30 Fr 09:43, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

OK, I get it. Do we have a routine to wrap a Datum in a bytea, or do I
need to write one?

I don't recall that functionality being exported anywhere,
but it'd only be a couple lines of code in any case.

(Whether it's actually worth doing, I'm not sure about.
It only helps if it lets you use existing hashtable support
routines, but I'm not quite sure that any of those exist
for bytea either.)

Me either. I think this might call for too much invention so I'm going
to revert to plan A. The invalidation code won't be very much, and it
should be a fairly rare event, so it doesn't need to be very clever.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#18

Tom Lane

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: Andrew Dunstan (#17)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

Andrew Dunstan <andrew@dunslane.net> writes:

Me either. I think this might call for too much invention so I'm going
to revert to plan A. The invalidation code won't be very much, and it
should be a fairly rare event, so it doesn't need to be very clever.

The problem with rarely-executed code is that it's also hard to test.

regards, tom lane

#19

Andrew Dunstan

andrew@dunslane.net

almost 3 years ago

In reply to: Tom Lane (#18)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

On 2023-06-30 Fr 15:08, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

Me either. I think this might call for too much invention so I'm going
to revert to plan A. The invalidation code won't be very much, and it
should be a fairly rare event, so it doesn't need to be very clever.

The problem with rarely-executed code is that it's also hard to test.

Your Plan B in the end proved less difficult than I thought, and
certainly seems more robust than having to tangle with invalidations. I
didn't try to do anything to wrap the values in a bytea, it didn't seem
necessary. Here's a patch - it's not terribly long or invasive. I
haven't tried backpatching it yet.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#20

Andres Freund

andres@anarazel.de

almost 3 years ago

In reply to: Andrew Dunstan (#19)

Re: BUG #17994: Invalidating relcache corrupts tupDesc inside ExecEvalFieldStoreDeForm()

Hi,

On 2023-07-01 09:14:39 -0400, Andrew Dunstan wrote:

On 2023-06-30 Fr 15:08, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

Me either. I think this might call for too much invention so I'm going
to revert to plan A. The invalidation code won't be very much, and it
should be a fairly rare event, so it doesn't need to be very clever.

The problem with rarely-executed code is that it's also hard to test.

Your Plan B in the end proved less difficult than I thought, and certainly
seems more robust than having to tangle with invalidations. I didn't try to
do anything to wrap the values in a bytea, it didn't seem necessary. Here's
a patch - it's not terribly long or invasive. I haven't tried backpatching
it yet.

Hm. Is tying this to top-level transactions the right choice? What about
sequences like

SAVEPOINT x; ALTER TABLE ... ADD COLUMN ... DEFAULT; SELECT use_default FROM ...; ROLLBACK TO SAVEPOINT x;
SAVEPOINT y; ALTER TABLE ... ADD COLUMN ... DEFAULT; SELECT use_default FROM ...; ROLLBACK TO SAVEPOINT y;

Isn't that going to break the assumption that the key is unique within a
transaction? I think at the very least you'd need to make

Separately, will this work correctly with procedures keeping values alive
across transactions? I don't feel like I have a good grasp at how that whole
area is supposed to work...

Greetings,

Andres Freund

#21

Tom Lane

tgl@sss.pgh.pa.us

almost 3 years ago

In reply to: Andres Freund (#20)

#22

Andres Freund

andres@anarazel.de

almost 3 years ago

In reply to: Tom Lane (#21)