TupleDescAttr bounds checks

Started by Robert Haas16 days ago12 messageshackers
Jump to latest
#1Robert Haas
robertmhaas@gmail.com

Hi,

Scrutiny of a recent test_plan_advice failure in the buildfarm
revealed a bug that had nothing to do with test_plan_advice or
pg_plan_advice; rather, it was a bug introduced by the virtual
generated columns feature, and specifically of that feature indexing
off of the beginning of a TupleDesc when whole-row attributes are
present. The first patch attached to this email fixes this issue, and
should be committed and back-patched to v18. I plan to do that soon
unless there are objections.

But that got me wondering why we don't have an assertion in
TupleDescAttr to catch this sort of thing, and it seems like that is
indeed something we can do, so patch #2 adds that and then cleans up
the resulting damage. By "damage" I mean correcting places where the
new Assert() either actually fails or could theoretically fail,
because we use TupleDescAttr() on a value that we don't know to be
within range. None of these seem to be actual bugs, because as the
commit message says, all TupleDescAttr() does is compute a pointer,
and we don't actually dereference that pointer in any of these code
paths until after we know that it's OK to do so. Nonetheless, these
all seem like good cleanups, so I do not see any of these changes as
arguments against adding the assertion. I propose to put this in
master.

Patch #3 adds a test case that would have caught the bug fixed by
patch #1 if we had already had the asserts added by patch #2. To my
surprise, we seem to have zero existing test coverage of creating an
index on a whole-row expression, so I think this is worth adding
mostly for that reason. One could also argue that it's worth adding as
a follow-up to #1 and #2, but we're unlikely to reintroduce that
specific bug. We might, however, add other bugs that this would also
catch.

Comments?

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachments:

v1-0001-Prevent-spurious-indexes-on-virtual-generated-col.patchapplication/octet-stream; name=v1-0001-Prevent-spurious-indexes-on-virtual-generated-col.patchDownload+4-3
v1-0003-Add-a-test-for-creating-an-index-on-a-whole-row-e.patchapplication/octet-stream; name=v1-0003-Add-a-test-for-creating-an-index-on-a-whole-row-e.patchDownload+23-1
v1-0002-Bounds-check-access-to-TupleDescAttr-with-an-Asse.patchapplication/octet-stream; name=v1-0002-Bounds-check-access-to-TupleDescAttr-with-an-Asse.patchDownload+26-17
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#1)
Re: TupleDescAttr bounds checks

Robert Haas <robertmhaas@gmail.com> writes:

Scrutiny of a recent test_plan_advice failure in the buildfarm
revealed a bug that had nothing to do with test_plan_advice or
pg_plan_advice; rather, it was a bug introduced by the virtual
generated columns feature, and specifically of that feature indexing
off of the beginning of a TupleDesc when whole-row attributes are
present. The first patch attached to this email fixes this issue, and
should be committed and back-patched to v18. I plan to do that soon
unless there are objections.

I had just come to the same conclusion about why grison is failing.
+1 to all three of these patches. (I did not look to see if 0002
fixes every case that the Assert could trigger on, but as long as
you're only putting it in HEAD I'm not too concerned that we might
have missed some.)

regards, tom lane

#3Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#2)
Re: TupleDescAttr bounds checks

On Fri, Mar 20, 2026 at 12:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Scrutiny of a recent test_plan_advice failure in the buildfarm
revealed a bug that had nothing to do with test_plan_advice or
pg_plan_advice; rather, it was a bug introduced by the virtual
generated columns feature, and specifically of that feature indexing
off of the beginning of a TupleDesc when whole-row attributes are
present. The first patch attached to this email fixes this issue, and
should be committed and back-patched to v18. I plan to do that soon
unless there are objections.

I had just come to the same conclusion about why grison is failing.
+1 to all three of these patches. (I did not look to see if 0002
fixes every case that the Assert could trigger on, but as long as
you're only putting it in HEAD I'm not too concerned that we might
have missed some.)

Hmm, I had a rougher version of this analysis (and an analysis of some
the other failures) on an email I sent yesterday on the pg_plan_advice
thread. Based on this email and another one you sent, I'm guessing you
either didn't see that email or maybe even didn't get a copy of it for
some reason.

Or maybe you just mean that you were checking over my analysis, but
just in case:

/messages/by-id/CA+TgmoZUN8FT1Ah=m6Uis5bHa4FUa+_hMDWtcABG17toEfpiUg@mail.gmail.com

--
Robert Haas
EDB: http://www.enterprisedb.com

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#3)
Re: TupleDescAttr bounds checks

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, Mar 20, 2026 at 12:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I had just come to the same conclusion about why grison is failing.

Hmm, I had a rougher version of this analysis (and an analysis of some
the other failures) on an email I sent yesterday on the pg_plan_advice
thread. Based on this email and another one you sent, I'm guessing you
either didn't see that email or maybe even didn't get a copy of it for
some reason.

I did see that, but it read to me that you were just guessing at that
time. This morning I put Asserts into indexcmds.c that verified that
it was trying to access the tupledesc for attno zero, and that proves
there is a bug there. It also seems like a plausible explanation for
why only one machine has exhibited the failure. (Your 0002 is a
better version of said Asserts.)

regards, tom lane

#5Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#4)
Re: TupleDescAttr bounds checks

On Fri, Mar 20, 2026 at 12:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I did see that, but it read to me that you were just guessing at that
time. This morning I put Asserts into indexcmds.c that verified that
it was trying to access the tupledesc for attno zero, and that proves
there is a bug there. It also seems like a plausible explanation for
why only one machine has exhibited the failure. (Your 0002 is a
better version of said Asserts.)

Ah, OK. Yeah, I wasn't completely sure at the time whether there was
some kind of TupleDesc out there that would allow zero or negative
indexes safely. It seems like there is not.

--
Robert Haas
EDB: http://www.enterprisedb.com

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#4)
Re: TupleDescAttr bounds checks

Oddly enough, "adder" just showed this same failure [1]https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=adder&amp;dt=2026-03-23%2002%3A23%3A13:

@@ -66,11 +66,9 @@
 CREATE FUNCTION wrap_do_analyze(c INT) RETURNS INT IMMUTABLE LANGUAGE SQL
 	AS 'SELECT $1 FROM public.do_analyze()';
 CREATE INDEX ON vaccluster(wrap_do_analyze(i));
+ERROR:  indexes on virtual generated columns are not supported
 INSERT INTO vaccluster VALUES (1), (2);
 ANALYZE vaccluster;

and that's not in the test_plan_advice run at all, but pg_upgrade's
run of the core regression tests. I wonder if we recently made some
seemingly-unrelated change that has increased the probability of
having a 'v' in the right byte. Anyway, you should get this fix
pushed.

regards, tom lane

[1]: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=adder&amp;dt=2026-03-23%2002%3A23%3A13

#7Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#6)
Re: TupleDescAttr bounds checks

On Sun, Mar 22, 2026 at 11:54 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Anyway, you should get this fix pushed.

Done, and I'll plan to commit the other patches later today.

Also, if on any occasion you happen to feel that I'm not being
aggressive enough in committing something I've previously posted, feel
free to take matters into your own hands. I often wait a bit to see if
anybody will object to things or comment on them, and in this case
there were compounding factors like (1) the weekend and (2) being very
busy looking into other problems that test_plan_advice turned up.
Since this was such a simple fix and you'd +1'd it, I would have felt
comfortable putting it in right away, but I simply haven't had a
moment to spare until now, and I use that term loosely given that I do
not normally use the time between 6am and 7am for to commit patches.
Anyway, the point is: I'm virtually always happy when someone else
decides to commit one of my patches; it saves me a non-trivial amount
of time and I'm not offended.

Thanks,

--
Robert Haas
EDB: http://www.enterprisedb.com

#8Alexander Lakhin
exclusion@gmail.com
In reply to: Robert Haas (#7)
Re: TupleDescAttr bounds checks

Hello Robert,

24.03.2026 12:45, Robert Haas wrote:

On Sun, Mar 22, 2026 at 11:54 PM Tom Lane<tgl@sss.pgh.pa.us> wrote:

Anyway, you should get this fix pushed.

Done, and I'll plan to commit the other patches later today.

I've found a way to trigger the Assert added in c98ad086a:
CREATE TABLE t(i int);
COPY t FROM stdin WHERE tableoid > 0;
server closed the connection unexpectedly

(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007066d0e4527e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007066d0e288ff in __GI_abort () at ./stdlib/abort.c:79
#5  0x0000609680cc7fcf in ExceptionalCondition (
    conditionName=conditionName@entry=0x609680d2c104 "i >= 0 && i < tupdesc->natts",
    fileName=fileName@entry=0x609680d84568 "../../../src/include/access/tupdesc.h", lineNumber=lineNumber@entry=182)
    at assert.c:65
#6  0x00006096808c8f68 in TupleDescAttr (tupdesc=<optimized out>, i=<optimized out>)
    at ../../../src/include/access/tupdesc.h:182
#7  TupleDescAttr (i=<optimized out>, tupdesc=<optimized out>) at ../../../src/include/access/tupdesc.h:178
#8  DoCopy (pstate=0x6096a749dc90, stmt=0x6096a74ca120, stmt_location=0, stmt_len=40, processed=0x7ffd3b09b450) at
copy.c:180
#9  0x0000609680b6b82a in standard_ProcessUtility (pstmt=0x6096a74ca1f0,
    queryString=0x6096a74c9520 "COPY t FROM stdin WHERE tableoid > 0;", readOnlyTree=<optimized out>,
    context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x6096a74ca5b0, qc=0x7ffd3b09b6f0) at utility.c:743
#10 0x0000609680b6990c in PortalRunUtility (portal=portal@entry=0x6096a756a070, pstmt=pstmt@entry=0x6096a74ca1f0,
    isTopLevel=isTopLevel@entry=true, setHoldSnapshot=setHoldSnapshot@entry=false, dest=dest@entry=0x6096a74ca5b0,
    qc=qc@entry=0x7ffd3b09b6f0) at pquery.c:1148
...

Best regards,
Alexander

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexander Lakhin (#8)
Re: TupleDescAttr bounds checks

Alexander Lakhin <exclusion@gmail.com> writes:

I've found a way to trigger the Assert added in c98ad086a:
CREATE TABLE t(i int);
COPY t FROM stdin WHERE tableoid > 0;
server closed the connection unexpectedly

Good catch. I bet it's possible to trigger the Assert just above,
too, with a WHERE expression using "t.*".

regards, tom lane

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#9)
Re: TupleDescAttr bounds checks

I wrote:

Good catch. I bet it's possible to trigger the Assert just above,
too, with a WHERE expression using "t.*".

Oh, scratch that, I didn't read the code just above this loop.
But I bet this loop should throw an error for system columns, too,
since we surely won't have computed those either.

regards, tom lane

#11Peter Eisentraut
peter_e@gmx.net
In reply to: Robert Haas (#1)
Re: TupleDescAttr bounds checks

On 04.04.26 17:38, Tom Lane wrote:

I wrote:

But I bet this loop should throw an error for system columns, too,
since we surely won't have computed those either.

After poking at that: testing tableoid does sort of work, in that it
reads as the OID of the target table named in COPY. But I think any
rational use for a test on tableoid here would be in connection with
a partitioned target table, and the user would wish it to read as the
OID of the destination partition. So I think we should disallow
tableoid along with the other system columns, pending somebody having
the ambition to make that work.

So I propose the attached for HEAD. (I couldn't resist the temptation
to clean up adjacent comments.) In the back branches it might be
better to just ignore system columns here, on the tiny chance that
somebody thinks they do something useful.

I think this is the same issue that was discussed here:

/messages/by-id/30c39ee8-bb11-4b8f-9697-45f7e018a8d3@eisentraut.org

There was no conclusion there, but I agree with the proposal to prohibit
this use.

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#11)
Re: TupleDescAttr bounds checks

Peter Eisentraut <peter@eisentraut.org> writes:

On 04.04.26 17:38, Tom Lane wrote:

After poking at that: testing tableoid does sort of work, in that it
reads as the OID of the target table named in COPY. But I think any
rational use for a test on tableoid here would be in connection with
a partitioned target table, and the user would wish it to read as the
OID of the destination partition. So I think we should disallow
tableoid along with the other system columns, pending somebody having
the ambition to make that work.

I think this is the same issue that was discussed here:
/messages/by-id/30c39ee8-bb11-4b8f-9697-45f7e018a8d3@eisentraut.org
There was no conclusion there, but I agree with the proposal to prohibit
this use.

Ah, indeed. Jian's patch in that thread seems rough but potentially
workable to me, but seemingly the thread tailed off for lack of
interest. I don't want to revive it now as part of a bug fix.

Disallowing tableoid for now, and then re-allowing it if someone
picks up that patch down the road, seems like a good solution.
For one thing, since that patch changes the semantics of tableoid
in COPY WHERE, I think it'd be a good idea to have a release or
two in between where we throw error. That'd be helpful to flush
out any field usages that might be affected.

regards, tom lane