pg_upgrade fails to detect unsupported arrays and ranges
While composing the release note entry for commits 8d48e6a72 et al
(handle recursive type dependencies while checking for unsupported
types in pg_upgrade), I realized that there's a huge hole in
pg_upgrade's test for such cases. It looks for domains containing
the unsupported type, and for composites containing it, but not
for arrays or ranges containing it. It's definitely possible to
create tables containing arrays of lines, or arrays of composites
containing line, etc etc. A range over line is harder for lack of
a btree opclass, but a range over sql_identifier is possible.
The attached patches fix this. 0001 refactors the code in question
so that we have only one copy not three-and-growing. The only
difference between the three copies was that one case didn't bother
to search indexes, but I judged that that wasn't an optimization we
need to preserve. (Note: this patch is shown with --ignore-space-change
to make it more reviewable, but I did re-pgindent the code.) Then
0002 actually adds the array and range cases.
Although this is a really straightforward patch and I've tested it
against appropriate old versions (9.1 and 9.2), I'm very hesitant
to shove it in so soon before a release wrap. Should I do that, or
let it wait till after the wrap?
regards, tom lane
On 10 Nov 2019, at 20:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
0001 refactors the code in question
so that we have only one copy not three-and-growing. The only
difference between the three copies was that one case didn't bother
to search indexes, but I judged that that wasn't an optimization we
need to preserve.
A big +1 on this refactoring.
(Note: this patch is shown with --ignore-space-change
to make it more reviewable, but I did re-pgindent the code.) Then
0002 actually adds the array and range cases.
Was the source pgindented, but not committed, before generating the patches? I
fail to apply them on master (or REL_12_STABLE) on what seems to be only
whitespace changes.
Although this is a really straightforward patch and I've tested it
against appropriate old versions (9.1 and 9.2), I'm very hesitant
to shove it in so soon before a release wrap. Should I do that, or
let it wait till after the wrap?
Having read the patch I agree that it's trivial enough that I wouldn't be
worried to let it slip through. However, given that we've lacked the check for
a few releases, is it worth rushing with the potential for a last-minute
"oh-shit"?
+ /* arrays over any type selected so far */ + " SELECT t.oid FROM pg_catalog.pg_type t, x WHERE typelem = x.oid AND typtype = 'b' "
No need to check typlen?
cheers ./daniel
Daniel Gustafsson <daniel@yesql.se> writes:
On 10 Nov 2019, at 20:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Although this is a really straightforward patch and I've tested it
against appropriate old versions (9.1 and 9.2), I'm very hesitant
to shove it in so soon before a release wrap. Should I do that, or
let it wait till after the wrap?
Having read the patch I agree that it's trivial enough that I wouldn't be
worried to let it slip through. However, given that we've lacked the check for
a few releases, is it worth rushing with the potential for a last-minute
"oh-shit"?
Probably not, really --- the main argument for that is just that it'd fit
well with the fixes Tomas already made.
+ /* arrays over any type selected so far */ + " SELECT t.oid FROM pg_catalog.pg_type t, x WHERE typelem = x.oid AND typtype = 'b' "
No need to check typlen?
Yeah, that's intentional. A fixed-length array type over a problematic
type would be just as much of a problem as a varlena array type.
The case shouldn't apply to any of the existing problematic types,
but I was striving for generality.
regards, tom lane
Daniel Gustafsson <daniel@yesql.se> writes:
On 10 Nov 2019, at 20:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
(Note: this patch is shown with --ignore-space-change
to make it more reviewable, but I did re-pgindent the code.) Then
0002 actually adds the array and range cases.
Was the source pgindented, but not committed, before generating the patches? I
fail to apply them on master (or REL_12_STABLE) on what seems to be only
whitespace changes.
Hm, I suppose it might be hard to apply the combination of the patches
(maybe something involving patch -l would work). For simplicity, here's
the complete patch for HEAD. I fixed a missing schema qualification.
regards, tom lane
Attachments:
fix-pg_upgrade-unsupported-type-search-2.patchtext/x-diff; charset=us-ascii; name=fix-pg_upgrade-unsupported-type-search-2.patchDownload+100-234
On 10 Nov 2019, at 22:05, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
+ /* arrays over any type selected so far */ + " SELECT t.oid FROM pg_catalog.pg_type t, x WHERE typelem = x.oid AND typtype = 'b' "No need to check typlen?
Yeah, that's intentional. A fixed-length array type over a problematic
type would be just as much of a problem as a varlena array type.
The case shouldn't apply to any of the existing problematic types,
but I was striving for generality.
That makes a lot of sense, thanks for the explanation.
cheers ./daniel
On 10 Nov 2019, at 22:12, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
On 10 Nov 2019, at 20:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
(Note: this patch is shown with --ignore-space-change
to make it more reviewable, but I did re-pgindent the code.) Then
0002 actually adds the array and range cases.Was the source pgindented, but not committed, before generating the patches? I
fail to apply them on master (or REL_12_STABLE) on what seems to be only
whitespace changes.Hm, I suppose it might be hard to apply the combination of the patches
(maybe something involving patch -l would work). For simplicity, here's
the complete patch for HEAD. I fixed a missing schema qualification.
Applies, builds clean and passes light testing. I can see the appeal of
including it before the wrap, even though I personally would've held off.
cheers ./daniel
Daniel Gustafsson <daniel@yesql.se> writes:
Applies, builds clean and passes light testing.
Thanks for checking!
I can see the appeal of
including it before the wrap, even though I personally would've held off.
Nah, I'm not gonna risk it at this stage. I concur with your point
that this is an ancient bug, and one that is unlikely to bite many
people. I'll push it Wednesday or so.
regards, tom lane
[ blast-from-the-past department ]
I wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
I can see the appeal of
including it before the wrap, even though I personally would've held off.
Nah, I'm not gonna risk it at this stage. I concur with your point
that this is an ancient bug, and one that is unlikely to bite many
people. I'll push it Wednesday or so.
I happened across a couple of further pg_upgrade oversights in the
same vein as 29aeda6e4 et al:
* Those commits fixed the bugs in pg_upgrade/version.c about not
checking the contents of arrays/ranges/etc, but there are two
similar functions in pg_upgrade/check.c that I failed to notice
(probably due to the haste with which that patch was prepared).
* We really need to also reject user tables that contain instances
of system-defined composite types (i.e. catalog rowtypes), because
except for a few bootstrap catalogs, those type OIDs are assigned by
genbki.pl not by hand, so they aren't stable across major versions.
For example, in HEAD I get
regression=# select 'pg_enum'::regtype::oid;
oid
-------
13045
(1 row)
but the same OID was 12022 in v13, 11551 in v11, etc. So if you
had a column of type pg_enum, you'd likely get no-such-type-OID
failures when reading the values after an upgrade. I don't see
much use-case for doing such a thing, so it seems saner to just
block off the possibility rather than trying to support it.
(We'd have little choice in the back branches anyway, as their
OID values are locked down now.)
The attached proposed patch fixes these cases too. I generalized
the recursive query a little more so that it could start from an
arbitrary query yielding pg_type OIDs, rather than just one type
name; otherwise it's pretty straightforward.
Barring objections I'll apply and back-patch this soon.
regards, tom lane
Attachments:
fix-pg_upgrade-data-type-checks-2.patchtext/x-diff; charset=us-ascii; name=fix-pg_upgrade-data-type-checks-2.patchDownload+126-149
On 28 Apr 2021, at 17:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
[ blast-from-the-past department ]
I wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
I can see the appeal of
including it before the wrap, even though I personally would've held off.Nah, I'm not gonna risk it at this stage. I concur with your point
that this is an ancient bug, and one that is unlikely to bite many
people. I'll push it Wednesday or so.I happened across a couple of further pg_upgrade oversights in the
same vein as 29aeda6e4 et al:
Nice find, this makes a lot of sense.
..the same OID was 12022 in v13, 11551 in v11, etc. So if you
had a column of type pg_enum, you'd likely get no-such-type-OID
failures when reading the values after an upgrade. I don't see
much use-case for doing such a thing, so it seems saner to just
block off the possibility rather than trying to support it.
Agreed. Having implemented basically this for Greenplum I think it’s wise to
avoid it unless we really have to, it gets very complicated once the layers of
worms are peeled back.
The attached proposed patch fixes these cases too. I generalized
the recursive query a little more so that it could start from an
arbitrary query yielding pg_type OIDs, rather than just one type
name; otherwise it's pretty straightforward.Barring objections I'll apply and back-patch this soon.
Patch LGTM on reading, +1 on applying. Being on parental leave I don’t have my
dev env ready to go so I didn’t perform testing; sorry about that.
+ pg_fatal("Your installation contains system-defined composite type(s) in user tables.\n" + "These type OIDs are not stable across PostgreSQL versions,\n" + "so this cluster cannot currently be upgraded. You can\n" + "remove the problem tables and restart the upgrade.\n" + "A list of the problem columns is in the file:\n"
Would it be helpful to inform the user that they can alter/drop just the
problematic columns as a potentially less scary alternative to dropping the
entire table?
- * The type of interest might be wrapped in a domain, array, + * The types of interest might be wrapped in a domain, array,
Shouldn't this be "type(s)” as in the other changes here?
--
Daniel Gustafsson https://vmware.com/
Daniel Gustafsson <daniel@yesql.se> writes:
On 28 Apr 2021, at 17:09, Tom Lane <tgl@sss.pgh.pa.us> wrote: + pg_fatal("Your installation contains system-defined composite type(s) in user tables.\n" + "These type OIDs are not stable across PostgreSQL versions,\n" + "so this cluster cannot currently be upgraded. You can\n" + "remove the problem tables and restart the upgrade.\n" + "A list of the problem columns is in the file:\n"
Would it be helpful to inform the user that they can alter/drop just the
problematic columns as a potentially less scary alternative to dropping the
entire table?
This wording is copied-and-pasted from the other similar tests. I agree
that it's advocating a solution that might be overkill, but if we change
it we should also change the existing messages. I don't mind doing
that in HEAD; less sure about the back branches, as (I think) these
are translatable strings.
Thoughts?
- * The type of interest might be wrapped in a domain, array, + * The types of interest might be wrapped in a domain, array,
Shouldn't this be "type(s)” as in the other changes here?
Fair enough.
regards, tom lane
On 28 Apr 2021, at 22:47, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Daniel Gustafsson <daniel@yesql.se> writes:
On 28 Apr 2021, at 17:09, Tom Lane <tgl@sss.pgh.pa.us> wrote: + pg_fatal("Your installation contains system-defined composite type(s) in user tables.\n" + "These type OIDs are not stable across PostgreSQL versions,\n" + "so this cluster cannot currently be upgraded. You can\n" + "remove the problem tables and restart the upgrade.\n" + "A list of the problem columns is in the file:\n"Would it be helpful to inform the user that they can alter/drop just the
problematic columns as a potentially less scary alternative to dropping the
entire table?This wording is copied-and-pasted from the other similar tests. I agree
that it's advocating a solution that might be overkill, but if we change
it we should also change the existing messages.
Good point.
I don't mind doing that in HEAD; less sure about the back branches, as
I think it would be helpful for users to try and give slightly more expanded
advice while (obviously) still always being safe. I’m happy to take a crack at
that once back unless someone beats me to it.
(I think) these are translatable strings.
If they aren't I think we should try and make them so to as far as we can
reduce language barrier problems in such important messages.
--
Daniel Gustafsson https://vmware.com/
Daniel Gustafsson <daniel@yesql.se> writes:
On 28 Apr 2021, at 22:47, Tom Lane <tgl@sss.pgh.pa.us> wrote:
This wording is copied-and-pasted from the other similar tests. I agree
that it's advocating a solution that might be overkill, but if we change
it we should also change the existing messages.
Good point.
I don't mind doing that in HEAD; less sure about the back branches, as
I think it would be helpful for users to try and give slightly more expanded
advice while (obviously) still always being safe. I’m happy to take a crack at
that once back unless someone beats me to it.
Seems like s/remove the problem tables/drop the problem columns/
is easy and sufficient.
(I think) these are translatable strings.
If they aren't I think we should try and make them so to as far as we can
reduce language barrier problems in such important messages.
Checking, I see they do appear in pg_upgrade's po files. So I propose
that we change the existing messages in HEAD but not the back branches.
regards, tom lane