Extracting cross-version-upgrade knowledge from buildfarm client
This is a followup to the discussion at [1]/messages/by-id/951602.1673535249@sss.pgh.pa.us, in which we agreed that
it's time to fix the buildfarm client so that knowledge about
cross-version discrepancies in pg_dump output can be moved into
the community git tree, making it feasible for people other than
Andrew to fix problems when we change things of that sort.
The idea is to create helper files that live in the git tree and
are used by the BF client to perform the activities that are likely
to need tweaking.
Attached are two patches, one for PG git and one for the buildfarm
client, that create a working POC for this approach. I've only
carried this as far as making a helper file for HEAD, but I believe
that helper files for the back branches would mostly just need to
be cut-down versions of this one. I've tested it successfully with
cross-version upgrade tests down to 9.3. (9.2 would need some more
work, and I'm not sure if it's worth the trouble --- are we going to
retire 9.2 soon?)
I'm a very mediocre Perl programmer, so I'm sure there are stylistic
and other problems, but I'm encouraged that this seems feasible.
Also, I wonder if we can't get rid of
src/bin/pg_upgrade/upgrade_adapt.sql in favor of using this code.
I tried to write adjust_database_contents() in such a way that it
could be pointed at a database by some other Perl code that's
not the buildfarm client.
regards, tom lane
On 2023-01-13 Fr 19:48, Tom Lane wrote:
This is a followup to the discussion at [1], in which we agreed that
it's time to fix the buildfarm client so that knowledge about
cross-version discrepancies in pg_dump output can be moved into
the community git tree, making it feasible for people other than
Andrew to fix problems when we change things of that sort.
The idea is to create helper files that live in the git tree and
are used by the BF client to perform the activities that are likely
to need tweaking.Attached are two patches, one for PG git and one for the buildfarm
client, that create a working POC for this approach. I've only
carried this as far as making a helper file for HEAD, but I believe
that helper files for the back branches would mostly just need to
be cut-down versions of this one. I've tested it successfully with
cross-version upgrade tests down to 9.3. (9.2 would need some more
work, and I'm not sure if it's worth the trouble --- are we going to
retire 9.2 soon?)I'm a very mediocre Perl programmer, so I'm sure there are stylistic
and other problems, but I'm encouraged that this seems feasible.Also, I wonder if we can't get rid of
src/bin/pg_upgrade/upgrade_adapt.sql in favor of using this code.
I tried to write adjust_database_contents() in such a way that it
could be pointed at a database by some other Perl code that's
not the buildfarm client.regards, tom lane
OK, we've been on parallel tracks (sorry about that). Let's run with
yours, as it covers more ground.
One thing I would change is that your adjust_database_contents tries to
make the adjustments rather than passing back a set of statements. We
could make that work, although your attempt won't really work for the
buildfarm, but I would just make actually performing the adjustments the
client's responsibility. That would make for much less disturbance in
the buildfarm code.
I also tried to remove a lot of the ugly release tag processing,
leveraging our PostgreSQL::Version gadget. I think that's worthwhile too.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes:
On 2023-01-13 Fr 19:48, Tom Lane wrote:
Attached are two patches, one for PG git and one for the buildfarm
client, that create a working POC for this approach.
OK, we've been on parallel tracks (sorry about that). Let's run with
yours, as it covers more ground.
Cool.
One thing I would change is that your adjust_database_contents tries to
make the adjustments rather than passing back a set of statements.
Agreed. I'd thought maybe adjust_database_contents would need to
actually interact with the target DB; but experience so far says
that IF EXISTS conditionality is sufficient, so we can just build
a static list of statements to issue. It's definitely a simpler
API that way.
I also tried to remove a lot of the ugly release tag processing,
leveraging our PostgreSQL::Version gadget. I think that's worthwhile too.
OK, I'll take a look at that and make a new draft.
regards, tom lane
I wrote:
OK, I'll take a look at that and make a new draft.
Here's version 2, incorporating your suggestions and with some
further work to make it handle 9.2 fully. I think this could
be committable so far as HEAD is concerned, though I still
need to make versions of AdjustUpgrade.pm for the back branches.
I tried to use this to replace upgrade_adapt.sql, but failed so
far because I couldn't figure out exactly how you're supposed
to use 002_pg_upgrade.pl with an old source installation.
It's not terribly well documented. In any case I think we
need a bit more thought about that, because it looks like
002_pg_upgrade.pl thinks that you can supply any random dump
file to serve as the initial state of the old installation;
but neither what I have here nor any likely contents of
upgrade_adapt.sql or the "custom filter" rules are going to
work on databases that aren't just the standard regression
database(s) of the old version.
I assume we should plan on reverting 9814ff550 (Add custom filtering
rules to the TAP tests of pg_upgrade)? Does that have any
plausible use that's not superseded by this patchset?
regards, tom lane
On 2023-01-14 Sa 15:06, Tom Lane wrote:
I wrote:
OK, I'll take a look at that and make a new draft.
Here's version 2, incorporating your suggestions and with some
further work to make it handle 9.2 fully. I think this could
be committable so far as HEAD is concerned, though I still
need to make versions of AdjustUpgrade.pm for the back branches.
This looks pretty good to me.
I'll probably change this line
my $adjust_cmds = adjust_database_contents($oversion, %dbnames);
so it's only called if the old and new versions are different. Is there
any case where a repo shouldn't be upgradeable to its own version
without adjustment?
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes:
On 2023-01-14 Sa 15:06, Tom Lane wrote:
Here's version 2, incorporating your suggestions and with some
further work to make it handle 9.2 fully.
This looks pretty good to me.
Great! I'll work on making back-branch versions.
I'll probably change this line
my $adjust_cmds = adjust_database_contents($oversion, %dbnames);
so it's only called if the old and new versions are different. Is there
any case where a repo shouldn't be upgradeable to its own version
without adjustment?
Makes sense. I'd keep the check for $oversion eq 'HEAD' in the
subroutines, but that's mostly just to protect the version
conversion code below it.
Another thing I was just thinking about was not bothering to run
"diff" if the fixed dump strings are equal in-memory. You could
take that even further and not write out the fixed files at all,
but that seems like a bad idea for debuggability of the adjustment
subroutines. However, I don't see why we need to write an
empty diff file, nor parse it.
One other question before I continue --- do the adjustment
subroutines need to worry about Windows newlines in the strings?
It's not clear to me whether Perl will automatically make "\n"
in a pattern match "\r\n", or whether it's not a problem because
something upstream will have stripped \r's.
regards, tom lane
On Sat, Jan 14, 2023 at 03:06:06PM -0500, Tom Lane wrote:
I tried to use this to replace upgrade_adapt.sql, but failed so
far because I couldn't figure out exactly how you're supposed
to use 002_pg_upgrade.pl with an old source installation.
It's not terribly well documented.
As in pg_upgrade's TESTING or the comments of the tests?
In any case I think we
need a bit more thought about that, because it looks like
002_pg_upgrade.pl thinks that you can supply any random dump
file to serve as the initial state of the old installation;
but neither what I have here nor any likely contents of
upgrade_adapt.sql or the "custom filter" rules are going to
work on databases that aren't just the standard regression
database(s) of the old version.
Yeah, this code needs an extra push that I have not been able to
figure out yet, as we could recommend the creation of a dump with
installcheck-world and USE_MODULE_DB=1. Perhaps a module is just
better at the end.
I assume we should plan on reverting 9814ff550 (Add custom filtering
rules to the TAP tests of pg_upgrade)? Does that have any
plausible use that's not superseded by this patchset?
Nope, this could just be removed if we finish by adding a module that
does exactly the same work. If you are planning on committing the
module you have, i'd be happy to take care of a revert for this part.
+ # Can't upgrade aclitem in user tables from pre 16 to 16+.
+ _add_st($result, 'regression',
+ 'alter table public.tab_core_types drop column aclitem');
Could you just use a DO block here to detect tables with such
attributes, like in upgrade_adapt.sql, rather than dropping the table
from the core tests? That's more consistent with the treatment of
WITH OIDS.
Is this module pluggable with 002_pg_upgrade.pl?
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
On Sat, Jan 14, 2023 at 03:06:06PM -0500, Tom Lane wrote: + # Can't upgrade aclitem in user tables from pre 16 to 16+. + _add_st($result, 'regression', + 'alter table public.tab_core_types drop column aclitem');
Could you just use a DO block here to detect tables with such
attributes, like in upgrade_adapt.sql, rather than dropping the table
from the core tests? That's more consistent with the treatment of
WITH OIDS.
I guess, but it seems like make-work as long as there's just the one
column.
Is this module pluggable with 002_pg_upgrade.pl?
I did find that 002_pg_upgrade.pl could load it. I got stuck at
the point of trying to test things, because I didn't understand
what the test process is supposed to be for an upgrade from a
back branch. For some reason I thought that 002_pg_upgrade.pl
could automatically create the old regression database, but
now I see that's not implemented.
regards, tom lane
On 2023-01-15 Su 11:01, Tom Lane wrote:
Another thing I was just thinking about was not bothering to run
"diff" if the fixed dump strings are equal in-memory. You could
take that even further and not write out the fixed files at all,
but that seems like a bad idea for debuggability of the adjustment
subroutines. However, I don't see why we need to write an
empty diff file, nor parse it.
Yeah, that makes sense.
One other question before I continue --- do the adjustment
subroutines need to worry about Windows newlines in the strings?
It's not clear to me whether Perl will automatically make "\n"
in a pattern match "\r\n", or whether it's not a problem because
something upstream will have stripped \r's.
I don't think we need to worry about them, but I will have a closer
look. Those replacement lines are very difficult to read. I think use of
extended regexes and some multi-part replacements would help. I'll have
a go at that tomorrow.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes:
Those replacement lines are very difficult to read. I think use of
extended regexes and some multi-part replacements would help. I'll have
a go at that tomorrow.
Yeah, after I wrote that code I remembered about \Q ... \E, which would
eliminate the need for most of the backslashes and probably make things
better that way. I didn't get around to improving it yet though, so
feel free to have a go.
regards, tom lane
On Sun, Jan 15, 2023 at 06:02:07PM -0500, Tom Lane wrote:
I guess, but it seems like make-work as long as there's just the one
column.
Well, the query is already written, so I would use that, FWIW.
I did find that 002_pg_upgrade.pl could load it. I got stuck at
the point of trying to test things, because I didn't understand
what the test process is supposed to be for an upgrade from a
back branch. For some reason I thought that 002_pg_upgrade.pl
could automatically create the old regression database, but
now I see that's not implemented.
test.sh did that, until I noticed that we need to worry about
pg_regress from the past branches to be compatible in the script
itself because we need to run it in the old source tree. This makes
the whole much more complicated to maintain, especially with the
recent removal of input/ and output/ folders in the regression tests
:/
--
Michael
On 2023-01-15 Su 18:37, Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
Those replacement lines are very difficult to read. I think use of
extended regexes and some multi-part replacements would help. I'll have
a go at that tomorrow.Yeah, after I wrote that code I remembered about \Q ... \E, which would
eliminate the need for most of the backslashes and probably make things
better that way. I didn't get around to improving it yet though, so
feel free to have a go.
OK, here's my version. It tests clean against all of crake's dump files
back to 9.2.
To some extent it's a matter of taste, but I hate very long regex lines
- it makes it very hard to see what's actually changing, so I broke up
most of those.
Given that we are looking at newlines in some places I decided that
after all it was important to convert CRLF to LF.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Attachments:
adjustupgrade-3.patchtext/x-patch; charset=UTF-8; name=adjustupgrade-3.patchDownload+550-0
Andrew Dunstan <andrew@dunslane.net> writes:
OK, here's my version. It tests clean against all of crake's dump files
back to 9.2.
To some extent it's a matter of taste, but I hate very long regex lines
- it makes it very hard to see what's actually changing, so I broke up
most of those.
I don't mind breaking things up, but I'm not terribly excited about
making the patterns looser, as you've done in some places like
if ($old_version < 14)
{
# Remove mentions of extended hash functions.
- $dump =~
- s/^(\s+OPERATOR 1 =\(integer,integer\)) ,\n\s+FUNCTION 2 \(integer, integer\) public\.part_hashint4_noop\(integer,bigint\);/$1;/mg;
- $dump =~
- s/^(\s+OPERATOR 1 =\(text,text\)) ,\n\s+FUNCTION 2 \(text, text\) public\.part_hashtext_length\(text,bigint\);/$1;/mg;
+ $dump =~ s {(^\s+OPERATOR\s1\s=\((?:integer,integer|text,text)\))\s,\n
+ \s+FUNCTION\s2\s.*?public.part_hash.*?;}
+ {$1;}mxg;
}
I don't think that's any easier to read, and it risks masking
diffs that we'd wish to know about.
regards, tom lane
On 2023-01-16 Mo 14:34, Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
OK, here's my version. It tests clean against all of crake's dump files
back to 9.2.
To some extent it's a matter of taste, but I hate very long regex lines
- it makes it very hard to see what's actually changing, so I broke up
most of those.I don't mind breaking things up, but I'm not terribly excited about
making the patterns looser, as you've done in some places likeif ($old_version < 14) { # Remove mentions of extended hash functions. - $dump =~ - s/^(\s+OPERATOR 1 =\(integer,integer\)) ,\n\s+FUNCTION 2 \(integer, integer\) public\.part_hashint4_noop\(integer,bigint\);/$1;/mg; - $dump =~ - s/^(\s+OPERATOR 1 =\(text,text\)) ,\n\s+FUNCTION 2 \(text, text\) public\.part_hashtext_length\(text,bigint\);/$1;/mg; + $dump =~ s {(^\s+OPERATOR\s1\s=\((?:integer,integer|text,text)\))\s,\n + \s+FUNCTION\s2\s.*?public.part_hash.*?;} + {$1;}mxg; }I don't think that's any easier to read, and it risks masking
diffs that we'd wish to know about.
OK, I'll make another pass and tighten things up.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes:
On 2023-01-16 Mo 14:34, Tom Lane wrote:
I don't think that's any easier to read, and it risks masking
diffs that we'd wish to know about.
OK, I'll make another pass and tighten things up.
Don't sweat it, I'm just working the bugs out of a new version.
I'll have something to post shortly, I hope.
regards, tom lane
OK, here's a v4:
* It works with 002_pg_upgrade.pl now. The only substantive change
I had to make for that was to define the $old_version arguments as
being always PostgreSQL::Version objects not strings, because
otherwise I got complaints like
Argument "HEAD" isn't numeric in numeric comparison (<=>) at /home/postgres/pgsql/src/bin/pg_upgrade/../../../src/test/perl/PostgreSQL/Version.pm line 130.
So now TestUpgradeXversion.pm is responsible for performing that
conversion, and also for not doing any conversions on HEAD (which
Andrew wanted anyway).
* I improved pg_upgrade's TESTING directions after figuring out how
to get it to work for contrib modules.
* Incorporated (most of) Andrew's stylistic improvements.
* Simplified TestUpgradeXversion.pm's use of diff, as discussed.
I think we're about ready to go, except for cutting down
AdjustUpgrade.pm to make versions to put in the back branches.
I'm slightly tempted to back-patch 002_pg_upgrade.pl so that there
is an in-tree way to verify back-branch AdjustUpgrade.pm files.
On the other hand, it's hard to believe that testing that in
HEAD won't be sufficient; I doubt the back-branch copies will
need to change much.
regards, tom lane
I wrote:
I think we're about ready to go, except for cutting down
AdjustUpgrade.pm to make versions to put in the back branches.
Hmmm ... so upon trying to test in the back branches, I soon
discovered that PostgreSQL/Version.pm isn't there before v15.
I don't see a good reason why we couldn't back-patch it, though.
Any objections?
regards, tom lane
On 2023-01-16 Mo 18:11, Tom Lane wrote:
I wrote:
I think we're about ready to go, except for cutting down
AdjustUpgrade.pm to make versions to put in the back branches.Hmmm ... so upon trying to test in the back branches, I soon
discovered that PostgreSQL/Version.pm isn't there before v15.I don't see a good reason why we couldn't back-patch it, though.
Any objections?
No, that seems perfectly reasonable.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
I've pushed the per-branch AdjustUpgrade.pm files and tested by performing
a fresh round of buildfarm runs with the patched TestUpgradeXversion.pm
file. I think we're in good shape with this project.
I dunno if we want to stretch buildfarm owners' patience with yet
another BF client release right now. On the other hand, I'm antsy
to see if we can un-revert 1b4d280ea after doing a little more
work in AdjustUpgrade.pm.
regards, tom lane
On Mon, Jan 16, 2023 at 04:48:28PM -0500, Tom Lane wrote:
I'm slightly tempted to back-patch 002_pg_upgrade.pl so that there
is an in-tree way to verify back-branch AdjustUpgrade.pm files.
On the other hand, it's hard to believe that testing that in
HEAD won't be sufficient; I doubt the back-branch copies will
need to change much.
Backpatching 002_pg_upgrade.pl requires a bit more than the test:
there is one compatibility gotcha as of dc57366. I did not backpatch
it because nobody has complained about it until I found out about it,
but the test would require it.
By the way, thanks for your work on this stuff :)
--
Michael