pgsql: Catalog changes preparing for builtin collation provider.

Started by Jeff Davisover 2 years ago12 messagescomitters
Jump to latest
#1Jeff Davis
pgsql@j-davis.com

Catalog changes preparing for builtin collation provider.

Rename pg_collation.colliculocale to colllocale, and
pg_database.daticulocale to datlocale. These names reflects that the
fields will be useful for the upcoming builtin provider as well, not
just for ICU.

This is purely a rename; no changes to the meaning of the fields.

Discussion: /messages/by-id/ff4c2f2f9c8fc7ca27c1c24ae37ecaeaeaff6b53.camel@j-davis.com
Reviewed-by: Peter Eisentraut

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/f696c0cd5f299f1b51e214efc55a22a782cc175d

Modified Files
--------------
doc/src/sgml/bki.sgml | 2 +-
doc/src/sgml/catalogs.sgml | 26 +++++++---
src/backend/catalog/pg_collation.c | 10 ++--
src/backend/commands/collationcmds.c | 34 ++++++-------
src/backend/commands/dbcommands.c | 68 +++++++++++++-------------
src/backend/utils/adt/pg_locale.c | 4 +-
src/backend/utils/init/postinit.c | 12 ++---
src/bin/initdb/initdb.c | 34 ++++++-------
src/bin/pg_dump/pg_dump.c | 56 +++++++++++----------
src/bin/pg_upgrade/info.c | 31 +++++++-----
src/bin/pg_upgrade/pg_upgrade.c | 34 +++++++++----
src/bin/pg_upgrade/pg_upgrade.h | 2 +-
src/bin/pg_upgrade/t/002_pg_upgrade.pl | 29 +++++++----
src/bin/psql/describe.c | 20 +++++---
src/include/catalog/catversion.h | 2 +-
src/include/catalog/pg_collation.dat | 2 +-
src/include/catalog/pg_collation.h | 4 +-
src/include/catalog/pg_database.dat | 2 +-
src/include/catalog/pg_database.h | 2 +-
src/test/regress/expected/collate.icu.utf8.out | 4 +-
src/test/regress/expected/psql.out | 18 +++----
src/test/regress/sql/collate.icu.utf8.sql | 4 +-
22 files changed, 231 insertions(+), 169 deletions(-)

#2Jeff Davis
pgsql@j-davis.com
In reply to: Jeff Davis (#1)
Re: pgsql: Catalog changes preparing for builtin collation provider.

On Sat, 2024-03-09 at 22:50 +0000, Jeff Davis wrote:

Catalog changes preparing for builtin collation provider.

This is causing problems on a couple buildfarm members (mantid,
lapwing, snakefly) that all look like this:

# Running: pg_upgrade --no-sync -d
/home/postgres/buildroot/HEAD/pgsql.build/src/bin/pg_upgrade/tmp_check/
t_002_pg_upgrade_old_node_data/pgdata -D
/home/postgres/buildroot/HEAD/pgsql.build/src/bin/pg_upgrade/tmp_check/
t_002_pg_upgrade_new_node_data/pgdata -b
/home/postgres/buildroot/HEAD/pgsql.build/tmp_install/home/postgres/bui
ldroot/HEAD/inst/bin -B
/home/postgres/buildroot/HEAD/pgsql.build/tmp_install/home/postgres/bui
ldroot/HEAD/inst/bin -s /tmp/3b07YyaACZ -p 52828 -P 52829 --copy --
check
[01:08:08.992](1.053s) ok 9 - invalid database causes failure status
(got 1 vs expected 1)
[01:08:08.992](0.001s) ok 10 - invalid database causes failure stdout
/(?^:invalid)/
[01:08:08.993](0.001s) not ok 11 - invalid database causes failure
stderr /(?^:)/
[01:08:08.993](0.000s)
[01:08:08.993](0.000s) # Failed test 'invalid database causes failure
stderr /(?^:)/'
[01:08:08.993](0.000s) # at t/002_pg_upgrade.pl line 370.
[01:08:08.994](0.000s) # ''
# doesn't match '(?^:)'

That's a little confusing to me: the '(?^:)' is just qr//, which should
match anything, right?

It's expecting stderr to be empty, and the test failure seems to
indicate it is empty, so again I don't really see the problem.

Am I missing something obvious here?

Quite a few other members are passing and I didn't find an obvious
pattern yet.

Regards,
Jeff Davis

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Davis (#2)
Re: pgsql: Catalog changes preparing for builtin collation provider.

Jeff Davis <pgsql@j-davis.com> writes:

On Sat, 2024-03-09 at 22:50 +0000, Jeff Davis wrote:

Catalog changes preparing for builtin collation provider.

This is causing problems on a couple buildfarm members (mantid,
lapwing, snakefly) that all look like this:
...
That's a little confusing to me: the '(?^:)' is just qr//, which should
match anything, right?

Yeah. It looks to me like it's somehow Perl-version-related.
The oldest Perl versions in the buildfarm (as of last month,
last time I scraped these results) are

longfin | 2024-02-08 11:16:05 | configure: using perl 5.14.0
lapwing | 2024-02-08 12:40:20 | configure: using perl 5.14.2
arowana | 2024-02-08 04:08:57 | configure: using perl 5.16.3
boa | 2024-02-07 16:10:12 | configure: using perl 5.16.3
buri | 2024-02-07 20:08:23 | configure: using perl 5.16.3
clam | 2024-02-08 15:00:10 | configure: using perl 5.16.3
dhole | 2024-02-08 06:09:13 | configure: using perl 5.16.3
mantid | 2024-02-08 14:07:10 | configure: using perl 5.16.3
parula | 2024-02-08 11:15:02 | configure: using perl 5.16.3
rhinoceros | 2024-02-08 11:52:11 | configure: using perl 5.16.3
siskin | 2024-02-07 20:16:39 | configure: using perl 5.16.3
snakefly | 2024-02-08 11:15:03 | configure: using perl 5.16.3

Of these, lapwing, mantid, and snakefly have failed; arowana
succeeded, but it didn't run the pg_upgrade test; longfin succeeded,
which I'm confused about; and the rest did not run yet. Not sure
what to make of that, but I counsel sleeping on it and seeing what
the remaining animals say.

regards, tom lane

#4Jeff Davis
pgsql@j-davis.com
In reply to: Tom Lane (#3)
Re: pgsql: Catalog changes preparing for builtin collation provider.

On Sun, 2024-03-10 at 00:05 -0500, Tom Lane wrote:

Yeah.  It looks to me like it's somehow Perl-version-related.

I found it. On perl 5.16.3 on a failing instance:

use strict;
die '"" does not match qr// before substitution' unless '' =~ qr//;
my $foo = ('|a|' =~ s/a//r);
die '"" does not match qr// after substitution' unless '' =~ qr//;
print "$foo\n";

dies with:

"" does not match qr// after substitution at - line 4.

My commit adds a line using the s/// operator.

That's so bizarre that I have to guess that it's a perl bug. I poked
around in the perl docs to see if I'm missing something, but I didn't
find anything indicating that a substitution would have crazy side
effects. Please correct me if I'm wrong; I'm far from a perl expert.

Assuming that it is a perl bug, there are two potential workarounds:

1. Use qr/^$/ for the test rather than qr//.

2. Don't use s/// anywhere. Find another way to transform devel
versions into numbers.

Either one has the potential to leave traps for later. New tests might
either rely on s/// or expect qr// to work. I am inclined toward #1,
because if we use qr/^$/, other tests are likely to copy it and they
will be safe as well.

Though I'm still not sure what's going on with longfin.

Thoughts?

Regards,
Jeff Davis

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Davis (#4)
Re: pgsql: Catalog changes preparing for builtin collation provider.

Jeff Davis <pgsql@j-davis.com> writes:

I found it. On perl 5.16.3 on a failing instance:
use strict;
die '"" does not match qr// before substitution' unless '' =~ qr//;
my $foo = ('|a|' =~ s/a//r);
die '"" does not match qr// after substitution' unless '' =~ qr//;
print "$foo\n";
dies with:
"" does not match qr// after substitution at - line 4.

Wow.

That's so bizarre that I have to guess that it's a perl bug.

I agree.

Assuming that it is a perl bug, there are two potential workarounds:
1. Use qr/^$/ for the test rather than qr//.
2. Don't use s/// anywhere. Find another way to transform devel
versions into numbers.
Either one has the potential to leave traps for later. New tests might
either rely on s/// or expect qr// to work. I am inclined toward #1,
because if we use qr/^$/, other tests are likely to copy it and they
will be safe as well.

I like #1 as well, because qr// imposes zero constraints on what
the command's stderr is. Checking qr/^$/ would express what we
really want to express, namely no stderr output.

Though I'm still not sure what's going on with longfin.

Me too, not least because your minimal example fails just as above
with longfin's 5.14.0 build. The fact that longfin nonetheless
passes the whole test indicates that something in between those
steps manages to clean up whatever the broken state is.

I also see that none of the other animals running 5.16.3 ever
showed the failure. This could easily indicate some bizarre
platform-specificity in the bug. It could also mean that the
Linux vendors started carrying patches for this bug, but the
three machines that failed are too old to have any such patch.
Digging in perl's revision history might be interesting if we
really felt the need to identify the bug precisely, but I don't.
Let's just go with your #1.

regards, tom lane

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#5)
Re: pgsql: Catalog changes preparing for builtin collation provider.

I wrote:

Jeff Davis <pgsql@j-davis.com> writes:

That's so bizarre that I have to guess that it's a perl bug.

I agree.

Oh! No, it's a "feature": man perlop quoth

The empty pattern "//"
If the PATTERN evaluates to the empty string, the last
successfully matched regular expression is used instead. In
this case, only the "g" and "c" flags on the empty pattern are
honored; the other flags are taken from the original pattern.
If no match has previously succeeded, this will (silently) act
instead as a genuine empty pattern (which will always match).

So we actually need to find and nuke all of these, not just the one
that's causing trouble.

regards, tom lane

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Davis (#4)
Re: pgsql: Catalog changes preparing for builtin collation provider.

Jeff Davis <pgsql@j-davis.com> writes:

My commit adds a line using the s/// operator.

Oh, independently of the empty-pattern problem: what you added is

+# Numeric major version of old cluster, ignoring "devel" suffix.
+# Needed for testing upgrades from development version to itself.
+my $old_major_version = int($oldnode->pg_version =~ s/devel//rg);

Surely this will break the moment we enter beta. Rather than
trying to fix that directly, what we should be doing is realizing
that you've reinvented -- badly -- the facilities provided by
the PostgreSQL::Version package. The code you changed,

if ($oldnode->pg_version >= 15 && $ENV{with_icu} eq 'yes')

was perfectly correct as it stood, because pg_version is a
PostgreSQL::Version object. Why did you feel a need to
editorialize on that?

regards, tom lane

#8Jeff Davis
pgsql@j-davis.com
In reply to: Tom Lane (#6)
Re: pgsql: Catalog changes preparing for builtin collation provider.

On Mon, 2024-03-11 at 16:21 -0400, Tom Lane wrote:

Oh!  No, it's a "feature": man perlop quoth

Wow.

So we actually need to find and nuke all of these, not just the one
that's causing trouble.

For now I committed the one change to fix the buildfarm.

I am still confused on a couple of points here, such as: why does my
example work fine on newer versions of perl?

But I agree: if the empty pattern is magical, we should get rid of it,
even if we don't understand the exact conditions under which it behaves
magically.

Reagrds,
Jeff Davis

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Davis (#8)
Re: pgsql: Catalog changes preparing for builtin collation provider.

Jeff Davis <pgsql@j-davis.com> writes:

I am still confused on a couple of points here, such as: why does my
example work fine on newer versions of perl?

I'm not sure about that either. The discussions of this that I found
on-line say that it's not something the Perl gurus want to get rid of,
but it sure seems like they must have tightened the behavior to some
extent.

Another question is why it didn't fail on all the animals with similar
Perl vintage. My guess about that is that there's some configuration
difference causing the Perl script to take slightly different code
paths, but it's just a guess.

But I agree: if the empty pattern is magical, we should get rid of it,
even if we don't understand the exact conditions under which it behaves
magically.

Yeah. I was dismayed to find that there's no perlcritic check for
this, because it sure seems like the kind of thing you don't want
to invoke by accident.

regards, tom lane

#10Jeff Davis
pgsql@j-davis.com
In reply to: Tom Lane (#7)
Re: pgsql: Catalog changes preparing for builtin collation provider.

On Mon, 2024-03-11 at 17:08 -0400, Tom Lane wrote:

Jeff Davis <pgsql@j-davis.com> writes:

was perfectly correct as it stood, because pg_version is a
PostgreSQL::Version object.  Why did you feel a need to
editorialize on that?

The goal was to do a version check for 17 that's inclusive of
development versions.

Patch attached, following the example in AdjustUpgrade.pm. It feels a
bit inconsistent to sometimes use $oldnode->pg_version and sometimes
use $old_major_version, but it's certainly better than what I had done
in f696c0cd5f.

Regards,
Jeff Davis

Attachments:

v1-0001-Another-fix-to-002_pg_upgrade.pl.patchtext/x-patch; charset=UTF-8; name=v1-0001-Another-fix-to-002_pg_upgrade.pl.patchDownload+4-5
#11Jeff Davis
pgsql@j-davis.com
In reply to: Tom Lane (#9)
Re: pgsql: Catalog changes preparing for builtin collation provider.

On Mon, 2024-03-11 at 18:04 -0400, Tom Lane wrote:

Another question is why it didn't fail on all the animals with
similar
Perl vintage.  My guess about that is that there's some configuration
difference causing the Perl script to take slightly different code
paths, but it's just a guess.

Sounds plausible.

Yeah.  I was dismayed to find that there's no perlcritic check for
this, because it sure seems like the kind of thing you don't want
to invoke by accident.

Additionally, the delimiters can many characters, which makes simple
grepping harder.

I found a few cases -- attached a patch. Something is better than
nothing. I'm not sure that I got the /.*/ cases right, but they don't
seem to be expecting any output, so /.*/ seemed like the right pattern.

Regards,
Jeff Davis

Attachments:

v1-0001-perl-avoid-empty-regex-patterns.patchtext/x-patch; charset=UTF-8; name=v1-0001-perl-avoid-empty-regex-patterns.patchDownload+9-10
#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Davis (#10)
Re: pgsql: Catalog changes preparing for builtin collation provider.

Jeff Davis <pgsql@j-davis.com> writes:

Patch attached, following the example in AdjustUpgrade.pm. It feels a
bit inconsistent to sometimes use $oldnode->pg_version and sometimes
use $old_major_version, but it's certainly better than what I had done
in f696c0cd5f.

You're still doing it the hard way. I suggest the attached.

regards, tom lane

Attachments:

simpler-version-check.patchtext/x-diff; charset=us-ascii; name=simpler-version-check.patchDownload+2-6