Move definition of standard collations from initdb to pg_collation.dat

Started by Peter Eisentrautabout 3 years ago3 messageshackers
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

While working on [0]/messages/by-id/1293e382-2093-a2bf-a397-c04e8f83d3c2@enterprisedb.com, I was wondering why the collations ucs_basic and
unicode are not in pg_collation.dat. I traced this back through
history, and I think this was just lost in a game of telephone.

The initial commit for pg_collation.h (414c5a2ea6) has only the default
collation in pg_collation.h (pre .dat), with initdb handling everything
else. Over time, additional collations "C" and "POSIX" were moved to
pg_collation.h, and other logic was moved from initdb to
pg_import_system_collations(). But ucs_basic was untouched. Commit
0b13b2a771 rearranged the relative order of operations in initdb and
added the current comment "We don't want to pin these", but looking at
the email[1]/messages/by-id/28195.1498172402@sss.pgh.pa.us, I think this was more a guess about the previous intent.

I suggest we fix this now; see attached patch.

[0]: /messages/by-id/1293e382-2093-a2bf-a397-c04e8f83d3c2@enterprisedb.com
/messages/by-id/1293e382-2093-a2bf-a397-c04e8f83d3c2@enterprisedb.com

[1]: /messages/by-id/28195.1498172402@sss.pgh.pa.us

Attachments:

0001-Move-definition-of-standard-collations-from-initdb-t.patchtext/plain; charset=UTF-8; name=0001-Move-definition-of-standard-collations-from-initdb-t.patchDownload+8-15
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#1)
Re: Move definition of standard collations from initdb to pg_collation.dat

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

While working on [0], I was wondering why the collations ucs_basic and
unicode are not in pg_collation.dat. I traced this back through
history, and I think this was just lost in a game of telephone.
The initial commit for pg_collation.h (414c5a2ea6) has only the default
collation in pg_collation.h (pre .dat), with initdb handling everything
else. Over time, additional collations "C" and "POSIX" were moved to
pg_collation.h, and other logic was moved from initdb to
pg_import_system_collations(). But ucs_basic was untouched. Commit
0b13b2a771 rearranged the relative order of operations in initdb and
added the current comment "We don't want to pin these", but looking at
the email[1], I think this was more a guess about the previous intent.

Yeah, I was just loath to change the previous behavior in that
patch. I can't see any strong reason not to pin these entries.

I suggest we fix this now; see attached patch.

While we're here, do we want to adopt some other spelling of "the
root locale" than "und", in view of recent discoveries about the
instability of that on old ICU versions?

regards, tom lane

#3Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#2)
Re: Move definition of standard collations from initdb to pg_collation.dat

On 28.03.23 13:33, Tom Lane wrote:

While we're here, do we want to adopt some other spelling of "the
root locale" than "und", in view of recent discoveries about the
instability of that on old ICU versions?

That issue was fixed by 3b50275b12, so we can keep using the "und" spelling.