BUG #18196: Databases Created in Turkish Language Will Not Run on the Latest Version of Windows
The following bug has been logged on the website:
Bug reference: 18196
Logged by: Halil Han Badem
Email address: halilhanbadem@gmail.com
PostgreSQL version: 15.4
Operating system: Windows 11 22H2 22621.2715
Description:
Hello,
This morning, I noticed that my computer at the office and my computer at
home had received automatic updates. Subsequently, I received reports of
some errors. Upon investigation, I found that the PostgreSQL service was not
running, and the error indicated that it was due to a problem in the
postgresql.conf file. After extensive research, I discovered that in the
Windows 11 22H2 22621.2715 update, the term "Turkey" had been changed to
"Türkiye." As a result, Turkish_Turkey.1254 is no longer present in the
operating system. Instead of Turkish_Turkey.1254, the Windows system now
uses Turkish_Türkiye.1254.
Databases created with Turkish_Turkey.1254 cannot find the corresponding
locale, resulting in errors and service disruptions. After a fresh
installation, I observed the following expressions in the postgresql.conf
file.
# These settings are initialized by initdb, but they can be changed.
lc_messages = 'Turkish_Türkiye.1254' # locale for system error message
# strings
lc_monetary = 'Turkish_Türkiye.1254' # locale for monetary formatting
lc_numeric = 'Turkish_Türkiye.1254' # locale for number formatting
lc_time = 'Turkish_Türkiye.1254' # locale for time formatting
Therefore, there is a conflict, and I request your assistance in addressing
this matter.
Thank you,
Best regards.
On Wed, 2023-11-15 at 09:04 +0000, PG Bug reporting form wrote:
Operating system: Windows 11 22H2 22621.2715
This morning, I noticed that my computer at the office and my computer at
home had received automatic updates. Subsequently, I received reports of
some errors. Upon investigation, I found that the PostgreSQL service was not
running, and the error indicated that it was due to a problem in the
postgresql.conf file. After extensive research, I discovered that in the
Windows 11 22H2 22621.2715 update, the term "Turkey" had been changed to
"Türkiye." As a result, Turkish_Turkey.1254 is no longer present in the
operating system. Instead of Turkish_Turkey.1254, the Windows system now
uses Turkish_Türkiye.1254.Databases created with Turkish_Turkey.1254 cannot find the corresponding
locale, resulting in errors and service disruptions. After a fresh
installation, I observed the following expressions in the postgresql.conf
file.# These settings are initialized by initdb, but they can be changed.
lc_messages = 'Turkish_Türkiye.1254' # locale for system error message
# strings
lc_monetary = 'Turkish_Türkiye.1254' # locale for monetary formatting
lc_numeric = 'Turkish_Türkiye.1254' # locale for number formatting
lc_time = 'Turkish_Türkiye.1254' # locale for time formattingTherefore, there is a conflict, and I request your assistance in addressing
this matter.
This has been reported before [1]/messages/by-id/18181-fa5143d18a467fe5@postgresql.org, but your report explains the problem clearly.
Can you show the exact error message?
This seems to be a really nasty problem that Microsoft has created for us.
Even if the server started, you could not connect to the database if the
collation persisted in pg_database is no longer accepted by the operating
system.
I cannot think of a possible solution except a hard-coded hack specifically
for Turkish locale names...
Yours,
Laurenz Albe
I can convey it to you in more detail. Now I will explain the steps to
avoid repeating this error.
1. Obtain a Windows version prior to version 22H2 22621.2715.
2. During the PostgreSQL installation in that version, choose Turkish,
Turkey as the locale (during installation).
3. Create a database using PgAdmin or any other tool and select
Turkish_Turkey.1254 as the local option.
4. Update your operating system to version 22H2 22621.2715.
5. After updating the operating system and restarting, you will notice that
the PostgreSQL service is not running.
6. When you try to start the service, you will encounter an error stating
that the Windows service cannot be started.
7. Later, in the Windows Event Logs, you will see the error "2023-11-15
10:47:14.877 +03 [1644] FATAL: configuration file "C:/Program
Files/PostgreSQL/15/data/postgresql.conf" contains errors."
8. The cause of the error in this file is the values specified as
lc_messages, lc_monetary, lc_numeric, lc_time under Turkish_Turkey.1254.
This is because, with the new update, Turkish_Turkey.1254 is no longer
present in the Windows operating system; the correct locale name is now
Turkish_Türkiye.1254.
In this case, the solution I implemented was as follows:
1. I copied the existing "data" folder of the database.
2. I obtained a Windows version with a lower version number than 22H2
22621.2715.
3. I defined the "data" folder for the PostgreSQL server in the relevant
version and restarted the services.
4. I accessed the database, took a backup.
5. I reinstalled PostgreSQL on the problematic server.
6. I restored the backup to the server. Before restoring, I marked the
local option for the created database as "C". This option worked smoothly
and did not cause any data loss.
Note: After updating Windows, if you reinstall PostgreSQL, you will see
that all Turkish_Turkey.1254 values in the postgresql.conf file have
changed to Turkish_Türkiye.1254. If you manipulate the value in this file
(try changing it back to Turkish_Turkey.1254), you will encounter the error
`The database was initialized with LC_COLLATE “Turkish_Turkey.1254”, which
is not recognized by setlocale()` because Turkish_Turkey.1254 locale no
longer exists; its name has been changed.
The decision of Windows to update in this regard will be a bad experience
for users in Turkey or anyone using the Turkish language option. I hope the
instant solution I found is helpful. I sincerely look forward to any
permanent solutions you may find.
Best regards.
Halil Han BADEM
Yazılım & Sistem Uzmanı
[image: linkedin icon] <https://tr.linkedin.com/in/halilhanbadem> [image:
instagram icon] <https://www.instagram.com/halilhanbadem.dev/>
E-Posta: halilhanbadem@gmail.com | work@halilhanbadem.dev
VD. - VNo.: 1290500711 - HİTİT | Telefon: (850) 309 4875
Balgat Mahallesi, Ceyhun Atuf Kansu Caddesi 36/6 06520, Çankaya, Ankara
Laurenz Albe <laurenz.albe@cybertec.at>, 15 Kas 2023 Çar, 20:48 tarihinde
şunu yazdı:
Show quoted text
On Wed, 2023-11-15 at 09:04 +0000, PG Bug reporting form wrote:
Operating system: Windows 11 22H2 22621.2715
This morning, I noticed that my computer at the office and my computer at
home had received automatic updates. Subsequently, I received reports of
some errors. Upon investigation, I found that the PostgreSQL service wasnot
running, and the error indicated that it was due to a problem in the
postgresql.conf file. After extensive research, I discovered that in the
Windows 11 22H2 22621.2715 update, the term "Turkey" had been changed to
"Türkiye." As a result, Turkish_Turkey.1254 is no longer present in the
operating system. Instead of Turkish_Turkey.1254, the Windows system now
uses Turkish_Türkiye.1254.Databases created with Turkish_Turkey.1254 cannot find the corresponding
locale, resulting in errors and service disruptions. After a fresh
installation, I observed the following expressions in the postgresql.conf
file.# These settings are initialized by initdb, but they can be changed.
lc_messages = 'Turkish_Türkiye.1254' # locale forsystem error message
# strings
lc_monetary = 'Turkish_Türkiye.1254' # locale formonetary formatting
lc_numeric = 'Turkish_Türkiye.1254' # locale for
number formatting
lc_time = 'Turkish_Türkiye.1254' # locale
for time formatting
Therefore, there is a conflict, and I request your assistance in
addressing
this matter.
This has been reported before [1], but your report explains the problem
clearly.Can you show the exact error message?
This seems to be a really nasty problem that Microsoft has created for us.
Even if the server started, you could not connect to the database if the
collation persisted in pg_database is no longer accepted by the operating
system.I cannot think of a possible solution except a hard-coded hack specifically
for Turkish locale names...Yours,
Laurenz Albe
Laurenz Albe <laurenz.albe@cybertec.at> writes:
On Wed, 2023-11-15 at 09:04 +0000, PG Bug reporting form wrote:
[ Microsoft decided to rename their Turkish locales ]
This seems to be a really nasty problem that Microsoft has created for us.
Even if the server started, you could not connect to the database if the
collation persisted in pg_database is no longer accepted by the operating
system.
Yeah, that's a mess. The postgresql.conf entries could be fixed by hand,
but if you've got "Turkish_Turkey.1254" in pg_database.datcollate or
datctype then there's no easy way around that.
I think really the right fix is a whole bunch of WTFs directed at
Microsoft. I can understand wanting to support the Turkish_Türkiye
spelling, but not having a backward-compatibility entry for the old
spelling is a colossal blunder -- especially in a minor OS release.
Aren't these the same people who prize backward compatibility at
any cost?
regards, tom lane
On Wed, 2023-11-15 at 21:11 +0300, Halil Han Badem wrote:
7. Later, in the Windows Event Logs, you will see the error
"FATAL: configuration file "C:/Program Files/PostgreSQL/15/data/postgresql.conf" contains errors."
8. The cause of the error in this file is the values specified as
lc_messages, lc_monetary, lc_numeric, lc_time under Turkish_Turkey.1254.
This is because, with the new update, Turkish_Turkey.1254 is no longer
present in the Windows operating system; the correct locale name is
now Turkish_Türkiye.1254.
If you edit "postgresql.conf" and change all occurrences of "Turkish_Turkey.1254"
to "Turkish_Türkiye.1254", can you start the service?
If yes, can you connect to a Turkish database?
Yours,
Laurenz Albe
In this case, the service starts, but you cannot establish a connection to
the database. In other words, it gives the following error. This error is
expected because the database was created with Turkish_Turkey.1254. Due to
the name change to Turkish_Türkiye.1254, it cannot find the local name.
failed: FATAL: database locale is incompatible with operating system
DETAIL: The database was initialized with LC_COLLATE "Turkish_Turkey.1254",
which is not recognized by setlocale(). HINT: Recreate the database with
another locale or install the missing locale.
Regards,
Halil Han.
Laurenz Albe <laurenz.albe@cybertec.at>, 15 Kas 2023 Çar, 21:29 tarihinde
şunu yazdı:
Show quoted text
On Wed, 2023-11-15 at 21:11 +0300, Halil Han Badem wrote:
7. Later, in the Windows Event Logs, you will see the error
"FATAL: configuration file "C:/ProgramFiles/PostgreSQL/15/data/postgresql.conf" contains errors."
8. The cause of the error in this file is the values specified as
lc_messages, lc_monetary, lc_numeric, lc_time underTurkish_Turkey.1254.
This is because, with the new update, Turkish_Turkey.1254 is no longer
present in the Windows operating system; the correct locale name is
now Turkish_Türkiye.1254.If you edit "postgresql.conf" and change all occurrences of
"Turkish_Turkey.1254"
to "Turkish_Türkiye.1254", can you start the service?
If yes, can you connect to a Turkish database?Yours,
Laurenz Albe
Hi
st 15. 11. 2023 v 19:53 odesílatel Halil Han Badem <halilhanbadem@gmail.com>
napsal:
I can convey it to you in more detail. Now I will explain the steps to
avoid repeating this error.1. Obtain a Windows version prior to version 22H2 22621.2715.
2. During the PostgreSQL installation in that version, choose Turkish,
Turkey as the locale (during installation).
3. Create a database using PgAdmin or any other tool and select
Turkish_Turkey.1254 as the local option.
4. Update your operating system to version 22H2 22621.2715.
5. After updating the operating system and restarting, you will notice
that the PostgreSQL service is not running.
6. When you try to start the service, you will encounter an error stating
that the Windows service cannot be started.
7. Later, in the Windows Event Logs, you will see the error "2023-11-15
10:47:14.877 +03 [1644] FATAL: configuration file "C:/Program
Files/PostgreSQL/15/data/postgresql.conf" contains errors."
8. The cause of the error in this file is the values specified as
lc_messages, lc_monetary, lc_numeric, lc_time under Turkish_Turkey.1254.
This is because, with the new update, Turkish_Turkey.1254 is no longer
present in the Windows operating system; the correct locale name is now
Turkish_Türkiye.1254.
This is a similar issue to rename the Czech locale "Czech_Czech
Republic.1250" to "Czech_Czechia.1250" six years ago.
Regards
Pavel
The more significant issue here is that PostgreSQL does not allow
connections because, during installation, it was created with the
Turkish_Turkey.1254 option. I attempted to connect to the PostgreSQL user
using psql via the command line to take a backup, but it didn't even allow
a connection to the main database (postgres). The error it provided is
again as follows:
failed: FATAL: database locale is incompatible with the operating system
DETAIL: The database was initialized with LC_COLLATE "Turkish_Turkey.1254",
which is not recognized by setlocale(). HINT: Recreate the database with
another locale or install the missing locale.
In this section, there are two definite solutions. The first one is the
solution I explained [1]/messages/by-id/CA+hfYg-rGc-O_LnS5M59HGbYufaGsQ6Sc4yBKucBQZLhoZE_7w@mail.gmail.com, and the second one is to uninstall the Windows
Update. However, removing this update will not be a permanent solution
because, eventually, a Windows Update will be received for security or any
dependency reasons.
[1]: /messages/by-id/CA+hfYg-rGc-O_LnS5M59HGbYufaGsQ6Sc4yBKucBQZLhoZE_7w@mail.gmail.com
/messages/by-id/CA+hfYg-rGc-O_LnS5M59HGbYufaGsQ6Sc4yBKucBQZLhoZE_7w@mail.gmail.com
Regards.
Halil Han Badem <halilhanbadem@gmail.com>, 15 Kas 2023 Çar, 21:47 tarihinde
şunu yazdı:
Show quoted text
In this case, the service starts, but you cannot establish a connection to
the database. In other words, it gives the following error. This error is
expected because the database was created with Turkish_Turkey.1254. Due to
the name change to Turkish_Türkiye.1254, it cannot find the local name.failed: FATAL: database locale is incompatible with operating system
DETAIL: The database was initialized with LC_COLLATE "Turkish_Turkey.1254",
which is not recognized by setlocale(). HINT: Recreate the database with
another locale or install the missing locale.Regards,
Halil Han.Laurenz Albe <laurenz.albe@cybertec.at>, 15 Kas 2023 Çar, 21:29 tarihinde
şunu yazdı:On Wed, 2023-11-15 at 21:11 +0300, Halil Han Badem wrote:
7. Later, in the Windows Event Logs, you will see the error
"FATAL: configuration file "C:/ProgramFiles/PostgreSQL/15/data/postgresql.conf" contains errors."
8. The cause of the error in this file is the values specified as
lc_messages, lc_monetary, lc_numeric, lc_time underTurkish_Turkey.1254.
This is because, with the new update, Turkish_Turkey.1254 is no
longer
present in the Windows operating system; the correct locale name is
now Turkish_Türkiye.1254.If you edit "postgresql.conf" and change all occurrences of
"Turkish_Turkey.1254"
to "Turkish_Türkiye.1254", can you start the service?
If yes, can you connect to a Turkish database?Yours,
Laurenz Albe
Thank you Pavel. This is indeed a good solution if you have a few servers.
We have users in hundreds across all major cities in Turkey, and each user
is using our database on their dedicated server (we are developing a
desktop application). In this case, providing support for all databases
will be very challenging. We will document the solution you mentioned and
add it to our documentation.
Best regards.
Pavel Stehule <pavel.stehule@gmail.com>, 15 Kas 2023 Çar, 22:03 tarihinde
şunu yazdı:
Show quoted text
Hi
st 15. 11. 2023 v 19:53 odesílatel Halil Han Badem <
halilhanbadem@gmail.com> napsal:I can convey it to you in more detail. Now I will explain the steps to
avoid repeating this error.1. Obtain a Windows version prior to version 22H2 22621.2715.
2. During the PostgreSQL installation in that version, choose Turkish,
Turkey as the locale (during installation).
3. Create a database using PgAdmin or any other tool and select
Turkish_Turkey.1254 as the local option.
4. Update your operating system to version 22H2 22621.2715.
5. After updating the operating system and restarting, you will notice
that the PostgreSQL service is not running.
6. When you try to start the service, you will encounter an error stating
that the Windows service cannot be started.
7. Later, in the Windows Event Logs, you will see the error "2023-11-15
10:47:14.877 +03 [1644] FATAL: configuration file "C:/Program
Files/PostgreSQL/15/data/postgresql.conf" contains errors."
8. The cause of the error in this file is the values specified as
lc_messages, lc_monetary, lc_numeric, lc_time under Turkish_Turkey.1254.
This is because, with the new update, Turkish_Turkey.1254 is no longer
present in the Windows operating system; the correct locale name is now
Turkish_Türkiye.1254.This is a similar issue to rename the Czech locale "Czech_Czech
Republic.1250" to "Czech_Czechia.1250" six years ago.Regards
Pavel
Hi
st 15. 11. 2023 v 20:10 odesílatel Halil Han Badem <halilhanbadem@gmail.com>
napsal:
Thank you Pavel. This is indeed a good solution if you have a few servers.
We have users in hundreds across all major cities in Turkey, and each user
is using our database on their dedicated server (we are developing a
desktop application). In this case, providing support for all databases
will be very challenging. We will document the solution you mentioned and
add it to our documentation.
maybe it can be scripted, automatized. If I remember correctly, one of my
customers wrote a small application that did this.
Sure - it is a workaround - dirty, but some clients can be fixed tomorrow.
Show quoted text
Best regards.
Pavel Stehule <pavel.stehule@gmail.com>, 15 Kas 2023 Çar, 22:03 tarihinde
şunu yazdı:Hi
st 15. 11. 2023 v 19:53 odesílatel Halil Han Badem <
halilhanbadem@gmail.com> napsal:I can convey it to you in more detail. Now I will explain the steps to
avoid repeating this error.1. Obtain a Windows version prior to version 22H2 22621.2715.
2. During the PostgreSQL installation in that version, choose Turkish,
Turkey as the locale (during installation).
3. Create a database using PgAdmin or any other tool and select
Turkish_Turkey.1254 as the local option.
4. Update your operating system to version 22H2 22621.2715.
5. After updating the operating system and restarting, you will notice
that the PostgreSQL service is not running.
6. When you try to start the service, you will encounter an error
stating that the Windows service cannot be started.
7. Later, in the Windows Event Logs, you will see the error "2023-11-15
10:47:14.877 +03 [1644] FATAL: configuration file "C:/Program
Files/PostgreSQL/15/data/postgresql.conf" contains errors."
8. The cause of the error in this file is the values specified as
lc_messages, lc_monetary, lc_numeric, lc_time under Turkish_Turkey.1254.
This is because, with the new update, Turkish_Turkey.1254 is no longer
present in the Windows operating system; the correct locale name is now
Turkish_Türkiye.1254.This is a similar issue to rename the Czech locale "Czech_Czech
Republic.1250" to "Czech_Czechia.1250" six years ago.Regards
Pavel
On Wed, 2023-11-15 at 21:47 +0300, Halil Han Badem wrote:
In this case, the service starts, but you cannot establish a connection to the database.
In other words, it gives the following error. This error is expected because the
database was created with Turkish_Turkey.1254. Due to the name change to
Turkish_Türkiye.1254, it cannot find the local name.failed: FATAL: database locale is incompatible with operating system
DETAIL: The database was initialized with LC_COLLATE "Turkish_Turkey.1254", which is not recognized by setlocale().
HINT: Recreate the database with another locale or install the missing locale.
That was to be expected...
I cannot think of a good solution except to be Microsoft to unbreak their
software. If you have not performed the Windows update on a system yet,
and you are bold enough, you could try the following hack:
- connect as superuser
- modify the catalog table with
UPDATE pg_database
SET datcollate = 'Turkish_Türkiye.1254',
datctype = 'Turkish_Türkiye.1254'
WHERE datcollate = 'Turkish_Turkey.1254';
- stop the PostgreSQL service
- edit "postgresql.conf" and change "Turkish_Turkey.1254" to "Turkish_Türkiye.1254" everywhere
- install the Windows update
- start the service
- in case the collation changed, REINDEX all indexes on string expressions
A safer option would be to dump the database before the upgrade
and restore the dump to a newly created cluster after the upgrade.
If you have used "Turkish_Turkey.1254" as explicit column collation
or somewhere else in your code, you will still run into problems.
This should really be fixed by Microsoft.
Yours,
Laurenz Albe
Thank you, Pavel. With the feature in the Locale Builder application, we
can create an MSI build. With your solution, an MSI build can be created
and easily configured later.
Thanks for the tip.
Pavel Stehule <pavel.stehule@gmail.com>, 15 Kas 2023 Çar, 22:19 tarihinde
şunu yazdı:
Show quoted text
Hi
st 15. 11. 2023 v 20:10 odesílatel Halil Han Badem <
halilhanbadem@gmail.com> napsal:Thank you Pavel. This is indeed a good solution if you have a few
servers. We have users in hundreds across all major cities in Turkey, and
each user is using our database on their dedicated server (we are
developing a desktop application). In this case, providing support for all
databases will be very challenging. We will document the solution you
mentioned and add it to our documentation.maybe it can be scripted, automatized. If I remember correctly, one of my
customers wrote a small application that did this.Sure - it is a workaround - dirty, but some clients can be fixed
tomorrow.Best regards.
Pavel Stehule <pavel.stehule@gmail.com>, 15 Kas 2023 Çar, 22:03
tarihinde şunu yazdı:Hi
st 15. 11. 2023 v 19:53 odesílatel Halil Han Badem <
halilhanbadem@gmail.com> napsal:I can convey it to you in more detail. Now I will explain the steps to
avoid repeating this error.1. Obtain a Windows version prior to version 22H2 22621.2715.
2. During the PostgreSQL installation in that version, choose Turkish,
Turkey as the locale (during installation).
3. Create a database using PgAdmin or any other tool and select
Turkish_Turkey.1254 as the local option.
4. Update your operating system to version 22H2 22621.2715.
5. After updating the operating system and restarting, you will notice
that the PostgreSQL service is not running.
6. When you try to start the service, you will encounter an error
stating that the Windows service cannot be started.
7. Later, in the Windows Event Logs, you will see the error "2023-11-15
10:47:14.877 +03 [1644] FATAL: configuration file "C:/Program
Files/PostgreSQL/15/data/postgresql.conf" contains errors."
8. The cause of the error in this file is the values specified as
lc_messages, lc_monetary, lc_numeric, lc_time under Turkish_Turkey.1254.
This is because, with the new update, Turkish_Turkey.1254 is no longer
present in the Windows operating system; the correct locale name is now
Turkish_Türkiye.1254.This is a similar issue to rename the Czech locale "Czech_Czech
Republic.1250" to "Czech_Czechia.1250" six years ago.Regards
Pavel
Yes, we're paying the price for Microsoft's mistake. It's really
unacceptable. Thank you very much for the tips and solution suggestions.
Let's discuss the situation and take action with the solutions as needed.
Thank you so much again.
Laurenz Albe <laurenz.albe@cybertec.at>, 15 Kas 2023 Çar, 22:41 tarihinde
şunu yazdı:
Show quoted text
On Wed, 2023-11-15 at 21:47 +0300, Halil Han Badem wrote:
In this case, the service starts, but you cannot establish a connection
to the database.
In other words, it gives the following error. This error is expected
because the
database was created with Turkish_Turkey.1254. Due to the name change to
Turkish_Türkiye.1254, it cannot find the local name.failed: FATAL: database locale is incompatible with operating system
DETAIL: The database was initialized with LC_COLLATE"Turkish_Turkey.1254", which is not recognized by setlocale().
HINT: Recreate the database with another locale or install the missing
locale.
That was to be expected...
I cannot think of a good solution except to be Microsoft to unbreak their
software. If you have not performed the Windows update on a system yet,
and you are bold enough, you could try the following hack:- connect as superuser
- modify the catalog table with
UPDATE pg_database
SET datcollate = 'Turkish_Türkiye.1254',
datctype = 'Turkish_Türkiye.1254'
WHERE datcollate = 'Turkish_Turkey.1254';- stop the PostgreSQL service
- edit "postgresql.conf" and change "Turkish_Turkey.1254" to
"Turkish_Türkiye.1254" everywhere- install the Windows update
- start the service
- in case the collation changed, REINDEX all indexes on string expressions
A safer option would be to dump the database before the upgrade
and restore the dump to a newly created cluster after the upgrade.If you have used "Turkish_Turkey.1254" as explicit column collation
or somewhere else in your code, you will still run into problems.This should really be fixed by Microsoft.
Yours,
Laurenz Albe
On Thu, Nov 16, 2023 at 7:26 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
On Wed, 2023-11-15 at 09:04 +0000, PG Bug reporting form wrote:
[ Microsoft decided to rename their Turkish locales ]
This seems to be a really nasty problem that Microsoft has created for us.
Even if the server started, you could not connect to the database if the
collation persisted in pg_database is no longer accepted by the operating
system.Yeah, that's a mess. The postgresql.conf entries could be fixed by hand,
but if you've got "Turkish_Turkey.1254" in pg_database.datcollate or
datctype then there's no easy way around that.I think really the right fix is a whole bunch of WTFs directed at
Microsoft. I can understand wanting to support the Turkish_Türkiye
spelling, but not having a backward-compatibility entry for the old
spelling is a colossal blunder -- especially in a minor OS release.
Aren't these the same people who prize backward compatibility at
any cost?
For the record, the Windows manual tells us very, very clearly not to
store those "display" style names anywhere, and to use BCP 47 locale
names instead, which are based on ISO standards. I had patches to
teach initdb to use BCP 47, and I remain convinced that we must do
that, but I have failed to attract the attention of enough people who
use and understand Windows to be able to commit anything. In
particular, Juan José pointed out some problems that I just couldn't
understand.
/messages/by-id/CA+hUKGJ=XThErgAQRoqfCy1bKPxXVuF0=2zDbB+SxDs59pv7Fw@mail.gmail.com
PG Bug reporting form wrote:
As a result, Turkish_Turkey.1254 is no longer present in the
operating system. Instead of Turkish_Turkey.1254, the Windows system
now uses Turkish_Türkiye.1254.
Databases created with Turkish_Turkey.1254 cannot find the
corresponding locale, resulting in errors and service disruptions
Aside from the disappearance of the previous name, this "ü" in the new
name also looks troublesome. It might be encoded as a single byte in
an encoding of the ISO-8859-* family, in which case the byte would be
0xFC it seems, but in an UTF-8 context, it would be represented by the
sequence 0xC3 0xBC.
This locale name ends up in pg_database.datcollate and datctype, and
since pg_database is shared among all databases of the instance, it's
supposed to be encoded in a way that is compatible with all possible
db encodings, which in fact implies not having characters outside
of the US-ASCII character set.
In practice, in a cluster initialized with the locale
"Turkish_Türkiye.1254" and a monobyte encoding, assuming that
an UTF-8 database can be successfully created in it , how
would "select * from pg_database" from inside that database
not fail with an UTF-8 decoding error?
Best regards,
--
Daniel Vérité
https://postgresql.verite.pro/
Twitter: @DanielVerite
"Daniel Verite" <daniel@manitou-mail.org> writes:
Aside from the disappearance of the previous name, this "ü" in the new
name also looks troublesome. It might be encoded as a single byte in
an encoding of the ISO-8859-* family, in which case the byte would be
0xFC it seems, but in an UTF-8 context, it would be represented by the
sequence 0xC3 0xBC.
Indeed. That jogged my memory that we've dealt with this sort of
thing before; search the commit log for mentions of "Norwegian (Bokmål)".
I guess we could add mapping entry(s) similar to what we use for that,
although that won't help anybody who's already stuck with an unstartable
database.
regards, tom lane
On Thu, Nov 16, 2023 at 9:00 AM Halil Han Badem <halilhanbadem@gmail.com> wrote:
Yes, we're paying the price for Microsoft's mistake. It's really unacceptable.
I am certainly no Windows apologist, but I think on this one, it feels
like PostgreSQL is at least partially in the wrong here. Windows
documentation tells everyone that these names might change in every OS
update, and that we shouldn't store them anywhere, and what to do
instead ("tr-TR"). This has happened again and again over the past
decades and win32locale.c is full of bizarro hacks to deal with it.
If something good can come of this round of
"WTF-how-could-they-do-this-to-us", I hope it can be some more
eyeballs on the thread that I posted to fix this problem (see link in
my earlier message).
(Full disclosure: I work for Azure Data, a branch of Microsoft, but I
have nothing to do with the Windows operating system and don't use it,
I'm employed to work on open source stuff. I'm speaking with my
PostgreSQL hacker hat on, here.)
On Thu, 2023-11-16 at 11:54 +1300, Thomas Munro wrote:
On Thu, Nov 16, 2023 at 7:26 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
On Wed, 2023-11-15 at 09:04 +0000, PG Bug reporting form wrote:
[ Microsoft decided to rename their Turkish locales ]
This seems to be a really nasty problem that Microsoft has created for us.
Even if the server started, you could not connect to the database if the
collation persisted in pg_database is no longer accepted by the operating
system.Yeah, that's a mess. The postgresql.conf entries could be fixed by hand,
but if you've got "Turkish_Turkey.1254" in pg_database.datcollate or
datctype then there's no easy way around that.I think really the right fix is a whole bunch of WTFs directed at
Microsoft. I can understand wanting to support the Turkish_Türkiye
spelling, but not having a backward-compatibility entry for the old
spelling is a colossal blunder -- especially in a minor OS release.
Aren't these the same people who prize backward compatibility at
any cost?For the record, the Windows manual tells us very, very clearly not to
store those "display" style names anywhere, and to use BCP 47 locale
names instead, which are based on ISO standards. I had patches to
teach initdb to use BCP 47, and I remain convinced that we must do
that, but I have failed to attract the attention of enough people who
use and understand Windows to be able to commit anything. In
particular, Juan José pointed out some problems that I just couldn't
understand./messages/by-id/CA+hUKGJ=XThErgAQRoqfCy1bKPxXVuF0=2zDbB+SxDs59pv7Fw@mail.gmail.com
I don't have Windows to test, but wouldn't the problem be avoided if
people created their cluster with "--locale=tr-TR"? If yes, EDB's
Windows installer should be modified to use the correct locale names.
Is anybody from EDB reading this?
Yours,
Laurenz Albe
On Mon, Nov 20, 2023 at 10:54 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Thu, 2023-11-16 at 11:54 +1300, Thomas Munro wrote:
/messages/by-id/CA+hUKGJ=XThErgAQRoqfCy1bKPxXVuF0=2zDbB+SxDs59pv7Fw@mail.gmail.com
I don't have Windows to test, but wouldn't the problem be avoided if
people created their cluster with "--locale=tr-TR"? If yes, EDB's
Windows installer should be modified to use the correct locale names.
Is anybody from EDB reading this?
That would have the same effect as that patch. I believe that is the
right thing to do, but I am not sure about one detail: the lack of
encoding on the end of the name. Does the encoding remain the same as
the traditional one for that language, and stable, or when you haven't
explicitly named it, might it depend on the registry/control panel
"ACP", that could in theory change? If it does, what happens? Should
we put it on the end to pin it down? When we tried that it had some
effect, but didn't seem to have the expected [by me] effect on ctype
(eg case conversion of the famous Turkish i); it appears that having
ACP != LC_TYPE gives some Frankenstein behaviour, but I don't
understand it, and I think you'd also want to determine in which cases
strcoll_l() is behaving sensibly with various combinations. This may
all be pre-existing stuff well understood by people who worked on the
Windows port? I don't have Windows to test.
(How nice it would be to use ICU by default!)