Re: Solaris versus our NLS files
On 2025-Dec-09, Tom Lane wrote:
If you're right about Sun not doing transcoding, then I guess we would
only need to create symlinks matching the encodings used in our .po
files, which'd remove the symlink bloat problem and replace it with
how-do-we-extract-that-encoding-name ... although it looks like all
but one is in UTF-8, so maybe we should just decree they have to be
in UTF-8? The lone exception is src/bin/pg_config/po/nb.po, which
seems not to have been touched since 2013.
Hmm, where do you see that file? It was removed by commit 3c70de2e12b9
from branch 12 in 2019, and has never existed since.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Los cuentos de hadas no dan al niño su primera idea sobre los monstruos.
Lo que le dan es su primera idea de la posible derrota del monstruo."
(G. K. Chesterton)
Import Notes
Reply to msg id not found: 299454.1765318999@sss.pgh.pa.us
Álvaro Herrera <alvherre@kurilemu.de> writes:
On 2025-Dec-09, Tom Lane wrote:
If you're right about Sun not doing transcoding, then I guess we would
only need to create symlinks matching the encodings used in our .po
files, which'd remove the symlink bloat problem and replace it with
how-do-we-extract-that-encoding-name ... although it looks like all
but one is in UTF-8, so maybe we should just decree they have to be
in UTF-8? The lone exception is src/bin/pg_config/po/nb.po, which
seems not to have been touched since 2013.Hmm, where do you see that file? It was removed by commit 3c70de2e12b9
from branch 12 in 2019, and has never existed since.
That translation commit was on the REL_12_STABLE branch, after it was
cut from master (after rc1, even). Looking more closely, the
post-branch translation updates deleted it from version 12, 13, 14, and
15, but not 16 onwards, and the file is still there in master:
https://git.postgresql.org/cgit/postgresql.git/tree/src/bin/pg_config/po/nb.po
- ilmari
On 2025-Dec-10, Dagfinn Ilmari Mannsåker wrote:
Álvaro Herrera <alvherre@kurilemu.de> writes:
Hmm, where do you see that file? It was removed by commit 3c70de2e12b9
from branch 12 in 2019, and has never existed since.That translation commit was on the REL_12_STABLE branch, after it was
cut from master (after rc1, even). Looking more closely, the
post-branch translation updates deleted it from version 12, 13, 14, and
15, but not 16 onwards, and the file is still there in master:
Oh. Well, that's clearly a process failure, and the fix will require us
deleting that file on all branches from 16 and up anyway, so I have no
issues with the plan of requiring all message catalogs to be UTF-8.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"I can see support will not be a problem. 10 out of 10." (Simon Wittber)
(http://archives.postgresql.org/pgsql-general/2004-12/msg00159.php)
=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@kurilemu.de> writes:
On 2025-Dec-10, Dagfinn Ilmari Mannsåker wrote:
That translation commit was on the REL_12_STABLE branch, after it was
cut from master (after rc1, even). Looking more closely, the
post-branch translation updates deleted it from version 12, 13, 14, and
15, but not 16 onwards, and the file is still there in master:
Hah, yeah, I failed to notice that there's a gap in which branches
have that file. But it's definitely there in master.
Oh. Well, that's clearly a process failure, and the fix will require us
deleting that file on all branches from 16 and up anyway, so I have no
issues with the plan of requiring all message catalogs to be UTF-8.
Shall I just go delete those files, or is there more process that
ought to be observed here?
regards, tom lane
Álvaro Herrera <alvherre@kurilemu.de> writes:
On 2025-Dec-10, Dagfinn Ilmari Mannsåker wrote:
Álvaro Herrera <alvherre@kurilemu.de> writes:
Hmm, where do you see that file? It was removed by commit 3c70de2e12b9
from branch 12 in 2019, and has never existed since.That translation commit was on the REL_12_STABLE branch, after it was
cut from master (after rc1, even). Looking more closely, the
post-branch translation updates deleted it from version 12, 13, 14, and
15, but not 16 onwards, and the file is still there in master:Oh. Well, that's clearly a process failure, and the fix will require us
deleting that file on all branches from 16 and up anyway, so I have no
issues with the plan of requiring all message catalogs to be UTF-8.
Digging a bit more in the history of **/nb.po, there seems to be a
policy that files that are less than 80% translated are removed¹, and I
guess this file was just below the threshold on the 12-15 branches, but
just above the threshold on master and 16+. The Norwegian translation
seems unmaintained², so I vote³ for removing it completely.
- ilmari
[1]: https://git.postgresql.org/cgit/postgresql.git/commit/?id=a6667d96c5e4aca92612295d549541146dd6e74a
[2]: https://git.postgresql.org/cgit/pgtranslation/messages.git/log/nb
[3]: I am Norwegian, but I prefer to use computers in English
Tom Lane <tgl@sss.pgh.pa.us> writes:
=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@kurilemu.de> writes:
Oh. Well, that's clearly a process failure, and the fix will require us
deleting that file on all branches from 16 and up anyway, so I have no
issues with the plan of requiring all message catalogs to be UTF-8.Shall I just go delete those files, or is there more process that
ought to be observed here?regards, tom lane
Looking at the translations repo, there's 30 .po files (out of 530) that
are not UTF-8, but I guess only nb/pg_config.po meets the 80% threshold
and makes it into the main repo. To avoid future breakage, should we
ask the translation team to convert those to UTF-8?
- ilmari
=?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?= <ilmari@ilmari.org> writes:
Looking at the translations repo, there's 30 .po files (out of 530) that
are not UTF-8, but I guess only nb/pg_config.po meets the 80% threshold
and makes it into the main repo. To avoid future breakage, should we
ask the translation team to convert those to UTF-8?
+1, they'd have to be on board with any all-UTF8 policy anyway.
regards, tom lane
=?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?= <ilmari@ilmari.org> writes:
Álvaro Herrera <alvherre@kurilemu.de> writes:
Oh. Well, that's clearly a process failure, and the fix will require us
deleting that file on all branches from 16 and up anyway, so I have no
issues with the plan of requiring all message catalogs to be UTF-8.
Digging a bit more in the history of **/nb.po, there seems to be a
policy that files that are less than 80% translated are removed¹, and I
guess this file was just below the threshold on the 12-15 branches, but
just above the threshold on master and 16+.
There may actually be an update-process bug here, because according to
[1]: https://babel.postgresql.org
So src/bin/pg_config/po/nb.po should not be propagated to gitmaster
in any branch later than v12, yet here it is in the more recent
branches. I'm suspecting a logic bug that fails to delete files
that should be deleted.
regards, tom lane
=?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?= <ilmari@ilmari.org> writes:
Digging a bit more in the history of **/nb.po, there seems to be a
policy that files that are less than 80% translated are removed¹,
BTW, while the wiki page does still say that, I have a vague idea
that the policy might have been changed later. I dug in the archives
and could find only this inconclusive discussion:
/messages/by-id/CAECtzeV6dyu4jTOrorFW=B=EicyejWO7_Seew3Ch0=0wO+M-RQ@mail.gmail.com
However, the actual state of affairs doesn't seem to match the 80%
rule. I see in src/backend/po in the v18 branch:
de.po 99%
es.po 93%
fr.po 78%
id.po 45%
it.po 81%
ja.po 99%
ka.po 79%
ko.po 99%
pl.po 56%
pt_BR.po 77%
ru.po 99%
sv.po 99%
tr.po 60%
uk.po 90%
zh_CN.po 67%
I annotated these with translation percentages from
babel.postgresql.org, which are probably up-to-the-minute not
reflective of where it was at 18.0 release. But there's no way
that id.po went from >= 80% to 45% since release, and there are
others that are well under 80%.
So I'm not sure what the active policy really is, but it's not 80%.
regards, tom lane