EDB builds Postgres 13 with an obsolete ICU version
Hi,
As a follow-up to bug #16570 [1]/messages/by-id/16570-58cc04e1a6ef3c3f@postgresql.org and other previous discussions
on the mailing-lists, I'm checking out PG13 beta for Windows
from:
https://www.enterprisedb.com/postgresql-early-experience
and it ships with the same obsolete ICU 53 that was used
for PG 10,11,12.
Besides not having the latest Unicode features and fixes, ICU 53
ignores the BCP 47 tags syntax in collations used as examples
in Postgres documentation, which leads to confusion and
false bug reports.
The current version is ICU 67.
I don't see where the suggestion to upgrade it before the
next PG release should be addressed but maybe some people on
this list do know or have the leverage to make it happen?
[1]: /messages/by-id/16570-58cc04e1a6ef3c3f@postgresql.org
/messages/by-id/16570-58cc04e1a6ef3c3f@postgresql.org
Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite
On Mon, Aug 3, 2020 at 08:56:06PM +0200, Daniel Verite wrote:
Hi,
As a follow-up to bug #16570 [1] and other previous discussions
on the mailing-lists, I'm checking out PG13 beta for Windows
from:
https://www.enterprisedb.com/postgresql-early-experience
and it ships with the same obsolete ICU 53 that was used
for PG 10,11,12.
Besides not having the latest Unicode features and fixes, ICU 53
ignores the BCP 47 tags syntax in collations used as examples
in Postgres documentation, which leads to confusion and
false bug reports.
The current version is ICU 67.I don't see where the suggestion to upgrade it before the
next PG release should be addressed but maybe some people on
this list do know or have the leverage to make it happen?
Well, you can ask EDB about this, but perhaps the have kept the same ICU
version so indexes will not need to be reindexed.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Tue, Aug 4, 2020 at 1:04 AM Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Aug 3, 2020 at 08:56:06PM +0200, Daniel Verite wrote:
Hi,
As a follow-up to bug #16570 [1] and other previous discussions
on the mailing-lists, I'm checking out PG13 beta for Windows
from:
https://www.enterprisedb.com/postgresql-early-experience
and it ships with the same obsolete ICU 53 that was used
for PG 10,11,12.
Besides not having the latest Unicode features and fixes, ICU 53
ignores the BCP 47 tags syntax in collations used as examples
in Postgres documentation, which leads to confusion and
false bug reports.
The current version is ICU 67.I don't see where the suggestion to upgrade it before the
next PG release should be addressed but maybe some people on
this list do know or have the leverage to make it happen?Well, you can ask EDB about this, but perhaps the have kept the same ICU
version so indexes will not need to be reindexed.
Correct - updating ICU would mean a reindex is required following any
upgrade, major or minor.
I would really like to find an acceptable solution to this however as it
really would be good to be able to update ICU.
--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake
Dave Page schrieb am 04.08.2020 um 10:06:
Correct - updating ICU would mean a reindex is required following any
upgrade, major or minor.I would really like to find an acceptable solution to this however as
it really would be good to be able to update ICU.
What about providing a newer ICU version as kind of an "add-on" download containing only the needed DLLs (assuming it's as easy as only replacing the DLLs)?
Then everyone who wishes to use a newer ICU version can manually install them.
If that download carries a big "ATTENTION: reindex required" I don't think this would be a big risk.
Thomas
On Tue, Aug 4, 2020 at 10:07 AM Dave Page <dpage@pgadmin.org> wrote:
On Tue, Aug 4, 2020 at 1:04 AM Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Aug 3, 2020 at 08:56:06PM +0200, Daniel Verite wrote:
Hi,
As a follow-up to bug #16570 [1] and other previous discussions
on the mailing-lists, I'm checking out PG13 beta for Windows
from:
https://www.enterprisedb.com/postgresql-early-experience
and it ships with the same obsolete ICU 53 that was used
for PG 10,11,12.
Besides not having the latest Unicode features and fixes, ICU 53
ignores the BCP 47 tags syntax in collations used as examples
in Postgres documentation, which leads to confusion and
false bug reports.
The current version is ICU 67.I don't see where the suggestion to upgrade it before the
next PG release should be addressed but maybe some people on
this list do know or have the leverage to make it happen?Well, you can ask EDB about this, but perhaps the have kept the same ICU
version so indexes will not need to be reindexed.Correct - updating ICU would mean a reindex is required following any
upgrade, major or minor.I would really like to find an acceptable solution to this however as it
really would be good to be able to update ICU.
It certainly couldn't and shouldn't be done in a minor.
But doing so in v13 doesn't seem entirely unreasonable, especially given
that I believe we will detect the requirement to reindex thanks to the
versioning, and not just start returning invalid results (like, say, with
those glibc updates).
Would it be possible to have the installer even check if there are any icu
indexes in the database. If there aren't, just put in the new version of
icu. If there are, give the user a choice of the old version or new version
and reindex?
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
On Tue, Aug 4, 2020 at 10:29 AM Magnus Hagander <magnus@hagander.net> wrote:
On Tue, Aug 4, 2020 at 10:07 AM Dave Page <dpage@pgadmin.org> wrote:
On Tue, Aug 4, 2020 at 1:04 AM Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Aug 3, 2020 at 08:56:06PM +0200, Daniel Verite wrote:
Hi,
As a follow-up to bug #16570 [1] and other previous discussions
on the mailing-lists, I'm checking out PG13 beta for Windows
from:
https://www.enterprisedb.com/postgresql-early-experience
and it ships with the same obsolete ICU 53 that was used
for PG 10,11,12.
Besides not having the latest Unicode features and fixes, ICU 53
ignores the BCP 47 tags syntax in collations used as examples
in Postgres documentation, which leads to confusion and
false bug reports.
The current version is ICU 67.I don't see where the suggestion to upgrade it before the
next PG release should be addressed but maybe some people on
this list do know or have the leverage to make it happen?Well, you can ask EDB about this, but perhaps the have kept the same ICU
version so indexes will not need to be reindexed.Correct - updating ICU would mean a reindex is required following any
upgrade, major or minor.I would really like to find an acceptable solution to this however as it
really would be good to be able to update ICU.It certainly couldn't and shouldn't be done in a minor.
But doing so in v13 doesn't seem entirely unreasonable, especially given
that I believe we will detect the requirement to reindex thanks to the
versioning, and not just start returning invalid results (like, say, with
those glibc updates).Would it be possible to have the installer even check if there are any icu
indexes in the database. If there aren't, just put in the new version of
icu. If there are, give the user a choice of the old version or new version
and reindex?
That would require fairly large changes to the installer to allow it to
login to the database server (whether that would work would be dependent on
how pg_hba.conf is configured), and also assumes that the ICU ABI hasn't
changed between releases. It would also require some hacky renaming of
DLLs, as they have the version number in them.
The chances of designing, building and testing that thoroughly before v13
is released is about zero I'd say.
--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake
On Tue, Aug 4, 2020 at 11:42 AM Dave Page <dpage@pgadmin.org> wrote:
On Tue, Aug 4, 2020 at 10:29 AM Magnus Hagander <magnus@hagander.net>
wrote:On Tue, Aug 4, 2020 at 10:07 AM Dave Page <dpage@pgadmin.org> wrote:
On Tue, Aug 4, 2020 at 1:04 AM Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Aug 3, 2020 at 08:56:06PM +0200, Daniel Verite wrote:
Hi,
As a follow-up to bug #16570 [1] and other previous discussions
on the mailing-lists, I'm checking out PG13 beta for Windows
from:
https://www.enterprisedb.com/postgresql-early-experience
and it ships with the same obsolete ICU 53 that was used
for PG 10,11,12.
Besides not having the latest Unicode features and fixes, ICU 53
ignores the BCP 47 tags syntax in collations used as examples
in Postgres documentation, which leads to confusion and
false bug reports.
The current version is ICU 67.I don't see where the suggestion to upgrade it before the
next PG release should be addressed but maybe some people on
this list do know or have the leverage to make it happen?Well, you can ask EDB about this, but perhaps the have kept the same ICU
version so indexes will not need to be reindexed.Correct - updating ICU would mean a reindex is required following any
upgrade, major or minor.I would really like to find an acceptable solution to this however as it
really would be good to be able to update ICU.It certainly couldn't and shouldn't be done in a minor.
But doing so in v13 doesn't seem entirely unreasonable, especially given
that I believe we will detect the requirement to reindex thanks to the
versioning, and not just start returning invalid results (like, say, with
those glibc updates).Would it be possible to have the installer even check if there are any
icu indexes in the database. If there aren't, just put in the new version
of icu. If there are, give the user a choice of the old version or new
version and reindex?That would require fairly large changes to the installer to allow it to
login to the database server (whether that would work would be dependent on
how pg_hba.conf is configured), and also assumes that the ICU ABI hasn't
changed between releases. It would also require some hacky renaming of
DLLs, as they have the version number in them.
I assumed it had code for that stuff already. Mainly because I assumed it
supported doing pg_upgrade, which requires similar things no?
The chances of designing, building and testing that thoroughly before v13
is released is about zero I'd say.
Yeah, I can see how it would be for 13 -- unfortunately. But I definitely
think it's something that should go high on the list of things to get fixed
for 14.
//Magnus
On Mon, 3 Aug 2020 at 13:56, Daniel Verite <daniel@manitou-mail.org> wrote:
Hi,
As a follow-up to bug #16570 [1] and other previous discussions
on the mailing-lists, I'm checking out PG13 beta for Windows
from:
https://www.enterprisedb.com/postgresql-early-experience
and it ships with the same obsolete ICU 53 that was used
for PG 10,11,12.
Besides not having the latest Unicode features and fixes, ICU 53
ignores the BCP 47 tags syntax in collations used as examples
in Postgres documentation, which leads to confusion and
false bug reports.
The current version is ICU 67.
Hi,
Sadly, that is managed by EDB and not by the community.
You can try https://www.2ndquadrant.com/en/resources/postgresql-installer-2ndquadrant/
which uses ICU-62.2, is not the latest but should allow you to follow
the examples in the documentation.
--
Jaime Casanova www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Jaime Casanova schrieb am 11.08.2020 um 20:39:
As a follow-up to bug #16570 [1] and other previous discussions
on the mailing-lists, I'm checking out PG13 beta for Windows
from:
https://www.enterprisedb.com/postgresql-early-experience
and it ships with the same obsolete ICU 53 that was used
for PG 10,11,12.
Besides not having the latest Unicode features and fixes, ICU 53
ignores the BCP 47 tags syntax in collations used as examples
in Postgres documentation, which leads to confusion and
false bug reports.
The current version is ICU 67.Sadly, that is managed by EDB and not by the community.
You can try https://www.2ndquadrant.com/en/resources/postgresql-installer-2ndquadrant/
which uses ICU-62.2, is not the latest but should allow you to follow
the examples in the documentation.
One of the reasons I prefer the EDB builds is, that they provide a ZIP file without the installer overhead.
Any chance 2ndQuadrant can supply something like that as well?
Thomas
On Tue, 11 Aug 2020 at 13:45, Thomas Kellerer <shammat@gmx.net> wrote:
Jaime Casanova schrieb am 11.08.2020 um 20:39:
As a follow-up to bug #16570 [1] and other previous discussions
on the mailing-lists, I'm checking out PG13 beta for Windows
from:
https://www.enterprisedb.com/postgresql-early-experience
and it ships with the same obsolete ICU 53 that was used
for PG 10,11,12.
Besides not having the latest Unicode features and fixes, ICU 53
ignores the BCP 47 tags syntax in collations used as examples
in Postgres documentation, which leads to confusion and
false bug reports.
The current version is ICU 67.Sadly, that is managed by EDB and not by the community.
You can try https://www.2ndquadrant.com/en/resources/postgresql-installer-2ndquadrant/
which uses ICU-62.2, is not the latest but should allow you to follow
the examples in the documentation.One of the reasons I prefer the EDB builds is, that they provide a ZIP file without the installer overhead.
Any chance 2ndQuadrant can supply something like that as well?
i don't think so, an unattended install mode is the closest
--
Jaime Casanova www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Tue, Aug 11, 2020 at 02:58:30PM +0200, Magnus Hagander wrote:
On Tue, Aug 4, 2020 at 11:42 AM Dave Page <dpage@pgadmin.org> wrote:
That would require fairly large changes to the installer to allow it to
login to the database server (whether that would work would be�dependent on
how pg_hba.conf is configured), and also assumes that the ICU ABI hasn't
changed between releases. It would also require some hacky renaming of
DLLs, as they have the version number in them.I assumed it had code for that stuff already. Mainly because I assumed it
supported doing pg_upgrade, which requires similar things no?
While pg_upgrade requires having the old and new cluster software in
place, I don't think it helps allowing different ICU versions for each
cluster. I guess you can argue that if you know the user is _not_ going
to be using pg_upgrade, then a new ICU version should be used for the
new cluster.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Fri, Aug 14, 2020 at 09:00:06AM -0400, Bruce Momjian wrote:
On Tue, Aug 11, 2020 at 02:58:30PM +0200, Magnus Hagander wrote:
I assumed it had code for that stuff already. Mainly because I assumed it
supported doing pg_upgrade, which requires similar things no?While pg_upgrade requires having the old and new cluster software in
place, I don't think it helps allowing different ICU versions for each
cluster. I guess you can argue that if you know the user is _not_ going
to be using pg_upgrade, then a new ICU version should be used for the
new cluster.
We have nothing in core, yet, that helps with this kind of problem
with binary upgrades. In the last year, Julien and I worked on an
upgrade case where a glibc upgrade was involved with pg_upgrade used
for PG, and it could not afford the use of a new host to allow a
logical dump/restore to rebuild the indexes from scratch. You can
always run a "reindex -a" after the upgrade to be sure that no indexes
are broken because of the changes with collation versions, but once
you have to give the guarantee that an upgrade does not take longer
than a certain amount of time, the reindex easily becomes the
bottleneck. That's one motivation behind the recent work to add
collation versions to pg_depend entries, which would lead to more
filtering facilities for REINDEX on the backend to get for example the
option to only reindex collation-sensitive indexes (imagine just a
reindexdb --jobs with the collation filtering done at table-level,
that would be fast, or a script doing this work generated by
pg_upgrade).
--
Michael
On Fri, Aug 14, 2020 at 10:23:27PM +0900, Michael Paquier wrote:
We have nothing in core, yet, that helps with this kind of problem
with binary upgrades. In the last year, Julien and I worked on an
upgrade case where a glibc upgrade was involved with pg_upgrade used
for PG, and it could not afford the use of a new host to allow a
logical dump/restore to rebuild the indexes from scratch. You can
always run a "reindex -a" after the upgrade to be sure that no indexes
are broken because of the changes with collation versions, but once
you have to give the guarantee that an upgrade does not take longer
than a certain amount of time, the reindex easily becomes the
bottleneck. That's one motivation behind the recent work to add
collation versions to pg_depend entries, which would lead to more
filtering facilities for REINDEX on the backend to get for example the
option to only reindex collation-sensitive indexes (imagine just a
reindexdb --jobs with the collation filtering done at table-level,
that would be fast, or a script doing this work generated by
pg_upgrade).
Agreed --- only a small percentage of indexes are affected by
collations, and it would be great if we could tell users how to easily
identify them.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Fri, Aug 14, 2020 at 3:00 PM Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Aug 11, 2020 at 02:58:30PM +0200, Magnus Hagander wrote:
On Tue, Aug 4, 2020 at 11:42 AM Dave Page <dpage@pgadmin.org> wrote:
That would require fairly large changes to the installer to allow itto
login to the database server (whether that would work would
be dependent on
how pg_hba.conf is configured), and also assumes that the ICU ABI
hasn't
changed between releases. It would also require some hacky renaming
of
DLLs, as they have the version number in them.
I assumed it had code for that stuff already. Mainly because I assumed it
supported doing pg_upgrade, which requires similar things no?While pg_upgrade requires having the old and new cluster software in
place, I don't think it helps allowing different ICU versions for each
cluster.
Depends on where they are installed (and disclaimer, I don't know how the
windows installers do that). But as long as the ICU libraries are installed
in separate locations for the two versions, which I *think* they are or at
least used to be, it shouldn't have an effect on this in either direction.
That argument really only holds for different versions, not for different
clusters of the same version. But I don't think the installers (natively)
supports multiple clusters of the same version anyway.
The tricky thing is if you want to allow the user to *choose* which ICU
version should be used with postgres version <x>. Because then the user
might also expect an upgrade-path wherein they only upgrade the icu library
on an existing install...
I guess you can argue that if you know the user is _not_ going
to be using pg_upgrade, then a new ICU version should be used for the
new cluster.
Yes, that's exactly the argument I meant :) If the user got the option to
"pick version of ICU: <old>, <new>", with a comment saying "pick old only
if you plan to do a pg_upgrade based upgrade of a different cluster, or if
this instance should participate in replication with a node using <old>",
that would probably help for the vast majority of cases. And of course, if
the installer through other options can determine with certainty that it's
going to be running pg_upgrade for the user, then it can reword the dialog
based on that (that is, it should still allow the user to pick the new
version, as long as they know that their indexes are going to need
reindexing)
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
On Mon, Aug 17, 2020 at 11:19 AM Magnus Hagander <magnus@hagander.net>
wrote:
On Fri, Aug 14, 2020 at 3:00 PM Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Aug 11, 2020 at 02:58:30PM +0200, Magnus Hagander wrote:
On Tue, Aug 4, 2020 at 11:42 AM Dave Page <dpage@pgadmin.org> wrote:
That would require fairly large changes to the installer to allowit to
login to the database server (whether that would work would
be dependent on
how pg_hba.conf is configured), and also assumes that the ICU ABI
hasn't
changed between releases. It would also require some hacky renaming
of
DLLs, as they have the version number in them.
I assumed it had code for that stuff already. Mainly because I assumed
it
supported doing pg_upgrade, which requires similar things no?
No, the installers don't support pg_upgrade directly. They ship it of
course, and the user can manually run it, but the installers won't do that,
and have no ability to login to a cluster except during the post-initdb
phase.
While pg_upgrade requires having the old and new cluster software in
place, I don't think it helps allowing different ICU versions for each
cluster.Depends on where they are installed (and disclaimer, I don't know how the
windows installers do that). But as long as the ICU libraries are installed
in separate locations for the two versions, which I *think* they are or at
least used to be, it shouldn't have an effect on this in either direction.
They are.
That argument really only holds for different versions, not for different
clusters of the same version. But I don't think the installers (natively)
supports multiple clusters of the same version anyway.
They don't. You'd need to manually init a new cluster and register a new
server instance. The installer only has any knowledge of the cluster it
sets up.
The tricky thing is if you want to allow the user to *choose* which ICU
version should be used with postgres version <x>. Because then the user
might also expect an upgrade-path wherein they only upgrade the icu library
on an existing install...I guess you can argue that if you know the user is _not_ going
to be using pg_upgrade, then a new ICU version should be used for the
new cluster.Yes, that's exactly the argument I meant :) If the user got the option to
"pick version of ICU: <old>, <new>", with a comment saying "pick old only
if you plan to do a pg_upgrade based upgrade of a different cluster, or if
this instance should participate in replication with a node using <old>",
that would probably help for the vast majority of cases. And of course, if
the installer through other options can determine with certainty that it's
going to be running pg_upgrade for the user, then it can reword the dialog
based on that (that is, it should still allow the user to pick the new
version, as long as they know that their indexes are going to need
reindexing)
That seems like a very hacky and extremely user-unfriendly approach. How
many users are going to understand options in the installer to deal with
that, or want to go decode the ICU filenames on their existing
installations (which our installers may not actually know about) to figure
out what their current version is?
I would suggest that the better way to handle this would be for pg_upgrade
to (somehow) check the ICU version on the old and new clusters and if
there's a mismatch perform a reindex of any ICU based indexes. I suspect
that may require that the server exposes the ICU version though. That way,
the installers could freely upgrade the ICU version with a new major
release.
--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake
On Mon, Aug 17, 2020 at 1:44 PM Dave Page <dpage@pgadmin.org> wrote:
On Mon, Aug 17, 2020 at 11:19 AM Magnus Hagander <magnus@hagander.net>
wrote:On Fri, Aug 14, 2020 at 3:00 PM Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Aug 11, 2020 at 02:58:30PM +0200, Magnus Hagander wrote:
On Tue, Aug 4, 2020 at 11:42 AM Dave Page <dpage@pgadmin.org> wrote:
That would require fairly large changes to the installer to allowit to
login to the database server (whether that would work would
be dependent on
how pg_hba.conf is configured), and also assumes that the ICU ABI
hasn't
changed between releases. It would also require some hacky
renaming of
DLLs, as they have the version number in them.
I assumed it had code for that stuff already. Mainly because I assumed
it
supported doing pg_upgrade, which requires similar things no?
No, the installers don't support pg_upgrade directly. They ship it of
course, and the user can manually run it, but the installers won't do that,
and have no ability to login to a cluster except during the post-initdb
phase.
Oh, I just assumed it did :)
If it doesn't, I think shipping with a modern ICU is a much smaller problem
really...
While pg_upgrade requires having the old and new cluster software in
place, I don't think it helps allowing different ICU versions for each
cluster.Depends on where they are installed (and disclaimer, I don't know how the
windows installers do that). But as long as the ICU libraries are installed
in separate locations for the two versions, which I *think* they are or at
least used to be, it shouldn't have an effect on this in either direction.They are.
Good. So putting both in wouldn't break things.
That argument really only holds for different versions, not for different
clusters of the same version. But I don't think the installers (natively)
supports multiple clusters of the same version anyway.They don't. You'd need to manually init a new cluster and register a new
server instance. The installer only has any knowledge of the cluster it
sets up.
I'd say that's "unsupported enough" to not be a scenario one has to
consider.
The tricky thing is if you want to allow the user to *choose* which ICU
version should be used with postgres version <x>. Because then the user
might also expect an upgrade-path wherein they only upgrade the icu library
on an existing install...I guess you can argue that if you know the user is _not_ going
to be using pg_upgrade, then a new ICU version should be used for the
new cluster.Yes, that's exactly the argument I meant :) If the user got the option to
"pick version of ICU: <old>, <new>", with a comment saying "pick old only
if you plan to do a pg_upgrade based upgrade of a different cluster, or if
this instance should participate in replication with a node using <old>",
that would probably help for the vast majority of cases. And of course, if
the installer through other options can determine with certainty that it's
going to be running pg_upgrade for the user, then it can reword the dialog
based on that (that is, it should still allow the user to pick the new
version, as long as they know that their indexes are going to need
reindexing)That seems like a very hacky and extremely user-unfriendly approach. How
many users are going to understand options in the installer to deal with
that, or want to go decode the ICU filenames on their existing
installations (which our installers may not actually know about) to figure
out what their current version is?
That was more if the installer actually handles the whole chain. It clearly
doesn't today (since it doesn't support upgrades), I agree this might
definitely be overkill. But then also I don't really see the problem with
just putting a new version of ICU in with the newer versions of PostgreSQL.
That's just puts the user in the same position as they are with any other
platform wrt manual pg_upgrade runs.
I would suggest that the better way to handle this would be for pg_upgrade
to (somehow) check the ICU version on the old and new clusters and if
there's a mismatch perform a reindex of any ICU based indexes. I suspect
that may require that the server exposes the ICU version though. That way,
the installers could freely upgrade the ICU version with a new major
release.
Having pg_upgrade spit out a script that does reindex specifically on the
indexes required would certainly be useful in the generic case, and help
others as well.
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
On Mon, Aug 17, 2020 at 4:14 PM Magnus Hagander <magnus@hagander.net> wrote:
On Mon, Aug 17, 2020 at 1:44 PM Dave Page <dpage@pgadmin.org> wrote:
On Mon, Aug 17, 2020 at 11:19 AM Magnus Hagander <magnus@hagander.net>
wrote:On Fri, Aug 14, 2020 at 3:00 PM Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Aug 11, 2020 at 02:58:30PM +0200, Magnus Hagander wrote:
On Tue, Aug 4, 2020 at 11:42 AM Dave Page <dpage@pgadmin.org> wrote:
That would require fairly large changes to the installer to allowit to
login to the database server (whether that would work would
be dependent on
how pg_hba.conf is configured), and also assumes that the ICU ABI
hasn't
changed between releases. It would also require some hacky
renaming of
DLLs, as they have the version number in them.
I assumed it had code for that stuff already. Mainly because I
assumed it
supported doing pg_upgrade, which requires similar things no?
No, the installers don't support pg_upgrade directly. They ship it of
course, and the user can manually run it, but the installers won't do that,
and have no ability to login to a cluster except during the post-initdb
phase.Oh, I just assumed it did :)
If it doesn't, I think shipping with a modern ICU is a much smaller
problem really...While pg_upgrade requires having the old and new cluster software in
place, I don't think it helps allowing different ICU versions for each
cluster.Depends on where they are installed (and disclaimer, I don't know how
the windows installers do that). But as long as the ICU libraries are
installed in separate locations for the two versions, which I *think* they
are or at least used to be, it shouldn't have an effect on this in either
direction.They are.
Good. So putting both in wouldn't break things.
That argument really only holds for different versions, not for different
clusters of the same version. But I don't think the installers (natively)
supports multiple clusters of the same version anyway.They don't. You'd need to manually init a new cluster and register a new
server instance. The installer only has any knowledge of the cluster it
sets up.I'd say that's "unsupported enough" to not be a scenario one has to
consider.
Agreed. Plus it's not really any different from running multiple clusters
on other OSs where we're likely to be using a vendor supplied ICU that the
user also couldn't change easily.
The tricky thing is if you want to allow the user to *choose* which ICU
version should be used with postgres version <x>. Because then the user
might also expect an upgrade-path wherein they only upgrade the icu library
on an existing install...I guess you can argue that if you know the user is _not_ going
to be using pg_upgrade, then a new ICU version should be used for the
new cluster.Yes, that's exactly the argument I meant :) If the user got the option
to "pick version of ICU: <old>, <new>", with a comment saying "pick old
only if you plan to do a pg_upgrade based upgrade of a different cluster,
or if this instance should participate in replication with a node using
<old>", that would probably help for the vast majority of cases. And of
course, if the installer through other options can determine with certainty
that it's going to be running pg_upgrade for the user, then it can reword
the dialog based on that (that is, it should still allow the user to pick
the new version, as long as they know that their indexes are going to need
reindexing)That seems like a very hacky and extremely user-unfriendly approach. How
many users are going to understand options in the installer to deal with
that, or want to go decode the ICU filenames on their existing
installations (which our installers may not actually know about) to figure
out what their current version is?That was more if the installer actually handles the whole chain. It
clearly doesn't today (since it doesn't support upgrades), I agree this
might definitely be overkill. But then also I don't really see the problem
with just putting a new version of ICU in with the newer versions of
PostgreSQL. That's just puts the user in the same position as they are with
any other platform wrt manual pg_upgrade runs.
Well we can certainly do that if everyone is happy in the knowledge that
it'll mean pg_upgrade users will need to reindex if they've used ICU
collations.
Sandeep; can you have someone do a test build with the latest ICU please
(for background, this would be with the Windows and Mac installers)? If
noone objects, we can push that into the v13 builds before GA. We'd also
need to update the README if we do so.
I would suggest that the better way to handle this would be for
pg_upgrade to (somehow) check the ICU version on the old and new clusters
and if there's a mismatch perform a reindex of any ICU based indexes. I
suspect that may require that the server exposes the ICU version though.
That way, the installers could freely upgrade the ICU version with a new
major release.Having pg_upgrade spit out a script that does reindex specifically on the
indexes required would certainly be useful in the generic case, and help
others as well.
+1
--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake
On Mon, Aug 17, 2020 at 04:55:13PM +0100, Dave Page wrote:
That was more if the installer actually handles the whole chain. It clearly
doesn't today (since it doesn't support upgrades), I agree this might
definitely be overkill. But then also I don't really see the problem with
just putting a new version of ICU in with the newer versions of PostgreSQL.
That's just puts the user in the same position as they are with any other
platform wrt manual pg_upgrade runs.Well we can certainly do that if everyone is happy in the knowledge that it'll
mean pg_upgrade users will need to reindex if they've used ICU collations.Sandeep; can you have someone do a test build with the latest ICU please (for
background, this would be with the Windows and Mac installers)? If noone
objects, we can push that into the v13 builds before GA. We'd also need to
update the README if we do so.
Woh, we don't have any support in pg_upgrade to reindex just indexes
that use ICU collations, and frankly, if they have to reindex, they
might decide that they should just do pg_dump/reload of their cluster at
that point because pg_upgrade is going to be very slow, and they will be
surprised. I can see a lot more people being disappointed by this than
will be happy to have Postgres using a newer ICU library.
Also, is it the ICU library version we should be tracking for reindex,
or each _collation_ version? If the later, do we store the collation
version for each index?
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Mon, Aug 17, 2020 at 02:23:57PM -0400, Bruce Momjian wrote:
Also, is it the ICU library version we should be tracking for reindex,
or each _collation_ version? If the later, do we store the collation
version for each index?
You need to store the collation version(s) for each index. This
thread deals with the problem:
https://commitfest.postgresql.org/29/2367/
/messages/by-id/CAEepm=0uEQCpfq_+LYFBdArCe4Ot98t1aR4eYiYTe=yavQygiQ@mail.gmail.com
That's not all of it as you would still need some filtering
capabilities in the backend to reindex only the collation-sensitive
indexes with a reindex, but that's one step forward into being able to
do that.
--
Michael
On Tue, Aug 18, 2020 at 09:44:35AM +0900, Michael Paquier wrote:
On Mon, Aug 17, 2020 at 02:23:57PM -0400, Bruce Momjian wrote:
Also, is it the ICU library version we should be tracking for reindex,
or each _collation_ version? If the later, do we store the collation
version for each index?You need to store the collation version(s) for each index. This
thread deals with the problem:
https://commitfest.postgresql.org/29/2367/
/messages/by-id/CAEepm=0uEQCpfq_+LYFBdArCe4Ot98t1aR4eYiYTe=yavQygiQ@mail.gmail.comThat's not all of it as you would still need some filtering
capabilities in the backend to reindex only the collation-sensitive
indexes with a reindex, but that's one step forward into being able to
do that.
Oh, we don't even have the version in the system catalogs yet? I guess
when pg_upgrade runs create_index we could grab it then, and for the
pg_upgrade _after_ that, do the checks. This seems like it is years
away from being useful.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
On Mon, Aug 17, 2020 at 7:23 PM Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Aug 17, 2020 at 04:55:13PM +0100, Dave Page wrote:
That was more if the installer actually handles the whole chain. It
clearly
doesn't today (since it doesn't support upgrades), I agree this might
definitely be overkill. But then also I don't really see the problemwith
just putting a new version of ICU in with the newer versions of
PostgreSQL.
That's just puts the user in the same position as they are with any
other
platform wrt manual pg_upgrade runs.
Well we can certainly do that if everyone is happy in the knowledge that
it'll
mean pg_upgrade users will need to reindex if they've used ICU
collations.
Sandeep; can you have someone do a test build with the latest ICU please
(for
background, this would be with the Windows and Mac installers)? If noone
objects, we can push that into the v13 builds before GA. We'd also needto
update the README if we do so.
Woh, we don't have any support in pg_upgrade to reindex just indexes
that use ICU collations, and frankly, if they have to reindex, they
might decide that they should just do pg_dump/reload of their cluster at
that point because pg_upgrade is going to be very slow, and they will be
surprised.
Not necessarily. It's likely that not all indexes use ICU collations, and
you still save time loading what may be large amounts of data.
I agree though, that it *could* be slow.
I can see a lot more people being disappointed by this than
will be happy to have Postgres using a newer ICU library.
Quite possibly, hence my hesitation to push ahead with anything more than a
simple test build at this time.
Also, is it the ICU library version we should be tracking for reindex,
or each _collation_ version? If the later, do we store the collation
version for each index?
I wasn't aware that ICU had the concept of collation versions internally
(which Michael seems to have confirmed downthread). That would potentially
make the number of users needing a reindex even smaller, but as you point
out won't help us for years as we don't store it anyway.
--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake
On Tue, Aug 18, 2020 at 11:24 AM Dave Page <dpage@pgadmin.org> wrote:
On Mon, Aug 17, 2020 at 7:23 PM Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Aug 17, 2020 at 04:55:13PM +0100, Dave Page wrote:
That was more if the installer actually handles the whole chain. It
clearly
doesn't today (since it doesn't support upgrades), I agree this
might
definitely be overkill. But then also I don't really see the
problem with
just putting a new version of ICU in with the newer versions of
PostgreSQL.
That's just puts the user in the same position as they are with any
other
platform wrt manual pg_upgrade runs.
Well we can certainly do that if everyone is happy in the knowledge
that it'll
mean pg_upgrade users will need to reindex if they've used ICU
collations.
Sandeep; can you have someone do a test build with the latest ICU
please (for
background, this would be with the Windows and Mac installers)? If noone
objects, we can push that into the v13 builds before GA. We'd also needto
update the README if we do so.
Woh, we don't have any support in pg_upgrade to reindex just indexes
that use ICU collations, and frankly, if they have to reindex, they
might decide that they should just do pg_dump/reload of their cluster at
that point because pg_upgrade is going to be very slow, and they will be
surprised.Not necessarily. It's likely that not all indexes use ICU collations, and
you still save time loading what may be large amounts of data.I agree though, that it *could* be slow.
I agree it definitely could, but I'm not sure I see any case where it would
actually be slower than the alternative (which would be dump/reload).
I'm also willing to say that given that (1) the windows installers don't
provide a way to do it automatically, and (2) the "commandline challenge"
of running pg_upgrade on WIndows in general, I bet there's a larger
percentage of users who are using dump/reload rather than pg_upgrade on
Windows than on other platforms already in the first place.
I can see a lot more people being disappointed by this than
will be happy to have Postgres using a newer ICU library.
Quite possibly, hence my hesitation to push ahead with anything more than
a simple test build at this time.
My guess would be in the other direction :) But in particular, the vast
majority probably don't care at all, because they're not using ICU
collations.
It might be a slightly larger percentage on Windows who use it, but I'm
willing to bet it's still quite low.
Also, is it the ICU library version we should be tracking for reindex,
or each _collation_ version? If the later, do we store the collation
version for each index?I wasn't aware that ICU had the concept of collation versions internally
(which Michael seems to have confirmed downthread). That would potentially
make the number of users needing a reindex even smaller, but as you point
out won't help us for years as we don't store it anyway.
It does -- and we track it in pg_collation at this point.
I think the part that Michael is referring to is we don't track enough
details on a per-index basis. The suggested changes (in the separate
thread) are to get rid of it from pg_collation and move it to a per-object
dependency.
(And fwiw contains a patch to pg_upgrade to at least give it the ability to
for all old indexes say "i know that my icu is compatible". But yeah, the
full functionality won't be available until upgrading *from* 14)
--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>
Magnus Hagander schrieb am 18.08.2020 um 11:38:
It might be a slightly larger percentage on Windows who use it, but
I'm willing to bet it's still quite low.
I have seen increasingly more questions around ICU collations on Windows due to the fact that people that migrate from SQL Server to Postgres very often keep Windows as the operating system and they want to get SQL Server's case-insensitivity (at least partially) using ICU collations.
Thomas
On Tue, Aug 18, 2020 at 11:39 AM Magnus Hagander <magnus@hagander.net> wrote:
On Tue, Aug 18, 2020 at 11:24 AM Dave Page <dpage@pgadmin.org> wrote:
On Mon, Aug 17, 2020 at 7:23 PM Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Aug 17, 2020 at 04:55:13PM +0100, Dave Page wrote:
I wasn't aware that ICU had the concept of collation versions internally (which Michael seems to have confirmed downthread). That would potentially make the number of users needing a reindex even smaller, but as you point out won't help us for years as we don't store it anyway.
It does -- and we track it in pg_collation at this point.
I think the part that Michael is referring to is we don't track enough details on a per-index basis. The suggested changes (in the separate thread) are to get rid of it from pg_collation and move it to a per-object dependency.
(And fwiw contains a patch to pg_upgrade to at least give it the ability to for all old indexes say "i know that my icu is compatible". But yeah, the full functionality won't be available until upgrading *from* 14)
Indeed, when upgrading from something older than 14, all indexes would
be marked as depending on an unknown collation version as in possibly
corrupted.
On Tue, Aug 18, 2020 at 11:38:38AM +0200, Magnus Hagander wrote:
On Tue, Aug 18, 2020 at 11:24 AM Dave Page <dpage@pgadmin.org> wrote:
Not necessarily. It's likely that not all indexes use ICU collations, and
you still save time loading what may be large amounts of data.I agree though, that it *could* be slow.
I agree it definitely could, but I'm not sure I see any case where it would
actually be slower than the alternative (which would be dump/reload).
Well, given that pg_upgrade is more complex to run than pg_dump/reload,
you then have to weigh the complexity of using pg_upgrade with index
rebuild vs. the simpler pg_dump. Right now, you know pg_upgrade in link
mode is going to be fast, but with the reindex, you have a much more
complex decision to make.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee