glibc 2.35-2.39 upgrade requirements

Started by Kamen Kalchevabout 1 year ago5 messagesgeneral

kalchev035@gmail.com

about 1 year ago

Hi everyone, we're planning to upgrade the OS running Postgres from ubuntu
jammy to ubuntu noble. As part of the OS change, the glibc version will be
changed from glibc 2.35 to glibc 2.39..

Can someone confirm if changing the glibc between those versions will
require a full reindex of the Postgres cluster?

Thanks in advance.

Ron

ronljohnsonjr@gmail.com

about 1 year ago

In reply to: Kamen Kalchev (#1)

Re: glibc 2.35-2.39 upgrade requirements

On Fri, Jan 17, 2025 at 1:12 AM Kamen Kalchev <kalchev035@gmail.com> wrote:

Hi everyone, we're planning to upgrade the OS running Postgres from ubuntu
jammy to ubuntu noble. As part of the OS change, the glibc version will be
changed from glibc 2.35 to glibc 2.39..

Can someone confirm if changing the glibc between those versions will
require a full reindex of the Postgres cluster?

You never have to reindex _everything_. Only (for some definition of
"only") indices with text/char/varchar/name columns need to be rebuilt.

This is how I find such indices:
create or replace view dba.all_indices_types as
select tbcl.relnamespace::regnamespace::text||'.'||tbcl.relname as
table_name
, ndcl.relname as index_name
, array_agg(ty.typname order by att.attnum) as index_types
from pg_class ndcl
inner join pg_index nd
on (ndcl.oid = nd.indexrelid and ndcl.relkind in ('i', 'I')
inner join pg_class tbcl
on (nd.indrelid = tbcl.oid and tbcl.relkind in ('r', 'R', 'm'))
inner join pg_attribute att
on att.attrelid = nd.indexrelid
inner join pg_type ty
on att.atttypid = ty.oid
where tbcl.relnamespace::regnamespace::text != 'pg_catalog'
group by tbcl.relnamespace::regnamespace::text||'.'||tbcl.relname
, ndcl.relname
order by 1, 2;
select * from dba.all_indices_types where index_types &&
'{"text","varchar","char","text"}';

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

Karsten Hilbert

Karsten.Hilbert@gmx.net

about 1 year ago

In reply to: Ron (#2)

Re: glibc 2.35-2.39 upgrade requirements

You will want to ingest

https://www.joeconway.com/presentations/glibc-PostgresConfSEA-2024.pdf

Karsten
--
GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B

Tom Lane

tgl@sss.pgh.pa.us

about 1 year ago

In reply to: Kamen Kalchev (#1)

Re: glibc 2.35-2.39 upgrade requirements

Kamen Kalchev <kalchev035@gmail.com> writes:

Hi everyone, we're planning to upgrade the OS running Postgres from ubuntu
jammy to ubuntu noble. As part of the OS change, the glibc version will be
changed from glibc 2.35 to glibc 2.39..
Can someone confirm if changing the glibc between those versions will
require a full reindex of the Postgres cluster?

Maybe, maybe not. According to [1]https://wiki.postgresql.org/wiki/Locale_data_changes, the last glibc collation change
that the PG community really noticed was in glibc 2.28. So maybe
there weren't any significant changes between 2.35 and 2.39. The
conservative path would certainly be to reindex all textual columns
(though you can skip any that have collation "C").

If you feel a need to try to avoid that, you could dump some of your
textual columns into files and sort those using sort(1) on both
old and new systems. (Be sure that the LANG/LC_xxx environment
matches what you use for the database.) If the results are different
then you definitely need to reindex; if they are the same then maybe
you're okay. Pay particular attention to columns containing
punctuation or non-ASCII characters, as those are the areas most
likely to see changes.

regards, tom lane

[1]: https://wiki.postgresql.org/wiki/Locale_data_changes

Jeremy Schneider

schneider@ardentperf.com

about 1 year ago

In reply to: Tom Lane (#4)

Re: glibc 2.35-2.39 upgrade requirements

On Fri, 17 Jan 2025 10:27:04 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:

Kamen Kalchev <kalchev035@gmail.com> writes:

Hi everyone, we're planning to upgrade the OS running Postgres from
ubuntu jammy to ubuntu noble. As part of the OS change, the glibc
version will be changed from glibc 2.35 to glibc 2.39..
Can someone confirm if changing the glibc between those versions
will require a full reindex of the Postgres cluster?

Maybe, maybe not. According to [1], the last glibc collation change
that the PG community really noticed was in glibc 2.28. So maybe
there weren't any significant changes between 2.35 and 2.39. The
conservative path would certainly be to reindex all textual columns
(though you can skip any that have collation "C").

I haven't run 2.39 through the scan yet [1]; I should do that because
someone was asking the same question on postgres slack. But note that
every single ubuntu LTS and every single RHEL major release in the last
10 years has had collation changes, except for ubuntu 14.04 ... so it's
worth being cautious. Collations are a bit like time zones - small
changes are always happening, but you might not always notice.

Jeff Davis and I did a talk at the last pgconf about this, the
recording is online [2].

Personally I would recommend using the builtin C collation as database
default starting in pg17, and using ICU to do linguistic collation at
the table or query level when needed. With ICU there's at least the
option to rebuild old versions on new operating system majors, if
needed. (Though rebuilding objects - not just indexes, but anything
depending on the collation - is the best course.)

And be careful about hot standbys, FDWs, and other places where you can
get little surprises with different OS majors. The YouTube recording
has lots of info.

-Jeremy

1: https://github.com/ardentperf/glibc-unicode-sorting
2: https://www.youtube.com/watch?v=KTA6oau7tl8