Thread: glibc updarte 2.31 to 2.38

glibc updarte 2.31 to 2.38

From
Paul Foerster
Date:
Hi,

we have SLES 15.5 which has glibc 2.31. Our admin told us that he's about to install the SLES 15.6 update which
containsglibc 2.38. 

I have built our PostgreSQL software from source on SLES 15.5, because we have some special requirements which the
packagescannot fulfill. So I have questions: 

1) Do I have to build it again on 15.6?

2) Does the glibc update have any impact? I recall having to have everything reindexed when the 2.28 update came due to
majorlocale changes, but I didn't have to do it since then. 

3) Where and how can I find out if it is necessary to reindex? And how can I find out what indexes would be affected.

I'd really appreciate your comments. Thanks very much in advance.

Paul


Re: glibc updarte 2.31 to 2.38

From
Adrian Klaver
Date:
On 9/19/24 07:37, Paul Foerster wrote:
> Hi,
> 
> we have SLES 15.5 which has glibc 2.31. Our admin told us that he's about to install the SLES 15.6 update which
containsglibc 2.38.
 
> 
> I have built our PostgreSQL software from source on SLES 15.5, because we have some special requirements which the
packagescannot fulfill. So I have questions:
 
> 
> 1) Do I have to build it again on 15.6?
> 
> 2) Does the glibc update have any impact? I recall having to have everything reindexed when the 2.28 update came due
tomajor locale changes, but I didn't have to do it since then.
 
> 
> 3) Where and how can I find out if it is necessary to reindex? And how can I find out what indexes would be
affected.
> 
> I'd really appreciate your comments. Thanks very much in advance.

I would take a look at:

https://wiki.postgresql.org/wiki/Locale_data_changes

It refers to the glibc 2.8 change in particular, but includes some 
generic tips that could prove useful.


The glibc change log below might also be useful:

https://sourceware.org/glibc/wiki/Release

> 
> Paul
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: glibc updarte 2.31 to 2.38

From
Tom Lane
Date:
Paul Foerster <paul.foerster@gmail.com> writes:
> we have SLES 15.5 which has glibc 2.31. Our admin told us that he's about to install the SLES 15.6 update which
containsglibc 2.38. 
> I have built our PostgreSQL software from source on SLES 15.5, because we have some special requirements which the
packagescannot fulfill. So I have questions: 

> 1) Do I have to build it again on 15.6?

No, I wouldn't expect that to be necessary.

> 2) Does the glibc update have any impact?

Maybe.  We don't really track glibc changes, so I can't say for sure,
but it might be advisable to reindex indexes on string columns.

            regards, tom lane



Re: glibc updarte 2.31 to 2.38

From
Joe Conway
Date:
On 9/19/24 11:14, Tom Lane wrote:
> Paul Foerster <paul.foerster@gmail.com> writes:
>> we have SLES 15.5 which has glibc 2.31. Our admin told us that
>> he's about to install the SLES 15.6 update which contains glibc
>> 2.38.

>> 2) Does the glibc update have any impact?
> Maybe.  We don't really track glibc changes, so I can't say for sure,
> but it might be advisable to reindex indexes on string columns.


Every glibc major version change potentially impacts the sorting of some 
strings, which would require reindexing. Whether your actual data trips 
into any of these changes is another matter.

You could check by doing something equivalent to this on every 
collatable column with an index built on it, in every table:

8<-----------
WITH t(s) AS (SELECT <collatable_col> FROM <some_table> ORDER BY 1)
  SELECT md5(string_agg(t.s, NULL)) FROM t;
8<-----------

Check the before and after glibc upgrade result -- if it is the same, 
you are good to go. If not, rebuild the index before *any* DML is done 
to the table.

-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: glibc updarte 2.31 to 2.38

From
Joe Conway
Date:
On 9/19/24 13:07, Joe Conway wrote:
> On 9/19/24 11:14, Tom Lane wrote:
>> Paul Foerster <paul.foerster@gmail.com> writes:
>>> we have SLES 15.5 which has glibc 2.31. Our admin told us that
>>> he's about to install the SLES 15.6 update which contains glibc
>>> 2.38.
> 
>>> 2) Does the glibc update have any impact?
>> Maybe.  We don't really track glibc changes, so I can't say for sure,
>> but it might be advisable to reindex indexes on string columns.
> 
> 
> Every glibc major version change potentially impacts the sorting of some
> strings, which would require reindexing. Whether your actual data trips
> into any of these changes is another matter.
> 
> You could check by doing something equivalent to this on every
> collatable column with an index built on it, in every table:
> 
> 8<-----------
> WITH t(s) AS (SELECT <collatable_col> FROM <some_table> ORDER BY 1)
>    SELECT md5(string_agg(t.s, NULL)) FROM t;
> 8<-----------
> 
> Check the before and after glibc upgrade result -- if it is the same,
> you are good to go. If not, rebuild the index before *any* DML is done
> to the table.


... and I should have mentioned that in a similar way, if you have any 
tables that are partitioned by range on collatable columns, the 
partition boundaries potentially are affected. Similarly, constraints 
involving expressions on collatable columns may be affected.

-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: glibc updarte 2.31 to 2.38

From
Paul Foerster
Date:
Hi Adrian,

> On 19 Sep 2024, at 17:00, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
>
> I would take a look at:
>
> https://wiki.postgresql.org/wiki/Locale_data_changes
>
> It refers to the glibc 2.8 change in particular, but includes some generic tips that could prove useful.
>
>
> The glibc change log below might also be useful:
>
> https://sourceware.org/glibc/wiki/Release

I've seen those before but since the article only refers to 2.28 and SUSE 15.3, and I couldn't find anything in the
glibcrelease notes, I thought I'd ask. 

Cheers,
Paul




Re: glibc updarte 2.31 to 2.38

From
Paul Foerster
Date:
Hi Tom,

> On 19 Sep 2024, at 17:14, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> No, I wouldn't expect that to be necessary.

I was hoping one of the pros would say that. 🤣

> Maybe.  We don't really track glibc changes, so I can't say for sure,
> but it might be advisable to reindex indexes on string columns.

Advisable is a word I undfortunately can't do much with. We have terabytes and terabytes of data in hundreds of
databaseseach having potentially hundreds of columns that are candidates. Just reindexing and taking down applications
duringthat time is not an option in a 24x7 high availability environment. 

Cheer,
Paul




Re: glibc updarte 2.31 to 2.38

From
Paul Foerster
Date:
Hi Joe,

> On 19 Sep 2024, at 19:07, Joe Conway <mail@joeconway.com> wrote:
>
> Every glibc major version change potentially impacts the sorting of some strings, which would require reindexing.
Whetheryour actual data trips into any of these changes is another matter. 
>
> You could check by doing something equivalent to this on every collatable column with an index built on it, in every
table:
>
> 8<-----------
> WITH t(s) AS (SELECT <collatable_col> FROM <some_table> ORDER BY 1)
> SELECT md5(string_agg(t.s, NULL)) FROM t;
> 8<-----------
>
> Check the before and after glibc upgrade result -- if it is the same, you are good to go. If not, rebuild the index
before*any* DML is done to the table. 

I like the neatness of this one. I think about how to implement this on hundreds of of databases with hundreds of
columns.That'll be a challenge, but at least it's a start. 

Thanks very much for this one.

Cheers,
Paul


Re: glibc updarte 2.31 to 2.38

From
Joe Conway
Date:
On 9/19/24 13:56, Paul Foerster wrote:
>> On 19 Sep 2024, at 17:14, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Maybe.  We don't really track glibc changes, so I can't say for sure,
>> but it might be advisable to reindex indexes on string columns.

> Advisable is a word I undfortunately can't do much with. We have
> terabytes and terabytes of data in hundreds of databases each having
> potentially hundreds of columns that are candidates. Just reindexing
> and taking down applications during that time is not an option in a
> 24x7 high availability environment.

See my thread-adjacent email, but suffice to say that if there are 
collation differences that do affect your tables/data, and you allow any 
inserts or updates, you may wind up with corrupted data (e.g. duplicate 
data in your otherwise unique indexes/primary keys).

For more examples about that see 
https://joeconway.com/presentations/glibc-SCaLE21x-2024.pdf

An potential alternative for you (discussed at the end of that 
presentation) would be to create a new branch based on your original 
SLES 15.5 glibc RPM equivalent to this:

https://github.com/awslabs/compat-collation-for-glibc/tree/2.17-326.el7

The is likely a non trivial amount of work involved (the port from the 
AL2 rpm to the RHEL7 rpm took me the better part of a couple of days), 
but once done your collation is frozen to the specific version you had 
on 15.5.


-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: glibc updarte 2.31 to 2.38

From
Paul Foerster
Date:
Hi Peter,

> On 19 Sep 2024, at 19:43, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
>
> I wrote a small script[1] which prints all unicode code points and a few
> selected[2] longer strings in order. If you run that before and after
> the upgrade and the output doesn't change, you are probably be fine.
> (It checks only the default collation, though: If you have indexes using
> a different collation you would have to modify the script accordingly.)
>
> If there are differences, closer inspection might show that the changes
> don't affect you. But I would reindex all indexes on text (etc.) columns
> just to be sure.
>
>        hp
>
> [1] https://git.hjp.at:3000/hjp/pgcollate
> [2] The selection is highly subjective and totally unscientific.
>    Additions are welcome.

I'm not a Python specialist but I take it that the script need psycopg2, which we probably don't have. So I'd have to
buildsome sort of venv around that like I had to do to get Patroni working on our systems. 

Well, we'll see.

Thanks for this script.

Cheers,
Paul




Re: glibc updarte 2.31 to 2.38

From
Paul Foerster
Date:
Hi Joe,

> On 19 Sep 2024, at 20:09, Joe Conway <mail@joeconway.com> wrote:
>
> See my thread-adjacent email, but suffice to say that if there are collation differences that do affect your
tables/data,and you allow any inserts or updates, you may wind up with corrupted data (e.g. duplicate data in your
otherwiseunique indexes/primary keys). 

Yes, I know that.

> For more examples about that see https://joeconway.com/presentations/glibc-SCaLE21x-2024.pdf

A very interesting PDF. Thanks very much.

> An potential alternative for you (discussed at the end of that presentation) would be to create a new branch based on
youroriginal SLES 15.5 glibc RPM equivalent to this: 
>
> https://github.com/awslabs/compat-collation-for-glibc/tree/2.17-326.el7
>
> The is likely a non trivial amount of work involved (the port from the AL2 rpm to the RHEL7 rpm took me the better
partof a couple of days), but once done your collation is frozen to the specific version you had on 15.5. 

I'm not a developer. I have one machine which is equivalent to all other servers except that it has gcc, make and some
otherthings for me to build PostgreSQL. I can't make the admins run a rpm on all servers. I can obviously put a library
intothe /path/2/postgres/software/lib64 directory but not into the system. 

Also, my build server does not have internet access. So things like git clone would be an additional show stopper.
Unfortunately,I'm pretty limited. 

Cheers,
Paul




Re: glibc updarte 2.31 to 2.38

From
Paul Foerster
Date:
Hi Peter,

> On 21 Sep 2024, at 00:33, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
>
> I don't use SLES but I would expect it to have an RPM for it.
>
> If you have any test machine which you can upgrade before the production
> servers (and given the amount of data and availability requirements you
> have, I really hope you do) you should be set.

One of our admins did me a favor and upgraded my build server ahead of schedule. So I can both test our current
PostgreSQLversion as well as rebuild it if necessary. 

I can't test all of our data. That'd take quite a few months or more. I just can try to identify some crucial databases
andcolumns. When those tests are done, I can only pray and hope for the best. 

I already expressed the idea of changing all locales to ICU. The problem there is that I'd have to create new instances
andthen move each database individually. I wish I could convert already running databases… This also takes time. Still,
Ithink I'm going to try this route. It's always a gamble if reindexing is needed or not with any glibc change. 

Cheers,
Paul


Re: glibc updarte 2.31 to 2.38

From
Joe Conway
Date:
On 9/21/24 15:19, Paul Foerster wrote:
> I already expressed the idea of changing all locales to ICU. The
> problem there is that I'd have to create new instances and then move
> each database individually. I wish I could convert already running
> databases… This also takes time. Still, I think I'm going to try
> this route. It's always a gamble if reindexing is needed or not with
> any glibc change.


Note that moving to ICU might improve things, but there are similar 
potential issues with ICU as well. The trick there would be to get your 
OS distro provider to maintain the same ICU version across major 
versions of the distro, which is not the case currently. Nor does the 
PGDG repo do that.


-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: glibc updarte 2.31 to 2.38

From
Shaheed Haque
Date:

I've been working on Unix-like systems for decades and though I thought I understood most of the issues to do with i18n/l10n, I've only just started using Postgres and I don't understand is why these changes ONLY seem to affect Postgres. Or is it more that it also affects text editors and the like, but we just tend to ignore that?

On Sun, 22 Sep 2024, 14:47 Joe Conway, <mail@joeconway.com> wrote:
On 9/21/24 15:19, Paul Foerster wrote:
> I already expressed the idea of changing all locales to ICU. The
> problem there is that I'd have to create new instances and then move
> each database individually. I wish I could convert already running
> databases… This also takes time. Still, I think I'm going to try
> this route. It's always a gamble if reindexing is needed or not with
> any glibc change.


Note that moving to ICU might improve things, but there are similar
potential issues with ICU as well. The trick there would be to get your
OS distro provider to maintain the same ICU version across major
versions of the distro, which is not the case currently. Nor does the
PGDG repo do that.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


Re: glibc updarte 2.31 to 2.38

From
Ron Johnson
Date:
Shaheed,

How often do you sort words in text editors?
How often do you have your text editor care whether the word you just typed is the only instance of that word in the document?

Not too often.  So... yes, we ignore the problem.

The real question is why nobody notices it in other RDBMSs like Oracle, SQL Server and MySQL.

On Sun, Sep 22, 2024 at 9:59 AM Shaheed Haque <shaheedhaque@gmail.com> wrote:

I've been working on Unix-like systems for decades and though I thought I understood most of the issues to do with i18n/l10n, I've only just started using Postgres and I don't understand is why these changes ONLY seem to affect Postgres. Or is it more that it also affects text editors and the like, but we just tend to ignore that?

On Sun, 22 Sep 2024, 14:47 Joe Conway, <mail@joeconway.com> wrote:
On 9/21/24 15:19, Paul Foerster wrote:
> I already expressed the idea of changing all locales to ICU. The
> problem there is that I'd have to create new instances and then move
> each database individually. I wish I could convert already running
> databases… This also takes time. Still, I think I'm going to try
> this route. It's always a gamble if reindexing is needed or not with
> any glibc change.


Note that moving to ICU might improve things, but there are similar
potential issues with ICU as well. The trick there would be to get your
OS distro provider to maintain the same ICU version across major
versions of the distro, which is not the case currently. Nor does the
PGDG repo do that.


--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> crustacean!

Re: glibc updarte 2.31 to 2.38

From
Karsten Hilbert
Date:
Am Sun, Sep 22, 2024 at 02:59:34PM +0100 schrieb Shaheed Haque:

> I've been working on Unix-like systems for decades and though I thought I
> understood most of the issues to do with i18n/l10n, I've only just started
> using Postgres and I don't understand is why these changes ONLY seem to
> affect Postgres. Or is it more that it also affects text editors and the
> like, but we just tend to ignore that?

Text editors for example do not persist ordering based on locale.

I'm sure there's software ignoring the issue, too.

Karsten
--
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B



Re: glibc updarte 2.31 to 2.38

From
Paul Förster
Date:
Hi Joe

> On 22. Sep, 2024, at 15:47, Joe Conway <mail@joeconway.com> wrote:
>
> Note that moving to ICU might improve things, but there are similar potential issues with ICU as well. The trick
therewould be to get your OS distro provider to maintain the same ICU version across major versions of the distro,
whichis not the case currently. Nor does the PGDG repo do that. 

Then I strongly suggest that the PostgreSQL developers develop a fail safe sorting mechanism that holds for generations
oflocale changes. 

Cheers,
Paul




Re: glibc updarte 2.31 to 2.38

From
Paul Förster
Date:
Hi Ron,

> On 22. Sep, 2024, at 16:11, Ron Johnson <ronljohnsonjr@gmail.com> wrote:
>
> The real question is why nobody notices it in other RDBMSs like Oracle, SQL Server and MySQL.

The answer is simple for Oracle: It includes a whole zoo of locale mappings and uses each one as it is needed. This is
oneof the many things with Oracle that only grows over time but does never get smaller again. 

I suspect it's similar with MariaDB, MySQL, SQL Server and others. Only PostgreSQL has no such thing as a local
inventoryand relies on either glibc or ICU. 

Cheers,
Paul


Re: glibc updarte 2.31 to 2.38

From
Adrian Klaver
Date:
On 9/22/24 09:48, Paul Förster wrote:
> Hi Joe
> 
>> On 22. Sep, 2024, at 15:47, Joe Conway <mail@joeconway.com> wrote:
>>
>> Note that moving to ICU might improve things, but there are similar potential issues with ICU as well. The trick
therewould be to get your OS distro provider to maintain the same ICU version across major versions of the distro,
whichis not the case currently. Nor does the PGDG repo do that.
 
> 
> Then I strongly suggest that the PostgreSQL developers develop a fail safe sorting mechanism that holds for
generationsof locale changes.
 

https://www.postgresql.org/docs/17/release-17.html#RELEASE-17-HIGHLIGHTS

Add a builtin platform-independent collation provider (Jeff Davis)

This supports C and C.UTF-8 collations.

> 
> Cheers,
> Paul
> 
> 
> 

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: glibc updarte 2.31 to 2.38

From
Joe Conway
Date:
On 9/22/24 12:53, Adrian Klaver wrote:
> On 9/22/24 09:48, Paul Förster wrote:
>> Hi Joe
>> 
>>> On 22. Sep, 2024, at 15:47, Joe Conway <mail@joeconway.com> wrote:
>>>
>>> Note that moving to ICU might improve things, but there are similar potential issues with ICU as well. The trick
therewould be to get your OS distro provider to maintain the same ICU version across major versions of the distro,
whichis not the case currently. Nor does the PGDG repo do that.
 
>> 
>> Then I strongly suggest that the PostgreSQL developers develop a fail safe sorting mechanism that holds for
generationsof locale changes.
 
> 
> https://www.postgresql.org/docs/17/release-17.html#RELEASE-17-HIGHLIGHTS
> 
> Add a builtin platform-independent collation provider (Jeff Davis)
> 
> This supports C and C.UTF-8 collations.


Yep, what he said

-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com