Thread: glibc updarte 2.31 to 2.38
Hi, we have SLES 15.5 which has glibc 2.31. Our admin told us that he's about to install the SLES 15.6 update which containsglibc 2.38. I have built our PostgreSQL software from source on SLES 15.5, because we have some special requirements which the packagescannot fulfill. So I have questions: 1) Do I have to build it again on 15.6? 2) Does the glibc update have any impact? I recall having to have everything reindexed when the 2.28 update came due to majorlocale changes, but I didn't have to do it since then. 3) Where and how can I find out if it is necessary to reindex? And how can I find out what indexes would be affected. I'd really appreciate your comments. Thanks very much in advance. Paul
On 9/19/24 07:37, Paul Foerster wrote: > Hi, > > we have SLES 15.5 which has glibc 2.31. Our admin told us that he's about to install the SLES 15.6 update which containsglibc 2.38. > > I have built our PostgreSQL software from source on SLES 15.5, because we have some special requirements which the packagescannot fulfill. So I have questions: > > 1) Do I have to build it again on 15.6? > > 2) Does the glibc update have any impact? I recall having to have everything reindexed when the 2.28 update came due tomajor locale changes, but I didn't have to do it since then. > > 3) Where and how can I find out if it is necessary to reindex? And how can I find out what indexes would be affected. > > I'd really appreciate your comments. Thanks very much in advance. I would take a look at: https://wiki.postgresql.org/wiki/Locale_data_changes It refers to the glibc 2.8 change in particular, but includes some generic tips that could prove useful. The glibc change log below might also be useful: https://sourceware.org/glibc/wiki/Release > > Paul > -- Adrian Klaver adrian.klaver@aklaver.com
Paul Foerster <paul.foerster@gmail.com> writes: > we have SLES 15.5 which has glibc 2.31. Our admin told us that he's about to install the SLES 15.6 update which containsglibc 2.38. > I have built our PostgreSQL software from source on SLES 15.5, because we have some special requirements which the packagescannot fulfill. So I have questions: > 1) Do I have to build it again on 15.6? No, I wouldn't expect that to be necessary. > 2) Does the glibc update have any impact? Maybe. We don't really track glibc changes, so I can't say for sure, but it might be advisable to reindex indexes on string columns. regards, tom lane
On 9/19/24 11:14, Tom Lane wrote: > Paul Foerster <paul.foerster@gmail.com> writes: >> we have SLES 15.5 which has glibc 2.31. Our admin told us that >> he's about to install the SLES 15.6 update which contains glibc >> 2.38. >> 2) Does the glibc update have any impact? > Maybe. We don't really track glibc changes, so I can't say for sure, > but it might be advisable to reindex indexes on string columns. Every glibc major version change potentially impacts the sorting of some strings, which would require reindexing. Whether your actual data trips into any of these changes is another matter. You could check by doing something equivalent to this on every collatable column with an index built on it, in every table: 8<----------- WITH t(s) AS (SELECT <collatable_col> FROM <some_table> ORDER BY 1) SELECT md5(string_agg(t.s, NULL)) FROM t; 8<----------- Check the before and after glibc upgrade result -- if it is the same, you are good to go. If not, rebuild the index before *any* DML is done to the table. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On 9/19/24 13:07, Joe Conway wrote: > On 9/19/24 11:14, Tom Lane wrote: >> Paul Foerster <paul.foerster@gmail.com> writes: >>> we have SLES 15.5 which has glibc 2.31. Our admin told us that >>> he's about to install the SLES 15.6 update which contains glibc >>> 2.38. > >>> 2) Does the glibc update have any impact? >> Maybe. We don't really track glibc changes, so I can't say for sure, >> but it might be advisable to reindex indexes on string columns. > > > Every glibc major version change potentially impacts the sorting of some > strings, which would require reindexing. Whether your actual data trips > into any of these changes is another matter. > > You could check by doing something equivalent to this on every > collatable column with an index built on it, in every table: > > 8<----------- > WITH t(s) AS (SELECT <collatable_col> FROM <some_table> ORDER BY 1) > SELECT md5(string_agg(t.s, NULL)) FROM t; > 8<----------- > > Check the before and after glibc upgrade result -- if it is the same, > you are good to go. If not, rebuild the index before *any* DML is done > to the table. ... and I should have mentioned that in a similar way, if you have any tables that are partitioned by range on collatable columns, the partition boundaries potentially are affected. Similarly, constraints involving expressions on collatable columns may be affected. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi Adrian, > On 19 Sep 2024, at 17:00, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > I would take a look at: > > https://wiki.postgresql.org/wiki/Locale_data_changes > > It refers to the glibc 2.8 change in particular, but includes some generic tips that could prove useful. > > > The glibc change log below might also be useful: > > https://sourceware.org/glibc/wiki/Release I've seen those before but since the article only refers to 2.28 and SUSE 15.3, and I couldn't find anything in the glibcrelease notes, I thought I'd ask. Cheers, Paul
Hi Tom, > On 19 Sep 2024, at 17:14, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > No, I wouldn't expect that to be necessary. I was hoping one of the pros would say that. 🤣 > Maybe. We don't really track glibc changes, so I can't say for sure, > but it might be advisable to reindex indexes on string columns. Advisable is a word I undfortunately can't do much with. We have terabytes and terabytes of data in hundreds of databaseseach having potentially hundreds of columns that are candidates. Just reindexing and taking down applications duringthat time is not an option in a 24x7 high availability environment. Cheer, Paul
Hi Joe, > On 19 Sep 2024, at 19:07, Joe Conway <mail@joeconway.com> wrote: > > Every glibc major version change potentially impacts the sorting of some strings, which would require reindexing. Whetheryour actual data trips into any of these changes is another matter. > > You could check by doing something equivalent to this on every collatable column with an index built on it, in every table: > > 8<----------- > WITH t(s) AS (SELECT <collatable_col> FROM <some_table> ORDER BY 1) > SELECT md5(string_agg(t.s, NULL)) FROM t; > 8<----------- > > Check the before and after glibc upgrade result -- if it is the same, you are good to go. If not, rebuild the index before*any* DML is done to the table. I like the neatness of this one. I think about how to implement this on hundreds of of databases with hundreds of columns.That'll be a challenge, but at least it's a start. Thanks very much for this one. Cheers, Paul
On 9/19/24 13:56, Paul Foerster wrote: >> On 19 Sep 2024, at 17:14, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Maybe. We don't really track glibc changes, so I can't say for sure, >> but it might be advisable to reindex indexes on string columns. > Advisable is a word I undfortunately can't do much with. We have > terabytes and terabytes of data in hundreds of databases each having > potentially hundreds of columns that are candidates. Just reindexing > and taking down applications during that time is not an option in a > 24x7 high availability environment. See my thread-adjacent email, but suffice to say that if there are collation differences that do affect your tables/data, and you allow any inserts or updates, you may wind up with corrupted data (e.g. duplicate data in your otherwise unique indexes/primary keys). For more examples about that see https://joeconway.com/presentations/glibc-SCaLE21x-2024.pdf An potential alternative for you (discussed at the end of that presentation) would be to create a new branch based on your original SLES 15.5 glibc RPM equivalent to this: https://github.com/awslabs/compat-collation-for-glibc/tree/2.17-326.el7 The is likely a non trivial amount of work involved (the port from the AL2 rpm to the RHEL7 rpm took me the better part of a couple of days), but once done your collation is frozen to the specific version you had on 15.5. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi Peter, > On 19 Sep 2024, at 19:43, Peter J. Holzer <hjp-pgsql@hjp.at> wrote: > > I wrote a small script[1] which prints all unicode code points and a few > selected[2] longer strings in order. If you run that before and after > the upgrade and the output doesn't change, you are probably be fine. > (It checks only the default collation, though: If you have indexes using > a different collation you would have to modify the script accordingly.) > > If there are differences, closer inspection might show that the changes > don't affect you. But I would reindex all indexes on text (etc.) columns > just to be sure. > > hp > > [1] https://git.hjp.at:3000/hjp/pgcollate > [2] The selection is highly subjective and totally unscientific. > Additions are welcome. I'm not a Python specialist but I take it that the script need psycopg2, which we probably don't have. So I'd have to buildsome sort of venv around that like I had to do to get Patroni working on our systems. Well, we'll see. Thanks for this script. Cheers, Paul
Hi Joe, > On 19 Sep 2024, at 20:09, Joe Conway <mail@joeconway.com> wrote: > > See my thread-adjacent email, but suffice to say that if there are collation differences that do affect your tables/data,and you allow any inserts or updates, you may wind up with corrupted data (e.g. duplicate data in your otherwiseunique indexes/primary keys). Yes, I know that. > For more examples about that see https://joeconway.com/presentations/glibc-SCaLE21x-2024.pdf A very interesting PDF. Thanks very much. > An potential alternative for you (discussed at the end of that presentation) would be to create a new branch based on youroriginal SLES 15.5 glibc RPM equivalent to this: > > https://github.com/awslabs/compat-collation-for-glibc/tree/2.17-326.el7 > > The is likely a non trivial amount of work involved (the port from the AL2 rpm to the RHEL7 rpm took me the better partof a couple of days), but once done your collation is frozen to the specific version you had on 15.5. I'm not a developer. I have one machine which is equivalent to all other servers except that it has gcc, make and some otherthings for me to build PostgreSQL. I can't make the admins run a rpm on all servers. I can obviously put a library intothe /path/2/postgres/software/lib64 directory but not into the system. Also, my build server does not have internet access. So things like git clone would be an additional show stopper. Unfortunately,I'm pretty limited. Cheers, Paul
Hi Peter, > On 21 Sep 2024, at 00:33, Peter J. Holzer <hjp-pgsql@hjp.at> wrote: > > I don't use SLES but I would expect it to have an RPM for it. > > If you have any test machine which you can upgrade before the production > servers (and given the amount of data and availability requirements you > have, I really hope you do) you should be set. One of our admins did me a favor and upgraded my build server ahead of schedule. So I can both test our current PostgreSQLversion as well as rebuild it if necessary. I can't test all of our data. That'd take quite a few months or more. I just can try to identify some crucial databases andcolumns. When those tests are done, I can only pray and hope for the best. I already expressed the idea of changing all locales to ICU. The problem there is that I'd have to create new instances andthen move each database individually. I wish I could convert already running databases… This also takes time. Still, Ithink I'm going to try this route. It's always a gamble if reindexing is needed or not with any glibc change. Cheers, Paul
On 9/21/24 15:19, Paul Foerster wrote: > I already expressed the idea of changing all locales to ICU. The > problem there is that I'd have to create new instances and then move > each database individually. I wish I could convert already running > databases… This also takes time. Still, I think I'm going to try > this route. It's always a gamble if reindexing is needed or not with > any glibc change. Note that moving to ICU might improve things, but there are similar potential issues with ICU as well. The trick there would be to get your OS distro provider to maintain the same ICU version across major versions of the distro, which is not the case currently. Nor does the PGDG repo do that. -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
I've been working on Unix-like systems for decades and though I thought I understood most of the issues to do with i18n/l10n, I've only just started using Postgres and I don't understand is why these changes ONLY seem to affect Postgres. Or is it more that it also affects text editors and the like, but we just tend to ignore that?
On Sun, 22 Sep 2024, 14:47 Joe Conway, <mail@joeconway.com> wrote:
On 9/21/24 15:19, Paul Foerster wrote:
> I already expressed the idea of changing all locales to ICU. The
> problem there is that I'd have to create new instances and then move
> each database individually. I wish I could convert already running
> databases… This also takes time. Still, I think I'm going to try
> this route. It's always a gamble if reindexing is needed or not with
> any glibc change.
Note that moving to ICU might improve things, but there are similar
potential issues with ICU as well. The trick there would be to get your
OS distro provider to maintain the same ICU version across major
versions of the distro, which is not the case currently. Nor does the
PGDG repo do that.
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Shaheed,
How often do you sort words in text editors?
How often do you have your text editor care whether the word you just typed is the only instance of that word in the document?
Not too often. So... yes, we ignore the problem.
The real question is why nobody notices it in other RDBMSs like Oracle, SQL Server and MySQL.
On Sun, Sep 22, 2024 at 9:59 AM Shaheed Haque <shaheedhaque@gmail.com> wrote:
I've been working on Unix-like systems for decades and though I thought I understood most of the issues to do with i18n/l10n, I've only just started using Postgres and I don't understand is why these changes ONLY seem to affect Postgres. Or is it more that it also affects text editors and the like, but we just tend to ignore that?
On Sun, 22 Sep 2024, 14:47 Joe Conway, <mail@joeconway.com> wrote:On 9/21/24 15:19, Paul Foerster wrote:
> I already expressed the idea of changing all locales to ICU. The
> problem there is that I'd have to create new instances and then move
> each database individually. I wish I could convert already running
> databases… This also takes time. Still, I think I'm going to try
> this route. It's always a gamble if reindexing is needed or not with
> any glibc change.
Note that moving to ICU might improve things, but there are similar
potential issues with ICU as well. The trick there would be to get your
OS distro provider to maintain the same ICU version across major
versions of the distro, which is not the case currently. Nor does the
PGDG repo do that.
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> crustacean!
Am Sun, Sep 22, 2024 at 02:59:34PM +0100 schrieb Shaheed Haque: > I've been working on Unix-like systems for decades and though I thought I > understood most of the issues to do with i18n/l10n, I've only just started > using Postgres and I don't understand is why these changes ONLY seem to > affect Postgres. Or is it more that it also affects text editors and the > like, but we just tend to ignore that? Text editors for example do not persist ordering based on locale. I'm sure there's software ignoring the issue, too. Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Hi Joe > On 22. Sep, 2024, at 15:47, Joe Conway <mail@joeconway.com> wrote: > > Note that moving to ICU might improve things, but there are similar potential issues with ICU as well. The trick therewould be to get your OS distro provider to maintain the same ICU version across major versions of the distro, whichis not the case currently. Nor does the PGDG repo do that. Then I strongly suggest that the PostgreSQL developers develop a fail safe sorting mechanism that holds for generations oflocale changes. Cheers, Paul
Hi Ron, > On 22. Sep, 2024, at 16:11, Ron Johnson <ronljohnsonjr@gmail.com> wrote: > > The real question is why nobody notices it in other RDBMSs like Oracle, SQL Server and MySQL. The answer is simple for Oracle: It includes a whole zoo of locale mappings and uses each one as it is needed. This is oneof the many things with Oracle that only grows over time but does never get smaller again. I suspect it's similar with MariaDB, MySQL, SQL Server and others. Only PostgreSQL has no such thing as a local inventoryand relies on either glibc or ICU. Cheers, Paul
On 9/22/24 09:48, Paul Förster wrote: > Hi Joe > >> On 22. Sep, 2024, at 15:47, Joe Conway <mail@joeconway.com> wrote: >> >> Note that moving to ICU might improve things, but there are similar potential issues with ICU as well. The trick therewould be to get your OS distro provider to maintain the same ICU version across major versions of the distro, whichis not the case currently. Nor does the PGDG repo do that. > > Then I strongly suggest that the PostgreSQL developers develop a fail safe sorting mechanism that holds for generationsof locale changes. https://www.postgresql.org/docs/17/release-17.html#RELEASE-17-HIGHLIGHTS Add a builtin platform-independent collation provider (Jeff Davis) This supports C and C.UTF-8 collations. > > Cheers, > Paul > > > -- Adrian Klaver adrian.klaver@aklaver.com
On 9/22/24 12:53, Adrian Klaver wrote: > On 9/22/24 09:48, Paul Förster wrote: >> Hi Joe >> >>> On 22. Sep, 2024, at 15:47, Joe Conway <mail@joeconway.com> wrote: >>> >>> Note that moving to ICU might improve things, but there are similar potential issues with ICU as well. The trick therewould be to get your OS distro provider to maintain the same ICU version across major versions of the distro, whichis not the case currently. Nor does the PGDG repo do that. >> >> Then I strongly suggest that the PostgreSQL developers develop a fail safe sorting mechanism that holds for generationsof locale changes. > > https://www.postgresql.org/docs/17/release-17.html#RELEASE-17-HIGHLIGHTS > > Add a builtin platform-independent collation provider (Jeff Davis) > > This supports C and C.UTF-8 collations. Yep, what he said -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi Adrian, > On 22 Sep 2024, at 18:53, Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > https://www.postgresql.org/docs/17/release-17.html#RELEASE-17-HIGHLIGHTS > > Add a builtin platform-independent collation provider (Jeff Davis) > > This supports C and C.UTF-8 collations. I must admit that I haven't read the readme fully yet, but this is definitely great news. Thanks very much. Cheers, Paul