long analyze, libc bug and libicu - Mailing list pgsql-bugs

From Grigory Smolkin
Subject long analyze, libc bug and libicu
Date
Msg-id dc49b0d2-d20d-adeb-ecdb-0ad25461386d@postgrespro.ru
Whole thread Raw
Responses Re: long analyze, libc bug and libicu
List pgsql-bugs
Hello, everyone!

One of your clients with PostgreSQL 10.4 recently noticed that ANALYZE 
started to take unproportionally long amount of time. Backtrace:

#0  0x00007f244b68ca06 in strcoll_l () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f244e37802d in varstrfastcmp_locale (x=139794027137284, 
y=139794027967812, ssup=<optimized out>) at 
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/backend/utils/adt/varlena.c:2185
#2  0x00007f244e0d6b15 in ApplySortComparator (isNull1=0 '\000', 
isNull2=0 '\000', ssup=0x7fffa958de80, datum2=<optimized out>, 
datum1=<optimized out>) at 
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/include/utils/sortsupport.h:225
#3  compare_scalars (a=<optimized out>, b=<optimized out>, 
arg=0x7fffa958de70) at 
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/backend/commands/analyze.c:2821
#4  0x00007f244e3e1023 in med3 (a=0x7f24578f6bd8 "\224__U$\177", 
b=0x7f24578f6c78 "\004ITU$\177", c=0x7f24578f6d18 "D\365`U$\177", 
cmp=0x7f244e0d6af0 <compare_scalars>, arg=0x7fffa958de70) at 
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/port/qsort_arg.c:108
#5  0x00007f244e3e13ec in qsort_arg (a=a@entry=0x7f24578f6bd8, 
n=<optimized out>, n@entry=82, es=es@entry=16, 
cmp=cmp@entry=0x7f244e0d6af0 <compare_scalars>, 
arg=arg@entry=0x7fffa958de70) at 
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/port/qsort_arg.c:157
#6  0x00007f244e3e1475 in qsort_arg (a=a@entry=0x7f24578f6248, 
n=<optimized out>, n@entry=235, es=es@entry=16, 
cmp=cmp@entry=0x7f244e0d6af0 <compare_scalars>, 
arg=arg@entry=0x7fffa958de70) at 
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/port/qsort_arg.c:217
#7  0x00007f244e3e1475 in qsort_arg (a=a@entry=0x7f24578f5348, 
n=<optimized out>, es=es@entry=16, cmp=cmp@entry=0x7f244e0d6af0 
<compare_scalars>, arg=arg@entry=0x7fffa958de70) at 
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/port/qsort_arg.c:217
#8  0x00007f244e3e13ac in qsort_arg (a=a@entry=0x7f24578f5348, 
n=<optimized out>, es=es@entry=16, cmp=cmp@entry=0x7f244e0d6af0 
<compare_scalars>, arg=arg@entry=0x7fffa958de70) at 
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/port/qsort_arg.c:203
#9  0x00007f244e3e13ac in qsort_arg (a=0x7f24578f5348, 
a@entry=0x7f2457816928, n=<optimized out>, n@entry=109083, 
es=es@entry=16, cmp=cmp@entry=0x7f244e0d6af0 <compare_scalars>, 
arg=arg@entry=0x7fffa958de70) at 
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/port/qsort_arg.c:203
#10 0x00007f244e0d604f in compute_scalar_stats (stats=0x7f244fd168f8, 
fetchfunc=<optimized out>, samplerows=110194, totalrows=110194) at 
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/backend/commands/analyze.c:2378
#11 0x00007f244e0d76f9 in do_analyze_rel 
(onerel=onerel@entry=0x7f244fc339d8, options=options@entry=99, 
params=params@entry=0x7f244f98b9f0, va_cols=va_cols@entry=0x0, 
acquirefunc=<optimized out>, relpages=11515, inh=0 '\000', 
in_outer_xact=0 '\000', elevel=13) at 
/build/postgresql-10-U6N320/postgresql-10-10.4/build/../src/backend/commands/analyze.c:545


After a bit of digging it became clear that they hit the libc bug:
https://sourceware.org/bugzilla/show_bug.cgi?id=18441

We decided to switch to libicu collation, but it became clear that 
ANALYZE ignore column level COLLATE and stubbornly use default collation.

It happens thanks to this piece of code:

static void
compute_scalar_stats(VacAttrStatsP stats,
                      AnalyzeAttrFetchFunc fetchfunc,
                      int samplerows,
                      double totalrows)
....
     /* We always use the default collation for statistics */
     ssup.ssup_collation = DEFAULT_COLLATION_OID;
....

Ok, we decided to set libicu collation as default for database, although 
it is a major pain.
But turns out it`s impossible to set libicu collation as default for 
database:

postgres=# create database icu LC_COLLATE = 'en-x-icu';
ERROR:  invalid locale name: "en-x-icu"

Why ANALYZE igrones column COLLATE? It there workaround to force ANALYZE 
to use libicu?

-- 
Grigory Smolkin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



pgsql-bugs by date:

Previous
From: Jeff Janes
Date:
Subject: Re: NOTIFY does not work as expected
Next
From: Taiki Kondo
Date:
Subject: Typo in Japanese translation of psql.