Greetings,
* Grigory Smolkin (g.smolkin@postgrespro.ru) wrote:
> On 07/07/2018 10:10 AM, Peter Eisentraut wrote:
> >On 05.07.18 17:05, Grigory Smolkin wrote:
> >>Why ANALYZE igrones column COLLATE?
> >I think the statistics would be mostly the same independent of which
> >collation you use. This could possibly be refined, but I don't think
> >it's a major problem right now.
>
> Thank you for your interest in this problem!
>
> >I think the statistics would be mostly the same independent of which
> >collation you use.
>
> I assumed that one of the goals of using libicu is to be independent from
> libc collation and it`s bugs and inconsistencies, but current ANALYZE forced
> to use libc anyway, which undermines that goal.
I would have thought so too, especially in a case like you describe
below...
> > This could possibly be refined, but I don't think
> >it's a major problem right now.
>
> It`s a major problem to people, who use Thai alphabet.
> In attachment there is a data sample(33MB on my machine). ANALYZE`ing it
> comes up with following results:
>
> postgres=# ANALYZE t_icu_coll;
> ANALYZE
> Time: 2252086.648 ms
>
> 37minutes on 33MB table is painful. On big tables autovacuum ANALYZE goes
> for hours, starving autovacuum VACUUM for worker
> slots(autovacuum_max_workers).
> Another major problem is that in strol_l() backend process ignores
> pg_terminate_backend()/pg_cancel_backend() functions.
>
> With attached patch this problem goes away:
>
> postgres=# analyze t_icu_coll;
> ANALYZE
> Time: 161.419 ms
Wow, that's definitely an issue.
I haven't looked at the patch in any depth, but definitely a +1 from me
for figuring out how to fix this issue..
Thanks!
Stephen