Home > mailing lists

Re: ICU integration - Mailing list pgsql-hackers

From	Doug Doole
Subject	Re: ICU integration
Date	September 7, 2016 20:32:50
Msg-id	CAP6UvaMTJYCxSBqhOnMwTS-vu=u7wvut-3k6TQ4eddtnSd4a1Q@mail.gmail.com Whole thread Raw
In response to	Re: ICU integration (Peter Geoghegan <pg@heroku.com>)
List	pgsql-hackers

Tree view

This isn't a problem for Postgres, or at least wouldn't be right now,
because we don't have case insensitive collations.

I was wondering if Postgres might be that way. It does avoid the RI constraint problem, but there are still troubles with range based predicates. (My previous project wanted case/accent insensitive collations, so we got to deal with it all.)

So, we use a strcmp()/memcmp() tie-breaker when strcoll() indicates equality, while also making the general notion of text equality actually mean binary equality.

We used a similar tie breaker in places. (e.g. Index keys needed to be identical, not just equal. We also broke ties in sort to make its behaviour more deterministic.)

I would like to get case insensitive collations some day, and was
really hoping that ICU would help. That being said, the need for a
strcmp() tie-breaker makes that hard. Oh well.

Prior to adding ICU to my previous project, it had the assumption that equal meant identical as well. It turned out to be a lot easier to break this assumption than I expected, but that code base had religiously used its own string comparison function for user data - strcmp()/memcmp() was never called for user data. (I don't know if the same can be said for Postgres.) We found that very few places needed to be aware of values that were equal but not identical. (Index and sort were the big two.)

Hopefully Postgres will be the same.

Doug Doole

pgsql-hackers by date:

From: Alvaro Herrera
Date: 07 September 2016, 20:29:16
Subject: Re: SELECT FOR UPDATE regression in 9.5

From: Robert Haas
Date: 07 September 2016, 20:35:29
Subject: Re: Optimization for lazy_scan_heap

Re: ICU integration - Mailing list pgsql-hackers

Previous

Next