Home > mailing lists

Re: [PATCH] Completed unaccent dictionary with many missing characters - Mailing list pgsql-hackers

From	Michael Paquier
Subject	Re: [PATCH] Completed unaccent dictionary with many missing characters
Date	June 28, 2022 08:14:53
Msg-id	YrqOTZBFoRNOOuz5@paquier.xyz Whole thread Raw
In response to	Re: [PATCH] Completed unaccent dictionary with many missing characters (Przemysław Sztoch <przemyslaw@sztoch.pl>)
Responses	Re: [PATCH] Completed unaccent dictionary with many missing characters (Przemysław Sztoch <przemyslaw@sztoch.pl>) Re: [PATCH] Completed unaccent dictionary with many missing characters (Michael Paquier <michael@paquier.xyz>)
List	pgsql-hackers

Tree view

On Thu, Jun 23, 2022 at 02:10:42PM +0200, Przemysław Sztoch wrote:
> The only division that is probably possible is the one attached.

Well, the addition of cyrillic does not make necessary the removal of
SOUND RECORDING COPYRIGHT or the DEGREEs, that implies the use of a
dictionnary when manipulating the set of codepoints, but that's me
being too picky.  Just to say that I am fine with what you are
proposing here.

By the way, could you add a couple of regressions tests for each
patch with a sample of the characters added?  U+210C is a particularly
sensitive case, as we should really make sure that it maps to what we
want even if Latin-ASCII.xml tells a different story.  This requires
the addition of a couple of queries in unaccent.sql with the expected
output updated in unaccent.out.
--
Michael

Attachment

signature.asc

pgsql-hackers by date:

From: Andres Freund
Date: 28 June 2022, 08:02:11
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum

From: vignesh C
Date: 28 June 2022, 09:17:53
Subject: Re: Handle infinite recursion in logical replication setup

Re: [PATCH] Completed unaccent dictionary with many missing characters - Mailing list pgsql-hackers

Attachment

Previous

Next