Re: BUG #18057: unaccent removes intentional spaces - Mailing list pgsql-bugs

From Michael Paquier
Subject Re: BUG #18057: unaccent removes intentional spaces
Date
Msg-id ZN8NUpx2f9pB+F/g@paquier.xyz
Whole thread Raw
In response to Re: BUG #18057: unaccent removes intentional spaces  (Michael Paquier <michael@paquier.xyz>)
Responses Re: BUG #18057: unaccent removes intentional spaces
List pgsql-bugs
On Wed, Aug 16, 2023 at 09:00:43AM +0900, Michael Paquier wrote:
> Agreed that this looks incorrect as-is.  This goes as far as 9a206d0
> when these has been introduced, and it looks like the culprit is
> around initTrie() where the entries are loaded.  See around t_isspace,
> for example.

I was looking at the code, and my first impression was right.  All
leading and trailing whitespaces between the two characters listed in
the rule file are discarded.  The thing is that we clearly document
the parsing rules for the sake of any custom files one can feed to the
extension:
https://www.postgresql.org/docs/devel/unaccent.html

I am not sure what we can do here.  Doing nothing is certainly an
option, but I am wondering if we could put in place an extra rule
where whitespaces can be part of the translated character if it uses
double quotes, for example.  Thoughts?
--
Michael

Attachment

pgsql-bugs by date:

Previous
From: Emile Amewoto
Date:
Subject: Postgresql15 crash with :FATAL: could not open shared memory segment "/PostgreSQL.0000000": No such file or directory
Next
From: PG Bug reporting form
Date:
Subject: BUG #18060: Left joining rows using random() function in join condition doesn't work as expected.