On 13.01.22 16:12, Peter Eisentraut wrote:
> On 11.01.22 23:19, Peter Geoghegan wrote:
>> On Tue, Jan 11, 2022 at 3:16 AM PG Bug reporting form
>> <noreply@postgresql.org> wrote:
>>> When I query a table containing certain Unicode data, using an index
>>> that
>>> has certain collation, I get error message "could not find block
>>> containing
>>> chunk". This is fully reproducible on CentOS 7 using the official
>>> RPM. (I
>>> could not reproduce this on Oracle Linux 8, though):
>>
>> It looks like you're probably not using utf8 as your database
>> encoding, based on the stacktrace -- even though I would expect that
>> on your ICU version. What does "show server_encoding;" show you when
>> run from psql?
>>
>> My guess is that you can temporarily work around the bug (which looks
>> like a bug in our !HAVE_UCOL_STRCOLLUTF8 ICU support) by making sure
>> to use UTF-8 as the server encoding.
>
> It looks like the reporter's system is rhel7, which does not have
> HAVE_UCOL_STRCOLLUTF8, so it will use the other code independent of the
> encoding.
>
> There could very well be some subtle memory counting bug or similar in
> icu_to_uchar() perhaps. Needs more analysis.
Ok, I can reproduce this with the given test data and steps if I undef
HAVE_UCOL_STRCOLLUTF8 in my build.
Interestingly, I couldn't just rebuild and rerun the query. In order
for the error to occur, I had to rebuild the index.