Thread: initdb faild to initialize full text search dictionaries
This is a follow-up on bug #17356. PostgreSQL version 15 is also affected: it is not able to build dictionary from hunspell packages.
How to reproduce:
How to reproduce:
# Start minimal virgin system
docker run -it --rm debian bash
apt update
apt install -y acl ca-certificates curl gzip libbsd0 libbz2-1.0 libc6 libedit2 libffi7 libnettle8 libicu67 libreadline8 libgcc1 libgmp10 libgnutls30 libhogweed6 libidn2-0 libldap-2.4-2 liblz4-1 liblzma5 \
libncurses6 libp11-kit0 libpcre3 libsasl2-2 libsqlite3-0 libssl1.1 libstdc++6 libtasn1-6 libtinfo6 libunistring2 libuuid1 libxml2 libxslt1.1 libzstd1 locales procps tar zlib1g gnupg dumb-init curl
# Install hunspell-hu and PostgreSQL
apt install -y hunspell hunspell-hu
curl -s https://salsa.debian.org/postgresql/postgresql-common/raw/master/pgdg/apt.postgresql.org.sh | bash
apt update
apt install -y postgresql-15
The last command "apt install -y postgresql-15" gives this error:
Building PostgreSQL dictionaries from installed myspell/hunspell packages...
hu_hu
iconv: illegal input sequence at position 131
ERROR: Conversion of /usr/share/hunspell/hu_HU.aff failed
Removing obsolete dictionary files:
docker run -it --rm debian bash
apt update
apt install -y acl ca-certificates curl gzip libbsd0 libbz2-1.0 libc6 libedit2 libffi7 libnettle8 libicu67 libreadline8 libgcc1 libgmp10 libgnutls30 libhogweed6 libidn2-0 libldap-2.4-2 liblz4-1 liblzma5 \
libncurses6 libp11-kit0 libpcre3 libsasl2-2 libsqlite3-0 libssl1.1 libstdc++6 libtasn1-6 libtinfo6 libunistring2 libuuid1 libxml2 libxslt1.1 libzstd1 locales procps tar zlib1g gnupg dumb-init curl
# Install hunspell-hu and PostgreSQL
apt install -y hunspell hunspell-hu
curl -s https://salsa.debian.org/postgresql/postgresql-common/raw/master/pgdg/apt.postgresql.org.sh | bash
apt update
apt install -y postgresql-15
The last command "apt install -y postgresql-15" gives this error:
Building PostgreSQL dictionaries from installed myspell/hunspell packages...
hu_hu
iconv: illegal input sequence at position 131
ERROR: Conversion of /usr/share/hunspell/hu_HU.aff failed
Removing obsolete dictionary files:
I'm not sure where the problem is. It may be in hunspell, or hunspell-hu, or iconv or postgresql. I have tried to find the root cause, but I falied. At least it seems that it is NOT a bug in hunspell or hunspell-hu, because the author of hunspell wrote this comment in 2018 at https://github.com/hunspell/hunspell/issues/559#issuecomment-446335091
> Not a bug: Hunspell's file format is not an UTF-8 encoded text file in the case of SET UTF-8 with the default 8-bit FLAG.
That hunspell issue is only open because "it is a valid request", but it is not a bug nonetheless (according to the author).
So it might be iconv, or it might be pg_updatedicts that calls iconv with the wrong parameters. I do not know enough to tell...
The effect of this bug is that PostgreSQL is not able to utilize the dictionaries for full text search. ( https://www.postgresql.org/docs/15/textsearch-dictionaries.html ) I did not try ispell or myspell yet, but they are old (ancient, actually) and hunspell should be preferred. I think that this bug has been around at least since 5 years (2018).
> Not a bug: Hunspell's file format is not an UTF-8 encoded text file in the case of SET UTF-8 with the default 8-bit FLAG.
That hunspell issue is only open because "it is a valid request", but it is not a bug nonetheless (according to the author).
So it might be iconv, or it might be pg_updatedicts that calls iconv with the wrong parameters. I do not know enough to tell...
The effect of this bug is that PostgreSQL is not able to utilize the dictionaries for full text search. ( https://www.postgresql.org/docs/15/textsearch-dictionaries.html ) I did not try ispell or myspell yet, but they are old (ancient, actually) and hunspell should be preferred. I think that this bug has been around at least since 5 years (2018).
Regards,
Laszlo Zsolt Nagy
Les <nagylzs@gmail.com> writes: > The last command "apt install -y postgresql-15" gives this error: > Building PostgreSQL dictionaries from installed myspell/hunspell packages... > hu_hu > iconv: illegal input sequence at position 131 > ERROR: Conversion of /usr/share/hunspell/hu_HU.aff failed Sadly, I do not think any of the moving parts there are under the PG project's control. We certainly can't fix problems in either hunspell or iconv, and even the fact that iconv is being applied during install is not something the core project does. I gather that this is something the Debian packaging of postgres is attempting, so I'd suggest taking it up with those packagers. It's possible that it's something easy like they have the wrong idea of what encoding that particular file is in. Or maybe the best answer is to skip any files that fail conversion, without aborting the package install entirely. But we here on pgsql-bugs can't help you. regards, tom lane