On Sun, Jun 30, 2019 at 11:45:47AM +0200, Julien Rouhaud wrote:
>Hi,
>
>With the glibc 2.28 coming, all users will have to reindex almost
>every indexes after a glibc upgrade to guarantee the lack of
>corruption. Unfortunately, reindexdb is not ideal for that as it's
>processing everything using a single connexion and isn't able to
>discard indexes that doesn't depend on a glibc collation.
>
>PFA a patchset to add parallelism to reindexdb (reusing the
>infrastructure in vacuumdb with some additions) and an option to
>discard indexes that doesn't depend on glibc (without any specific
>collation filtering or glibc version detection), with updated
>regression tests. Note that this should be applied on top of the
>existing reindexdb cleanup & refactoring patch
>(https://commitfest.postgresql.org/23/2115/).
>
>This was sponsored by VMware, and has been discussed internally with
>Kevin and Michael, in Cc.
I wonder why this is necessary:
pg_log_error("cannot reindex glibc dependent objects and a subset of objects");
What's the reasoning behind that? It seems like a valid use case to me -
imagine you have a bug database, but only a couple of tables are used by
the application regularly (the rest may be archive tables, for example).
Why not to allow rebuilding glibc-dependent indexes on the used tables, so
that the database can be opened for users sooner.
BTW now that we allow rebuilding only some of the indexes, it'd be great
to have a dry-run mode, were we just print which indexes will be rebuilt
without actually rebuilding them.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services