Home > mailing lists

Re: Add parallelism and glibc dependent only options to reindexdb - Mailing list pgsql-hackers

From	Alvaro Herrera
Subject	Re: Add parallelism and glibc dependent only options to reindexdb
Date	July 1, 2019 13:51:12
Msg-id	20190701135112.GA6750@alvherre.pgsql Whole thread Raw
In response to	Re: Add parallelism and glibc dependent only options to reindexdb (Michael Paquier <michael@paquier.xyz>)
Responses	Re: Add parallelism and glibc dependent only options to reindexdb Re: Add parallelism and glibc dependent only options to reindexdb
List	pgsql-hackers

Tree view

Now that we have REINDEX CONCURRENTLY, I think reindexdb is going to
gain more popularity.

Please don't reuse a file name as generic as "parallel.c" -- it's
annoying when navigating source.  Maybe conn_parallel.c multiconn.c
connscripts.c admconnection.c ...?

If your server crashes or is stopped midway during the reindex, you
would have to start again from scratch, and it's tedious (if it's
possible at all) to determine which indexes were missed.  I think it
would be useful to have a two-phase mode: in the initial phase reindexdb
computes the list of indexes to be reindexed and saves them into a work
table somewhere.  In the second phase, it reads indexes from that table
and processes them, marking them as done in the work table.  If the
second phase crashes or is stopped, it can be restarted and consults the
work table.  I would keep the work table, as it provides a bit of an
audit trail.  It may be important to be able to run even if unable to
create such a work table (because of the <ironic>numerous</> users that
DROP DATABASE postgres).

Maybe we'd have two flags in the work table for each index:
"reindex requested", "reindex done".
    
The "glibc filter" thing (which I take to mean "indexes that depend on
collations") would apply to the first phase: it just skips adding other
indexes to the work table.  I suppose ICU collations are not affected,
so the filter would be for glibc collations only?  The --glibc-dependent
switch seems too ad-hoc.  Maybe "--exclude-rule=glibc"?  That way we can
add other rules later.  (Not "--exclude=foo" because we'll want to add
the possibility to ignore specific indexes by name.)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

From: Binguo Bao
Date: 01 July 2019, 13:46:28
Subject: Re: Optimize partial TOAST decompression

From: Tom Lane
Date: 01 July 2019, 14:10:51
Subject: Re: Add parallelism and glibc dependent only options to reindexdb

Re: Add parallelism and glibc dependent only options to reindexdb - Mailing list pgsql-hackers

Previous

Next