Thread: BUG #15987: Improve REINDEX of all indexes of a table at once
The following bug has been logged on the website: Bug reference: 15987 Logged by: Jo Tzschenscher Email address: hellojo180@yahoo.es PostgreSQL version: 10.10 Operating system: RHEL Description: Hello, PLEASE ROUTE THIS PROPOSAL TO A DEVELOPMENT TEAM, IT PROPOSES AN IMPROVED FEATURE. I am just spending lots of time to rebuild the indexes of our DB one by one, although all indexes of the same table need a refresh at once, Therefore you need to open a cursor which contains the DB-columns which occur in at least one of the indexes. If the same columns occur in more than 1 index you need to read them only once. Then you run 1 and only 1 full table scan (FTS), and with the data of the cursor you add an entry into each of the indexes to rebuild. With just 1 FTS you could create all the indexes at once, and speeding up a full reindex task to use little more than the time the FTS takes. Thank you for listening. Have a nice day. Cheerio. Jo
On Sat, Aug 31, 2019 at 05:15:10PM +0000, PG Bug reporting form wrote: > I am just spending lots of time to rebuild the indexes of our DB one by one, > although all indexes of the same table need a refresh at once, Therefore you > need to open a cursor which contains the DB-columns which occur in at least > one of the indexes. If the same columns occur in more than 1 index you need > to read them only once. Then you run 1 and only 1 full table scan (FTS), and > with the data of the cursor you add an entry into each of the indexes to > rebuild. With just 1 FTS you could create all the indexes at once, and > speeding up a full reindex task to use little more than the time the FTS > takes. I don't quite understand your request. You have REINDEX TABLE which is able to reindex all the indexes of a table one by one, REINDEX SCHEMA which works on all the tables of the defined schema, and REINDEX DATABASE which works on all the indexes within a database which processes in a serializable fashion all its tables. If you have indexes are defined multiple times for the same columns and you'd like to just process these, you may not need those multiple definitions. Please note that if you have multiple indexes on the same table which partially map to the same columns, you still need to process them one-by-one. -- Michael
Attachment
>> one of the indexes. If the same columns occur in more than 1 index you need >> to read them only once. Then you run 1 and only 1 full table scan (FTS), and >> with the data of the cursor you add an entry into each of the indexes to >> rebuild. With just 1 FTS you could create all the indexes at once, and >> speeding up a full reindex task to use little more than the time the FTS >> takes. > > I don't quite understand your request. You have REINDEX TABLE which > is able to reindex all the indexes of a table one by one, […] I understood the request as the scan should be factored out, so that reindexing a table with multiple indexes would cost only one seq scan. -- Fabien.
Totally correct, Fabien!
En domingo, 1 de septiembre de 2019 11:45:57 CEST, Fabien COELHO <coelho@cri.ensmp.fr> escribió:
>> one of the indexes. If the same columns occur in more than 1 index you need
>> to read them only once. Then you run 1 and only 1 full table scan (FTS), and
>> with the data of the cursor you add an entry into each of the indexes to
>> rebuild. With just 1 FTS you could create all the indexes at once, and
>> speeding up a full reindex task to use little more than the time the FTS
>> takes.
>
> I don't quite understand your request. You have REINDEX TABLE which
> is able to reindex all the indexes of a table one by one, […]
I understood the request as the scan should be factored out, so that
reindexing a table with multiple indexes would cost only one seq scan.
--
Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes: >> I don't quite understand your request. You have REINDEX TABLE which >> is able to reindex all the indexes of a table one by one, […] > I understood the request as the scan should be factored out, so that > reindexing a table with multiple indexes would cost only one seq scan. I'm not totally convinced that that would be an improvement. For starters, one would presumably have to divide up maintenance_work_mem per index, which would be a substantial penalty for some index types. Also, you'd be creating more randomized I/O on the index-writing side of the equation (for those index types that write the index during the scan). It might net out as a win anyway depending on context, but I don't think it's as clear-cut as the OP seems to think. regards, tom lane