Thread: Concurrently option for reindexdb
Hi all, Attached WIP patch adds "-C (--concurrently)" option for reindexdb command for concurrently reindexing. If we specify "-C" option with any table then reindexdb do reindexing concurrently with minimum lock necessary. Note that we cannot use '-s' option (for system catalog) and '-C' option at the same time. This patch use simple method as follows. 1. Do "CREATE INDEX CONCURRENTLY" new index which has same definition as target index 2. Aquire ACCESS EXCLUSIVE LOCK to target table( and transaction starts) 3. Swap old and new index 4. Drop old index 5. COMMIT These process are based on pg_repack(or pg_reorg) does, done via SQL. ToDo - Multi language support for log message. Regards, ------- Sawada Masahiko
Attachment
On Mon, Aug 25, 2014 at 3:36 PM, Sawada Masahiko <sawada.mshk@gmail.com> wrote: > Attached WIP patch adds "-C (--concurrently)" option for reindexdb > command for concurrently reindexing. > If we specify "-C" option with any table then reindexdb do reindexing > concurrently with minimum lock necessary. > Note that we cannot use '-s' option (for system catalog) and '-C' > option at the same time. > This patch use simple method as follows. > > 1. Do "CREATE INDEX CONCURRENTLY" new index which has same definition > as target index > 2. Aquire ACCESS EXCLUSIVE LOCK to target table( and transaction starts) > 3. Swap old and new index > 4. Drop old index > 5. COMMIT > > These process are based on pg_repack(or pg_reorg) does, done via SQL. This would be a useful for users, but I am not sure that you can call that --concurrently as the rename/swap phase requires an exclusive lock, and you would actually block a real implementation of REINDEX CONCURRENTLY (hm...). > ToDo > - Multi language support for log message. Why? I am not sure that's something you should deal with. -- Michael
On Mon, Aug 25, 2014 at 3:48 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Mon, Aug 25, 2014 at 3:36 PM, Sawada Masahiko <sawada.mshk@gmail.com> wrote: >> Attached WIP patch adds "-C (--concurrently)" option for reindexdb >> command for concurrently reindexing. >> If we specify "-C" option with any table then reindexdb do reindexing >> concurrently with minimum lock necessary. >> Note that we cannot use '-s' option (for system catalog) and '-C' >> option at the same time. >> This patch use simple method as follows. >> >> 1. Do "CREATE INDEX CONCURRENTLY" new index which has same definition >> as target index >> 2. Aquire ACCESS EXCLUSIVE LOCK to target table( and transaction starts) >> 3. Swap old and new index >> 4. Drop old index >> 5. COMMIT >> >> These process are based on pg_repack(or pg_reorg) does, done via SQL. > > This would be a useful for users, but I am not sure that you can call > that --concurrently as the rename/swap phase requires an exclusive > lock, and you would actually block a real implementation of REINDEX > CONCURRENTLY (hm...). > this might be difficult to call this as --concurrently. It might need to be change the name. >> ToDo >> - Multi language support for log message. > Why? I am not sure that's something you should deal with. The log message which has been existed already are supported multi language support using by .po file, But newly added message has not corresponded message in .po file, I thought. Regards, ------- Sawada Masahiko
On Mon, Aug 25, 2014 at 4:33 PM, Sawada Masahiko <sawada.mshk@gmail.com> wrote: > On Mon, Aug 25, 2014 at 3:48 PM, Michael Paquier > <michael.paquier@gmail.com> wrote: >> On Mon, Aug 25, 2014 at 3:36 PM, Sawada Masahiko <sawada.mshk@gmail.com> wrote: >>> Attached WIP patch adds "-C (--concurrently)" option for reindexdb >>> command for concurrently reindexing. >>> If we specify "-C" option with any table then reindexdb do reindexing >>> concurrently with minimum lock necessary. >>> Note that we cannot use '-s' option (for system catalog) and '-C' >>> option at the same time. >>> This patch use simple method as follows. >>> >>> 1. Do "CREATE INDEX CONCURRENTLY" new index which has same definition >>> as target index >>> 2. Aquire ACCESS EXCLUSIVE LOCK to target table( and transaction starts) >>> 3. Swap old and new index >>> 4. Drop old index >>> 5. COMMIT >>> >>> These process are based on pg_repack(or pg_reorg) does, done via SQL. +1. I have some shell scripts which do that reindex technique, and I'd be happy if I can replace them with this feature. Can this technique reindex the primary key index and the index which other objects depend on (e.g., foreign key)? >> This would be a useful for users, but I am not sure that you can call >> that --concurrently as the rename/swap phase requires an exclusive >> lock, and you would actually block a real implementation of REINDEX >> CONCURRENTLY (hm...). >> > > this might be difficult to call this as --concurrently. > It might need to be change the name. I'm OK to say that as --concurrently if the document clearly explains that restriction. Or --almost-concurrently? ;P >>> ToDo >>> - Multi language support for log message. >> Why? I am not sure that's something you should deal with. > > The log message which has been existed already are supported multi > language support using by .po file, > But newly added message has not corresponded message in .po file, I thought. I don't think that you need to add the update of .po file into the feature patch. Regards, -- Fujii Masao
On Tue, Aug 26, 2014 at 3:48 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Mon, Aug 25, 2014 at 4:33 PM, Sawada Masahiko <sawada.mshk@gmail.com> wrote: >> this might be difficult to call this as --concurrently. >> It might need to be change the name. > > I'm OK to say that as --concurrently if the document clearly > explains that restriction. Or --almost-concurrently? ;P By reading that I am thinking as well about a wording with "lock", like --minimum-locks. -- Michael
Michael Paquier wrote: > On Tue, Aug 26, 2014 at 3:48 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > > On Mon, Aug 25, 2014 at 4:33 PM, Sawada Masahiko <sawada.mshk@gmail.com> wrote: > >> this might be difficult to call this as --concurrently. > >> It might need to be change the name. > > > > I'm OK to say that as --concurrently if the document clearly > > explains that restriction. Or --almost-concurrently? ;P > By reading that I am thinking as well about a wording with "lock", > like --minimum-locks. Why not just finish up the REINDEX CONCURRENTLY patch. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On August 25, 2014 10:35:20 PM CEST, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >Michael Paquier wrote: >> On Tue, Aug 26, 2014 at 3:48 AM, Fujii Masao <masao.fujii@gmail.com> >wrote: >> > On Mon, Aug 25, 2014 at 4:33 PM, Sawada Masahiko ><sawada.mshk@gmail.com> wrote: >> >> this might be difficult to call this as --concurrently. >> >> It might need to be change the name. >> > >> > I'm OK to say that as --concurrently if the document clearly >> > explains that restriction. Or --almost-concurrently? ;P >> By reading that I am thinking as well about a wording with "lock", >> like --minimum-locks. > >Why not just finish up the REINDEX CONCURRENTLY patch. +many. Although I'm not sure if we managed to find a safe relation swap. If not: How about adding ALTER INDEX ... SWAP which requires an exclusive lock but is fast and O(1)? Than all indexes canbe created concurrently, swapped in a very short xact, and then dropped concurrently? 95% of all users would be happywith that and the remaining 5 would still be in a better position than today where the catalog needs to be hacked forthat (fkeys, pkeys et al). --- Please excuse brevity and formatting - I am writing this on my mobile phone.
On Tue, Aug 26, 2014 at 5:46 AM, Andres Freund <andres@anarazel.de> wrote: > On August 25, 2014 10:35:20 PM CEST, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >>Michael Paquier wrote: >>> On Tue, Aug 26, 2014 at 3:48 AM, Fujii Masao <masao.fujii@gmail.com> >>wrote: >>> > On Mon, Aug 25, 2014 at 4:33 PM, Sawada Masahiko >><sawada.mshk@gmail.com> wrote: >>> >> this might be difficult to call this as --concurrently. >>> >> It might need to be change the name. >>> > >>> > I'm OK to say that as --concurrently if the document clearly >>> > explains that restriction. Or --almost-concurrently? ;P >>> By reading that I am thinking as well about a wording with "lock", >>> like --minimum-locks. >> >>Why not just finish up the REINDEX CONCURRENTLY patch. +1 > +many. Although I'm not sure if we managed to find a safe relation swap. That safe relation swap is possible if an AccessExclusive lock is taken. Right? That means that REINDEX CONCURRENTLY is not completely-concurrently, but I think that many users are satisfied with even this feature. Regards, -- Fujii Masao
On Tue, Aug 26, 2014 at 12:28 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >> +many. Although I'm not sure if we managed to find a safe relation swap. Well we didn't AFAIK. With the latest patch provided I could not really find any whole in the logic, and Andres felt that something may be wrong miles away. If I'd revisit the patch now with a rebased version maybe I may find smth... > That safe relation swap is possible if an AccessExclusive lock is taken. Right? > That means that REINDEX CONCURRENTLY is not completely-concurrently, but > I think that many users are satisfied with even this feature. This would block as well isolation tests on this feature, something not that welcome for a feature calling itself concurrently, but it would deadly simplify the patch and reduce deadlock occurrences if done right with the exclusive locks (no need to check for past snapshots necessary when using ShareUpdateExclusiveLock?). I left notes on the wiki the status of this patch: https://wiki.postgresql.org/wiki/Reindex_concurrently Reading this thread, the consensus would be to use an exclusive lock for swap and be done. Well if there are enough votes for this approach I wouldn't mind resending an updated patch for the next CF. Regards, -- Michael
On 2014-08-26 12:44:43 +0900, Michael Paquier wrote: > On Tue, Aug 26, 2014 at 12:28 PM, Fujii Masao <masao.fujii@gmail.com> wrote: > >> +many. Although I'm not sure if we managed to find a safe relation swap. > > Well we didn't AFAIK. With the latest patch provided I could not > really find any whole in the logic, and Andres felt that something may > be wrong miles away. If I'd revisit the patch now with a rebased > version maybe I may find smth... I don't think it was miles away, but I'll look into the rebased version. > > That safe relation swap is possible if an AccessExclusive lock is taken. Right? > > That means that REINDEX CONCURRENTLY is not completely-concurrently, but > > I think that many users are satisfied with even this feature. > > This would block as well isolation tests on this feature, something > not that welcome for a feature calling itself concurrently, Right. But it's much better than what we have now. Possibly we can rename the feature... :/ > but it > would deadly simplify the patch and reduce deadlock occurrences if > done right with the exclusive locks (no need to check for past > snapshots necessary when using ShareUpdateExclusiveLock?). I'm not sure if you really can get rid of the waiting for past snapshots without making the feature much more heavyweight htan necessary. > Reading this thread, the consensus would be to use an exclusive lock > for swap and be done. Well if there are enough votes for this approach > I wouldn't mind resending an updated patch for the next CF. I always was of the opinion that a exclusive lock is still *MUCH* better than what we have today. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Aug 26, 2014 at 5:12 PM, Andres Freund <andres@anarazel.de> wrote: > On 2014-08-26 12:44:43 +0900, Michael Paquier wrote: > I always was of the opinion that a exclusive lock is still *MUCH* better > than what we have today. Well, if somebody has some interest in that, here is a rebased patch with the approach using low-level locks: http://www.postgresql.org/message-id/CAB7nPqRkwKFgn4BFUybqU-Oo-=Gcbq0K-8H93Gr6fX-GGRPDXg@mail.gmail.com -- Michael
On Wed, Aug 27, 2014 at 11:02 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Tue, Aug 26, 2014 at 5:12 PM, Andres Freund <andres@anarazel.de> wrote: >> On 2014-08-26 12:44:43 +0900, Michael Paquier wrote: >> I always was of the opinion that a exclusive lock is still *MUCH* better >> than what we have today. > Well, if somebody has some interest in that, here is a rebased patch > with the approach using low-level locks: > http://www.postgresql.org/message-id/CAB7nPqRkwKFgn4BFUybqU-Oo-=Gcbq0K-8H93Gr6fX-GGRPDXg@mail.gmail.com My patch need to be improved doc and to be renamed option name (--minimum-locks?) Also I need to test, e.g., foreign key and primary key. Anyway, If REINDEX CONCURRENTLY patch Michael submitted is committed then I might need to rebase the patch (rather it's not necessary..?) So I will see how it goes for a while. Regards, ------- Sawada Masahiko
On 08/25/2014 02:36 PM, Sawada Masahiko wrote: > Hi all, > > Attached WIP patch adds "-C (--concurrently)" option for reindexdb > command for concurrently reindexing. > If we specify "-C" option with any table then reindexdb do reindexing > concurrently with minimum lock necessary. > Note that we cannot use '-s' option (for system catalog) and '-C' > option at the same time. > This patch use simple method as follows. > > 1. Do "CREATE INDEX CONCURRENTLY" new index which has same definition > as target index > 2. Aquire ACCESS EXCLUSIVE LOCK to target table( and transaction starts) > 3. Swap old and new index > 4. Drop old index > 5. COMMIT How do you handle indexes tied to constraints - PRIMARY KEY, UNIQUE, or EXCLUSION constraint indexes? My understanding was that this currently required some less than lovely catalog hacks. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Sep 1, 2014 at 10:43 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > On 08/25/2014 02:36 PM, Sawada Masahiko wrote: >> Hi all, >> >> Attached WIP patch adds "-C (--concurrently)" option for reindexdb >> command for concurrently reindexing. >> If we specify "-C" option with any table then reindexdb do reindexing >> concurrently with minimum lock necessary. >> Note that we cannot use '-s' option (for system catalog) and '-C' >> option at the same time. >> This patch use simple method as follows. >> >> 1. Do "CREATE INDEX CONCURRENTLY" new index which has same definition >> as target index >> 2. Aquire ACCESS EXCLUSIVE LOCK to target table( and transaction starts) >> 3. Swap old and new index >> 4. Drop old index >> 5. COMMIT > > How do you handle indexes tied to constraints - PRIMARY KEY, UNIQUE, or > EXCLUSION constraint indexes? > > My understanding was that this currently required some less than lovely > catalog hacks. > The currently patch dose not hack catalog, just create new index concurrently and swap them. So, It is supporting only UNIQUE index, I think. This patch contains some limitation. Also I'm thinking to implement to handle these cases. Regards, ------- Sawada Masahiko
On 09/02/2014 11:10 AM, Sawada Masahiko wrote: > The currently patch dose not hack catalog, just create new index > concurrently and > swap them. > So, It is supporting only UNIQUE index, I think. UNIQUE indexes, but not a UNIQUE constraint backed by a UNIQUE index, or a PRIMARY KEY constraint backed by a UNIQUE index. > This patch contains some limitation. > Also I'm thinking to implement to handle these cases. My understanding from the prior discussion is that any satisfactory solution to those problems would also make it possible to support REINDEX CONCURRENTLY natively. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Sep 2, 2014 at 1:06 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > On 09/02/2014 11:10 AM, Sawada Masahiko wrote: >> The currently patch dose not hack catalog, just create new index >> concurrently and >> swap them. >> So, It is supporting only UNIQUE index, I think. > > UNIQUE indexes, but not a UNIQUE constraint backed by a UNIQUE index, or > a PRIMARY KEY constraint backed by a UNIQUE index. You can use "ALTER TABLE ... DROP CONSTRAINT ... ADD PRIMARY KEY USING INDEX ..." for them. I'm not sure how to rebuild the index which other object like foreign key depends on, though. >> This patch contains some limitation. >> Also I'm thinking to implement to handle these cases. > > My understanding from the prior discussion is that any satisfactory > solution to those problems would also make it possible to support > REINDEX CONCURRENTLY natively. Agreed. We will need to back to Sawada's proposal only when we fail to apply REINDEX CONCURRENTLY patch again. I hope that will not happen. Regards, -- Fujii Masao