Thread: Concurrently option for reindexdb

Concurrently option for reindexdb

From
Sawada Masahiko
Date:
Hi all,

Attached WIP patch adds "-C (--concurrently)" option for reindexdb
command for concurrently reindexing.
If we specify "-C" option with any table then reindexdb do reindexing
concurrently with minimum lock necessary.
Note that we cannot use '-s' option (for system catalog) and '-C'
option at the same time.
This patch use simple method as follows.

1. Do "CREATE INDEX CONCURRENTLY" new index which has same definition
as target index
2. Aquire ACCESS EXCLUSIVE LOCK to target table( and transaction starts)
3. Swap old and new index
4. Drop old index
5. COMMIT

These process are based on pg_repack(or pg_reorg) does, done via SQL.

ToDo
- Multi language support for log message.

Regards,

-------
Sawada Masahiko

Attachment

Re: Concurrently option for reindexdb

From
Michael Paquier
Date:
On Mon, Aug 25, 2014 at 3:36 PM, Sawada Masahiko <sawada.mshk@gmail.com> wrote:
> Attached WIP patch adds "-C (--concurrently)" option for reindexdb
> command for concurrently reindexing.
> If we specify "-C" option with any table then reindexdb do reindexing
> concurrently with minimum lock necessary.
> Note that we cannot use '-s' option (for system catalog) and '-C'
> option at the same time.
> This patch use simple method as follows.
>
> 1. Do "CREATE INDEX CONCURRENTLY" new index which has same definition
> as target index
> 2. Aquire ACCESS EXCLUSIVE LOCK to target table( and transaction starts)
> 3. Swap old and new index
> 4. Drop old index
> 5. COMMIT
>
> These process are based on pg_repack(or pg_reorg) does, done via SQL.

This would be a useful for users, but I am not sure that you can call
that --concurrently as the rename/swap phase requires an exclusive
lock, and you would actually block a real implementation of REINDEX
CONCURRENTLY (hm...).

> ToDo
> - Multi language support for log message.
Why? I am not sure that's something you should deal with.
-- 
Michael



Re: Concurrently option for reindexdb

From
Sawada Masahiko
Date:
On Mon, Aug 25, 2014 at 3:48 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Mon, Aug 25, 2014 at 3:36 PM, Sawada Masahiko <sawada.mshk@gmail.com> wrote:
>> Attached WIP patch adds "-C (--concurrently)" option for reindexdb
>> command for concurrently reindexing.
>> If we specify "-C" option with any table then reindexdb do reindexing
>> concurrently with minimum lock necessary.
>> Note that we cannot use '-s' option (for system catalog) and '-C'
>> option at the same time.
>> This patch use simple method as follows.
>>
>> 1. Do "CREATE INDEX CONCURRENTLY" new index which has same definition
>> as target index
>> 2. Aquire ACCESS EXCLUSIVE LOCK to target table( and transaction starts)
>> 3. Swap old and new index
>> 4. Drop old index
>> 5. COMMIT
>>
>> These process are based on pg_repack(or pg_reorg) does, done via SQL.
>
> This would be a useful for users, but I am not sure that you can call
> that --concurrently as the rename/swap phase requires an exclusive
> lock, and you would actually block a real implementation of REINDEX
> CONCURRENTLY (hm...).
>

this might be difficult to call this as --concurrently.
It might need to be change the name.

>> ToDo
>> - Multi language support for log message.
> Why? I am not sure that's something you should deal with.

The log message which has been existed already are supported multi
language support using by .po file,
But newly added message has not corresponded message in .po file, I thought.

Regards,

-------
Sawada Masahiko



Re: Concurrently option for reindexdb

From
Fujii Masao
Date:
On Mon, Aug 25, 2014 at 4:33 PM, Sawada Masahiko <sawada.mshk@gmail.com> wrote:
> On Mon, Aug 25, 2014 at 3:48 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> On Mon, Aug 25, 2014 at 3:36 PM, Sawada Masahiko <sawada.mshk@gmail.com> wrote:
>>> Attached WIP patch adds "-C (--concurrently)" option for reindexdb
>>> command for concurrently reindexing.
>>> If we specify "-C" option with any table then reindexdb do reindexing
>>> concurrently with minimum lock necessary.
>>> Note that we cannot use '-s' option (for system catalog) and '-C'
>>> option at the same time.
>>> This patch use simple method as follows.
>>>
>>> 1. Do "CREATE INDEX CONCURRENTLY" new index which has same definition
>>> as target index
>>> 2. Aquire ACCESS EXCLUSIVE LOCK to target table( and transaction starts)
>>> 3. Swap old and new index
>>> 4. Drop old index
>>> 5. COMMIT
>>>
>>> These process are based on pg_repack(or pg_reorg) does, done via SQL.

+1. I have some shell scripts which do that reindex technique,
and I'd be happy if I can replace them with this feature.

Can this technique reindex the primary key index and the index
which other objects depend on (e.g., foreign key)?

>> This would be a useful for users, but I am not sure that you can call
>> that --concurrently as the rename/swap phase requires an exclusive
>> lock, and you would actually block a real implementation of REINDEX
>> CONCURRENTLY (hm...).
>>
>
> this might be difficult to call this as --concurrently.
> It might need to be change the name.

I'm OK to say that as --concurrently if the document clearly
explains that restriction. Or --almost-concurrently? ;P

>>> ToDo
>>> - Multi language support for log message.
>> Why? I am not sure that's something you should deal with.
>
> The log message which has been existed already are supported multi
> language support using by .po file,
> But newly added message has not corresponded message in .po file, I thought.

I don't think that you need to add the update of .po file into
the feature patch.

Regards,

-- 
Fujii Masao



Re: Concurrently option for reindexdb

From
Michael Paquier
Date:
On Tue, Aug 26, 2014 at 3:48 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Mon, Aug 25, 2014 at 4:33 PM, Sawada Masahiko <sawada.mshk@gmail.com> wrote:
>> this might be difficult to call this as --concurrently.
>> It might need to be change the name.
>
> I'm OK to say that as --concurrently if the document clearly
> explains that restriction. Or --almost-concurrently? ;P
By reading that I am thinking as well about a wording with "lock",
like --minimum-locks.
-- 
Michael



Re: Concurrently option for reindexdb

From
Alvaro Herrera
Date:
Michael Paquier wrote:
> On Tue, Aug 26, 2014 at 3:48 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> > On Mon, Aug 25, 2014 at 4:33 PM, Sawada Masahiko <sawada.mshk@gmail.com> wrote:
> >> this might be difficult to call this as --concurrently.
> >> It might need to be change the name.
> >
> > I'm OK to say that as --concurrently if the document clearly
> > explains that restriction. Or --almost-concurrently? ;P
> By reading that I am thinking as well about a wording with "lock",
> like --minimum-locks.

Why not just finish up the REINDEX CONCURRENTLY patch.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: Concurrently option for reindexdb

From
Andres Freund
Date:
On August 25, 2014 10:35:20 PM CEST, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>Michael Paquier wrote:
>> On Tue, Aug 26, 2014 at 3:48 AM, Fujii Masao <masao.fujii@gmail.com>
>wrote:
>> > On Mon, Aug 25, 2014 at 4:33 PM, Sawada Masahiko
><sawada.mshk@gmail.com> wrote:
>> >> this might be difficult to call this as --concurrently.
>> >> It might need to be change the name.
>> >
>> > I'm OK to say that as --concurrently if the document clearly
>> > explains that restriction. Or --almost-concurrently? ;P
>> By reading that I am thinking as well about a wording with "lock",
>> like --minimum-locks.
>
>Why not just finish up the REINDEX CONCURRENTLY patch.

+many. Although I'm not sure if we managed to find a safe relation swap.

If not: How about adding ALTER INDEX ... SWAP which requires an exclusive lock but is fast and O(1)? Than all indexes
canbe created concurrently, swapped in a very short xact, and then dropped concurrently? 95% of all users would be
happywith that and the remaining 5 would still be in a better position than today where the catalog needs to be hacked
forthat (fkeys, pkeys et al).
 

--- 
Please excuse brevity and formatting - I am writing this on my mobile phone.



Re: Concurrently option for reindexdb

From
Fujii Masao
Date:
On Tue, Aug 26, 2014 at 5:46 AM, Andres Freund <andres@anarazel.de> wrote:
> On August 25, 2014 10:35:20 PM CEST, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>>Michael Paquier wrote:
>>> On Tue, Aug 26, 2014 at 3:48 AM, Fujii Masao <masao.fujii@gmail.com>
>>wrote:
>>> > On Mon, Aug 25, 2014 at 4:33 PM, Sawada Masahiko
>><sawada.mshk@gmail.com> wrote:
>>> >> this might be difficult to call this as --concurrently.
>>> >> It might need to be change the name.
>>> >
>>> > I'm OK to say that as --concurrently if the document clearly
>>> > explains that restriction. Or --almost-concurrently? ;P
>>> By reading that I am thinking as well about a wording with "lock",
>>> like --minimum-locks.
>>
>>Why not just finish up the REINDEX CONCURRENTLY patch.

+1

> +many. Although I'm not sure if we managed to find a safe relation swap.

That safe relation swap is possible if an AccessExclusive lock is taken. Right?
That means that REINDEX CONCURRENTLY is not completely-concurrently, but
I think that many users are satisfied with even this feature.

Regards,

-- 
Fujii Masao



Re: Concurrently option for reindexdb

From
Michael Paquier
Date:
On Tue, Aug 26, 2014 at 12:28 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
>> +many. Although I'm not sure if we managed to find a safe relation swap.

Well we didn't AFAIK. With the latest patch provided I could not
really find any whole in the logic, and Andres felt that something may
be wrong miles away. If I'd revisit the patch now with a rebased
version maybe I may find smth...

> That safe relation swap is possible if an AccessExclusive lock is taken. Right?
> That means that REINDEX CONCURRENTLY is not completely-concurrently, but
> I think that many users are satisfied with even this feature.

This would block as well isolation tests on this feature, something
not that welcome for a feature calling itself concurrently, but it
would deadly simplify the patch and reduce deadlock occurrences if
done right with the exclusive locks (no need to check for past
snapshots necessary when using ShareUpdateExclusiveLock?).

I left notes on the wiki the status of this patch:
https://wiki.postgresql.org/wiki/Reindex_concurrently

Reading this thread, the consensus would be to use an exclusive lock
for swap and be done. Well if there are enough votes for this approach
I wouldn't mind resending an updated patch for the next CF.

Regards,
-- 
Michael



Re: Concurrently option for reindexdb

From
Andres Freund
Date:
On 2014-08-26 12:44:43 +0900, Michael Paquier wrote:
> On Tue, Aug 26, 2014 at 12:28 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> >> +many. Although I'm not sure if we managed to find a safe relation swap.
> 
> Well we didn't AFAIK. With the latest patch provided I could not
> really find any whole in the logic, and Andres felt that something may
> be wrong miles away. If I'd revisit the patch now with a rebased
> version maybe I may find smth...

I don't think it was miles away, but I'll look into the rebased version.

> > That safe relation swap is possible if an AccessExclusive lock is taken. Right?
> > That means that REINDEX CONCURRENTLY is not completely-concurrently, but
> > I think that many users are satisfied with even this feature.
> 
> This would block as well isolation tests on this feature, something
> not that welcome for a feature calling itself concurrently,

Right. But it's much better than what we have now. Possibly we can
rename the feature... :/

> but it
> would deadly simplify the patch and reduce deadlock occurrences if
> done right with the exclusive locks (no need to check for past
> snapshots necessary when using ShareUpdateExclusiveLock?).

I'm not sure if you really can get rid of the waiting for past snapshots
without making the feature much more heavyweight htan necessary.

> Reading this thread, the consensus would be to use an exclusive lock
> for swap and be done. Well if there are enough votes for this approach
> I wouldn't mind resending an updated patch for the next CF.

I always was of the opinion that a exclusive lock is still *MUCH* better
than what we have today.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Concurrently option for reindexdb

From
Michael Paquier
Date:
On Tue, Aug 26, 2014 at 5:12 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2014-08-26 12:44:43 +0900, Michael Paquier wrote:
> I always was of the opinion that a exclusive lock is still *MUCH* better
> than what we have today.
Well, if somebody has some interest in that, here is a rebased patch
with the approach using low-level locks:
http://www.postgresql.org/message-id/CAB7nPqRkwKFgn4BFUybqU-Oo-=Gcbq0K-8H93Gr6fX-GGRPDXg@mail.gmail.com
-- 
Michael



Re: Concurrently option for reindexdb

From
Sawada Masahiko
Date:
On Wed, Aug 27, 2014 at 11:02 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Tue, Aug 26, 2014 at 5:12 PM, Andres Freund <andres@anarazel.de> wrote:
>> On 2014-08-26 12:44:43 +0900, Michael Paquier wrote:
>> I always was of the opinion that a exclusive lock is still *MUCH* better
>> than what we have today.
> Well, if somebody has some interest in that, here is a rebased patch
> with the approach using low-level locks:
> http://www.postgresql.org/message-id/CAB7nPqRkwKFgn4BFUybqU-Oo-=Gcbq0K-8H93Gr6fX-GGRPDXg@mail.gmail.com

My patch need to be improved doc and to be renamed option name
(--minimum-locks?)
Also I need to test, e.g., foreign key and primary key.

Anyway, If REINDEX CONCURRENTLY patch Michael submitted is committed then
I might need to rebase the patch (rather it's not necessary..?)
So I will see how it goes for a while.

Regards,

-------
Sawada Masahiko



Re: Concurrently option for reindexdb

From
Craig Ringer
Date:
On 08/25/2014 02:36 PM, Sawada Masahiko wrote:
> Hi all,
> 
> Attached WIP patch adds "-C (--concurrently)" option for reindexdb
> command for concurrently reindexing.
> If we specify "-C" option with any table then reindexdb do reindexing
> concurrently with minimum lock necessary.
> Note that we cannot use '-s' option (for system catalog) and '-C'
> option at the same time.
> This patch use simple method as follows.
> 
> 1. Do "CREATE INDEX CONCURRENTLY" new index which has same definition
> as target index
> 2. Aquire ACCESS EXCLUSIVE LOCK to target table( and transaction starts)
> 3. Swap old and new index
> 4. Drop old index
> 5. COMMIT

How do you handle indexes tied to constraints - PRIMARY KEY, UNIQUE, or
EXCLUSION constraint indexes?

My understanding was that this currently required some less than lovely
catalog hacks.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Concurrently option for reindexdb

From
Sawada Masahiko
Date:
On Mon, Sep 1, 2014 at 10:43 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 08/25/2014 02:36 PM, Sawada Masahiko wrote:
>> Hi all,
>>
>> Attached WIP patch adds "-C (--concurrently)" option for reindexdb
>> command for concurrently reindexing.
>> If we specify "-C" option with any table then reindexdb do reindexing
>> concurrently with minimum lock necessary.
>> Note that we cannot use '-s' option (for system catalog) and '-C'
>> option at the same time.
>> This patch use simple method as follows.
>>
>> 1. Do "CREATE INDEX CONCURRENTLY" new index which has same definition
>> as target index
>> 2. Aquire ACCESS EXCLUSIVE LOCK to target table( and transaction starts)
>> 3. Swap old and new index
>> 4. Drop old index
>> 5. COMMIT
>
> How do you handle indexes tied to constraints - PRIMARY KEY, UNIQUE, or
> EXCLUSION constraint indexes?
>
> My understanding was that this currently required some less than lovely
> catalog hacks.
>

The currently patch dose not hack catalog, just create new index
concurrently and
swap them.
So, It is supporting only UNIQUE index, I think.
This patch contains some limitation.
Also I'm thinking to implement to handle these cases.

Regards,

-------
Sawada Masahiko



Re: Concurrently option for reindexdb

From
Craig Ringer
Date:
On 09/02/2014 11:10 AM, Sawada Masahiko wrote:
> The currently patch dose not hack catalog, just create new index
> concurrently and
> swap them.
> So, It is supporting only UNIQUE index, I think.

UNIQUE indexes, but not a UNIQUE constraint backed by a UNIQUE index, or
a PRIMARY KEY constraint backed by a UNIQUE index.

> This patch contains some limitation.
> Also I'm thinking to implement to handle these cases.

My understanding from the prior discussion is that any satisfactory
solution to those problems would also make it possible to support
REINDEX CONCURRENTLY natively.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Concurrently option for reindexdb

From
Fujii Masao
Date:
On Tue, Sep 2, 2014 at 1:06 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 09/02/2014 11:10 AM, Sawada Masahiko wrote:
>> The currently patch dose not hack catalog, just create new index
>> concurrently and
>> swap them.
>> So, It is supporting only UNIQUE index, I think.
>
> UNIQUE indexes, but not a UNIQUE constraint backed by a UNIQUE index, or
> a PRIMARY KEY constraint backed by a UNIQUE index.

You can use "ALTER TABLE ... DROP CONSTRAINT ... ADD PRIMARY KEY USING
INDEX ..."
for them. I'm not sure how to rebuild the index which other object
like foreign key depends on, though.

>> This patch contains some limitation.
>> Also I'm thinking to implement to handle these cases.
>
> My understanding from the prior discussion is that any satisfactory
> solution to those problems would also make it possible to support
> REINDEX CONCURRENTLY natively.

Agreed. We will need to back to Sawada's proposal only when we fail to
apply REINDEX CONCURRENTLY patch again. I hope that will not happen.

Regards,

-- 
Fujii Masao