Thread: REINDEX and blocking SELECT queries

REINDEX and blocking SELECT queries

From
Satoshi Nagayasu
Date:
Hi all,

According to the manual, running REINDEX does not take any locks
on the parent table which block read operations.

> REINDEX locks out writes but not reads of the index's parent table.

REINDEX
https://www.postgresql.org/docs/9.5/static/sql-reindex.html

However, I heard some complains from a few people that this explanation
would lead some misunderstanding that REINDEX would not block
SELECT queries on the table.

Actually, REINDEX blocks SELECT queries, maybe in the planning phase.

This explanation is very critical for many people because they are planning
to run REINDEX online with believing in the explanation.

So, I think we should improve this explanation precisely not to lead
such misunderstanding.

What do you think? Any comments?

Regards,
--
Satoshi Nagayasu <snaga@uptime.jp>


Re: REINDEX and blocking SELECT queries

From
Tom Lane
Date:
Satoshi Nagayasu <snaga@uptime.jp> writes:
> According to the manual, running REINDEX does not take any locks
> on the parent table which block read operations.
> Actually, REINDEX blocks SELECT queries, maybe in the planning phase.

Hm.  REINDEX does take out only ShareLock on the table, which would not
block DML, but it takes out AccessExclusiveLock on the index.  That
blocks the planner's attempts to acquire information about the table's
indexes.

In the case of an update query I think there's little we can do about
this; the executor would have to update the index anyway.  For a pure
SELECT, you could imagine having the planner do a conditional lock acquire
and ignore the index if that fails.  Would that be better than blocking?
Not sure.  You could end up with a really bad plan if the index was
critical for efficient processing of the query.

            regards, tom lane


Re: REINDEX and blocking SELECT queries

From
Vik Fearing
Date:
On 09/09/2016 03:41 PM, Tom Lane wrote:
> Satoshi Nagayasu <snaga@uptime.jp> writes:
>> According to the manual, running REINDEX does not take any locks
>> on the parent table which block read operations.
>> Actually, REINDEX blocks SELECT queries, maybe in the planning phase.
>
> Hm.  REINDEX does take out only ShareLock on the table, which would not
> block DML, but it takes out AccessExclusiveLock on the index.  That
> blocks the planner's attempts to acquire information about the table's
> indexes.
>
> In the case of an update query I think there's little we can do about
> this; the executor would have to update the index anyway.  For a pure
> SELECT, you could imagine having the planner do a conditional lock acquire
> and ignore the index if that fails.  Would that be better than blocking?
> Not sure.  You could end up with a really bad plan if the index was
> critical for efficient processing of the query.

I agree, things could get awful without certain indexes (I'm thinking of
partial indexes in particular).

Also, the very next sentence after the one cited says that a SELECT will
block if it tries to use the index in question, so I'm not even sure
there's anything at all to do here.
--
Vik Fearing                                          +33 6 46 75 15 36
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: REINDEX and blocking SELECT queries

From
Satoshi Nagayasu
Date:
2016-09-09 22:41 GMT+09:00 Tom Lane <tgl@sss.pgh.pa.us>:
> Satoshi Nagayasu <snaga@uptime.jp> writes:
>> According to the manual, running REINDEX does not take any locks
>> on the parent table which block read operations.
>> Actually, REINDEX blocks SELECT queries, maybe in the planning phase.
>
> Hm.  REINDEX does take out only ShareLock on the table, which would not
> block DML, but it takes out AccessExclusiveLock on the index.  That
> blocks the planner's attempts to acquire information about the table's
> indexes.

Sorry if I didn't explain my consideration well.

In short, I would like to say, "REINDEX TABLE pgbench_accounts" would block
"SELECT count(*) FROM pgbench_accounts".
That's the situation what many of users don't expect from the manual.

For example,

SessionA> BEGIN;
SessionA> REINDEX TABLE pgbench_accounts;
SessionB> SELECT count(*) FROM pgbench_accounts; -- This statement
would be blocked by the REINDEX and the locks.

Many people understand that "REINDEX does not block read (SELECT) operations"
according to the manual. That seems misunderstanding.

So, I would like to improve the explanation of REINDEX and locks.
At least, I think we should add some explanation about the planning
phase would touch
the index, and it could be blocked by REINDEX.

> In the case of an update query I think there's little we can do about
> this; the executor would have to update the index anyway.  For a pure
> SELECT, you could imagine having the planner do a conditional lock acquire
> and ignore the index if that fails.  Would that be better than blocking?
> Not sure.  You could end up with a really bad plan if the index was
> critical for efficient processing of the query.

Well, I would not intend to modify the implementation for now.
I just notice the users that REINDEX could block even SELECT statements in
the official documents because it is very critical for web services.

Regards,
--
Satoshi Nagayasu <snaga@uptime.jp>