Thanks for your review!
On Wed, Feb 20, 2013 at 12:14 AM, Fujii Masao
<masao.fujii@gmail.com> wrote:
When I compiled the HEAD with the patches, I got the following warnings.
index.c:1273: warning: unused variable 'parentRel'
execUtils.c:1199: warning: 'return' with no value, in function
returning non-void
Oops, corrected.
When I ran REINDEX CONCURRENTLY for the same index from two different
sessions, I got the deadlock. The error log is:
ERROR: deadlock detected
DETAIL: Process 37121 waits for ShareLock on virtual transaction
2/196; blocked by process 36413.
Process 36413 waits for ShareUpdateExclusiveLock on relation 16457 of
database 12293; blocked by process 37121.
Process 37121: REINDEX TABLE CONCURRENTLY pgbench_accounts;
Process 36413: REINDEX TABLE CONCURRENTLY pgbench_accounts;
HINT: See server log for query details.
STATEMENT: REINDEX TABLE CONCURRENTLY pgbench_accounts;
And, after the REINDEX CONCURRENTLY that survived the deadlock finished,
I found that new index with another name was created. It was NOT marked as
INVALID. Are these behaviors intentional?
This happens because of the following scenario:
- session 1: REINDEX CONCURRENTLY, that has not yet reached phase 3 where indexes are validated. necessary ShareUpdateExclusiveLock locks are taken on relations rebuilt.
- session 2: REINDEX CONCURRENTLY, waits for a ShareUpdateExclusiveLock lock to be obtained, its transaction begins before session 1 reaches phase 3
- session 1: enters phase 3, and fails at WaitForOldSnapshots as session 2 has an older snapshot and is currently waiting for lock on session 1
- session 2: succeeds, but concurrent index created by session 1 still exists
A ShareUpdateExclusiveLock is taken on index or table that is going to be rebuilt just before calling ReindexRelationConcurrently. So the solution I have here is to make REINDEX CONCURRENTLY fail for session 2. REINDEX CONCURRENTLY is made to allow a table to run DML in parallel to the operation so it doesn't look strange to me to make session 2 fail if REINDEX CONCURRENTLY is done in parallel on the same relation.
This fixes the problem of the concurrent index *_cct appearing after session 1 failed due to the deadlock in Masao's report.
The patch correcting this problem is attached.
Error message could be improved, here is what it is now when session 2 fails:
postgres=# reindex table concurrently aa;
ERROR: could not obtain lock on relation "aa"
Comments?
--
Michael