Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY - Mailing list pgsql-hackers
From | Michail Nikolaev |
---|---|
Subject | Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY |
Date | |
Msg-id | CANtu0ohcga-N6wqDR+PEy7si8E7R7pdiUU-mSrgA5MP0uNreRg@mail.gmail.com Whole thread Raw |
In response to | Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY (Michail Nikolaev <michail.nikolaev@gmail.com>) |
Responses |
Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY
Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY |
List | pgsql-hackers |
Hello, everyone.
> I think It is even possible to see !alive index in the same situation (it is worse case), but I was unable to reproduce it so far.
Fortunately, it is not possible.
So, seems like I have found the source of the problem:
1) infer_arbiter_indexes calls RelationGetIndexList to get the list of candidates.
> I think It is even possible to see !alive index in the same situation (it is worse case), but I was unable to reproduce it so far.
Fortunately, it is not possible.
So, seems like I have found the source of the problem:
1) infer_arbiter_indexes calls RelationGetIndexList to get the list of candidates.
It does no lock selected indexes in any additional way which prevents index_concurrently_swap changing them (set and clear validity).
RelationGetIndexList relcache.c:4857
infer_arbiter_indexes plancat.c:780
make_modifytable createplan.c:7097 ---------- node->arbiterIndexes = infer_arbiter_indexes(root);
create_modifytable_plan createplan.c:2826
create_plan_recurse createplan.c:532
create_plan createplan.c:349
standard_planner planner.c:421
planner planner.c:282
pg_plan_query postgres.c:904
pg_plan_queries postgres.c:996
exec_simple_query postgres.c:1193
2) other backend marks some index as invalid and commits
index_concurrently_swap index.c:1600
ReindexRelationConcurrently indexcmds.c:4115
ReindexIndex indexcmds.c:2814
ExecReindex indexcmds.c:2743
ProcessUtilitySlow utility.c:1567
standard_ProcessUtility utility.c:1067
ProcessUtility utility.c:523
PortalRunUtility pquery.c:1158
PortalRunMulti pquery.c:1315
PortalRun pquery.c:791
exec_simple_query postgres.c:1274
3) first backend invalidates catalog snapshot because transactional snapshot
InvalidateCatalogSnapshot snapmgr.c:426
GetTransactionSnapshot snapmgr.c:278
PortalRunMulti pquery.c:1244
PortalRun pquery.c:791
exec_simple_query postgres.c:1274
4) first backend copies indexes selected using previous catalog snapshot
ExecInitModifyTable nodeModifyTable.c:4499 -------- resultRelInfo->ri_onConflictArbiterIndexes = node->arbiterIndexes;
ExecInitNode execProcnode.c:177
InitPlan execMain.c:966
standard_ExecutorStart execMain.c:261
ExecutorStart execMain.c:137
ProcessQuery pquery.c:155
PortalRunMulti pquery.c:1277
PortalRun pquery.c:791
exec_simple_query postgres.c:1274
5) then reads indexes using new fresh snapshot
RelationGetIndexList relcache.c:4816
ExecOpenIndices execIndexing.c:175
ExecInsert nodeModifyTable.c:792 ------------- ExecOpenIndices(resultRelInfo, onconflict != ONCONFLICT_NONE);
ExecModifyTable nodeModifyTable.c:4059
ExecProcNodeFirst execProcnode.c:464
ExecProcNode executor.h:274
ExecutePlan execMain.c:1646
standard_ExecutorRun execMain.c:363
ExecutorRun execMain.c:304
ProcessQuery pquery.c:160
PortalRunMulti pquery.c:1277
PortalRun pquery.c:791
exec_simple_query postgres.c:1274
5) and uses arbiter selected with stale snapshot with new index view (marked as invalid)
ExecInsert nodeModifyTable.c:1016 -------------- arbiterIndexes = resultRelInfo->ri_onConflictArbiterIndexes;
............
ExecInsert nodeModifyTable.c:1048 ---------------if (!ExecCheckIndexConstraints(resultRelInfo, slot, estate, conflictTid, arbiterIndexes))
ExecModifyTable nodeModifyTable.c:4059
ExecProcNodeFirst execProcnode.c:464
ExecProcNode executor.h:274
ExecutePlan execMain.c:1646
standard_ExecutorRun execMain.c:363
ExecutorRun execMain.c:304
ProcessQuery pquery.c:160
PortalRunMulti pquery.c:1277
PortalRun pquery.c:791
exec_simple_query postgres.c:1274
I have attached an updated test for the issue (it fails on assert quickly and uses only 2 backends).
The same issue may happen in case of CREATE/DROP INDEX CONCURRENTLY as well.
RelationGetIndexList relcache.c:4857
infer_arbiter_indexes plancat.c:780
make_modifytable createplan.c:7097 ---------- node->arbiterIndexes = infer_arbiter_indexes(root);
create_modifytable_plan createplan.c:2826
create_plan_recurse createplan.c:532
create_plan createplan.c:349
standard_planner planner.c:421
planner planner.c:282
pg_plan_query postgres.c:904
pg_plan_queries postgres.c:996
exec_simple_query postgres.c:1193
2) other backend marks some index as invalid and commits
index_concurrently_swap index.c:1600
ReindexRelationConcurrently indexcmds.c:4115
ReindexIndex indexcmds.c:2814
ExecReindex indexcmds.c:2743
ProcessUtilitySlow utility.c:1567
standard_ProcessUtility utility.c:1067
ProcessUtility utility.c:523
PortalRunUtility pquery.c:1158
PortalRunMulti pquery.c:1315
PortalRun pquery.c:791
exec_simple_query postgres.c:1274
3) first backend invalidates catalog snapshot because transactional snapshot
InvalidateCatalogSnapshot snapmgr.c:426
GetTransactionSnapshot snapmgr.c:278
PortalRunMulti pquery.c:1244
PortalRun pquery.c:791
exec_simple_query postgres.c:1274
4) first backend copies indexes selected using previous catalog snapshot
ExecInitModifyTable nodeModifyTable.c:4499 -------- resultRelInfo->ri_onConflictArbiterIndexes = node->arbiterIndexes;
ExecInitNode execProcnode.c:177
InitPlan execMain.c:966
standard_ExecutorStart execMain.c:261
ExecutorStart execMain.c:137
ProcessQuery pquery.c:155
PortalRunMulti pquery.c:1277
PortalRun pquery.c:791
exec_simple_query postgres.c:1274
5) then reads indexes using new fresh snapshot
RelationGetIndexList relcache.c:4816
ExecOpenIndices execIndexing.c:175
ExecInsert nodeModifyTable.c:792 ------------- ExecOpenIndices(resultRelInfo, onconflict != ONCONFLICT_NONE);
ExecModifyTable nodeModifyTable.c:4059
ExecProcNodeFirst execProcnode.c:464
ExecProcNode executor.h:274
ExecutePlan execMain.c:1646
standard_ExecutorRun execMain.c:363
ExecutorRun execMain.c:304
ProcessQuery pquery.c:160
PortalRunMulti pquery.c:1277
PortalRun pquery.c:791
exec_simple_query postgres.c:1274
5) and uses arbiter selected with stale snapshot with new index view (marked as invalid)
ExecInsert nodeModifyTable.c:1016 -------------- arbiterIndexes = resultRelInfo->ri_onConflictArbiterIndexes;
............
ExecInsert nodeModifyTable.c:1048 ---------------if (!ExecCheckIndexConstraints(resultRelInfo, slot, estate, conflictTid, arbiterIndexes))
ExecModifyTable nodeModifyTable.c:4059
ExecProcNodeFirst execProcnode.c:464
ExecProcNode executor.h:274
ExecutePlan execMain.c:1646
standard_ExecutorRun execMain.c:363
ExecutorRun execMain.c:304
ProcessQuery pquery.c:160
PortalRunMulti pquery.c:1277
PortalRun pquery.c:791
exec_simple_query postgres.c:1274
I have attached an updated test for the issue (it fails on assert quickly and uses only 2 backends).
The same issue may happen in case of CREATE/DROP INDEX CONCURRENTLY as well.
The simplest possible fix is to use ShareLock instead ShareUpdateExclusiveLock in the index_concurrently_swap
oldClassRel = relation_open(oldIndexId, ShareLock);
newClassRel = relation_open(newIndexId, ShareLock);
newClassRel = relation_open(newIndexId, ShareLock);
But this is not a "concurrent" way. But such update should be fast enough as far as I understand.
Best regards,
Mikhail.
Attachment
pgsql-hackers by date: