Re: [HACKERS] Race between SELECT and ALTER TABLE NO INHERIT - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: [HACKERS] Race between SELECT and ALTER TABLE NO INHERIT |
Date | |
Msg-id | 20170828.182807.98097766.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: [HACKERS] Race between SELECT and ALTER TABLE NO INHERIT (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>) |
Responses |
Re: [HACKERS] Race between SELECT and ALTER TABLE NO INHERIT
Re: [HACKERS] Race between SELECT and ALTER TABLE NO INHERIT |
List | pgsql-hackers |
Hello, I'll add this to CF2017-09. At Tue, 27 Jun 2017 16:27:18 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in <75fe42df-b1d8-89ff-596d-d9da0749e66d@lab.ntt.co.jp> > On 2017/06/26 18:44, Kyotaro HORIGUCHI wrote: > > At Mon, 26 Jun 2017 18:16:42 +0900, Amit Langote wrote: > >> > >> I recall I had proposed a fix for the same thing some time ago [1]. > > > > Wow. About 1.5 years ago and stuck? I have a concrete case where > > this harms so I'd like to fix that this time. How can we move on? > > Agreed that this should be fixed. > > Your proposed approach #1 to recheck the inheritance after obtaining the > lock on the child table seems to be a good way forward. > > Approach #2 of reordering locking is a simpler solution, but does not > completely prevent the problem, because DROP TABLE child can still cause > it to occur, as you mentioned. > > >>> The cause is that NO INHERIT doesn't take an exlusive lock on the > >>> parent. This allows expand_inherited_rtentry to add the child > >>> relation into appendrel after removal from the inheritance but > >>> still exists. > >> > >> Right. > >> > >>> I see two ways to fix this. > >>> > >>> The first patch adds a recheck of inheritance relationship if the > >>> corresponding attribute is missing in the child in > >>> make_inh_translation_list(). The recheck is a bit complex but it > >>> is not performed unless the sequence above is happen. It checks > >>> duplication of relid (or cycles in inheritance) following > >>> find_all_inheritors (but doing a bit different) but I'm not sure > >>> it is really useful. > >> > >> I had proposed adding a syscache on pg_inherits(inhrelid, inhparent), but > >> I guess your hash table based solution will do the job as far as > >> performance of this check is concerned, although I haven't checked the > >> code closely. > > > > The hash table is not crucial in the patch. Substantially it just > > recurs down looking up pg_inherits for the child. I considered > > the new index but abandoned because I thought that such case > > won't occur so frequently. > > Agreed. BTW, the hash table in patch #1 does not seem to be really > helpful. In the while loop in is_descendant_of_internal(), does > hash_search() ever return found = true? AFAICS, it does not. > > >>> The second patch lets ALTER TABLE NO INHERIT to acquire locks on > >>> the parent first. > >>> > >>> Since the latter has a larger impact on the current behavior and > >>> we already treat "DROP TABLE child" case in the similar way, I > >>> suppose that the first approach would be preferable. > >> > >> That makes sense. So, I attached only the first patch, rebased on the current master (It actually failed to apply on it.) and fixed a typo in a comment. This still runs a closed-loop test using temporary hash but it seems a bit paranoic. (this is the same check with what find_all_inheritors is doing) << the following is another topic >> > >> BTW, in the partitioned table case, the parent is always locked first > >> using an AccessExclusiveLock. There are other considerations in that case > >> such as needing to recreate the partition descriptor upon termination of > >> inheritance (both the DETACH PARTITION and also DROP TABLE child cases). > > > > Apart from the degree of concurrency, if we keep parent->children > > order of locking, such recreation does not seem to be > > needed. Maybe I'm missing something. > > Sorry to have introduced that topic in this thread, but I will try to > explain anyway why things are the way they are currently: > > Once a table is no longer a partition of the parent (detached or dropped), > we must make sure that the next commands in the transaction don't see it > as one. That information is currently present in the relcache > (rd_partdesc), which is used by a few callers, most notably the > tuple-routing code. Next commands must recreate the entry so that the > correct thing happens based on the updated information. More precisely, > we must invalidate the current entry. RelationClearRelation() will either > delete the entry or rebuild it. If it's being referenced somewhere, it > will be rebuilt. The place holding the reference may also be looking at > the content of rd_partdesc, which we don't require them to make a copy of, > so we must preserve its content while other fields of RelationData are > being read anew from the catalog. We don't have to preserve it if there > has been any change (partition added/dropped), but to make such a change > one would need to take a strong enough lock on the relation (parent). We > assume here that anyone who wants to reference rd_partdesc takes at least > AccessShareLock lock on the relation, and anyone who wants to change its > content must take a lock that will conflict with it, so > AccessExclusiveLock. Note that in all of this, we are only talking about > one relation, that is the parent, so parent -> child ordering of taking > locks may be irrelevant. I think I understand this, anyway DropInherit and DropPartition is different-but-at-the-same-level operations so surely needs amendment for drop/detach cases. Is there already a solution? Or reproducing steps? regards, -- Kyotaro Horiguchi NTT Open Source Software Center *** a/src/backend/catalog/pg_inherits.c --- b/src/backend/catalog/pg_inherits.c *************** *** 42,47 **** typedef struct SeenRelsEntry --- 42,49 ---- ListCell *numparents_cell; /* corresponding list cell */ } SeenRelsEntry; + static bool is_descendent_of_internal(Oid parentId, Oid childId, + HTAB *seen_rels); /* * find_inheritance_children * *************** *** 400,402 **** typeInheritsFrom(Oid subclassTypeId, Oid superclassTypeId) --- 402,472 ---- return result; } + + /* + * Check if the child is really a descendent of the parent + */ + bool + is_descendent_of(Oid parentId, Oid childId) + { + HTAB *seen_rels; + HASHCTL ctl; + bool ischild = false; + + memset(&ctl, 0, sizeof(ctl)); + ctl.keysize = sizeof(Oid); + ctl.entrysize = sizeof(Oid); + ctl.hcxt = CurrentMemoryContext; + + seen_rels = hash_create("is_descendent_of temporary table", + 32, /* start small and extend */ + &ctl, + HASH_ELEM | HASH_BLOBS | HASH_CONTEXT); + + ischild = is_descendent_of_internal(parentId, childId, seen_rels); + + hash_destroy(seen_rels); + + return ischild; + } + + static bool + is_descendent_of_internal(Oid parentId, Oid childId, HTAB *seen_rels) + { + Relation inhrel; + SysScanDesc scan; + ScanKeyData key[1]; + bool ischild = false; + HeapTuple inheritsTuple; + + inhrel = heap_open(InheritsRelationId, AccessShareLock); + ScanKeyInit(&key[0], Anum_pg_inherits_inhparent, + BTEqualStrategyNumber, F_OIDEQ, ObjectIdGetDatum(parentId)); + scan = systable_beginscan(inhrel, InheritsParentIndexId, true, + NULL, 1, key); + + while ((inheritsTuple = systable_getnext(scan)) != NULL) + { + bool found; + Oid inhrelid = ((Form_pg_inherits) GETSTRUCT(inheritsTuple))->inhrelid; + + hash_search(seen_rels, &inhrelid, HASH_ENTER, &found); + + /* + * Recursively check into children. Although there can't theoretically + * be any cycles in the inheritance graph, check the cycles following + * find_all_inheritors. + */ + if (inhrelid == childId || + (!found && is_descendent_of_internal(inhrelid, childId, seen_rels))) + { + ischild = true; + break; + } + } + + systable_endscan(scan); + heap_close(inhrel, AccessShareLock); + + return ischild; + } *** a/src/backend/optimizer/prep/prepunion.c --- b/src/backend/optimizer/prep/prepunion.c *************** *** 100,109 **** static List *generate_append_tlist(List *colTypes, List *colCollations, static List *generate_setop_grouplist(SetOperationStmt*op, List *targetlist); static void expand_inherited_rtentry(PlannerInfo *root,RangeTblEntry *rte, Index rti); ! static void make_inh_translation_list(Relation oldrelation, Relation newrelation, ! Index newvarno, ! List **translated_vars); static Bitmapset *translate_col_privs(const Bitmapset *parent_privs, List *translated_vars); static Node *adjust_appendrel_attrs_mutator(Node *node, --- 100,108 ---- static List *generate_setop_grouplist(SetOperationStmt *op, List *targetlist); static void expand_inherited_rtentry(PlannerInfo*root, RangeTblEntry *rte, Index rti); ! static List *make_inh_translation_list(Relation oldrelation, Relation newrelation, ! Index newvarno); static Bitmapset *translate_col_privs(const Bitmapset *parent_privs, List *translated_vars); static Node *adjust_appendrel_attrs_mutator(Node *node, *************** *** 1508,1513 **** expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) --- 1507,1527 ---- */ if (childrte->relkind != RELKIND_PARTITIONED_TABLE) { + List *translated_vars = + make_inh_translation_list(oldrelation, newrelation, + childRTindex); + + if (!translated_vars) + { + /* + * This childrel is no longer a child of the parent. + * Close the child relation releasing locks. + */ + if (childOID != parentOID) + heap_close(newrelation, lockmode); + continue; + } + /* Remember if we saw a real child. */ if (childOID != parentOID) has_child =true; *************** *** 1517,1524 **** expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) appinfo->child_relid= childRTindex; appinfo->parent_reltype = oldrelation->rd_rel->reltype; appinfo->child_reltype= newrelation->rd_rel->reltype; ! make_inh_translation_list(oldrelation, newrelation, childRTindex, ! &appinfo->translated_vars); appinfo->parent_reloid = parentOID; appinfos = lappend(appinfos, appinfo); --- 1531,1537 ---- appinfo->child_relid = childRTindex; appinfo->parent_reltype = oldrelation->rd_rel->reltype; appinfo->child_reltype = newrelation->rd_rel->reltype; ! appinfo->translated_vars = translated_vars; appinfo->parent_reloid = parentOID; appinfos= lappend(appinfos, appinfo); *************** *** 1623,1636 **** expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti) /* * make_inh_translation_list * Build the list of translations from parent Vars to child Vars for ! * an inheritance child. * * For paranoia's sake, we match type/collation as well as attribute name. */ ! static void make_inh_translation_list(Relation oldrelation, Relation newrelation, ! Index newvarno, ! List **translated_vars) { List *vars = NIL; TupleDesc old_tupdesc = RelationGetDescr(oldrelation); --- 1636,1649 ---- /* * make_inh_translation_list * Build the list of translations from parent Vars to child Varsfor ! * an inheritance child. Returns NULL if the two relations are no longer in ! * a inheritance relationship. * * For paranoia's sake, we match type/collation as well as attribute name. */ ! static List * make_inh_translation_list(Relation oldrelation, Relation newrelation, ! Index newvarno) { List *vars = NIL; TupleDesc old_tupdesc = RelationGetDescr(oldrelation); *************** *** 1700,1707 **** make_inh_translation_list(Relation oldrelation, Relation newrelation, --- 1713,1730 ---- break; } if (new_attno >= newnatts) + { + /* + * It's possible that newrelation is no longer a child of the + * oldrelation. We just ingore it for the case. + */ + if (!is_descendent_of(oldrelation->rd_id, newrelation->rd_id)) + return NULL; + + /* If still a child, something's wrong. */ elog(ERROR, "could not find inherited attribute\"%s\" of relation \"%s\"", attname, RelationGetRelationName(newrelation)); + } } /* Found it, check type and collation match */ *************** *** 1720,1726 **** make_inh_translation_list(Relation oldrelation, Relation newrelation, 0)); } ! *translated_vars = vars; } /* --- 1743,1749 ---- 0)); } ! return vars; } /* *** a/src/include/catalog/pg_inherits_fn.h --- b/src/include/catalog/pg_inherits_fn.h *************** *** 23,27 **** extern List *find_all_inheritors(Oid parentrelId, LOCKMODE lockmode, --- 23,28 ---- extern bool has_subclass(Oid relationId); extern bool has_superclass(Oid relationId); extern bool typeInheritsFrom(OidsubclassTypeId, Oid superclassTypeId); + extern bool is_descendent_of(Oid parentId, Oid childId); #endif /* PG_INHERITS_FN_H */
pgsql-hackers by date: