Re: Huge memory consumption on partitioned table with FKs - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: Huge memory consumption on partitioned table with FKs |
Date | |
Msg-id | CA+HiwqECx2BVSw3+9k9pkMvNH7ymW3gEbPgw956TtSCWt5F2mA@mail.gmail.com Whole thread Raw |
In response to | Re: Huge memory consumption on partitioned table with FKs (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Responses |
Re: Huge memory consumption on partitioned table with FKs
|
List | pgsql-hackers |
Hello, On Thu, Dec 3, 2020 at 10:15 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > At Wed, 2 Dec 2020 22:02:50 +0900, Amit Langote <amitlangote09@gmail.com> wrote in > > Hello, > > > > On Tue, Dec 1, 2020 at 9:40 AM Kyotaro Horiguchi > > <horikyota.ntt@gmail.com> wrote: > > > > > > At Mon, 30 Nov 2020 21:03:45 -0300, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in > > > > On 2020-Nov-26, Kyotaro Horiguchi wrote: > > > > > > > > > This shares RI_ConstraintInfo cache by constraints that shares the > > > > > same parent constraints. But you forgot that the cache contains some > > > > > members that can differ among partitions. > > > > > > > > > > Consider the case of attaching a partition that have experienced a > > > > > column deletion. > > > > > > > > I think this can be solved easily in the patch, by having > > > > ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if > > > > they are equal then use the parent's constaint_id, otherwise use the > > > > child constraint. That way, the cache entry is reused in the common > > > > case where they are identical. > > > > > > *I* think it's the direction. After an off-list discussion, we > > > confirmed that even in that case the patch works as is because > > > fk_attnum (or contuple.conkey) always stores key attnums compatible > > > to the topmost parent when conparent has a valid value (assuming the > > > current usage of fk_attnum), but I still feel uneasy to rely on that > > > unclear behavior. > > > > Hmm, I don't see what the problem is here, because it's not > > RI_ConstraintInfo that is being shared among partitions of the same > > parent, but the RI query (and the SPIPlanPtr for it) that is issued > > through SPI. The query (and the plan) turns out to be the same no > > matter what partition's RI_ConstraintInfo is first used to generate > > it. What am I missing? > > I don't think you're missing something. As I (tried to..) mentioned in > another branch of this thread, you're right. On the otherhand > fk_attnums is actually used to generate the query, thus *issue* > potentially affects the query. Although that might be paranoic, that > possibility is eliminated by checking the congruency(?) of fk_attnums > (or other members). We might even be able to share a riinfo among > such children. Just to be sure I've not misunderstood you, let me mention why I think the queries used by different partitions all turn out to be the same despite attribute number differences among partitions. The places that uses fk_attnums when generating a query are these among others: Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]); ... Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]); ... quoteOneName(attname, RIAttName(fk_rel, riinfo->fk_attnums[i])); For the queries on the referencing side ("check" side), type/collation/attribute name determined using the above are going to be the same for all partitions in a given tree irrespective of the attribute number, because they're logically the same column. On the referenced side ("restrict", "cascade", "set" side), as you already mentioned, fk_attnums refers to the top parent table of the referencing side, so no possibility of they being different in the various referenced partitions' RI_ConstraintInfos. On the topic of how we'd be able to share even the RI_ConstraintInfos among partitions, that would indeed look a bit more elaborate than the patch we have right now. > > BTW, one problem with Kuroda-san's patch is that it's using > > conparentid as the shared key, which can still lead to multiple > > queries being generated when using multi-level partitioning, because > > there would be many (intermediate) parent constraints in that case. > > We should really be using the "root" constraint OID as the key, > > because there would only be one root from which all constraints in a > > given partition hierarchy would've originated. Actually, I had > > written a patch a few months back to do exactly that, a polished > > version of which I've attached with this email. Please take a look. > > I don't like that the function self-recurses but > ri_GetParentConstrOid() actually climbs up to the root constraint, so > the patch covers the multilevel partitioning cases. Sorry, I got too easily distracted by the name Kuroda-san chose for the field in RI_ConstraintInfo and the function. > About your patch, it calculates the root constrid at the time an > riinfo is created, but when the root-partition is further attached to > another partitioned-table after the riinfo creation, > constraint_root_id gets stale. Of course that dones't matter > practically, though. Maybe we could also store the hash value of the root constraint OID as rootHashValue and check for that one too in InvalidateConstraintCacheCallBack(). That would take care of this unless I'm missing something. -- Amit Langote EDB: http://www.enterprisedb.com
pgsql-hackers by date: