Re: Memory leak in RelationBuildRowSecurity - Mailing list pgsql-bugs

From Tom Lane
Subject Re: Memory leak in RelationBuildRowSecurity
Date
Msg-id 1937613.1601057234@sss.pgh.pa.us
Whole thread Raw
In response to Re: Memory leak in RelationBuildRowSecurity  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Responses Re: Memory leak in RelationBuildRowSecurity  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
List pgsql-bugs
Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes:
> On 25.09.2020 16:48, Tom Lane wrote:
>> Hm ... this smells a whole lot like the issue we fixed recently for
>> partition expressions, namely: what the devil are we doing reloading
>> these expressions during RelationClearRelation?  We should delay
>> populating that cache until the value is requested.

> Sorry, this stack trace was obtained at 9.6 version of Postgres which 
> the customer is using.
> So may be the problem with cache invalidation is already fixed.

Not sure if we've changed anything much in that area.  I've occasionally
wondered if we should not increase the size of the sinval message queue,
but it hasn't happened yet.

Anyway, looking more closely at RelationBuildRowSecurity, it doesn't
seem to be doing anything more dangerous than RelationBuildRuleLock
is; it's definitely not risking recursion, as the partition stuff did.
So I take back the idea that we need to postpone it till the data is
referenced.  However, it is leaking memory.

Attached is a proposed rewrite that borrows some ideas from
RelationBuildRuleLock, and also makes use of MemoryContextSetParent
so we don't need a PG_TRY block.  In a very simple test
(using regress_rls_schema.rls_tbl from the regression database),
this doesn't seem to leak any memory in the caller context,
and it also doesn't make the rscxt any bigger than it was before.

> But at least calling pfree for a tree is very strange idea: we should 
> not do it all or allocate tree in separate memory context.

Yeah, that's pretty useless.  But also I think the string forms of
the expressions are leaking as much memory as the tree forms.

This is against HEAD but I think it wouldn't be hard to back-patch.
Are you in a position to see if it fixes the problem in your customer's
environment?

            regards, tom lane

diff --git a/src/backend/commands/policy.c b/src/backend/commands/policy.c
index 4b4e469493..d3f8e8f06c 100644
--- a/src/backend/commands/policy.c
+++ b/src/backend/commands/policy.c
@@ -187,159 +187,139 @@ policy_role_list_to_array(List *roles, int *num_roles)
 /*
  * Load row security policy from the catalog, and store it in
  * the relation's relcache entry.
+ *
+ * Note that caller should have verified that pg_class.relrowsecurity
+ * is true for this relation.
  */
 void
 RelationBuildRowSecurity(Relation relation)
 {
     MemoryContext rscxt;
     MemoryContext oldcxt = CurrentMemoryContext;
-    RowSecurityDesc *volatile rsdesc = NULL;
+    RowSecurityDesc *rsdesc;
+    Relation    catalog;
+    ScanKeyData skey;
+    SysScanDesc sscan;
+    HeapTuple    tuple;
 
     /*
      * Create a memory context to hold everything associated with this
      * relation's row security policy.  This makes it easy to clean up during
-     * a relcache flush.
+     * a relcache flush.  However, to cover the possibility of an error
+     * partway through, we don't make the context long-lived till we're done.
      */
-    rscxt = AllocSetContextCreate(CacheMemoryContext,
+    rscxt = AllocSetContextCreate(CurrentMemoryContext,
                                   "row security descriptor",
                                   ALLOCSET_SMALL_SIZES);
+    MemoryContextCopyAndSetIdentifier(rscxt,
+                                      RelationGetRelationName(relation));
+
+    rsdesc = MemoryContextAllocZero(rscxt, sizeof(RowSecurityDesc));
+    rsdesc->rscxt = rscxt;
 
     /*
-     * Since rscxt lives under CacheMemoryContext, it is long-lived.  Use a
-     * PG_TRY block to ensure it'll get freed if we fail partway through.
+     * Now scan pg_policy for RLS policies associated with this relation.
+     * Because we use the index on (polrelid, polname), we should consistently
+     * visit the rel's policies in name order, at least when system indexes
+     * aren't disabled.  This simplifies equalRSDesc().
      */
-    PG_TRY();
-    {
-        Relation    catalog;
-        ScanKeyData skey;
-        SysScanDesc sscan;
-        HeapTuple    tuple;
-
-        MemoryContextCopyAndSetIdentifier(rscxt,
-                                          RelationGetRelationName(relation));
+    catalog = table_open(PolicyRelationId, AccessShareLock);
 
-        rsdesc = MemoryContextAllocZero(rscxt, sizeof(RowSecurityDesc));
-        rsdesc->rscxt = rscxt;
+    ScanKeyInit(&skey,
+                Anum_pg_policy_polrelid,
+                BTEqualStrategyNumber, F_OIDEQ,
+                ObjectIdGetDatum(RelationGetRelid(relation)));
 
-        catalog = table_open(PolicyRelationId, AccessShareLock);
+    sscan = systable_beginscan(catalog, PolicyPolrelidPolnameIndexId, true,
+                               NULL, 1, &skey);
 
-        ScanKeyInit(&skey,
-                    Anum_pg_policy_polrelid,
-                    BTEqualStrategyNumber, F_OIDEQ,
-                    ObjectIdGetDatum(RelationGetRelid(relation)));
+    while (HeapTupleIsValid(tuple = systable_getnext(sscan)))
+    {
+        Form_pg_policy policy_form = (Form_pg_policy) GETSTRUCT(tuple);
+        RowSecurityPolicy *policy;
+        Datum        datum;
+        bool        isnull;
+        char       *str_value;
 
-        sscan = systable_beginscan(catalog, PolicyPolrelidPolnameIndexId, true,
-                                   NULL, 1, &skey);
+        policy = MemoryContextAllocZero(rscxt, sizeof(RowSecurityPolicy));
 
         /*
-         * Loop through the row level security policies for this relation, if
-         * any.
+         * Note: we must be sure that pass-by-reference data gets copied into
+         * rscxt.  We avoid making that context current over wider spans than
+         * we have to, though.
          */
-        while (HeapTupleIsValid(tuple = systable_getnext(sscan)))
-        {
-            Datum        value_datum;
-            char        cmd_value;
-            bool        permissive_value;
-            Datum        roles_datum;
-            char       *qual_value;
-            Expr       *qual_expr;
-            char       *with_check_value;
-            Expr       *with_check_qual;
-            char       *policy_name_value;
-            bool        isnull;
-            RowSecurityPolicy *policy;
-
-            /*
-             * Note: all the pass-by-reference data we collect here is either
-             * still stored in the tuple, or constructed in the caller's
-             * short-lived memory context.  We must copy it into rscxt
-             * explicitly below.
-             */
-
-            /* Get policy command */
-            value_datum = heap_getattr(tuple, Anum_pg_policy_polcmd,
-                                       RelationGetDescr(catalog), &isnull);
-            Assert(!isnull);
-            cmd_value = DatumGetChar(value_datum);
-
-            /* Get policy permissive or restrictive */
-            value_datum = heap_getattr(tuple, Anum_pg_policy_polpermissive,
-                                       RelationGetDescr(catalog), &isnull);
-            Assert(!isnull);
-            permissive_value = DatumGetBool(value_datum);
-
-            /* Get policy name */
-            value_datum = heap_getattr(tuple, Anum_pg_policy_polname,
-                                       RelationGetDescr(catalog), &isnull);
-            Assert(!isnull);
-            policy_name_value = NameStr(*(DatumGetName(value_datum)));
-
-            /* Get policy roles */
-            roles_datum = heap_getattr(tuple, Anum_pg_policy_polroles,
-                                       RelationGetDescr(catalog), &isnull);
-            /* shouldn't be null, but initdb doesn't mark it so, so check */
-            if (isnull)
-                elog(ERROR, "unexpected null value in pg_policy.polroles");
-
-            /* Get policy qual */
-            value_datum = heap_getattr(tuple, Anum_pg_policy_polqual,
-                                       RelationGetDescr(catalog), &isnull);
-            if (!isnull)
-            {
-                qual_value = TextDatumGetCString(value_datum);
-                qual_expr = (Expr *) stringToNode(qual_value);
-            }
-            else
-                qual_expr = NULL;
 
-            /* Get WITH CHECK qual */
-            value_datum = heap_getattr(tuple, Anum_pg_policy_polwithcheck,
-                                       RelationGetDescr(catalog), &isnull);
-            if (!isnull)
-            {
-                with_check_value = TextDatumGetCString(value_datum);
-                with_check_qual = (Expr *) stringToNode(with_check_value);
-            }
-            else
-                with_check_qual = NULL;
+        /* Get policy command */
+        policy->polcmd = policy_form->polcmd;
 
-            /* Now copy everything into the cache context */
-            MemoryContextSwitchTo(rscxt);
+        /* Get policy, permissive or restrictive */
+        policy->permissive = policy_form->polpermissive;
 
-            policy = palloc0(sizeof(RowSecurityPolicy));
-            policy->policy_name = pstrdup(policy_name_value);
-            policy->polcmd = cmd_value;
-            policy->permissive = permissive_value;
-            policy->roles = DatumGetArrayTypePCopy(roles_datum);
-            policy->qual = copyObject(qual_expr);
-            policy->with_check_qual = copyObject(with_check_qual);
-            policy->hassublinks = checkExprHasSubLink((Node *) qual_expr) ||
-                checkExprHasSubLink((Node *) with_check_qual);
+        /* Get policy name */
+        policy->policy_name =
+            MemoryContextStrdup(rscxt, NameStr(policy_form->polname));
 
-            rsdesc->policies = lcons(policy, rsdesc->policies);
+        /* Get policy roles */
+        datum = heap_getattr(tuple, Anum_pg_policy_polroles,
+                             RelationGetDescr(catalog), &isnull);
+        /* shouldn't be null, but let's check for luck */
+        if (isnull)
+            elog(ERROR, "unexpected null value in pg_policy.polroles");
+        MemoryContextSwitchTo(rscxt);
+        policy->roles = DatumGetArrayTypePCopy(datum);
+        MemoryContextSwitchTo(oldcxt);
 
+        /* Get policy qual */
+        datum = heap_getattr(tuple, Anum_pg_policy_polqual,
+                             RelationGetDescr(catalog), &isnull);
+        if (!isnull)
+        {
+            str_value = TextDatumGetCString(datum);
+            MemoryContextSwitchTo(rscxt);
+            policy->qual = (Expr *) stringToNode(str_value);
             MemoryContextSwitchTo(oldcxt);
+            pfree(str_value);
+        }
+        else
+            policy->qual = NULL;
 
-            /* clean up some (not all) of the junk ... */
-            if (qual_expr != NULL)
-                pfree(qual_expr);
-            if (with_check_qual != NULL)
-                pfree(with_check_qual);
+        /* Get WITH CHECK qual */
+        datum = heap_getattr(tuple, Anum_pg_policy_polwithcheck,
+                             RelationGetDescr(catalog), &isnull);
+        if (!isnull)
+        {
+            str_value = TextDatumGetCString(datum);
+            MemoryContextSwitchTo(rscxt);
+            policy->with_check_qual = (Expr *) stringToNode(str_value);
+            MemoryContextSwitchTo(oldcxt);
+            pfree(str_value);
         }
+        else
+            policy->with_check_qual = NULL;
 
-        systable_endscan(sscan);
-        table_close(catalog, AccessShareLock);
-    }
-    PG_CATCH();
-    {
-        /* Delete rscxt, first making sure it isn't active */
+        /* We want to cache whether there are SubLinks in these expressions */
+        policy->hassublinks = checkExprHasSubLink((Node *) policy->qual) ||
+            checkExprHasSubLink((Node *) policy->with_check_qual);
+
+        /*
+         * Add this object to list.  For historical reasons, the list is built
+         * in reverse order.
+         */
+        MemoryContextSwitchTo(rscxt);
+        rsdesc->policies = lcons(policy, rsdesc->policies);
         MemoryContextSwitchTo(oldcxt);
-        MemoryContextDelete(rscxt);
-        PG_RE_THROW();
     }
-    PG_END_TRY();
 
-    /* Success --- attach the policy descriptor to the relcache entry */
+    systable_endscan(sscan);
+    table_close(catalog, AccessShareLock);
+
+    /*
+     * Success.  Reparent the descriptor's memory context under
+     * CacheMemoryContext so that it will live indefinitely, then attach the
+     * policy descriptor to the relcache entry.
+     */
+    MemoryContextSetParent(rscxt, CacheMemoryContext);
+
     relation->rd_rsdesc = rsdesc;
 }


pgsql-bugs by date:

Previous
From: Brian Kanaga
Date:
Subject: Re: BUG #16627: union all with partioned table yields random aggregate results
Next
From: "David G. Johnston"
Date:
Subject: Re: BUG #16627: union all with partioned table yields random aggregate results