Thread: Large objects and out-of-memory

Large objects and out-of-memory

From
Konstantin Knizhnik
Date:
Hi,

The following sequence of command cause backend's memory to exceed 10Gb:

CREATE DATABASE testmem;
CREATE ROLE alice WITH SUPERUSER LOGIN;
CREATE ROLE testlo WITH SUPERUSER LOGIN;
\c testmem # Подключаемся к тестовой базе
\c - alice
CREATE TABLE image1 (raster oid);
INSERT INTO image1 SELECT lo_creat(-1) FROM generate_series(1,10000000);
REASSIGN OWNED BY alice TO testlo;

This memory is occupied in top memory context by invalidation messages:

#0  AllocSetAlloc (context=0x19153a0, size=528) at aset.c:919
#1  0x0000000000b451d5 in MemoryContextAlloc (context=0x19153a0, 
size=528) at mcxt.c:809
#2  0x0000000000ae80c6 in AddInvalidationMessage (listHdr=0x19156d8, 
msg=0x7ffdca880600) at inval.c:241
#3  0x0000000000ae8473 in AddSnapshotInvalidationMessage (hdr=0x19156d0, 
dbId=0, relId=1214) at inval.c:437
#4  0x0000000000ae871c in RegisterSnapshotInvalidation (dbId=0, 
relId=1214) at inval.c:547
#5  0x0000000000ae915f in CacheInvalidateHeapTuple 
(relation=0x7f16afc35cb0, tuple=0x7ffdca8807b0, newtuple=0x198dba8) at 
inval.c:1162
#6  0x00000000004f2d41 in heap_update (relation=0x7f16afc35cb0, 
otid=0x198dbac, newtup=0x198dba8, cid=6, crosscheck=0x0, wait=true, 
tmfd=0x7ffdca880830,
     lockmode=0x7ffdca880828) at heapam.c:3729
#7  0x00000000004f3913 in simple_heap_update (relation=0x7f16afc35cb0, 
otid=0x198dbac, tup=0x198dba8) at heapam.c:3891
#8  0x00000000005c281d in CatalogTupleUpdate (heapRel=0x7f16afc35cb0, 
otid=0x198dbac, tup=0x198dba8) at indexing.c:309
#9  0x00000000005e7713 in shdepChangeDep (sdepRel=0x7f16afc35cb0, 
classid=2613, objid=11950311, objsubid=0, refclassid=1260, refobjid=32848,
     deptype=SHARED_DEPENDENCY_OWNER) at pg_shdepend.c:269
#10 0x00000000005e784c in changeDependencyOnOwner (classId=2613, 
objectId=11950311, newOwnerId=32848) at pg_shdepend.c:316
#11 0x0000000000669844 in AlterObjectOwner_internal (rel=0x7f16afc34c90, 
objectId=11950311, new_ownerId=32848) at alter.c:1050
#12 0x00000000005e9c8e in shdepReassignOwned (roleids=0x19aa138, 
newrole=32848) at pg_shdepend.c:1587
#13 0x0000000000718aa2 in ReassignOwnedObjects (stmt=0x18f1830) at 
user.c:1425
#14 0x000000000097bf7d in standard_ProcessUtility (pstmt=0x18f1b80, 
queryString=0x18f0cd0 "REASSIGN OWNED BY alice TO testlo;",
     context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, 
dest=0x18f1c70, qc=0x7ffdca881180) at utility.c:890
#15 0x000000000097b508 in ProcessUtility (pstmt=0x18f1b80, 
queryString=0x18f0cd0 "REASSIGN OWNED BY alice TO testlo;", 
context=PROCESS_UTILITY_TOPLEVEL,

As far as I understand invalidation messages  are kept until the end of 
transaction:

/*
  * CommandEndInvalidationMessages
  *        Process queued-up invalidation messages at end of one command
  *        in a transaction.
  *
  * Here, we send no messages to the shared queue, since we don't know 
yet if
  * we will commit.  We do need to locally process the CurrentCmdInvalidMsgs
  * list, so as to flush our caches of any entries we have outdated in the
  * current command.  We then move the current-cmd list over to become part
  * of the prior-cmds list.
  *
  * Note:
  *        This should be called during CommandCounterIncrement(),
  *        after we have advanced the command ID.
  */

void
CommandEndInvalidationMessages(void)
{
AppendInvalidationMessages(&transInvalInfo->PriorCmdInvalidMsgs,
&transInvalInfo->CurrentCmdInvalidMsgs);
}

So in PriorCmdInvalidMsgs we have 10 millions invalidation message chunks.
Such memory blow happens in this scenario twice: when large objects are 
created and when owner is reassigned.
It can not be considered as memory leak, but I think that it is a real 
problem which has to be fixed.
I am not so familiar with invalidation message processing, I will be 
pleae if somebody can suggest solution of the problem.

Thanks in advance,

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Large objects and out-of-memory

From
Tom Lane
Date:
Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes:
> The following sequence of command cause backend's memory to exceed 10Gb:

> INSERT INTO image1 SELECT lo_creat(-1) FROM generate_series(1,10000000);
> REASSIGN OWNED BY alice TO testlo;

[ shrug... ]  You're asking to change the ownership of 10000000 objects.
This is not going to be a cheap operation.  AFAIK it's not going to be
any more expensive than changing the ownership of 10000000 tables, or
any other kind of object.

The argument for allowing large objects to have per-object ownership and
permissions in the first place was that useful scenarios wouldn't have a
huge number of them (else you'd run out of disk space, if they're actually
"large"), so we needn't worry too much about the overhead.

We could possibly bound the amount of space used in the inval queue by
switching to an "invalidate all" approach once we got to an unreasonable
amount of space.  But this will do nothing for the other costs involved,
and I'm not really sure it's worth adding complexity for.

            regards, tom lane



Re: Large objects and out-of-memory

From
Konstantin Knizhnik
Date:

On 21.12.2020 21:27, Tom Lane wrote:
> Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes:
>> The following sequence of command cause backend's memory to exceed 10Gb:
>> INSERT INTO image1 SELECT lo_creat(-1) FROM generate_series(1,10000000);
>> REASSIGN OWNED BY alice TO testlo;
> [ shrug... ]  You're asking to change the ownership of 10000000 objects.
> This is not going to be a cheap operation.  AFAIK it's not going to be
> any more expensive than changing the ownership of 10000000 tables, or
> any other kind of object.
>
> The argument for allowing large objects to have per-object ownership and
> permissions in the first place was that useful scenarios wouldn't have a
> huge number of them (else you'd run out of disk space, if they're actually
> "large"), so we needn't worry too much about the overhead.
>
> We could possibly bound the amount of space used in the inval queue by
> switching to an "invalidate all" approach once we got to an unreasonable
> amount of space.  But this will do nothing for the other costs involved,
> and I'm not really sure it's worth adding complexity for.
>
>             regards, tom lane

It seems to me that several millions large objects (documents, images, 
... any other blobs) is something less exotic
than several millions tables.
In any case I think that such command REASSIGN should not cause crash of 
the server.

I attached small patch based on your idea: replace individual 
invalidation messages with invalidate-all messages.
If GUC variable invalidate_all_threshold is set to non zero value, then 
there is no memory overflow in this scenario.

I am not sure if the proposed approach to "collapse" invalidation 
message is really good.
Especially I do not like that there is no stored counter of invalidation 
message and I have to traverse all chunk list to calculate it.

Also I noticed small inconsistency in inval.c:

     else if (msg->id == SHAREDINVALSNAPSHOT_ID)
     {
         /* We only care about our own database and shared catalogs */
         if (msg->rm.dbId == InvalidOid)
             InvalidateCatalogSnapshot();
         else if (msg->rm.dbId == MyDatabaseId)
             InvalidateCatalogSnapshot();
     }

Here we are processing snapshot invalidation message but using structure 
for relation map invalidation.
It doesn't cause problems because only dbId field is used and it has the 
same offset
in SharedInvalRelmapMsg and SharedInvalSnapshotMsg structures.
But it should be fixed.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

Re: Large objects and out-of-memory

From
Michael Paquier
Date:
On Thu, Dec 24, 2020 at 05:06:34PM +0300, Konstantin Knizhnik wrote:
> Also I noticed small inconsistency in inval.c:
>
>     else if (msg->id == SHAREDINVALSNAPSHOT_ID)
>     {
>         /* We only care about our own database and shared catalogs */
>         if (msg->rm.dbId == InvalidOid)
>             InvalidateCatalogSnapshot();
>         else if (msg->rm.dbId == MyDatabaseId)
>             InvalidateCatalogSnapshot();
>     }
>
> Here we are processing snapshot invalidation message but using structure for
> relation map invalidation.
> It doesn't cause problems because only dbId field is used and it has the
> same offset
> in SharedInvalRelmapMsg and SharedInvalSnapshotMsg structures.
> But it should be fixed.

Good catch.  It is very easy to miss that this is not "rm", but "sn".
I have checked the rest of this area and did not notice similar
errors.  I'll backpatch the attached if there are no objections.  This
may look cosmetic, but this would cause unnecessary conflicts if this
area gets touched and it would be hard to miss.  This comes from
568d413.  There may be a point in having a single call to
InvalidateCatalogSnapshot() as well..
--
Michael

Attachment

Re: Large objects and out-of-memory

From
Michael Paquier
Date:
On Fri, Dec 25, 2020 at 10:51:00AM +0900, Michael Paquier wrote:
> Good catch.  It is very easy to miss that this is not "rm", but "sn".
> I have checked the rest of this area and did not notice similar
> errors.  I'll backpatch the attached if there are no objections.

Applied this one with 643428c & co.
--
Michael

Attachment