Re: BUG #15660: pg_dump memory leaks when dumping LOBs - Mailing list pgsql-bugs

From Michael Banck
Subject Re: BUG #15660: pg_dump memory leaks when dumping LOBs
Date
Msg-id 7da8823d83a2b66bdd917aa6cb2c5c2619d86011.camel@credativ.de
Whole thread Raw
In response to Re: BUG #15660: pg_dump memory leaks when dumping LOBs  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Hi,

Am Donnerstag, dem 28.02.2019 um 12:30 -0500 schrieb Tom Lane:
> Yeah.  We've heard about this repeatedly since large objects were
> promoted to be full-fledged objects with individual owners and ACL
> info.  It was argued at the time that large objects would be, well,
> large, and therefore there wouldn't be *that* many of them, and so
> it'd be okay for pg_dump to treat them as full-fledged objects with
> their own dump-table-of-contents entries.  Evidently, not everybody
> uses them that way :-(.

To give another data point, we also encountered such a problem at a
customer recently.

They have around 70 tables and a database size of 24 GB, but over 20
million rows in pg_largeobject. The directory dump they are using on an
XFS file system has 14,6 million files in it, leading to a size of the
directory inode of over 450 MB:

|drwx------+ 2 postgres postgres 454M 14. Sep 07:10 xxxxxxxxxtst-20220913

The TOC file is larger than 800 MB and has ca. 14,6 million lines, all
but around 700 of which are BLOB entries.

Dumping with pg_dump -d -j2 leads to frequent out of memory situations
on this node due to the large amount of memory pg_dump needs (and also
because their TSM client apparently also cannot cope with so many files
and takes up 20% of the RAM as well):

|postgres 43844 73.9  8.8 11711000 11680472 ?   R    00:27 297:00 /srv/[...]/bin/pg_dump -Fd -b -f
[...]/xxxxxxxxxtst-20220913-j 2 -Z 1 XXXXXXXXXTST
 
|postgres 43832  0.0  8.8 11711000 11680236 ?   S    00:27   0:03 /srv/[...]/bin/pg_dump -Fd -b -f
[...]/xxxxxxxxxtst-20220913-j 2 -Z 1 XXXXXXXXXTST
 
|postgres 38251  0.3  8.8 11710680 11683100 ?   S    00:21   1:22 /srv/[...]/bin/pg_dump -Fd -b -f
[...]/xxxxxxxxxtst-20220913-j 2 -Z 1 XXXXXXXXXTST
 

> At some point I think we're going to have to revisit that decision in
> pg_dump.  Maybe we could merge things so that we have one TOC entry
> for all blobs that share the same owner and ACL?  But it comes up
> seldom enough that nobody's gotten motivated to do the work.

Yeah, it would be good if something could be done about this.

> In the short run, the only answer is to run pg_dump on beefier iron,
> or else split up the dump as Haribabu suggests (although TBH I'm
> afraid that might not work either --- if memory serves, pg_dump tends
> to pull in all the per-object info even if it's not going to dump it
> all ...)

In this case, we will try to advise the client to not use large objects
if possible (this is only a test instance so far).


Michael

-- 
Michael Banck
Teamleiter PostgreSQL-Team
Projektleiter
Tel.: +49 2166 9901-171
E-Mail: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Geoff Richardson, Peter Lilley

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz




pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #17615: Getting error while inserting records in the table: invalid byte sequence for encoding "UTF8": 0xae
Next
From: PG Bug reporting form
Date:
Subject: BUG #17616: Silently skipping to update data to the database with owner privileges using flyway scripts