Re: BLOB updates -> database size explodes - Mailing list pgsql-general

From Stephen Scheck
Subject Re: BLOB updates -> database size explodes
Date
Msg-id CAKjnHz1V+SK7hHPgA8FHKDuo7PMekCui3=jevbYgGGJr05dX+Q@mail.gmail.com
Whole thread Raw
In response to BLOB updates -> database size explodes  (Dimitar Misev <dimitarmisev@gmail.com>)
Responses Re: BLOB updates -> database size explodes
List pgsql-general
This is just a guess (I haven't dug into the low-level page/disk access Postgres code for Large Objects yet but if I'm right, the LO-based project I'm working on will likely face the same issues you're seeing), but LOs enjoy transactional behavior just like anything else (as far as I can tell from my testing) and so are subject to MVCC effects. Since LOs are opaque to Postgres and it can't infer anything about their structure, even flipping a single bit in a LO causes whatever page that bit maps to be marked invalid (as if the page corresponded exactly to one row in a normal table), and the page copied to a new one along with your change(s).

If this hypothesis is correct, doing a vacuum should free up dead pages and your size expectations should be more accurate. And if that's the case putting more intelligence into the application could mitigate some of the update growth (predicting what page temporally similar updates will go to and grouping them into a single transaction, for instance).


On Tue, May 28, 2013 at 2:53 PM, Dimitar Misev <dimitarmisev@gmail.com> wrote:
I'm having some issue with BLOB updates (via ECPG). The total blobs size should be ~280MB, but after partially updating all of them for 150 times the size on disk grows up from 184MB to 18GB.

In more details:

There are 608 blobs of size 460800 bytes. All blobs are updated piecewise in 150 repetitions; so first all blobs are updated in bytes 0 - 3071, then 3072 - 6143, etc. In the end on the disk the database is 18GB.

Computing the size of pg_largeobject gives me 267MB:

   SELECT pg_size_pretty(count(loid) * 2048) FROM pg_largeobject;
     pg_size_pretty
   ----------------
     267 MB

On the other hand, this gives me 18GB, and du -sh on the disk also reports 18.4GB:

   SELECT tablename,
           pg_size_pretty(size) AS size_pretty,
           pg_size_pretty(total_size) AS total_size_pretty
   FROM (SELECT *, pg_relation_size(schemaname||'.'||tablename) AS size,
                    pg_total_relation_size(schemaname||'.'||tablename)
   AS total_size
          FROM pg_tables) AS TABLES
   WHERE TABLES.tablename = 'pg_largeobject'
   ORDER BY total_size DESC;
       tablename    | size_pretty | total_size_pretty
   ----------------+-------------+-------------------
     pg_largeobject | 18 GB       | 18 GB


Doing these updates takes 85 minutes on quad-core i7 with 6GB RAM and SSD hard disk. This is PostgreSQL 8.4.12 on Debian 64 bit.

Anyone knows what's going on here?


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

pgsql-general by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Introduction
Next
From: Chris Travers
Date:
Subject: feedback request (Perl/Pg developers)