Re: Large rows number, and large objects - Mailing list pgsql-performance

From Jose Ildefonso Camargo Tolosa
Subject Re: Large rows number, and large objects
Date
Msg-id CAETJ_S9UtECSQUezPL6a-ZgnLNBFC_o6-k7WFrYdnsuqYXgcRw@mail.gmail.com
Whole thread Raw
In response to Re: Large rows number, and large objects  (Andrzej Nakonieczny <dzemik-pgsql-performance@e-list.pingwin.eu.org>)
List pgsql-performance
On Wed, Jul 20, 2011 at 3:03 PM, Andrzej Nakonieczny
<dzemik-pgsql-performance@e-list.pingwin.eu.org> wrote:
> W dniu 20.07.2011 17:57, Jose Ildefonso Camargo Tolosa pisze:
>
> [...]
>
>>    Many of the advantages of partitioning have to do with maintenance
>>    tasks.  For example, if you gather data on a daily basis, it's faster
>>    to drop the partition that contains Thursday's data than it is to do a
>>    DELETE that finds the rows and deletes them one at a time.  And VACUUM
>>    can be a problem on very large tables as well, because only one VACUUM
>>    can run on a table at any given time.  If the frequency with which the
>>    table needs to be vacuumed is less than the time it takes for VACUUM
>>    to complete, then you've got a problem.
>>
>>
>> And.... pg_largeobject table doesn't get vacuumed? I mean, isn't that
>> table just as any other table?
>
> Vacuum is a real problem on big pg_largeobject table. I have 1.6 TB database
> mostly with large objects and vacuuming that table on fast SAN takes about 4
> hours:
>
>        now          |        start        |   time   |  datname   |
>  current_query
> ---------------------+---------------------+----------+------------+----------------------------------------------
>  2011-07-20 20:12:03 | 2011-07-20 16:21:20 | 03:50:43 | bigdb      |
> autovacuum: VACUUM pg_catalog.pg_largeobject
> (1 row)
>
>
> LO generates a lot of dead tuples when object are adding:
>
>     relname      | n_dead_tup
> ------------------+------------
>  pg_largeobject   |     246731
>
> Adding LO is very fast when table is vacuumed. But when there is a lot of
> dead tuples adding LO is very slow (50-100 times slower) and eats 100% of
> CPU.
>
> It looks that better way is writing object directly as a bytea on paritioned
> tables althought it's a bit slower than LO interface on a vacuumed table.

Well... yes... I thought about that, but now then, what happen when
you need to fetch the file from the DB? will that be fetched
completely at once?  I'm thinking about large files here, say
(hypothetically speaking) you have 1GB files stored.... if the system
will fetch the whole 1GB at once, it would take 1GB RAM (or not?), and
that's what I wanted to avoid by dividing the file in 2kB chunks
(bytea chunks, actually).... I don't quite remember where I got the
2kB size from... but I decided I wanted to avoid using TOAST too.

>
>
> Regards,
> Andrzej
>

pgsql-performance by date:

Previous
From: Greg Smith
Date:
Subject: Intel 320 series drives firmware bug
Next
From: Robert Haas
Date:
Subject: Re: hstore - Implementation and performance issues around its operators