Re: Large rows number, and large objects - Mailing list pgsql-performance

From Jose Ildefonso Camargo Tolosa
Subject Re: Large rows number, and large objects
Date
Msg-id CAETJ_S-2CwPD+oCzRmnM4i66S6o-ih_qBzdKDknSBCdJFwQvzA@mail.gmail.com
Whole thread Raw
In response to Re: Large rows number, and large objects  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Large rows number, and large objects
Re: Large rows number, and large objects
List pgsql-performance


On Tue, Jul 19, 2011 at 3:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Sun, Jun 19, 2011 at 10:19 PM, Jose Ildefonso Camargo Tolosa
> So, the question is, if I were to store 8TB worth of data into large
> objects system, it would actually make the pg_largeobject table slow,
> unless it was automatically partitioned.

I think it's a bit of an oversimplification to say that large,
unpartitioned tables are automatically going to be slow.  Suppose you
had 100 tables that were each 80GB instead of one table that is 8TB.
The index lookups would be a bit faster on the smaller tables, but it
would take you some non-zero amount of time to figure out which index
to read in the first place.  It's not clear that you are really
gaining all that much.

Certainly.... but it is still very blurry to me on *when* it is better to partition than not.
 

Many of the advantages of partitioning have to do with maintenance
tasks.  For example, if you gather data on a daily basis, it's faster
to drop the partition that contains Thursday's data than it is to do a
DELETE that finds the rows and deletes them one at a time.  And VACUUM
can be a problem on very large tables as well, because only one VACUUM
can run on a table at any given time.  If the frequency with which the
table needs to be vacuumed is less than the time it takes for VACUUM
to complete, then you've got a problem.

And.... pg_largeobject table doesn't get vacuumed? I mean, isn't that table just as any other table?
 

But I think that if we want to optimize pg_largeobject, we'd probably
gain a lot more by switching to a different storage format than we
could ever gain by partitioning the table.  For example, we might
decide that any object larger than 16MB should be stored in its own
file.  Even somewhat smaller objects would likely benefit from being
stored in larger chunks - say, a bunch of 64kB chunks, with any
overage stored in the 2kB chunks we use now.  While this might be an
interesting project, it's probably not going to be anyone's top
priority, because it would be a lot of work for the amount of benefit
you'd get.  There's an easy workaround: store the files in the
filesystem, and a path to those files in the database.

Ok, one reason for storing a file *in* the DB is to be able to do PITR of a wrongly deleted files (or overwritten, and that kind of stuff), on the filesystem level you would need a versioning filesystem (and I don't, yet, know any that is stable in the Linux world).

Also, you can use streaming replication and at the same time you stream your data, your files are also streamed to a secondary server (yes, on the FS-level you could use drbd or similar).

Ildefonso.

pgsql-performance by date:

Previous
From: Stefan Keller
Date:
Subject: Re: hstore - Implementation and performance issues around its operators
Next
From: Robert Haas
Date:
Subject: Re: Large rows number, and large objects