Re: copy command and blobs - Mailing list pgsql-performance

From Greg Spiegelberg
Subject Re: copy command and blobs
Date
Msg-id AANLkTimfbcCickrZfiWLLoDqiJGKquFYwki2f_FPPK53@mail.gmail.com
Whole thread Raw
In response to Re: copy command and blobs  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-performance
On Sat, Jan 22, 2011 at 8:41 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Jan 21, 2011 at 5:10 PM, Madhu Ramachandran <iammadhu@gmail.com> wrote:
> i was looking at
> http://www.postgresql.org/files/documentation/books/aw_pgsql/node96.html
> when they talk about using OID type to store large blobs (in my case .jpg
> files )

It's probably worth noting that that document is 9 years old.  It
might be worth reading something a little more up-to-date.  Perhaps:

http://www.postgresql.org/docs/current/static/largeobjects.html


A bit late to respond but better than never!

As of my latest testing in 8.3, I've found that the lo_* functions while adequate are a bit slow.  Our implemented alternative that leverages pg_read_file() is significantly faster.  I believe it is because pg_read_file() tells the database to go straight to the file system rather than through the client connection.  From memory, I seem to recall this being about 20% faster than the lo_* option or simple INSERTs.

The downside to pg_read_file() is that the file must be 1) on the same system as the database and 2) must be under the $PGDATA directory.  We opted to create a directory $PGDATA/public with proper system-side permissions but open enough to allow the database owner to read the files.

For example,
postgres=# select pg_read_file('public/a_file', 0, (pg_stat_file('postgresql.conf')).size);

We use this method in conjunction with additional checks to store files in tables governed by the MD5 hash of the file to prevent duplication.

HTH.
Greg

pgsql-performance by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: Really really slow select count(*)
Next
From: Linos
Date:
Subject: general hardware advice