Re: Databases compared at zend.com - Mailing list pgsql-general

From Doug McNaught
Subject Re: Databases compared at zend.com
Date
Msg-id m31yp43v79.fsf@belphigor.mcnaught.org
Whole thread Raw
In response to Re: Databases compared at zend.com  (<pgsql-general@commandprompt.com>)
List pgsql-general
<pgsql-general@commandprompt.com> writes:

> > Good article.  The only inaccuracy I saw was that he claims Postgres'
> > storage of large objects is "in the filesystem" and "inefficient".
> > >From my reading of the docs this is not true--large objects are stored
> > within the database just as with any other data.  The programming
> > interface to LOs may or may not be clunkier than that of other
> > databases, but the storage is no less efficient.
>
>
> As the person who wrote it :) It was my understanding that PostgreSQL
> stores large objects on the filesystem outside of the database tables.
> They may be indexed but I thought there was only an identifier within the
> table that pointed to the large object.

> If this is not true, could someone please describe the actual process, I
> can update the article.

You're close, but not completely accurate.

To quote the 7.1 docs:

[http://postgresql.readysetnet.com/users-lounge/docs/7.1/programmer/largeobjects.html]

    2.1. Historical Note

    Originally, Postgres 4.2 supported three standard implementations of
    large objects: as files external to Postgres, as external files
    managed by Postgres, and as data stored within the Postgres
    database. It causes considerable confusion among users. As a result,
    we only support large objects as data stored within the Postgres
    database in PostgreSQL. Even though it is slower to access, it
    provides stricter data integrity. For historical reasons, this storage
    scheme is referred to as Inversion large objects. (We will use
    Inversion and large objects interchangeably to mean the same thing in
    this section.) Since PostgreSQL 7.1 all large objects are placed in
    one system table called pg_largeobject.

It's interesting that the docs claim this table-level storage is
slower to access than using external files.  On one respect, though,
it's more efficient than storing LOs in files--if you have a lot of LO
files in a single directory, access to those files can get very slow
on many filesystems (Linux ext2, BSD ufs) due to linear searching of
the directory.

So, from my knowledge and reading of the docs (I've used LOs a bit
but not extensively), here's my understanding:

* LOs are stored within a single system table (as above) which is
  indexed by OID.
* User applications store references to LOs in other tables as column
  type "oid".
* Postgres provides file-descriptor-style access (open, read, write,
  seek, tell) to LOs through their OIDs.  This is something you want,
  since you don't want to be forced to read an entire LO into memory
  in order to work with it.

It might be nice also to have more "convenient" interfaces to
manipulate large objects in a way that approximates "normal" column
data, for LOs that are not too big.  You could probably do a lot of
this by writing custom functions.

-Doug
--
The rain man gave me two cures; he said jump right in,
The first was Texas medicine--the second was just railroad gin,
And like a fool I mixed them, and it strangled up my mind,
Now people just get uglier, and I got no sense of time...          --Dylan

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Databases compared at zend.com
Next
From: Tom Lane
Date:
Subject: Re: Databases compared at zend.com