Memory efficient insertion/retrieval of bytea - Mailing list pgsql-general

From Tomasz Ostrowski
Subject Memory efficient insertion/retrieval of bytea
Date
Msg-id 20060920130628.GF30944@batory.org.pl
Whole thread Raw
Responses Re: Memory efficient insertion/retrieval of bytea  (Tomasz Ostrowski <tometzky@batory.org.pl>)
Re: Memory efficient insertion/retrieval of bytea  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
I'd like to propose something which would make an easy way for memory
efficient insertion/retrieval of bytea data.

Now for inserting data to a bytea we need 5*data_size in a buffer for
escaped data representation. 6*data_size if we do PQescapeByteaConn
in one chunk. I've made some experiments with attached program and
came up that for inserting 1kB of data this program needs about
5120kB of virtual memory, 1MB - 10MB, 10MB - 55MB. This is a lot for
a low-end workstation, too much for embedded devices.

We can use large objects for this but:
- large objects do not have any security;
- it is hard to ensure data integrity - AFAIK it is not possible to
  disallow deletion of large object if there are references to it.

So we need a function that takes an oid of a large object as an
argument and returns bytea, so a client application could do for
example:
    Oid oid = lo_import(conn, "filename");
    PQexec("insert into tablename values (bytea_from_lo(%d))", oid);
    lo_unlink(oid);

For retrieval it could be possible to do something like this:
    PQexec("select lo_from_bytea(columnname) as oid from tablename where ...");
And then:
    lo_export(conn, oid, "filename");
    lo_unlink(oid);

This bytea_from_lo and lo_from_bytea functions could be very
efficient for bytea's - and will use much less memory and run on an
easily upgradeable server not client. And the basic versions should
be easy to implement for an experienced PostgreSQL hacker.


The hassle of unlinking this large objects could be eliminated by
introducing a temporary large object. This would be objects which
live for example only until end of session. Or just use
contrib/vacuum_lo.


This would also solve "Allow read/write into TOAST values like large
objects" TODO entry.

So - what do you think about this?

Regards
Tometzky
--
...although Eating Honey was a very good thing to do, there was a
moment just before you began to eat it which was better than when you
were...
                                                      Winnie the Pooh

pgsql-general by date:

Previous
From: Matthias.Pitzl@izb.de
Date:
Subject: Re: Strange database corruption with PostgreSQL 7.4.x o
Next
From: Tomasz Ostrowski
Date:
Subject: Re: Memory efficient insertion/retrieval of bytea