storing binary files / memory limit - Mailing list pgsql-php

From Tomas Vondra
Subject storing binary files / memory limit
Date
Msg-id 48F0D3CD.9020306@fuzzy.cz
Whole thread Raw
Responses Re: storing binary files / memory limit
List pgsql-php
Hi,

I'm kind of struggling with storing binary files inside the database.
I'd like to store them in BYTEA columns, but the problem is the files
are quite large (a few megabytes, usually), so the PHP memory_limit is
reached when escaping the data.

It does not matter whether I use old-fashioned pg_* functions or the new
PDO extension - with BYTEA columns both do behave the same. For example
with a 16MB file (called input.data) and 8MB memory_limit, this throws
an 'Allowed memory exhausted' exception due to the fact that all the
data (read from the file) have to be escaped at once:

------------------------------------------------------------------------

$pdo = new PDO('pgsql: ... ');

// data_table (id INTEGER, data BYTEA)
$stmt = $pdo->prepare('INSERT INTO data_table VALUES (:id, :data)');

$id = 1;
$input = fopen('/tmp/input.data', 'rb');

$stmt->bindParam(':id', $id, PDO::PARAM_INT);
$stmt->bindValue(':data', $input, PDO::PARAM_LOB);

$stmt->execute();

------------------------------------------------------------------------

Till now I've found two solutions, but none of them meets all my
requirements:

1) using LOBs - with LOBs a 'streaming of data' is possible, i.e. you
    can do this:

    ---------------------------------------------------------------------
    $pdo->beginTransaction();
    $input = fopen('/tmp/input.data', 'rb');
    $oid = $pdo->pgsqlLOBCreate();
    $stream = $pdo->pgsqlLOBOpen($oid, 'w');
    stream_copy_to_stream($input, $stream);
    $pdo->commit();
    ---------------------------------------------------------------------

    This is nice because it saves memory, but a serious disadvantage (for
    me) is a lack of refferential integrity - you can delete a LOB even
    if it's referenced from some table (i.e. an OID is stored in it). So
    it's basically the same as storing the data directly in a filesystem,
    and storing just a path to it.

2) using BYTEA columns and do the 'streaming' on my own - just store the
    data as usual, but upload them 'by pieces' (say 100kB). This is
    achieved by an initial INSERT (with say 100kB of data), followed by a
    number of updates with 'data = data || '... new data ...' appending
    the data. When reading the data, you have to do the same - read them
    piece by piece.

    Yes, it's kind of dirty (especially for large files - this may cause
    a lot of queries), but preserves all the nice BYTEA colunm features
    (no dangling / missing LOBs, etc.).

Is there any other way to solve storing of large files in PostgreSQL?
These are the most important requirements of the solution:

- I do want to store the files inside the database, and I don't want to
   store just the paths. It's quite difficult to combine transactional
   (database) and non-transactional (filesystem) resources properly.

- I want to preserve as much 'nice' PostgreSQL features as possible (for
   example referential integrity is a nice feature).

- Storing / retrieving of the files has to work with a quite small
   memory_limit PHP option (say 8MB). I can't (and don't want to) modify
   this option (it's a shared server).

- Optimization is a serious criterion, as is reliability.

Thanks for all your recommendations.

regards
Tomas

pgsql-php by date:

Previous
From: Sean
Date:
Subject: Re: ERROR: relation "userroles" does not exist
Next
From: Andrew McMillan
Date:
Subject: Re: storing binary files / memory limit