Re: Large Result and Memory Limit - Mailing list pgsql-general

From Bill Moran
Subject Re: Large Result and Memory Limit
Date
Msg-id 20071004170348.2dad6f9a.wmoran@potentialtech.com
Whole thread Raw
In response to Re: Large Result and Memory Limit  ("Scott Marlowe" <scott.marlowe@gmail.com>)
Responses Re: Large Result and Memory Limit
List pgsql-general
In response to "Scott Marlowe" <scott.marlowe@gmail.com>:

> On 10/4/07, Mike Ginsburg <mginsburg@collaborativefusion.com> wrote:
>
> > export it.  Memory limit is a major concern, but the query for one row
> > returns a result set too large and PHP fails.
>
> If this is true, and one single db row makes php exceed its memory
> limit just by returning it, you've done something very very in your
> design.

I work with Mike, and I'm not sure there's something "very wrong" in
the design.

The problem we're looking at is an export.  This big TEXT field is
nothing more than a staging area.  The export itself can take several
hours to run, so we background the process, and store the result in
a big TEXT field (it ends up being a CSV).  Later, the user can log
in and download the exported file via the web interface.  (But it's
unlikely that anyone is going to click "export", then wait 2 hours
for their browser to refresh :)

The process is, web interface -> request export -> get an estimate
on completion time -> come back later and check the status -> download
if complete.

It's difficult (maybe impossible) to materialize the data on a schedule,
since the user has a number of options how how to export the data
(filters, etc) so we chose to do it on-demand.

> You'd proably be better served using either a plain text file system
> to store these things, or large objects in postgresql.

We have multiple web servers with a load balancer, so saving the
result to a file doesn't really work.  We could put shared storage
in place, but I'm still not convinced that's the best fix for this.

Large objects are one option we were considering.  The problem is
they don't work with Slony, and our redundancy relies on Slony.  Granted,
we might want to just make this a non-replicated table, since it's
only a staging area anyway, but we're looking for a better solution.

My (specific) question is whether or not anyone has experimented with
putting something like a CSV file in a table with one tuple for each
row, and compared performance, etc to putting it in a large object?

> But if you're stuffing ~8 megs worth of csv text data into a single
> row* you're probably not using a very relational layout of your data.
> And you're losing all the advantages (checking your data for
> consistency and such) that a relational db could give you.

Actually, we _are_ using lots of tables with lots of relations and
foreign keys and triggers and stored procedures and all sorts of other
stuff.  That's why it takes multiple hours to "flatten" everything (is
there a better term to describe the process of turning relational data
into a single flat file?)

> * Note that I'm assuming a few things up there.  1: php uses about 2:1
> memory to store data it's holding, roughly.  If you're set to 16 Meg
> max, I'm assuming your return set is 8Meg or larger.

I'm not sure, but that's probably correct.

--
Bill Moran
http://www.potentialtech.com

pgsql-general by date:

Previous
From: "Scott Marlowe"
Date:
Subject: Re: Large Result and Memory Limit
Next
From: "Joshua D. Drake"
Date:
Subject: Re: Large Result and Memory Limit