Re: Suspending SELECTs - Mailing list pgsql-performance

From Craig A. James
Subject Re: Suspending SELECTs
Date
Msg-id 43CDB97A.8050702@modgraph-usa.com
Whole thread Raw
In response to Re: Suspending SELECTs  (Alessandro Baretta <a.baretta@barettadeit.com>)
List pgsql-performance
Alessandro Baretta wrote:
>> I think you're trying to do something at the wrong layer of your
>> architecture.  This task normally goes in your middleware layer, not
>> your database layer.
>
> I am developing my applications in Objective Caml, and I have written
> the middleware layer myself. I could easily implement a cursor-pooling
> strategy...

You're trying to solve a very hard problem, and you're rewriting a lot of stuff that's been worked on for years by
teamsof people.  If there's any way you switch use something like JBOSS, it might save you a lot of grief and hard
work.

I eliminated this problem a different way, using what we call a "hitlist".  Basically, every query becomes a "select
into",something like this: 

  insert into hitlist_xxxx (select id from ...)

where "xxxx" is your user's id.  Once you do this, it's trivial to return each page to the user almost instantly using
offset/limit,or by adding a "ROW_NUM" column of some sort.  We manage very large hitlists -- millions of rows.  Going
frompage 1 to page 100,000 takes a fraction of a second. 

It also has the advantage that the user can come back in a week or a month and the results are still there.

The drawback are:

1. Before the user gets the first page, the entire query must complete.
2. You need a way to clean up old hitlists.
3. If you have tens of thousands of users, you'll have a large number of hitlists, and you have to use tablespaces to
ensurethat Linux filesystem directories don't get too large. 
4. It takes space to store everyone's data.  (But disk space is so cheap this isn't much of an issue.)

You can eliminate #3 by a single shared hitlist with a column of UserID's.  But experience shows that a big shared
hitlistdoesn't work very well:  Inserts get slower because the UserID column must be indexed, and you can truncate
individualhitlists but you have to delete from a shared hitlist. 

Craig

pgsql-performance by date:

Previous
From: Hari Warrier
Date:
Subject: Getting pg to use index on an inherited table (8.1.1)
Next
From: Tom Lane
Date:
Subject: Re: Suspending SELECTs