Re: Paged Query - Mailing list pgsql-performance

From Greg Spiegelberg
Subject Re: Paged Query
Date
Msg-id CAEtnbpVL_CGRnHfykt=3ihT++m_Ni=AskKTteeFJXomZHHq3QA@mail.gmail.com
Whole thread Raw
In response to Re: Paged Query  (Craig James <cjames@emolecules.com>)
List pgsql-performance
On Mon, Jul 9, 2012 at 8:16 AM, Craig James <cjames@emolecules.com> wrote:

A good solution to this general problem is "hitlists."  I wrote about this concept before:

http://archives.postgresql.org/pgsql-performance/2010-05/msg00058.php


I implemented  this exact strategy in our product years ago.  Our queries were once quite complicated involving many nested sub-SELECT's and several JOIN's per SELECT.  The basics of our implementation now consists of

 1. A table tracking all "cache" tables.  A cache table is a permanent table once represented as one of the former sub-SELECT's.  The table includes the MD5 hash of the query used to create the table, time created, query type (helps to determine expire time), and a comment field to help in debugging.
 2. Simple logic checking for the existence of the cache table and creating it if it does not.
 3. Using one or many of the named cache tables in the final query using ORDER BY-LIMIT-OFFSET in a CURSOR.
 4. One scheduled backend process to clear the "expired" cache tables based on the query type.

Reason for the CURSOR is to execute once to get a tally of records for pagination purposes then rewind and fetch the right "page".

Highly recommended.

-Greg

pgsql-performance by date:

Previous
From: Craig James
Date:
Subject: Re: Paged Query
Next
From: Jeff Janes
Date:
Subject: Re: Create tables performance