Re: Performance on large, append-only tables - Mailing list pgsql-performance

From Marti Raudsepp
Subject Re: Performance on large, append-only tables
Date
Msg-id CABRT9RBDXQDwjHSA1GGeAKd97UXRaSTK1B58V6rRmZvit8G2CQ@mail.gmail.com
Whole thread Raw
In response to Performance on large, append-only tables  (David Yeu <david.yeu@skype.net>)
List pgsql-performance
On Wed, Feb 8, 2012 at 20:03, David Yeu <david.yeu@skype.net> wrote:
>  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?;
>  * Pages of twenty rows.

A good improvement for this sort of queries is the "scalable paging"
trick. Instead of increasing the OFFSET argument -- which means that
Postgres has to scan more and more rows -- you should remember an
index key where the last page ended.

In other words, you get the first page using:
WHERE group_id = ? ORDER BY created_at DESC LIMIT 20

Say, this page returns created_at values between 2012-01-01 and
2012-01-10. If the user clicks "next page", you run a query like this
instead:
WHERE group_id = ? AND created_at>'2012-01-10' ORDER BY created_at DESC LIMIT 20

Thus, every "next page" fetch always takes a constant time. Of course
there's a small problem when two rows have equal times. Then, you can
add primary key to the sort key to disambiguate those rows:

WHERE group_id = ? AND (created_at, pkey_col) > ('2012-01-10', 712)
ORDER BY created_at, pkey_col DESC LIMIT 20

Of course an index on (group_id, created_at) or (group_id, created_at,
pkey_col) is necessary for these to work well

Regards,
Marti

pgsql-performance by date:

Previous
From: Claudio Freire
Date:
Subject: Re: Performance on large, append-only tables
Next
From: Tom Lane
Date:
Subject: Re: Performance on large, append-only tables