Re: Performance on large, append-only tables - Mailing list pgsql-performance

From Claudio Freire
Subject Re: Performance on large, append-only tables
Date
Msg-id CAGTBQpY8ySbL0=b0GG75tsR-8dLaXFq-V3k5=WujXPDoorYJbg@mail.gmail.com
Whole thread Raw
In response to Performance on large, append-only tables  (David Yeu <david.yeu@skype.net>)
List pgsql-performance
On Wed, Feb 8, 2012 at 3:03 PM, David Yeu <david.yeu@skype.net> wrote:
> Thankfully, the types of queries that we perform against this table are
> pretty constrained. We never update rows and we never join against other
> tables. The table essentially looks like this:
>
> | id | group_id | created_at | everything elseŠ
...
> Our queries essentially fall into the following cases:
>
>  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20;
>  * Š WHERE group_id = ? AND id > ? ORDER BY created_at DESC;
>  * Š WHERE group_id = ? AND id < ? ORDER BY created_at DESC LIMIT 20;
>  * Š WHERE group_id = ? ORDER BY created_at DESC LIMIT 20 OFFSET ?;

I think you have something to gain from partitioning.
You could partition on group_id, which is akin to sharding only on a
single server, and that would significantly decrease each partition's
index size. Since those queries' performance is highly dependent on
index size, and since you seem to have such a huge table, I would
imagine such partitioning would help keep the indices performant.

Now, we do need statistics. How many groups are there? Do they grow
with your table, or is the number of groups constant? Which values of
offsets do you use? (offset is quite expensive)

And of course... explain analyze.

pgsql-performance by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: Performance on large, append-only tables
Next
From: Marti Raudsepp
Date:
Subject: Re: Performance on large, append-only tables