Re: Hardware upgrade for a high-traffic database - Mailing list pgsql-performance

From Tom Lane
Subject Re: Hardware upgrade for a high-traffic database
Date
Msg-id 21682.1092266404@sss.pgh.pa.us
Whole thread Raw
In response to Re: Hardware upgrade for a high-traffic database  ("Jason Coene" <jcoene@gotfrag.com>)
Responses Re: Hardware upgrade for a high-traffic database
List pgsql-performance
"Jason Coene" <jcoene@gotfrag.com> writes:
> A good example, a comments table where users submit TEXT data.  A common
> query is to find the last 5 comments a user has submitted.  The scan, while
> using an index, takes a considerable amount of time (> 0.5 sec is about as
> good as it gets).  Again, it's using an index on the single WHERE clause
> (userid = int).  The field that's used to ORDER BY (timestamp) is also
> indexed.

You mean you are doing
    SELECT ... WHERE userid = 42 ORDER BY timestamp DESC LIMIT 5;
and hoping that separate indexes on userid and timestamp will get the
job done?  They won't.  There are only two possible plans for this,
neither very good: select all of user 42's posts and sort them, or
scan timewise backwards through *all* posts looking for the last 5 from
user 42.

If you do this enough to justify a specialized index, I would suggest a
two-column index on (userid, timestamp).  You will also need to tweak
the query, because the planner is not quite smart enough to deduce that
such an index is applicable to the given sort order:
    SELECT ... WHERE userid = 42 ORDER BY userid DESC, timestamp DESC LIMIT 5;
This should generate an index-scan-backwards plan that will execute nigh
instantaneously, because it will only fetch the rows you really want.

You might or might not be able to drop the separate indexes on userid
and timestamp, depending on what other queries you might have that need
them.  But you should be paying attention to what plans you are really
getting (see EXPLAIN) rather than just assuming that some indexes chosen
at random will do what you need.

            regards, tom lane

pgsql-performance by date:

Previous
From: "Jason Coene"
Date:
Subject: Re: Hardware upgrade for a high-traffic database
Next
From: Russell Smith
Date:
Subject: Re: Hardware upgrade for a high-traffic database