Re: speeding up big query lookup - Mailing list pgsql-general

From Tom Lane
Subject Re: speeding up big query lookup
Date
Msg-id 24414.1156692533@sss.pgh.pa.us
Whole thread Raw
In response to speeding up big query lookup  ("Silvela, Jaime \(Exchange\)" <JSilvela@Bear.com>)
List pgsql-general
"Silvela, Jaime \(Exchange\)" <JSilvela@Bear.com> writes:
> I have a very big table that catalogs measurements of some objects over
> time. Measurements can be of several (~10) types. It keeps the
> observation date in a field, and indicates the type of measurement in
> another field.

> I often need to get the latest measurement of type A for object X.

This is a pretty common requirement, and since plain SQL doesn't handle
it very well, different DBMSes have invented different extensions to
help.  For instance you can use LIMIT:

  SELECT * from object_val WHERE object_id = X and object_val_type_id = Y
    ORDER BY object_id DESC, object_val_type_id DESC, observation_date DESC
    LIMIT 1;

This will work very very fast if there is an index on (object_id,
object_val_type_id, observation_date) for it to use.  The only problem
with it is that there's no obvious way to extend it to fetch latest
measurements for several objects in one query.

Another way, which AFAIK is Postgres-only, is to use DISTINCT ON:

  SELECT DISTINCT ON (object_id, object_val_type_id) * from object_val
    ORDER BY object_id DESC, object_val_type_id DESC, observation_date DESC

This can give you all the latest measurements at once, or any subset
you need (just add a WHERE clause).  It's not amazingly fast but it
generally beats the bog-standard-SQL alternatives, which as you
mentioned require joining against subselects.

            regards, tom lane

pgsql-general by date:

Previous
From: "Shoaib Mir"
Date:
Subject: Re: Can't populate database using Copy
Next
From: "Dave Page"
Date:
Subject: Re: Cutting the Gborg throat