Home > mailing lists

Re: Digesting explain analyze - Mailing list pgsql-performance

From	Ron Mayer
Subject	Re: Digesting explain analyze
Date	January 6, 2010 16:03:33
Msg-id	4B44EC89.7060200@cheapcomplexdevices.com Whole thread Raw
In response to	Digesting explain analyze (Jesper Krogh <jesper@krogh.cc>)
Responses	Re: Digesting explain analyze
List	pgsql-performance

Tree view

Jesper Krogh wrote:
> I have a table that consists of somewhere in the magnitude of 100.000.000
> rows and all rows are of this tuples
>
> (id1,id2,evalue);
>
> Then I'd like to speed up a query like this:
>
> explain analyze select id from table where id1 = 2067 or id2 = 2067 order
> by evalue asc limit 100;
>
> ...The inner sets are on average 3.000 for
> both id1 and id2 and a typical limit would be 100, so if I could convince
> postgresql to not fetch all of them then I would reduce the set retrieved
> by around 60. The dataset is quite large so the random query is not very
> likely to be hitting the same part of the dataset again, so there is going
> to be a fair amount of going to disk.,

If disk seeks are killing you a kinda crazy idea would be to
duplicate the table - clustering one by (id1) and
the other one by an index on (id2) and unioning the
results of each.

Since each of these duplicates of the table will be clustered
by the column you're querying it on, it should just take one
seek in each table.

Then your query could be something like

  select * from (
    select * from t1 where id1=2067 order by evalue limit 100
    union
    select * from t2 where id2=2067 order by evalue limit 100
  ) as foo order by evalue limit 100;

Hmm..  and I wonder if putting evalue into the criteria to cluster
the tables too (i.e. cluster on id1,evalue) if you could make it
so the limit finds the right 100 evalues first for each table....

pgsql-performance by date:

From: Jesper Krogh
Date: 06 January 2010, 15:22:16
Subject: Digesting explain analyze

From: Zintrigue
Date: 06 January 2010, 19:54:10
Subject: noob inheritance question

Re: Digesting explain analyze - Mailing list pgsql-performance

Previous

Next