----- Original Message -----
From: "Richard Huxton" <dev@archonet.com>
To: "Sean Davis" <sdavis2@mail.nih.gov>
Cc: "PostgreSQL SQL" <pgsql-sql@postgresql.org>
Sent: Tuesday, March 22, 2005 3:59 PM
Subject: Re: [SQL] Self-referencing table question
> Sean Davis wrote:
>> I answer my own question, if only for my own records. The following
>> query is about 5-6 times faster than the original. Of course, if anyone
>> else has other ideas, I'd be happy to hear them.
>>
>> Sean
>>
>> explain analyze select from_id,to_id,val from exprsdb.correlation where
>> from_id in (select to_id from exprsdb.correlation where from_id=2424
>> order by val desc limit 100) and to_id in (select to_id from
>> exprsdb.correlation where from_id=2424 order by val desc limit 100) and
>> val>0.6 and to_id<from_id;
>
> Might not be any faster, but you can do this as a self-join with subquery:
>
> SELECT c1.from_id, c1.to_id, c1.val
> FROM
> correlation c1,
> (
> SELECT to_id FROM correlation WHERE from_id=2424
> ORDER BY val DESC LIMIT 100
> ) AS c2
> (
> SELECT to_id FROM correlation WHERE from_id=2424
> ORDER BY val DESC LIMIT 100
> ) AS c3
> WHERE
> c1.from_id = c2.to_id
> AND c1.to_id = c3.to_id
> AND c1.val > 0.5
> AND c1.to_id < from_id
> ;
>
> I think PG should be smart enough nowadays to figure out these two queries
> are basically the same.
Richard,
In another email, I posted what I did (which was what you suggest), along
with explain analyze output. It looks like the subquery is 4-6 times
faster, which is getting into the acceptible for my little web application.
Thanks for the help.
Sean