Huge query help - Mailing list pgsql-novice

From Cath Lawrence
Subject Huge query help
Date
Msg-id 1390DF24-6B10-11D8-98CA-000A95DC17CC@anu.edu.au
Whole thread Raw
List pgsql-novice
Hi,

I have a big table (10 million records, each quite small - half a dozen
text and numeric fields) which I need to (eek!) outer join with itself,
but in such a way as to actually rule out 99.9% of the table.

I need tips on how to do this without crashing and running out of
memory - how do I make it for the "where" condition before attempting
the join?

Here's a stripped down version of the query:

SELECT  a1.x as x1, a1.y as y1, a1.z as z1,
                  a2.x as x2, a2.y as y2, a2.z as z2,
                  r1.position as rpos1, r1.residue as res1,
                  r2.position as rpos2, r2.residue as res2
     FROM atom a1, atom a2, residue r1, residue r2
     WHERE  a1.pdb_id = a2.pdb_id
      AND a1.pdb_id = '1ABC'
      AND a1.res_id < a2.res_id
      AND a1.res_id = r1.id
      AND a2.res_id = r2.id

Basically, the restriction on pdb_id reduces it to about 1 in 7000 of
the table entries, so the result will be big but not unmanageable, if I
can ever get it...

Can I organise my query somehow so the join is done on the subsets
rather than the full table?

And while I'm at it, does anyone have advice on materials for learning
advanced SQLtechniques? Everything I find on the web is basically
beginner stuff and assumes that you have a tiny dataset...


cheers
Cath
Cath Lawrence,                       Cath.Lawrence@anu.edu.au
Senior Scientific Programmer,  Centre for Bioinformation Science,
John Curtin School of Medical Research (room 4088)
Australian National University,  Canberra ACT 0200
ph: (02) 61257959   mobile: 0421-902694   fax: (02) 61252595


pgsql-novice by date:

Previous
From:
Date:
Subject: Help! "alter table add column" hangs
Next
From: Stephan Szabo
Date:
Subject: Re: Help! "alter table add column" hangs