Re: TimeOf(Subselects|Joins)FromLargeTables? - Mailing list pgsql-general
From | Dann Corbit |
---|---|
Subject | Re: TimeOf(Subselects|Joins)FromLargeTables? |
Date | |
Msg-id | D90A5A6C612A39408103E6ECDD77B8299CAB6A@voyager.corporate.connx.com Whole thread Raw |
In response to | TimeOf(Subselects|Joins)FromLargeTables? ("Hegedus, Tamas ." <Hegedus.Tamas@mayo.edu>) |
List | pgsql-general |
How does this query perform: SELECT p.name, p.seq FROM prots p, kwx k WHERE p.fid=k.fid AND k.kw_acc=812 ; > -----Original Message----- > From: Hegedus, Tamas . [mailto:Hegedus.Tamas@mayo.edu] > Sent: Thursday, June 03, 2004 6:48 PM > To: 'pgsql-general@postgresql.org' > Subject: [GENERAL] TimeOf(Subselects|Joins)FromLargeTables? > > > Dear All, > > I am a biologist and I do not know what to expect from an RDB > (PgSQL). I have large tables: 1215607 rows in prots, 2184596 > rows in kwx (see table details below). I would like to do > something like that: > > SELECT name, seq FROM prots WHERE fid in (SELECT fid FROM kwx > WHERE kw_acc=812); > > After executing this (either as a subquery or joins) the > best/fastest result what I had (SET enable_seqscan=off): > 83643.482 ms (see EXPLAIN ANALYZE below). > > The two (similar) parts of this query are executed much > faster: SELECT fid FROM kwx WHERE kw_acc=812 -- takes 302ms, > n(rows)=78050 SELECT name, seq FROM prots WHERE fid < 80000 > -- takes 1969.231 ms > > Is this realistic? OK? > If not: how can I increase the speed by fine tuning of the > RDB (indexes, run-time parameters) or my SQL query? (It came > now into my mind: if I decrease the number of columns in the > prots table (to have only 3 fields (fid, name, seq) instead > of 20 columns), than the prots table will have smaller file > size on disk, than this table may need less disk page > fetches, queries may be faster. Is this true?) > > Thanks for your help! > Tamas > > =============================================== > Table "public.prots" > Column | Type | Modifiers > -----------+----------------------+---------- > fid | integer | not null > name | character varying(10) | not null > [...other 17 columns...] > seq | text | > Indexes: > "prots_pkey" primary key, btree (fid) > "ix_prots_acc" unique, btree (acc) > "ix_prots_name" unique, btree (name) > "ix_prots_class" btree ("class") > =============================================== > Table "public.kwx" > Column | Type | Modifiers > --------+--------+---------- > fid | integer | > kw_acc | integer | > Indexes: > "ix_kwx_acc" btree (kw_acc) > "ix_kwx_fid" btree (fid) > Foreign-key constraints: > "fk_kws_acc" FOREIGN KEY (kw_acc) REFERENCES kw_ref(kw_acc) > "fk_kws_fid" FOREIGN KEY (fid) REFERENCES prots(fid) > =============================================== > > EXPLAIN ANALYZE SELECT name, seq from prots inner join kwx on > (prots.fid=kwx.fid) where kwx.kw_acc = 812; > > QUERY PLAN > > -------------------------------------------------------------- > -------------------------------------------------------------- > ------------------ > Merge Join (cost=0.00..160429.66 rows=84473 width=349) > (actual time=29.039..83505.629 rows=78050 loops=1) > Merge Cond: ("outer".fid = "inner".fid) > -> Index Scan using ix_kwx_fid on kwx > (cost=0.00..44987.55 rows=84473 width=4) (actual > time=18.893..5730.468 rows=78050 loops=1) > Filter: (kw_acc = 812) > -> Index Scan using prots_pkey on prots > (cost=0.00..112005.24 rows=981127 width=353) (actual > time=0.083..76059.235 rows=1210377 loops=1) Total runtime: > 83643.482 ms (6 rows) > > > > ---------------------------(end of > broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to > majordomo@postgresql.org >
pgsql-general by date: