Re: TimeOf(Subselects|Joins)FromLargeTables? - Mailing list pgsql-general
From | Duane Lee - EGOVX |
---|---|
Subject | Re: TimeOf(Subselects|Joins)FromLargeTables? |
Date | |
Msg-id | 64EDC403A1417B4299488BAE87CA7CBF01CD0E57@maricopa_xcng0 Whole thread Raw |
In response to | TimeOf(Subselects|Joins)FromLargeTables? ("Hegedus, Tamas ." <Hegedus.Tamas@mayo.edu>) |
List | pgsql-general |
Or this correlated subquery:
SELECT a.name, a.seq FROM prots as a
WHERE EXISTS (
SELECT '1'FROM kwx as b
WHERE b.kw_acc=812
and b.fid = a.fid
);
-----Original Message-----
From: Dann Corbit [mailto:DCorbit@connx.com]
Sent: Thursday, June 03, 2004 6:59 PM
To: Hegedus, Tamas .; pgsql-general@postgresql.org
Subject: Re: [GENERAL] TimeOf(Subselects|Joins)FromLargeTables?
How does this query perform:
SELECT p.name, p.seq
FROM prots p, kwx k
WHERE
p.fid=k.fid
AND
k.kw_acc=812
;
> -----Original Message-----
> From: Hegedus, Tamas . [mailto:Hegedus.Tamas@mayo.edu]
> Sent: Thursday, June 03, 2004 6:48 PM
> To: 'pgsql-general@postgresql.org'
> Subject: [GENERAL] TimeOf(Subselects|Joins)FromLargeTables?
>
>
> Dear All,
>
> I am a biologist and I do not know what to expect from an RDB
> (PgSQL). I have large tables: 1215607 rows in prots, 2184596
> rows in kwx (see table details below). I would like to do
> something like that:
>
> SELECT name, seq FROM prots WHERE fid in (SELECT fid FROM kwx
> WHERE kw_acc=812);
>
> After executing this (either as a subquery or joins) the
> best/fastest result what I had (SET enable_seqscan=off):
> 83643.482 ms (see EXPLAIN ANALYZE below).
>
> The two (similar) parts of this query are executed much
> faster: SELECT fid FROM kwx WHERE kw_acc=812 -- takes 302ms,
> n(rows)=78050 SELECT name, seq FROM prots WHERE fid < 80000
> -- takes 1969.231 ms
>
> Is this realistic? OK?
> If not: how can I increase the speed by fine tuning of the
> RDB (indexes, run-time parameters) or my SQL query? (It came
> now into my mind: if I decrease the number of columns in the
> prots table (to have only 3 fields (fid, name, seq) instead
> of 20 columns), than the prots table will have smaller file
> size on disk, than this table may need less disk page
> fetches, queries may be faster. Is this true?)
>
> Thanks for your help!
> Tamas
>
> ===============================================
> Table "public.prots"
> Column | Type | Modifiers
> -----------+----------------------+----------
> fid | integer | not null
> name | character varying(10) | not null
> [...other 17 columns...]
> seq | text |
> Indexes:
> "prots_pkey" primary key, btree (fid)
> "ix_prots_acc" unique, btree (acc)
> "ix_prots_name" unique, btree (name)
> "ix_prots_class" btree ("class")
> ===============================================
> Table "public.kwx"
> Column | Type | Modifiers
> --------+--------+----------
> fid | integer |
> kw_acc | integer |
> Indexes:
> "ix_kwx_acc" btree (kw_acc)
> "ix_kwx_fid" btree (fid)
> Foreign-key constraints:
> "fk_kws_acc" FOREIGN KEY (kw_acc) REFERENCES kw_ref(kw_acc)
> "fk_kws_fid" FOREIGN KEY (fid) REFERENCES prots(fid)
> ===============================================
>
> EXPLAIN ANALYZE SELECT name, seq from prots inner join kwx on
> (prots.fid=kwx.fid) where kwx.kw_acc = 812;
>
> QUERY PLAN
>
> --------------------------------------------------------------
> --------------------------------------------------------------
> ------------------
> Merge Join (cost=0.00..160429.66 rows=84473 width=349)
> (actual time=29.039..83505.629 rows=78050 loops=1)
> Merge Cond: ("outer".fid = "inner".fid)
> -> Index Scan using ix_kwx_fid on kwx
> (cost=0.00..44987.55 rows=84473 width=4) (actual
> time=18.893..5730.468 rows=78050 loops=1)
> Filter: (kw_acc = 812)
> -> Index Scan using prots_pkey on prots
> (cost=0.00..112005.24 rows=981127 width=353) (actual
> time=0.083..76059.235 rows=1210377 loops=1) Total runtime:
> 83643.482 ms (6 rows)
>
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to
> majordomo@postgresql.org
>
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
pgsql-general by date: