Re: Joining 2 tables with 300 million rows - Mailing list pgsql-performance

From Dmitri Bichko
Subject Re: Joining 2 tables with 300 million rows
Date
Msg-id F18A6F7CF1661F46920F2CF713122FED46CD28@mail.aveo.aveopharma.com
Whole thread Raw
In response to Joining 2 tables with 300 million rows  (Amit V Shah <ashah@tagaudit.com>)
List pgsql-performance
What's the problem?  You are joining two 300 million row tables in 0.15
of a second - seems reasonable.

Dmitri

> -----Original Message-----
> From: pgsql-performance-owner@postgresql.org
> [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of
> Amit V Shah
> Sent: Thursday, December 08, 2005 11:59 AM
> To: 'pgsql-performance@postgresql.org'
> Subject: [PERFORM] Joining 2 tables with 300 million rows
>
>
> Hi all,
>
> First of all, please pardon if the question is dumb! Is it
> even feasible or normal to do such a thing ! This query is
> needed by a webpage so needs to be lightning fast. Anything
> beyond 2-3 seconds is unacceptable performance.
>
> I have two tables
>
> CREATE TABLE runresult
> (
>   id_runresult int8 NOT NULL,
>   rundefinition_id_rundefinition int4 NOT NULL,
>   measure_id_measure int4 NOT NULL,
>   value float4 NOT NULL,
>   "sequence" varchar(20) NOT NULL,
>   CONSTRAINT pk_runresult_ars PRIMARY KEY (id_runresult),
> )
>
>
> CREATE TABLE runresult_has_catalogtable
> (
>   runresult_id_runresult int8 NOT NULL,
>   catalogtable_id_catalogtable int4 NOT NULL,
>   value int4 NOT NULL,
>   CONSTRAINT pk_runresult_has_catalogtable PRIMARY KEY
> (runresult_id_runresult, catalogtable_id_catalogtable, value)
>   CONSTRAINT fk_temp FOREIGN KEY (runresult_id_runresult) REFERENCES
> runresult(id_runresult) ON UPDATE RESTRICT ON DELETE RESTRICT
> )
>
> Each table has around 300 million records (will grow to
> probably billions). Below is the query and the explain analyze --
>
> explain analyze SELECT measure.description, runresult.value
> FROM ((((rundefinition INNER JOIN runresult ON
> rundefinition.id_rundefinition =
> runresult.rundefinition_id_rundefinition)
> INNER JOIN runresult_has_catalogtable ON runresult.id_runresult =
> runresult_has_catalogtable.runresult_id_runresult)
> INNER JOIN runresult_has_catalogtable AS
> runresult_has_catalogtable_1 ON runresult.id_runresult =
> runresult_has_catalogtable_1.runresult_id_runresult)
> INNER JOIN runresult_has_catalogtable AS
> runresult_has_catalogtable_2 ON runresult.id_runresult =
> runresult_has_catalogtable_2.runresult_id_runresult)
> INNER JOIN measure ON runresult.measure_id_measure =
> measure.id_measure WHERE
> (((runresult_has_catalogtable.catalogtable_id_catalogtable)=52)
> AND ((runresult_has_catalogtable_1.catalogtable_id_catalogtable)=54)
> AND ((runresult_has_catalogtable_2.catalogtable_id_catalogtable)=55)
> AND ((runresult_has_catalogtable.value)=15806)
> AND ((runresult_has_catalogtable_1.value)=1)
> AND ((runresult_has_catalogtable_2.value) In (21,22,23,24))
> AND ((rundefinition.id_rundefinition)=10106));
>
> 'Nested Loop  (cost=0.00..622582.70 rows=1 width=28) (actual
> time=25.221..150.563 rows=22 loops=1)' '  ->  Nested Loop
> (cost=0.00..622422.24 rows=2 width=52) (actual
> time=25.201..150.177 rows=22 loops=1)'
> '        ->  Nested Loop  (cost=0.00..622415.97 rows=2
> width=32) (actual
> time=25.106..149.768 rows=22 loops=1)'
> '              ->  Nested Loop  (cost=0.00..621258.54 rows=15
> width=24)
> (actual time=24.582..149.061 rows=30 loops=1)'
> '                    ->  Index Scan using pk_rundefinition on
> rundefinition
> (cost=0.00..3.86 rows=1 width=4) (actual time=0.125..0.147
> rows=1 loops=1)'
> '                          Index Cond: (id_rundefinition = 10106)'
> '                    ->  Nested Loop  (cost=0.00..621254.54 rows=15
> width=28) (actual time=24.443..148.784 rows=30 loops=1)'
> '                          ->  Index Scan using
> runresult_has_catalogtable_value on
> runresult_has_catalogtable (cost=0.00..575069.35 rows=14437
> width=8) (actual time=0.791..33.036 rows=10402 loops=1)'
> '                                Index Cond: (value = 15806)'
> '                                Filter:
> (catalogtable_id_catalogtable =
> 52)'
> '                          ->  Index Scan using pk_runresult_ars on
> runresult  (cost=0.00..3.19 rows=1 width=20) (actual
> time=0.007..0.007 rows=0 loops=10402)'
> '                                Index Cond: (runresult.id_runresult =
> "outer".runresult_id_runresult)'
> '                                Filter: (10106 =
> rundefinition_id_rundefinition)'
> '              ->  Index Scan using
> runresult_has_catalogtable_id_runresult
> on runresult_has_catalogtable runresult_has_catalogtable_1
> (cost=0.00..76.65 rows=41 width=8) (actual time=0.015..0.017
> rows=1 loops=30)'
> '                    Index Cond:
> (runresult_has_catalogtable_1.runresult_id_runresult =
> "outer".runresult_id_runresult)'
> '                    Filter: ((catalogtable_id_catalogtable =
> 54) AND (value
> = 1))'
> '        ->  Index Scan using pk_measure on measure
> (cost=0.00..3.12 rows=1
> width=28) (actual time=0.008..0.010 rows=1 loops=22)'
> '              Index Cond: ("outer".measure_id_measure =
> measure.id_measure)'
> '  ->  Index Scan using
> runresult_has_catalogtable_id_runresult on
> runresult_has_catalogtable runresult_has_catalogtable_2
> (cost=0.00..79.42 rows=65 width=8) (actual time=0.007..0.010
> rows=1 loops=22)'
> '        Index Cond:
> (runresult_has_catalogtable_2.runresult_id_runresult =
> "outer".runresult_id_runresult)'
> '        Filter: ((catalogtable_id_catalogtable = 55) AND
> ((value = 21) OR
> (value = 22) OR (value = 23) OR (value = 24)))'
> 'Total runtime: 150.863 ms'
>
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>
The information transmitted is intended only for the person or entity to which it is addressed and may contain
confidentialand/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any
actionin reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you
receivedthis in error, please contact the sender and delete the material from any computer 

pgsql-performance by date:

Previous
From: "Jeffrey W. Baker"
Date:
Subject: Re: opinion on disk speed
Next
From: Amit V Shah
Date:
Subject: Re: Joining 2 tables with 300 million rows