Improving join performance over multiple moderately wide tables - Mailing list pgsql-performance
From | miller_2555 |
---|---|
Subject | Improving join performance over multiple moderately wide tables |
Date | |
Msg-id | 25932408.post@talk.nabble.com Whole thread Raw |
Responses |
Re: Improving join performance over multiple moderately
wide tables
|
List | pgsql-performance |
Hi - I'm stuck on a query performance issue, and I would sincerely appreciate any server setting/ query improvement suggestions. I am attempting to retrieve records stored in 20 tables that are related to a one or more records in a single table (let's call the 20 tables `data tables` and the single table the `source table`). Each data table is partitioned and inherits directly from a master table (each partition holds about 200,000 records and there are 60 partitions per master table -- the query is expected to return ~20 records/partition ). The source table is not partitioned. Each data table partition has a primary key that is identical to a column of the source table (each data table partition has a foreign key constraint on the source table's primary key -- Note that the foreign key constraint is implicit in certain cases -- i.e. data_table_01 has a foreign key on data_table_02 that, in turn, has a foreign key on the source_table). When I attempt to produce a result set from the data tables using a subquery of the source table, the query takes ~60-200sec to run for a single row returned from the source table (time increases exponentially with additional subquery results). I perform a VACUUM ANALYZE nightly, attempted to set the `join_collapse_limit` to '1' and increased the statistics collection on the data table partition primary keys to 1000. I am running Postgres 8.3 on Fedora 10 using an Intel chipset (four processor cores) and 6Gb of RAM -- there are a maximum of 10 concurrent connections on the database at any given time. I've attached some descriptive information below: The fastest query I have been able to manage is the following (abridged): SELECT * FROM ( SELECT "<column>" FROM <source_table> WHERE "<some_column>" = '<somecondition>' ) AS source_table LEFT JOIN <data_table_01> USING ("<primary_key>") LEFT JOIN <data_table_02> USING ("<primary_key>") ... LEFT JOIN <data_table_19> USING ("<primary_key>") LEFT JOIN <data_table_20> USING ("<primary_key>") ; The following are the non-default parameters in postgres.conf: max_locks_per_transaction = 2056 shared_buffers = 128MB max_fsm_pages = 204800 max_fsm_relations = 3000 constraint_exclusion = on I've attached an abridged query plan below (abridged for the index scan results for each of the partition tables -- the total time to scan the partition tables is embedded in the "append" actual cost time, which is generally <5ms excluding the subquery). To summarize, it appears that each data_table identifies the relevant rows very quickly (total of all `appends` are ~1.2 seconds , but each join is slow (2-5 seconds for *each* join -- making 20*[2-5 sec]= [40-100 sec] for the joins). Note that `join_collapse_limit` is set to 1 on this session (also note the estimated costs by the query plan appear a bit large). ---------------------------------------- Nested Loop Left Join (cost=0.00..3855698559605789324702789208529861500799141255212203379626060432086506635309749052112896.00 rows=311931683242245219905715698384821128847649568166127853072768749671189521273717781287665664 width=8884) (actual time=804.729..54752.177 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_20>."<primary_key>") -> Nested Loop Left Join (cost=0.00..107594618658011016223736297174365328662423744442156697061527790311709493896861974528.00 rows=8704464966056878513204190237608265494011078822923959805624806664887393418376971812864 width=8532) (actual time=785.966..52705.399 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_19>."<primary_key>") -> Nested Loop Left Join (cost=0.00..3002477536026971680044620678533152337542410774574698677807503124668875305648128.00 rows=242900749928754345845996721555781173233674623132666614323075798898771156715700224 width=8180) (actual time=739.681..50493.225 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_18>."<primary_key>") -> Nested Loop Left Join (cost=0.00..83787067396008347529564745762189126147117204941577957444468815071220334592.00 rows=6778239377396060082196394544307189661486516189900352952164276255864001658880 width=7828) (actual time=725.434..48401.074 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_17>."<primary_key>") -> Nested Loop Left Join (cost=0.00..2338137489489500761700457981791537754699538979985950749900036728946688.00 rows=189154115739814280777160516793297638614094710485086175961401125362466816 width=7476) (actual time=714.956..46243.094 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_16>."<primary_key>") -> Nested Loop Left Join (cost=0.00..65247852053209794932032080731091779649041507376730152396969213952.00 rows=5278474735096636176089465265387923568528412860735200833430607626240 width=7124) (actual time=693.772..43828.871 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_15>."<primary_key>") -> Nested Loop Left Join (cost=0.00..1820849620634231609086388526672297318981782459980059921874944.00 rows=147300242643928143660653163557330700833114104062501442583265280 width=6772) (actual time=683.211..41744.448 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_14>."<primary_key>") -> Nested Loop Left Join (cost=0.00..50814256409136028027737552880521987178900164286101848064.00 rows=4110679017025470954778089725832553293993488240233704062976 width=6388) (actual time=666.928..39803.912 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_13>."<primary_key>") -> Nested Loop Left Join (cost=0.00..1418011137688622742268484288103546481505440632930304.00 rows=114716154195056436845646872605995872908472110832156672 width=6004) (actual time=646.420..37732.637 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_12>."<primary_key>") -> Nested Loop Left Join (cost=0.00..39574524842926171903719303887283825536044892160.00 rows=3201237512747515109467771442088115336887389913088 width=5620) (actual time=617.556..32691.208 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_11>."<primary_key>") -> Nested Loop Left Join (cost=0.00..1104461658892753285177242217114991400583168.00 rows=89340995439021908020325536097950548241678336 width=5180) (actual time=605.788..30405.082 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_10>."<primary_key>") -> Nested Loop Left Join (cost=0.00..30823820972139181588887753190839156736.00 rows=2493369520530492561237378929345605664768 width=4740) (actual time=590.691..28256.296 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_09>."<primary_key>") -> Nested Loop Left Join (cost=0.00..860253694217667941856909279100928.00 rows=69586043117709763748119584889110528 width=4300) (actual time=562.560..23995.033 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_08>."<primary_key>") -> Nested Loop Left Join (cost=0.00..24008376616267968521992929280.00 rows=1942051706828399753701512708096 width=3860) (actual time=522.649..21813.935 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_07>."<primary_key>") -> Nested Loop Left Join (cost=0.00..670029782636971643895808.00 rows=54199746334871823453257728 width=3420) (actual time=499.865..18377.510 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_06>."<primary_key>") -> Nested Loop Left Join (cost=0.00..18700458482955792384.00 rows=1512614519963725594624 width=2980) (actual time=469.866..14939.783 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_05>."<primary_key>") -> Nested Loop Left Join (cost=0.00..521929130617911.38 rows=42217091058335856 width=2500) (actual time=457.911..12760.616 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_04>."<primary_key>") -> Nested Loop Left Join (cost=0.00..14566671439.99 rows=1178275124662 width=2020) (actual time=446.057..10476.572 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_03>."<primary_key>") -> Nested Loop Left Join (cost=0.00..407379.92 rows=32884816 width=1540) (actual time=412.273..5911.776 rows=953 loops=1) Join Filter: (<data_table_01>."<primary_key>" = <data_table_02>."<primary_key>") -> Nested Loop Left Join (cost=0.00..522.73 rows=918 width=244) (actual time=380.108..1176.906 rows=953 loops=1) Join Filter: (<source_table>."<column>" = <data_table_01>."<primary_key>") -> Index Scan using "<source_table_column_index>" on <source_table> (cost=0.00..8.27 rows=1 width=40) (actual time=0.104..0.106 rows=1 loops=1) Index Cond: ("<some_column>" = <somecondition>::bigint) -> Append (cost=0.00..513.69 rows=62 width=212) (actual time=379.923..1174.640 rows=953 loops=1) -> Append (cost=0.00..442.42 rows=62 width=1304) (actual time=4.720..4.917 rows=1 loops=953) -> Append (cost=0.00..442.17 rows=62 width=488) (actual time=4.590..4.738 rows=1 loops=953) -> Append (cost=0.00..442.17 rows=62 width=488) (actual time=2.181..2.341 rows=1 loops=953) -> Append (cost=0.00..442.17 rows=62 width=488) (actual time=2.077..2.232 rows=1 loops=953) -> Append (cost=0.00..442.17 rows=62 width=448) (actual time=3.380..3.550 rows=1 loops=953) -> Append (cost=0.00..442.17 rows=62 width=448) (actual time=3.399..3.547 rows=1 loops=953) -> Append (cost=0.00..442.17 rows=62 width=448) (actual time=2.060..2.230 rows=1 loops=953) -> Append (cost=0.00..442.17 rows=62 width=448) (actual time=4.254..4.410 rows=1 loops=953) -> Append (cost=0.00..442.17 rows=62 width=448) (actual time=2.025..2.193 rows=1 loops=953) -> Append (cost=0.00..442.17 rows=62 width=448) (actual time=2.131..2.336 rows=1 loops=953) -> Append (cost=0.00..442.17 rows=62 width=392) (actual time=5.043..5.228 rows=1 loops=953) -> Append (cost=0.00..442.17 rows=62 width=392) (actual time=1.941..2.110 rows=1 loops=953) -> Append (cost=0.00..442.17 rows=62 width=392) (actual time=1.788..1.973 rows=1 loops=953) -> Append (cost=0.00..442.17 rows=62 width=360) (actual time=1.933..2.122 rows=1 loops=953) -> Append (cost=0.00..442.17 rows=62 width=360) (actual time=2.318..2.469 rows=1 loops=953) -> Append (cost=0.00..442.17 rows=62 width=360) (actual time=2.055..2.199 rows=1 loops=953) -> Append (cost=0.00..442.17 rows=62 width=360) (actual time=.984..2.129 rows=1 loops=953) -> Append (cost=0.00..442.17 rows=62 width=360) (actual time=2.14..2.256 rows=1 loops=953) -> Append (cost=0.00..442.17 rows=62 width=360) (actual time=1.91..2.081 rows=1 loops=953) Total runtime: 54819.860 ms Thanks in advance - Will -- View this message in context: http://www.nabble.com/Improving-join-performance-over-multiple-moderately-wide-tables-tp25932408p25932408.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
pgsql-performance by date: