Re: how to efficiently update tuple in many-to-many relationship? - Mailing list pgsql-performance
From | Drew Wilson |
---|---|
Subject | Re: how to efficiently update tuple in many-to-many relationship? |
Date | |
Msg-id | 22512E5C-A60F-43E7-A370-257F7AF956FD@gmail.com Whole thread Raw |
In response to | Re: how to efficiently update tuple in many-to-many relationship? (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: how to efficiently update tuple in many-to-many relationship?
|
List | pgsql-performance |
My apologies. That function call was some test code to verify that my subselect was only being called once. Let me try again, please. Here's the query plan for a SELECT statement that returns 1,207,161 rows in 6 seconds. MatchBox=# explain select count(translation_pair_id) from translation_pair_data join instance i using(translation_pair_id) join loc_submission ls using(loc_submission_id) where ls.is_public = true; QUERY PLAN ------------------------------------------------------------------------ ----------------------------------- Aggregate (cost=299276.72..299276.73 rows=1 width=4) -> Hash Join (cost=59962.72..294036.83 rows=2095954 width=4) Hash Cond: (i.translation_pair_id = translation_pair_data.translation_pair_id) -> Hash Join (cost=369.15..177405.01 rows=2095954 width=4) Hash Cond: (i.loc_submission_id = ls.loc_submission_id) -> Seq Scan on instance i (cost=0.00..99016.16 rows=5706016 width=8) -> Hash (cost=296.92..296.92 rows=5778 width=4) -> Index Scan using loc_submission_is_public on loc_submission ls (cost=0.00..296.92 rows=5778 width=4) Index Cond: (is_public = true) Filter: is_public -> Hash (cost=31861.92..31861.92 rows=1690292 width=4) -> Seq Scan on translation_pair_data (cost=0.00..31861.92 rows=1690292 width=4) And here's the query plan for the UPDATE query that seems to never complete. (Execution time > 30 minutes.) MatchBox=# explain update translation_pair_data set is_public = true where translation_pair_id in (select translation_pair_id from translation_pair_data join instance i using(translation_pair_id) join loc_submission ls using(loc_submission_id) where ls.is_public = true); QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------- Hash IN Join (cost=328000.49..453415.65 rows=1690282 width=90) Hash Cond: (public.translation_pair_data.translation_pair_id = public.translation_pair_data.translation_pair_id) -> Seq Scan on translation_pair_data (cost=0.00..31861.82 rows=1690282 width=90) -> Hash (cost=293067.74..293067.74 rows=2067660 width=8) -> Hash Join (cost=59958.35..293067.74 rows=2067660 width=8) Hash Cond: (i.translation_pair_id = public.translation_pair_data.translation_pair_id) -> Hash Join (cost=365.00..177117.92 rows=2067660 width=4) Hash Cond: (i.loc_submission_id = ls.loc_submission_id) -> Seq Scan on instance i (cost=0.00..99016.16 rows=5706016 width=8) -> Hash (cost=293.75..293.75 rows=5700 width=4) -> Index Scan using loc_submission_is_public on loc_submission ls (cost=0.00..293.75 rows=5700 width=4) Index Cond: (is_public = true) Filter: is_public -> Hash (cost=31861.82..31861.82 rows=1690282 width=4) -> Seq Scan on translation_pair_data (cost=0.00..31861.82 rows=1690282 width=4) I figure I must be doing something wrong here. Thanks for the help, Drew On Apr 9, 2007, at 2:43 PM, Tom Lane wrote: > Drew Wilson <drewmwilson@gmail.com> writes: >> I have 2 tables (A,B) joined in a many-to-many relationship via a >> join table ("membership"), where updating table A based on table B >> takes a very long time. >> ... >> -> Function Scan on a (cost=0.00..12.50 rows=1000 width=4) > > I think you've left out some relevant details ... there's nothing > in what you said about a set-returning function ... > > regards, tom lane
pgsql-performance by date: