Re: how to efficiently update tuple in many-to-many relationship? - Mailing list pgsql-performance

From Drew Wilson
Subject Re: how to efficiently update tuple in many-to-many relationship?
Date
Msg-id 22512E5C-A60F-43E7-A370-257F7AF956FD@gmail.com
Whole thread Raw
In response to Re: how to efficiently update tuple in many-to-many relationship?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: how to efficiently update tuple in many-to-many relationship?
List pgsql-performance
My apologies. That function call was some test code to verify that my
subselect was only being called once.

Let me try again, please.

Here's the query plan for a SELECT statement that returns 1,207,161
rows in 6 seconds.
MatchBox=# explain select count(translation_pair_id) from
translation_pair_data
         join instance i using(translation_pair_id)
         join loc_submission ls using(loc_submission_id)
         where ls.is_public = true;
                                                           QUERY PLAN
------------------------------------------------------------------------
-----------------------------------
Aggregate  (cost=299276.72..299276.73 rows=1 width=4)
    ->  Hash Join  (cost=59962.72..294036.83 rows=2095954 width=4)
          Hash Cond: (i.translation_pair_id =
translation_pair_data.translation_pair_id)
          ->  Hash Join  (cost=369.15..177405.01 rows=2095954 width=4)
                Hash Cond: (i.loc_submission_id = ls.loc_submission_id)
                ->  Seq Scan on instance i  (cost=0.00..99016.16
rows=5706016 width=8)
                ->  Hash  (cost=296.92..296.92 rows=5778 width=4)
                      ->  Index Scan using loc_submission_is_public
on loc_submission ls  (cost=0.00..296.92 rows=5778 width=4)
                            Index Cond: (is_public = true)
                            Filter: is_public
          ->  Hash  (cost=31861.92..31861.92 rows=1690292 width=4)
                ->  Seq Scan on translation_pair_data
(cost=0.00..31861.92 rows=1690292 width=4)


And here's the query plan for the UPDATE query that seems to never
complete. (Execution time > 30 minutes.)
MatchBox=# explain update translation_pair_data set is_public = true
         where translation_pair_id in
             (select translation_pair_id from translation_pair_data
                  join instance i using(translation_pair_id)
                  join loc_submission ls using(loc_submission_id)
                  where ls.is_public = true);
                                                              QUERY PLAN
------------------------------------------------------------------------
-------------------------------------------------------------
Hash IN Join  (cost=328000.49..453415.65 rows=1690282 width=90)
    Hash Cond: (public.translation_pair_data.translation_pair_id =
public.translation_pair_data.translation_pair_id)
    ->  Seq Scan on translation_pair_data  (cost=0.00..31861.82
rows=1690282 width=90)
    ->  Hash  (cost=293067.74..293067.74 rows=2067660 width=8)
          ->  Hash Join  (cost=59958.35..293067.74 rows=2067660 width=8)
                Hash Cond: (i.translation_pair_id =
public.translation_pair_data.translation_pair_id)
                ->  Hash Join  (cost=365.00..177117.92 rows=2067660
width=4)
                      Hash Cond: (i.loc_submission_id =
ls.loc_submission_id)
                      ->  Seq Scan on instance i
(cost=0.00..99016.16 rows=5706016 width=8)
                      ->  Hash  (cost=293.75..293.75 rows=5700 width=4)
                            ->  Index Scan using
loc_submission_is_public on loc_submission ls  (cost=0.00..293.75
rows=5700 width=4)
                                  Index Cond: (is_public = true)
                                  Filter: is_public
                ->  Hash  (cost=31861.82..31861.82 rows=1690282 width=4)
                      ->  Seq Scan on translation_pair_data
(cost=0.00..31861.82 rows=1690282 width=4)


I figure I must be doing something wrong here. Thanks for the help,

Drew

On Apr 9, 2007, at 2:43 PM, Tom Lane wrote:

> Drew Wilson <drewmwilson@gmail.com> writes:
>> I have 2 tables (A,B) joined in a many-to-many relationship via a
>> join table ("membership"), where updating table A based on table B
>> takes a very long time.
>> ...
>>     ->  Function Scan on a  (cost=0.00..12.50 rows=1000 width=4)
>
> I think you've left out some relevant details ... there's nothing
> in what you said about a set-returning function ...
>
>             regards, tom lane


pgsql-performance by date:

Previous
From: "s d"
Date:
Subject: Re: Beginner Question
Next
From: Tom Lane
Date:
Subject: Re: how to efficiently update tuple in many-to-many relationship?