Update with Subquery Performance - Mailing list pgsql-performance

From Linux Guru
Subject Update with Subquery Performance
Date
Msg-id 3caa866c0802110306rcb5e72dyd264ff330c61f8ae@mail.gmail.com
Whole thread Raw
Responses Re: Update with Subquery Performance
Re: Update with Subquery Performance
List pgsql-performance
We have a large datawarehouse stored in postgres and temp tables are created based on user query. The process of temp table creation involves selecting data from main fact table, this includes several select and update statements and one of the following update statement is having performance issues.

The newly temp table created for this scenario contains 22712 rows. Here is the query

alter table dummy add column gp numeric(40,15);
update dummy set gp=(select (
case when sum(temp.pd) <> 0 then sum(temp.gd)/sum(temp.pd)*100 else 0 end  )   from dummy as temp
where temp.product=dummy.product)

Now this query basically updates a table using values within itself in the subquery but it takes tooooo much time i.e. approx 5 mins. The whole temp table creation process is stucked in this query (there are 4 additional such updates with same problem). Index creation is useless here since its only a one time process.

Here is the strip down version (the part making performance issue) of above query i.e. only select statement
-------------------------------
select (case when sum(temp.pd) <> 0 then sum(temp.gd)/sum(temp.pd)*100 else 0 end  )   from dummy as temp, dummy as temp2
where temp.product=temp2.product group by temp.product

"HashAggregate  (cost=1652480.98..1652481.96 rows=39 width=39)"
"  ->  Hash Join  (cost=1636.07..939023.13 rows=71345785 width=39)"
"        Hash Cond: (("temp".product)::text = (temp2.product)::text)"
"        ->  Seq Scan on dummy "temp"  (cost=0.00..1311.03 rows=26003 width=39)"
"        ->  Hash  (cost=1311.03..1311.03 rows=26003 width=21)"
"              ->  Seq Scan on dummy temp2  (cost=0.00..1311.03 rows=26003 width=21)"
-------------------------------


Whats the solution of this problem, or any alternate way to write this query?


pgsql-performance by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Optimizer : query rewrite and execution plan ?
Next
From: Tom Lane
Date:
Subject: Re: Optimizer : query rewrite and execution plan ?