Thread: Slow update

Slow update

From

Abu Mushayeed

Date:

31 January 2007, 23:57:43

Following is one of the update query and it's explain plan which takes about 6 mins to execute. I am trying to find a way to execute it faster. The functions used in the update statement are if then else test and then return one value or the other.
=====================================================
update mdc_upc
set total_curoh = ownedgoods(k.gmmid,k.divid, loc1_oh, loc2_oh,loc3_oh,loc120_oh, loc15_oh,chesh_curoh),
      total_curoo =orderedgoods(k.gmmid,k.divid, loc1_oo, loc2_oo,loc3_oo,loc120_oo, loc15_oo,chesh_curoo),
      total_oh = ownedgoods(k.gmmid,k.divid, 0, 0,loc3_oh,loc120_oh, loc15_oh,chesh_oh),
      total_oo =orderedgoods(k.gmmid,k.divid, 0, 0,loc3_oo,loc120_oo, loc15_oo,chesh_oo)
from mdc_products p LEFT OUTER JOIN
kst k on p.dvm_d = k.dept
where p.keyp_products = mdc_upc.keyf_products;
---------------------------------------------------------------------------
Hash Join (cost=48602.07..137331.77 rows=695899 width=391)
Hash Cond: ("outer".keyf_products = "inner".keyp_products)
-> Seq Scan on mdc_upc (cost=0.00..59153.99 rows=695899 width=383)
-> Hash (cost=47274.60..47274.60 rows=530990 width=12)
        -> Hash Left Join (cost=43.85..47274.60 rows=530990 width=12)
              Hash Cond: ("outer".dvm_d = "inner".dept)
              -> Seq Scan on mdc_products p (cost=0.00..39265.90 rows=530990 width=8)
              -> Hash (cost=41.48..41.48 rows=948 width=12)
                    -> Seq Scan on kst k (cost=0.00..41.48 rows=948 width=12)

======================================================
I have seen that the updates are very slow on our system. What parameter should I test in order to find out why is it slow during update.

Thanks
Abu

It's here! Your new message!
Get new email alerts with the free Yahoo! Toolbar.

Re: Slow update

From

Tomas Vondra

Date:

01 February 2007, 00:32:52

> Following is one of the update query and it's explain plan which takes
> about 6 mins to execute. I am trying to find a way to execute it faster.
> The functions used in the update statement are if then else test and
> then return one value or the other.
> =====================================================
> update mdc_upc
>   set total_curoh = ownedgoods(k.gmmid,k.divid, loc1_oh,
> loc2_oh,loc3_oh,loc120_oh, loc15_oh,chesh_curoh),
>       total_curoo =orderedgoods(k.gmmid,k.divid, loc1_oo,
> loc2_oo,loc3_oo,loc120_oo, loc15_oo,chesh_curoo),
>       total_oh = ownedgoods(k.gmmid,k.divid, 0, 0,loc3_oh,loc120_oh,
> loc15_oh,chesh_oh),
>       total_oo =orderedgoods(k.gmmid,k.divid, 0, 0,loc3_oo,loc120_oo,
> loc15_oo,chesh_oo)
> from mdc_products p LEFT OUTER JOIN
> kst k on p.dvm_d = k.dept
> where  p.keyp_products = mdc_upc.keyf_products;
>
---------------------------------------------------------------------------
> Hash Join  (cost=48602.07..137331.77 rows=695899 width=391)
>   Hash Cond: ("outer".keyf_products = "inner".keyp_products)
>   ->  Seq Scan on mdc_upc  (cost=0.00..59153.99 rows=695899 width=383)
>   ->  Hash  (cost=47274.60..47274.60 rows=530990 width=12)
>         ->  Hash Left Join  (cost=43.85..47274.60 rows=530990 width=12)
>               Hash Cond: ("outer".dvm_d = "inner".dept)
>               ->  Seq Scan on mdc_products p  (cost=0.00..39265.90
> rows=530990 width=8)
>               ->  Hash  (cost=41.48..41.48 rows=948 width=12)
>                     ->  Seq Scan on kst k  (cost=0.00..41.48 rows=948
> width=12)
>
> ======================================================
> I have seen that the updates are very slow on our system. What parameter
> should I test in order to find out why is it slow during update.
>
> Thanks
> Abu

Obviously the update is slow because of sequential scans of the table,
with about 1GB od data to read - this may seem as 'not too much of data'
but there may be a lot of seeks, knocking the performance down. You can
use iowait / dstat to see how much time is spent waiting for the data.

I have to admin I don't fully uderstand that query as I've never used
the UPDATE ... FROM query, but it seems to me you could set up some
indexes to speed things up. I'd definitely start with indexes on the
columns used in join conditions, namely

CREATE INDEX mdc_products_keyp_idx ON mdc_products(keyp_products);
CREATE INDEX mdc_upc_keyf_idx ON mdc_upc(keyf_products);
CREATE INDEX mdc_products_dvm_idx ON mdc_products(dvm_d);
CREATE INDEX kst_dept_idx ON kst(dept);

but this is just a guess as I really know NOTHING about those tables
(structure, size, statistical features, etc.) Btw. don't forget to
analyze the tables.

Another 'uncertainty' is related to the functions used in your query,
namely ownedgoods() / orderedgoods(). If these procedures do something
nontrivial (searches in tables, etc.) it might have severe impact on the
query - but the parser / optimizer knows nothing about these procedures
so it can't optimize them.

If this won't help, we'll need some more information - for example what
does the 'our system' mean - how much memory doest it have? What are the
important settings in your postgresql.conf? Especially what is the value
of effective_cache_size, work_mem, and some others (for example number
of checkpoint segments as it seems like a write-intensive query).

And last but not least info of the structure and size of the tables
(columns, indexes, number of rows, number of occupied blocks, etc).

Tomas