Thread: update slow
Hello list, I have 7.3.4 on RH 8, server hardware is a dual processor Intel Xeon 2.4 Ghz, 2G RAM, 27Gb HD available on postgres partition. Table cost2est2003 with 99350 recs is described as: code char(4) -- index tcos numeric(12,4) mes char(6) -- index Table estprod with 355513 recs is described as: pk_estprod -- index epr_periodo char(6) -- Index epr_venta numeric(12,4) epr_costo numeric(12,4) pro_code char(4) This is the update command: update estprod set epr_costo=(select tcos from cost2est2003 where code=pro_code and mes=epr_periodo) where epr_periodo >='200301' The above filters the records to affect only 99157 rows, the update takes hours and don't get done (I did cancel it), I changed the filter to "epr_periodo ='200301'" to update only 9756 rows but still has more than 10 minutes working. I wonder if there is something I'm doing wrong, any help will be appreciated. BTW the server is not in production is not doing anything else. Thanks in advance -- Josué Maldonado.
On Tuesday 11 November 2003 19:46, Josué Maldonado wrote: > This is the update command: > update estprod set epr_costo=(select tcos from cost2est2003 where > code=pro_code and mes=epr_periodo) where epr_periodo >='200301' > > The above filters the records to affect only 99157 rows, the update > takes hours and don't get done (I did cancel it), I changed the filter > to "epr_periodo ='200301'" to update only 9756 rows but still has more > than 10 minutes working. Can you post EXPLAIN ANALYSE <query> on this shorter one - that will show what PG is doing. One thing that might be worth looking at is using PG's non-standard UPDATE t1 SET c1=... FROM t2 ... -- Richard Huxton Archonet Ltd
Hi Richard, Here are both ways: dbmund=# EXPLAIN ANALYSE update estprod set dbmund-# epr_costo=(select tcos from cost2est2003 where code=pro_code and mes=epr_periodo) dbmund-# where epr_periodo='200311'; Seq Scan on estprod (cost=0.00..9177.91 rows=8080 width=163) (actual time=440.99..627139.23 rows=6867 loops=1) Filter: (epr_periodo = '200311'::bpchar) SubPlan -> Index Scan using c2emes on cost2est2003 (cost=0.00..1532.53 rows=2 width=16) (actual time=66.60..91.19 rows=1 loops=6867) Index Cond: (mes = $1) Filter: (code = $0) Total runtime: 628557.56 msec (7 rows) dbmund=# explain analyze update estprod set dbmund-# epr_costo= tcos dbmund-# from cost2est2003 dbmund-# where code=pro_code and mes=epr_periodo dbmund-# and epr_periodo='200311'; Merge Join (cost=10080.76..15930.98 rows=316 width=197) (actual time=1191.89..4704.49 rows=6851 loops=1) Merge Cond: ("outer".code = "inner".pro_code) Join Filter: ("outer".mes = "inner".epr_periodo) -> Index Scan using c2ecode on cost2est2003 (cost=0.00..4614.85 rows=99350 width=34) (actual time=0.16..2974.96 rows=99350 loops=1) -> Sort (cost=10080.76..10100.96 rows=8080 width=163) (actual time=1191.62..1235.32 rows=55216 loops=1) Sort Key: estprod.pro_code -> Seq Scan on estprod (cost=0.00..9177.91 rows=8080 width=163) (actual time=396.88..1126.28 rows=6867 loops=1) Filter: (epr_periodo = '200311'::bpchar) Total runtime: 5990.34 msec (9 rows) Wow, update from is pretty faster than the first update, can't understand why. Thanks Richard Huxton wrote: > On Tuesday 11 November 2003 19:46, Josué Maldonado wrote: > >>This is the update command: >>update estprod set epr_costo=(select tcos from cost2est2003 where >>code=pro_code and mes=epr_periodo) where epr_periodo >='200301' >> >>The above filters the records to affect only 99157 rows, the update >>takes hours and don't get done (I did cancel it), I changed the filter >>to "epr_periodo ='200301'" to update only 9756 rows but still has more >>than 10 minutes working. > > > Can you post EXPLAIN ANALYSE <query> on this shorter one - that will show what > PG is doing. > > One thing that might be worth looking at is using PG's non-standard > UPDATE t1 SET c1=... FROM t2 ... > -- Josué Maldonado.
On Tuesday 11 November 2003 20:55, Josué Maldonado wrote: > Hi Richard, > > Here are both ways: Excellent - OK, here's a crash course in understanding the output of EXPLAIN ANALYSE... > dbmund=# EXPLAIN ANALYSE update estprod set > dbmund-# epr_costo=(select tcos from cost2est2003 where code=pro_code > and mes=epr_periodo) > dbmund-# where epr_periodo='200311'; > > Seq Scan on estprod (cost=0.00..9177.91 rows=8080 width=163) (actual > time=440.99..627139.23 rows=6867 loops=1) PG predicted 8080 rows to update (each of 163 bytes). It was actually 6867 rows with a total time of 627139.23 milliseconds (over 10 minutes). > Filter: (epr_periodo = '200311'::bpchar) > SubPlan > -> Index Scan using c2emes on cost2est2003 (cost=0.00..1532.53 > rows=2 width=16) (actual time=66.60..91.19 rows=1 loops=6867) Each sub-select took 91.19ms but looping through 6867 calls, that's 626 seconds - almost all of your query time. > Index Cond: (mes = $1) > Filter: (code = $0) > Total runtime: 628557.56 msec The only thing that might speed it up is to build an index on (mes,code) for cost2est2003, and even then I wouldn't expect wonders. > dbmund=# explain analyze update estprod set > dbmund-# epr_costo= tcos > dbmund-# from cost2est2003 > dbmund-# where code=pro_code and mes=epr_periodo > dbmund-# and epr_periodo='200311'; > > Merge Join (cost=10080.76..15930.98 rows=316 width=197) (actual > time=1191.89..4704.49 rows=6851 loops=1) > Merge Cond: ("outer".code = "inner".pro_code) > Join Filter: ("outer".mes = "inner".epr_periodo) > -> Index Scan using c2ecode on cost2est2003 (cost=0.00..4614.85 > rows=99350 width=34) (actual time=0.16..2974.96 rows=99350 loops=1) > -> Sort (cost=10080.76..10100.96 rows=8080 width=163) (actual > time=1191.62..1235.32 rows=55216 loops=1) > Sort Key: estprod.pro_code > -> Seq Scan on estprod (cost=0.00..9177.91 rows=8080 > width=163) (actual time=396.88..1126.28 rows=6867 loops=1) > Filter: (epr_periodo = '200311'::bpchar) > Total runtime: 5990.34 msec > (9 rows) > > Wow, update from is pretty faster than the first update, can't > understand why. In the second example, PG gathers the target rows in estprod sorts them and then joins using your index on c2ecode. This is clearly a better plan. Note that since the rows from the previous test could still be in memory, you'd expect an improvement anyway, but this plan just looks better. PG obviously can't figure out that it can convert the first example into the second, but I'm guessing there are good reasons for that. Although your example is simple enough, there might be cases where it isn't safe to do so. Presumably this is one reason why the non-standard UPDATE...FROM statement is there. Try the second form with your real query, VACUUM ANALYSE both tables and perhaps try an index on (mes,code) and see what that does for you. If you get any more performance issues, there is a mailing list specifically for them - plenty of knowledgable types there. -- Richard Huxton Archonet Ltd