Re: Slow update statement - Mailing list pgsql-performance
From | Patrick Hatcher |
---|---|
Subject | Re: Slow update statement |
Date | |
Msg-id | 42F6BCD3.4070600@comcast.net Whole thread Raw |
In response to | Re: Slow update statement (John A Meinel <john@arbash-meinel.com>) |
Responses |
Re: Slow update statement
|
List | pgsql-performance |
Sorry went out of town for the weekend. The update did occur, but I have no idea when it finished. Here's the actual query and the explain Update: cdm.bcp_ddw_ck_cus = 12.7 M cdm.cdm_ddw_customer = 12.8M explain update cdm.cdm_ddw_customer set indiv_fkey = b.indiv_fkey from cdm.bcp_ddw_ck_cus b where cdm.cdm_ddw_customer.cus_nbr = b.cus_num; Hash Join (cost=1246688.42..4127248.31 rows=12702676 width=200) Hash Cond: ("outer".cus_num = "inner".cus_nbr) -> Seq Scan on bcp_ddw_ck_cus b (cost=0.00..195690.76 rows=12702676 width=16) -> Hash (cost=874854.34..874854.34 rows=12880834 width=192) -> Seq Scan on cdm_ddw_customer (cost=0.00..874854.34 rows=12880834 width=192) John A Meinel wrote: >Patrick Hatcher wrote: > > >>[Reposted from General section with updated information] >>Pg 7.4.5 >> >>I'm running an update statement on about 12 million records using the >>following query: >> >>Update table_A >>set F1 = b.new_data >>from table_B b >>where b.keyfield = table_A.keyfield >> >>both keyfields are indexed, all other keys in table_A were dropped, yet >>this job has been running over 15 hours. Is >>this normal? >> >> > >Can you do an EXPLAIN UPDATE so that we can have an idea what the >planner is trying to do? > >My personal concern is if it doing something like pulling in all rows >from b, and then one by one updating table_A, but as it is going, it >can't retire any dead rows, because you are still in a transaction. So >you are getting a lot of old rows, which it has to pull in to realize it >was old. > >How many rows are in table_B? > >I can see that possibly doing it in smaller chunks might be faster, as >would inserting into another table. But I would do more of a test and >see what happens. > >John >=:-> > > > >>I stopped the process the first time after 3 hours of running due to >>excessive log rotation and reset the conf file to these settings: >> >> >>wal_buffers = 64 # min 4, 8KB each >> >># - Checkpoints - >> >>checkpoint_segments = 128 # in logfile segments, min 1, 16MB each >>checkpoint_timeout = 1800 # range 30-3600, in seconds >>#checkpoint_warning = 30 # 0 is off, in seconds >>#commit_delay = 0 # range 0-100000, in microseconds >>#commit_siblings = 5 # range 1-1000 >> >> >>Would it just be quicker to run a JOIN statement to a temp file and then >>reinsert? >>TIA Patrick >> >> >>---------------------------(end of broadcast)--------------------------- >>TIP 9: In versions below 8.0, the planner will ignore your desire to >> choose an index scan if your joining column's datatypes do not >> match >> >> >> > > >
pgsql-performance by date: