Thread: Slow update statement

Slow update statement

From
Patrick Hatcher
Date:
[Reposted from General section with updated information]
Pg 7.4.5

I'm running an update statement on about 12 million records using the
following query:

Update table_A
set F1 = b.new_data
from table_B b
where b.keyfield = table_A.keyfield

both keyfields are indexed, all other keys in table_A were dropped, yet this job has been running over 15 hours.  Is
this normal?

I stopped the process the first time after 3 hours of running due to excessive log rotation and reset the conf file to
thesesettings: 


wal_buffers = 64                # min 4, 8KB each

# - Checkpoints -

checkpoint_segments = 128       # in logfile segments, min 1, 16MB each
checkpoint_timeout = 1800       # range 30-3600, in seconds
#checkpoint_warning = 30        # 0 is off, in seconds
#commit_delay = 0               # range 0-100000, in microseconds
#commit_siblings = 5            # range 1-1000


Would it just be quicker to run a JOIN statement to a temp file and then reinsert?

TIA
Patrick


Re: Slow update statement

From
John A Meinel
Date:
Patrick Hatcher wrote:
> [Reposted from General section with updated information]
> Pg 7.4.5
>
> I'm running an update statement on about 12 million records using the
> following query:
>
> Update table_A
> set F1 = b.new_data
> from table_B b
> where b.keyfield = table_A.keyfield
>
> both keyfields are indexed, all other keys in table_A were dropped, yet
> this job has been running over 15 hours.  Is
> this normal?

Can you do an EXPLAIN UPDATE so that we can have an idea what the
planner is trying to do?

My personal concern is if it doing something like pulling in all rows
from b, and then one by one updating table_A, but as it is going, it
can't retire any dead rows, because you are still in a transaction. So
you are getting a lot of old rows, which it has to pull in to realize it
was old.

How many rows are in table_B?

I can see that possibly doing it in smaller chunks might be faster, as
would inserting into another table. But I would do more of a test and
see what happens.

John
=:->

>
> I stopped the process the first time after 3 hours of running due to
> excessive log rotation and reset the conf file to these settings:
>
>
> wal_buffers = 64                # min 4, 8KB each
>
> # - Checkpoints -
>
> checkpoint_segments = 128       # in logfile segments, min 1, 16MB each
> checkpoint_timeout = 1800       # range 30-3600, in seconds
> #checkpoint_warning = 30        # 0 is off, in seconds
> #commit_delay = 0               # range 0-100000, in microseconds
> #commit_siblings = 5            # range 1-1000
>
>
> Would it just be quicker to run a JOIN statement to a temp file and then
> reinsert?
> TIA Patrick
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>       choose an index scan if your joining column's datatypes do not
>       match
>


Attachment

Re: Slow update statement

From
Tom Lane
Date:
Patrick Hatcher <pathat@comcast.net> writes:
> I'm running an update statement on about 12 million records using the
> following query:

> Update table_A
> set F1 = b.new_data
> from table_B b
> where b.keyfield = table_A.keyfield

What does EXPLAIN show for this?

Do you have any foreign key references to table_A from elsewhere?

            regards, tom lane

Re: Slow update statement

From
Patrick Hatcher
Date:
Sorry went out of town for the weekend.  The update did occur, but I
have no idea when it finished.

Here's the actual query and the explain Update:
cdm.bcp_ddw_ck_cus = 12.7 M
cdm.cdm_ddw_customer = 12.8M

explain
update cdm.cdm_ddw_customer
                                        set indiv_fkey = b.indiv_fkey
                                        from cdm.bcp_ddw_ck_cus b
                                        where
cdm.cdm_ddw_customer.cus_nbr = b.cus_num;


 Hash Join  (cost=1246688.42..4127248.31 rows=12702676 width=200)
   Hash Cond: ("outer".cus_num = "inner".cus_nbr)
   ->  Seq Scan on bcp_ddw_ck_cus b  (cost=0.00..195690.76 rows=12702676
width=16)
   ->  Hash  (cost=874854.34..874854.34 rows=12880834 width=192)
         ->  Seq Scan on cdm_ddw_customer  (cost=0.00..874854.34
rows=12880834 width=192)


John A Meinel wrote:

>Patrick Hatcher wrote:
>
>
>>[Reposted from General section with updated information]
>>Pg 7.4.5
>>
>>I'm running an update statement on about 12 million records using the
>>following query:
>>
>>Update table_A
>>set F1 = b.new_data
>>from table_B b
>>where b.keyfield = table_A.keyfield
>>
>>both keyfields are indexed, all other keys in table_A were dropped, yet
>>this job has been running over 15 hours.  Is
>>this normal?
>>
>>
>
>Can you do an EXPLAIN UPDATE so that we can have an idea what the
>planner is trying to do?
>
>My personal concern is if it doing something like pulling in all rows
>from b, and then one by one updating table_A, but as it is going, it
>can't retire any dead rows, because you are still in a transaction. So
>you are getting a lot of old rows, which it has to pull in to realize it
>was old.
>
>How many rows are in table_B?
>
>I can see that possibly doing it in smaller chunks might be faster, as
>would inserting into another table. But I would do more of a test and
>see what happens.
>
>John
>=:->
>
>
>
>>I stopped the process the first time after 3 hours of running due to
>>excessive log rotation and reset the conf file to these settings:
>>
>>
>>wal_buffers = 64                # min 4, 8KB each
>>
>># - Checkpoints -
>>
>>checkpoint_segments = 128       # in logfile segments, min 1, 16MB each
>>checkpoint_timeout = 1800       # range 30-3600, in seconds
>>#checkpoint_warning = 30        # 0 is off, in seconds
>>#commit_delay = 0               # range 0-100000, in microseconds
>>#commit_siblings = 5            # range 1-1000
>>
>>
>>Would it just be quicker to run a JOIN statement to a temp file and then
>>reinsert?
>>TIA Patrick
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 9: In versions below 8.0, the planner will ignore your desire to
>>      choose an index scan if your joining column's datatypes do not
>>      match
>>
>>
>>
>
>
>

Re: Slow update statement

From
Patrick Hatcher
Date:
Sorry went out of town for the weekend.  The update did occur, but I
have no idea when it finished.

Here's the actual query and the explain Update:
cdm.bcp_ddw_ck_cus = 12.7 M
cdm.cdm_ddw_customer = 12.8M

explain
update cdm.cdm_ddw_customer
                                       set indiv_fkey = b.indiv_fkey
                                       from cdm.bcp_ddw_ck_cus b
                                       where
cdm.cdm_ddw_customer.cus_nbr = b.cus_num;


Here's the table layout.  It's the first time I noticed this, but there
is a PK on the cus_nbr and an index.  Does really need to be both and
could this be causing the issue?  I thought that if a primary key was
designated, it was automatically indexed.:

CREATE TABLE cdm.cdm_ddw_customer
(
  cus_nbr int8 NOT NULL,
  ph_home int8,
  ph_day int8,
  email_adr varchar(255),
  name_prefix varchar(5),
  name_first varchar(20),
  name_middle varchar(20),
  name_last varchar(30),
  name_suffix varchar(5),
  addr1 varchar(40),
  addr2 varchar(40),
  addr3 varchar(40),
  city varchar(25),
  state varchar(7),
  zip varchar(10),
  country varchar(16),
  gender varchar(1),
  lst_dte date,
  add_dte date,
  reg_id int4,
  indiv_fkey int8,
  CONSTRAINT ddwcus_pk PRIMARY KEY (cus_nbr)
)
WITH OIDS;

CREATE INDEX cdm_ddwcust_id_idx
  ON cdm.cdm_ddw_customer
  USING btree
  (cus_nbr);


CREATE TABLE cdm.bcp_ddw_ck_cus
(
  cus_num int8,
  indiv_fkey int8 NOT NULL
)
WITHOUT OIDS;

Tom Lane wrote:

>Patrick Hatcher <pathat@comcast.net> writes:
>
>
>>I'm running an update statement on about 12 million records using the
>>following query:
>>
>>
>
>
>
>>Update table_A
>>set F1 = b.new_data
>>from table_B b
>>where b.keyfield = table_A.keyfield
>>
>>
>
>What does EXPLAIN show for this?
>
>Do you have any foreign key references to table_A from elsewhere?
>
>            regards, tom lane
>
>
>

Re: Slow update statement

From
Tom Lane
Date:
Patrick Hatcher <pathat@comcast.net> writes:
>  Hash Join  (cost=1246688.42..4127248.31 rows=12702676 width=200)
>    Hash Cond: ("outer".cus_num = "inner".cus_nbr)
>    ->  Seq Scan on bcp_ddw_ck_cus b  (cost=0.00..195690.76 rows=12702676
> width=16)
>    ->  Hash  (cost=874854.34..874854.34 rows=12880834 width=192)
>          ->  Seq Scan on cdm_ddw_customer  (cost=0.00..874854.34
> rows=12880834 width=192)

Yipes, that's a bit of a large hash table, if the planner's estimates
are on-target.  What do you have work_mem (sort_mem if pre 8.0) set to,
and how does that compare to actual available RAM?  I'm thinking you
might have set work_mem too large and the thing is now swap-thrashing.

            regards, tom lane

Re: Slow update statement

From
Patrick Hatcher
Date:
At the time this was the only process running on the box so I set
sort_mem= 228000;
It's a 12G box.

Tom Lane wrote:

>Patrick Hatcher <pathat@comcast.net> writes:
>
>
>> Hash Join  (cost=1246688.42..4127248.31 rows=12702676 width=200)
>>   Hash Cond: ("outer".cus_num = "inner".cus_nbr)
>>   ->  Seq Scan on bcp_ddw_ck_cus b  (cost=0.00..195690.76 rows=12702676
>>width=16)
>>   ->  Hash  (cost=874854.34..874854.34 rows=12880834 width=192)
>>         ->  Seq Scan on cdm_ddw_customer  (cost=0.00..874854.34
>>rows=12880834 width=192)
>>
>>
>
>Yipes, that's a bit of a large hash table, if the planner's estimates
>are on-target.  What do you have work_mem (sort_mem if pre 8.0) set to,
>and how does that compare to actual available RAM?  I'm thinking you
>might have set work_mem too large and the thing is now swap-thrashing.
>
>            regards, tom lane
>
>
>

Re: Slow update statement

From
Tom Lane
Date:
Patrick Hatcher <pathat@comcast.net> writes:
> Here's the table layout.  It's the first time I noticed this, but there
> is a PK on the cus_nbr and an index.  Does really need to be both and
> could this be causing the issue?  I thought that if a primary key was
> designated, it was automatically indexed.:

The duplicate index is certainly a waste, but it's no more expensive to
maintain than any other index would be; it doesn't seem likely that that
would account for any huge slowdown.

A long-shot theory occurs to me upon noticing that your join keys are
int8: 7.4 had a pretty bad hash function for int8, to wit it took the
low order half of the integer and ignored the high order half.  For
ordinary distributions of key values this made no difference, but I
recall seeing at least one real-world case where the information was
all in the high half of the key, and so the hash join degenerated to a
sequential search because all the entries went into the same hash
bucket.  Were you assigning cus_nbrs nonsequentially by any chance?

            regards, tom lane