I'm having an interesting (perhaps anomalous) variability in UPDATE
performance on a table in my database, and wanted to see if there was
any interest in looking further before I destroy the evidence and move on.
The table, VOTER, contains 3,090,013 rows and each row is about 120
bytes wide. It's loaded via a batch process in one shot, and the
load is followed by an VACUUM FULL ANALYZE. Its structure is shown
at the bottom of the message.
If I run the statement:
(1): UPDATE voter SET gender = 'U';
on the table in this condition, the query effectively never ends --
I've allowed it to run for 12-14 hours before giving up. The plan
for that statement is:
Seq Scan on voter (cost=0.00..145117.38 rows=3127738 width=120)
However, if I do the following:
(2): CREATE TABLE voter_copy AS SELECT * FROM voter;
(3): UPDATE voter_copy SET gender = 'U';
the query is much faster --
Seq Scan on voter_copy (cost=0.00..96231.35 rows=3090635 width=120)
(actual time=108.056..43203.696 rows=3090013 loops=1)
Total runtime: 117315.731 ms
When (1) is running, the machine is very nearly idle, with no
postmaster taking more than 1 or 2 % of the CPU. When (3) is
running, about 60% CPU utilization occurs.
The same behavior occurs if the table is dumped and reloaded.
My environment is Windows XP SP2 and I'm on Postgresql 8.1.4
installed via the msi installer. Hardware is an Athlon 2000+
1.67ghx, with 1G RAM. The database is hosted on a Seagate Barracuda
7200.10 connected via a FastTrak 4300 without any RAID
configuration. dd shows a write speed of 39 MB/s and read speed of
44 MB/s. The server configuration deviates from the default in these
statements:
fsync = off
shared_buffers = 25000
work_mem = 50000
maintenance_work_mem = 100000
CREATE TABLE voter
(
voter_id int4,
sos_voter_id varchar(20),
household_id int4,
first_name varchar(40),
middle_name varchar(40),
last_name varchar(40),
name_suffix varchar(10),
phone_number varchar(10),
bad_phone_no bool,
registration_date date,
birth_year int4,
gender char(1),
pri_ind char(1),
gen_1992_primary_party char(1),
council_votes int2,
primary_votes int2,
council_primary_votes int2,
special_votes int2,
presidential_votes int2,
votes int2,
absentee_votes int2,
last_voted_date date,
first_voted_date date,
rating char(1),
score float4,
general_votes int2
)
WITHOUT OIDS;