Thread: UPDATE becomes mired / win32

UPDATE becomes mired / win32

From
Steve Peterson
Date:
I'm having an interesting (perhaps anomalous) variability in UPDATE
performance on a table in my database, and wanted to see if there was
any interest in looking further before I destroy the evidence and move on.

The table, VOTER, contains 3,090,013 rows and each row is about 120
bytes wide.  It's loaded via a batch process in one shot, and the
load is followed by an VACUUM FULL ANALYZE.  Its structure is shown
at the bottom of the message.

If I run the statement:

(1):  UPDATE voter SET gender = 'U';

on the table in this condition, the query effectively never ends --
I've allowed it to run for 12-14 hours before giving up.  The plan
for that statement is:

Seq Scan on voter  (cost=0.00..145117.38 rows=3127738 width=120)

However, if I do the following:

(2):  CREATE TABLE voter_copy AS SELECT * FROM voter;
(3):  UPDATE voter_copy SET gender = 'U';

the query is much faster --

Seq Scan on voter_copy  (cost=0.00..96231.35 rows=3090635 width=120)
(actual time=108.056..43203.696 rows=3090013 loops=1)
Total runtime: 117315.731 ms

When (1) is running, the machine is very nearly idle, with no
postmaster taking more than 1 or 2 % of the CPU.  When (3) is
running, about 60% CPU utilization occurs.

The same behavior occurs if the table is dumped and reloaded.

My environment is Windows XP SP2 and I'm on Postgresql 8.1.4
installed via the msi installer.  Hardware is an Athlon 2000+
1.67ghx, with 1G RAM.  The database is hosted on a Seagate Barracuda
7200.10 connected via a FastTrak 4300 without any RAID
configuration.  dd shows a write speed of 39 MB/s and read speed of
44 MB/s.  The server configuration deviates from the default in these
statements:

fsync = off
shared_buffers = 25000
work_mem = 50000
maintenance_work_mem = 100000

CREATE TABLE voter
(
   voter_id int4,
   sos_voter_id varchar(20),
   household_id int4,
   first_name varchar(40),
   middle_name varchar(40),
   last_name varchar(40),
   name_suffix varchar(10),
   phone_number varchar(10),
   bad_phone_no bool,
   registration_date date,
   birth_year int4,
   gender char(1),
   pri_ind char(1),
   gen_1992_primary_party char(1),
   council_votes int2,
   primary_votes int2,
   council_primary_votes int2,
   special_votes int2,
   presidential_votes int2,
   votes int2,
   absentee_votes int2,
   last_voted_date date,
   first_voted_date date,
   rating char(1),
   score float4,
   general_votes int2
)
WITHOUT OIDS;


Re: UPDATE becomes mired / win32

From
Date:
> The table, VOTER, contains 3,090,013 rows and each row is about 120 bytes
> wide.  It's loaded via a batch process in one shot, and the load is
> followed by an VACUUM FULL ANALYZE.  Its structure is shown at the bottom
> of the message.


if the table wasn't empty before and has indices defined, try a "REINDEX
TABLE VOTER" before running the update. i had a similar case where an often
updated table was vacuumed regurarly, but the indices grew and grew and
grew.  in my case the table - even when empty and analyze full'ed was 1.2gb
according to pgadmin due to (outdated) indices. a reindex fixed all my
performance issues.

- thomas



Re: UPDATE becomes mired / win32

From
Tom Lane
Date:
Steve Peterson <stevep-hv@zpfe.com> writes:
> If I run the statement:
> (1):  UPDATE voter SET gender = 'U';
> on the table in this condition, the query effectively never ends --
> I've allowed it to run for 12-14 hours before giving up.
> ...
> When (1) is running, the machine is very nearly idle, with no
> postmaster taking more than 1 or 2 % of the CPU.

Is the disk busy?  If neither CPU nor I/O are saturated, then it's a
good bet that the UPDATE isn't actually running at all, but is waiting
for a lock somewhere.  Have you looked into pg_locks to check for a
conflicting lock?

            regards, tom lane

Re: UPDATE becomes mired / win32

From
Steve Peterson
Date:
Both commands seem to be saturating the disk.  There's nothing else
running in the database, and all of the locks have 't' in the granted
column, which I'm assuming means they're not blocked.

According to the statistics, the original table has 889 mb and
indexes of 911mb, whereas the copy has 1021 mb and no space for indexes.

Steve

At 03:28 PM 10/4/2006, Tom Lane wrote:
>Steve Peterson <stevep-hv@zpfe.com> writes:
> > If I run the statement:
> > (1):  UPDATE voter SET gender = 'U';
> > on the table in this condition, the query effectively never ends --
> > I've allowed it to run for 12-14 hours before giving up.
> > ...
> > When (1) is running, the machine is very nearly idle, with no
> > postmaster taking more than 1 or 2 % of the CPU.
>
>Is the disk busy?  If neither CPU nor I/O are saturated, then it's a
>good bet that the UPDATE isn't actually running at all, but is waiting
>for a lock somewhere.  Have you looked into pg_locks to check for a
>conflicting lock?
>
>                         regards, tom lane



Re: UPDATE becomes mired / win32

From
Steve Peterson
Date:
I'm pretty sure that the table was empty before doing the load, but I
gave this a shot.  It didn't have an impact on the results.

The behavior also persists across a dump/reload of the table into a
new install on a different machine.  IIRC dump/reload rebuilds
indexes from scratch.

Steve

At 01:13 PM 10/4/2006, me@alternize.com wrote:
>>The table, VOTER, contains 3,090,013 rows and each row is about 120
>>bytes wide.  It's loaded via a batch process in one shot, and the
>>load is followed by an VACUUM FULL ANALYZE.  Its structure is shown
>>at the bottom of the message.
>
>
>if the table wasn't empty before and has indices defined, try a
>"REINDEX TABLE VOTER" before running the update. i had a similar
>case where an often updated table was vacuumed regurarly, but the
>indices grew and grew and grew.  in my case the table - even when
>empty and analyze full'ed was 1.2gb according to pgadmin due to
>(outdated) indices. a reindex fixed all my performance issues.
>
>- thomas
>



Re: UPDATE becomes mired / win32

From
Tom Lane
Date:
Steve Peterson <stevep-hv@zpfe.com> writes:
> The behavior also persists across a dump/reload of the table into a
> new install on a different machine.  IIRC dump/reload rebuilds
> indexes from scratch.

Hm.  There must be something you're not telling us that accounts for
the difference between the original table and the copied version ---
foreign keys linking from other tables, perhaps?

            regards, tom lane