Re: atrocious update performance - Mailing list pgsql-performance

From Rosser Schwarz
Subject Re: atrocious update performance
Date
Msg-id 001d01c40b7c$b80c98a0$2500fa0a@CardServices.TCI.com
Whole thread Raw
In response to Re: atrocious update performance  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: atrocious update performance
List pgsql-performance
while you weren't looking, Tom Lane wrote:

> But with the possible exception of wal_buffers, I can't see
> anything in
> these settings that explains the originally complained-of performance
> problem.  I'm still wondering about foreign key checks.

Many of the configs I posted were fairly wild values, set to gather
data points for further tweaking.  Unfortunately, with this query
there hasn't been time for many passes, and I've too much else on my
plate to try concocting demonstration cases.  The postmaster's been
hupped with more sane values, but I experienced this same issue with
the defaults.

As for foreign keys, three tables refer to account.cust; all of them
refer to account.cust.custid, the pk.  One of those tables has several
hundred thousand rows, many more to come; the others are empty.  Unless
I've woefully misunderstood, the presence or absence of a foreign key
referring to one column should be moot for updates writing another
column, shouldn't it?

To answer your (and others') question, Tom, 7.4.1 on 2.4.20-18.9smp.
Red Hat, I believe.  I was handed the machine, which is also in use
for lightweight production stuff:  intranet webserver, rinky-dink
MySQL doo-dads, &c.  I'm sure that has an impact, usurping the disk
heads and such--maybe even more than I'd expect--but I can't imagine
that'd cause an update to one 4.7M row table, from another 4.7M row
table, both clustered on a join column that maps one-to-one between
them, to take days.  I'm baffled; everything else is perfectly snappy,
given the hardware.  Anything requiring a sequential scan over one of
the big tables is a slog, but that's to be expected and hence all the
indices.

Watching iostat, I've observed a moderately cyclic read-big, write-
big pattern, wavelengths generally out of phase, interspersed with
smaller, almost epicycles--from the machine's other tasks, I'm sure.
top has postmaster's cpu usage rarely breaking 25% over the course
of the query's execution, and spending most of its time much lower;
memory usage hovers somewhere north of 500MB.

In what little time I had to stare at a disturbingly matrix-esque
array of terminals scrolling sundry metrics, I didn't notice a
correlation between cpu usage spikes and peaks in the IO cycle's
waveforms.  For whatever that's worth.

The other tables involved are:

# \d account.acct
                                      Table "account.acct"
   Column   |            Type             |        Modifiers
------------+-----------------------------+---------------------------------
----
 acctid     | bigint                      | not null default
                                          |
nextval('account.acctid_seq'::text)
 custid     | bigint                      |
 acctstatid | integer                     | not null
 acctno     | character varying(50)       |
 bal        | money                       |
 begdt      | timestamp without time zone | not null
 enddt      | timestamp without time zone |
 debtid     | character varying(50)       |
Indexes:
    "acct_pkey" primary key, btree (acctid)
    "ix_acctno" btree (acctno) WHERE (acctno IS NOT NULL)
Foreign-key constraints:
    "$1" FOREIGN KEY (custid) REFERENCES account.cust(custid)
         ON UPDATE CASCADE ON DELETE RESTRICT
    "$2" FOREIGN KEY (acctstatid) REFERENCES account.acctstat(acctstatid)
         ON UPDATE CASCADE ON DELETE RESTRICT

# \d account.note
                                      Table "account.note"
  Column   |            Type             |           Modifiers
-----------+-----------------------------+----------------------------------
---
 noteid    | bigint                      | not null default
                                         |
nextval('account.noteid_seq'::text)
 custid    | bigint                      | not null
 note      | text                        | not null
 createddt | timestamp without time zone | not null default now()
Indexes:
    "note_pkey" primary key, btree (noteid)
Foreign-key constraints:
    "$1" FOREIGN KEY (custid) REFERENCES account.cust(custid)
         ON UPDATE CASCADE ON DELETE RESTRICT

# \d account.origacct
               Table "account.origacct"
   Column    |            Type             | Modifiers
-------------+-----------------------------+-----------
 custid      | bigint                      |
 lender      | character varying(50)       |
 chgoffdt    | timestamp without time zone |
 opendt      | timestamp without time zone |
 offbureaudt | timestamp without time zone |
 princbal    | money                       |
 intbal      | money                       |
 totbal      | money                       |
 lastpayamt  | money                       |
 lastpaydt   | timestamp without time zone |
 debttype    | integer                     |
 debtid      | character varying(10)       |
 acctno      | character varying(50)       |
Foreign-key constraints:
    "$1" FOREIGN KEY (custid) REFERENCES account.cust(custid)
         ON UPDATE CASCADE ON DELETE RESTRICT

And the table we were joining to get the new values for prodid and
subprodid:

# \d tempprod
            Table "public.tempprod"
  Column   |         Type          | Modifiers
-----------+-----------------------+-----------
 debtid    | character varying(10) | not null
 pool      | character varying(10) | not null
 port      | character varying(10) | not null
 subprodid | bigint                |
 prodid    | bigint                |
Indexes:
    "ix_debtid" btree (debtid)

/rls

--
Rosser Schwarz
Total Card, Inc.


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: atrocious update performance
Next
From: Tom Lane
Date:
Subject: Re: atrocious update performance