Home > mailing lists

Re: atrocious update performance - Mailing list pgsql-performance

From	Aaron Werman
Subject	Re: atrocious update performance
Date	March 15, 2004 20:29:58
Msg-id	Law10-OE48HbUiBWu8p0004924b@hotmail.com Whole thread Raw
In response to	atrocious update performance ("Rosser Schwarz" <rschwarz@totalcardinc.com>)
Responses	Re: atrocious update performance
List	pgsql-performance

Tree view

Bulk updates are generally dogs (not just in pg), so I avoid doing them by
doing faster selects and inserts. You can create a new table using 'create
table as' to produce your target results. This is real fast - avoiding the
row iteration in insert, allowing the select optimizer to run and no index
overhead. Then alter/rename, add indexes and whatever else hangs off the
table (or if you're lazy do an insert/select into the original target
table). I often see 2 orders of magnitude improvement doing this, and no
need to vacuum.

/Aaron

----- Original Message -----
From: "Rosser Schwarz" <rschwarz@totalcardinc.com>
To: <pgsql-performance@postgresql.org>
Sent: Monday, March 15, 2004 3:28 PM
Subject: [PERFORM] atrocious update performance


We're in the throes of an MS SQL to PostgreSQL migration; our databases
include a number of ~5M row tables.  We decided to take this opportunity
to clean up and slightly re-normalize our schemas, given what we've
learned about the data over its lifetime and such, else we wouldn't be
experiencing any of the following (we could instead just dump and `copy
from`).

We have a temporary table, public.tempprod, containing 4.7M rows, one
for each row in account.cust.  account.cust has, among others, two
columns, prod and subprod, which we're trying to update from tempprod
joined against prod.  The update tends to take unnecessarily long--
rather, we've had to finally kill it after its taking obscenely too
long.

The table:

# \d account.cust
                                      Table "account.cust"
  Column   |            Type             |             Modifiers
-----------+-----------------------------+----------------------------------
----
 custid    | bigint                      | not null default
           |                             |
nextval('account.custid_seq'::text)
 ownerid   | integer                     | not null
 origid    | text                        | not null
 pname     | text                        |
 fname     | text                        |
 mname     | text                        |
 lname     | text                        |
 suffix    | text                        |
 addr1     | text                        |
 addr2     | text                        |
 addr3     | text                        |
 city      | text                        |
 state     | text                        |
 zip       | text                        |
 zipplus   | text                        |
 homeph    | text                        |
 workph    | text                        |
 otherph   | text                        |
 ssn       | text                        |
 isactive  | boolean                     | default true
 createddt | timestamp without time zone | default now()
 prodid    | bigint                      |
 subprodid | bigint                      |
Indexes:
    "cust_pkey" primary key, btree (custid)
    "ix_addr1" btree (addr1) WHERE (addr1 IS NOT NULL)
    "ix_addr2" btree (addr2) WHERE (addr2 IS NOT NULL)
    "ix_city" btree (city) WHERE (city IS NOT NULL)
    "ix_fname" btree (fname) WHERE (fname IS NOT NULL)
    "ix_homeph" btree (homeph) WHERE (homeph IS NOT NULL)
    "ix_lname" btree (lname) WHERE (lname IS NOT NULL)
    "ix_mname" btree (mname) WHERE (mname IS NOT NULL)
    "ix_origid" btree (origid)
    "ix_ssn" btree (ssn) WHERE (ssn IS NOT NULL)
    "ix_state" btree (state) WHERE (state IS NOT NULL)
    "ix_workph" btree (workph) WHERE (workph IS NOT NULL)
    "ix_zip" btree (zip) WHERE (zip IS NOT NULL)

We're currently running on a dual Xeon 700 (I know, I know; it's what
we've got) with 2.5GB RAM and 4x36GB SCSI in hardware RAID 5 (Dell
Perc3 something-or-other controller).  If we can demonstrate that
PostgreSQL will meet our needs, we'll be going production on a dual
Opteron, maxed memory, with a 12-disk Fibre Channel array.

The query is:

update account.cust set prodid =
(select p.prodid from account.prod p
join public.tempprod t on t.pool = p.origid
where custid = t.did)

And then, upon its completion, s/prod/subprod/.

That shouldn't run overnight, should it, let alone for -days-?

In experimenting with ways of making the updates take less time, we tried
adding product and subproduct columns to tempprod, and updating those.
That seemed to work marginally better:

explain analyze update public.tempprod set prodid =
(select account.prod.prodid::bigint
from account.prod
where public.tempprod.pool::text = account.prod.origid::text)

Seq Scan on tempprod (cost=0.00..9637101.35 rows 4731410 width=56) (actual
time=24273.467..16090470.438 rows=4731410 loops=1)
  SubPlan
    ->  Limit (cost=0.00..2.02 rows=2 width=8) (actual time=0.134..0.315
        rows=1 loops=4731410)
          ->  Seq Scan on prod (cost=0.00..2.02 rows=2 width=8) (actual
              time=0.126..0.305 rows=1 loops=4731410)
              Filter: (($0)::text = (origid)::text)
Total runtime: 2284551.962 ms

But then going from public.tempprod to account.cust again takes days.  I
just cancelled an update that's been running since last Thursday.
Alas, given how long the queries take to run, I can't supply an `explain
analyze`.  The `explain` seems reasonable enough:

# explain update account.cust set prodid = tempprod.prodid
where tempprod.did = origid;

 Merge Join  (cost=0.00..232764.69 rows=4731410 width=252)
   Merge Cond: (("outer".origid)::text = ("inner".did)::text)
   ->  Index Scan using ix_origid on cust  (cost=0.00..94876.83
       rows=4731410 width=244)
   ->  Index Scan using ix_did on tempprod  (cost=0.00..66916.71
       rows=4731410 width=18)

The relevant bits from my postgreql.conf (note, we built with a BLCKSZ
of 16K):

shared_buffers = 4096
sort_mem = 32768
vacuum_mem = 32768
wal_buffers = 16384
checkpoint_segments = 64
checkpoint_timeout = 1800
checkpoint_warning = 30
commit_delay = 50000
effective_cache_size = 131072

Any advice, suggestions or comments of the "You bleeding idiot, why do
you have frob set to x?!" sort welcome.  Unfortunately, if we can't
improve this, significantly, the powers what be will probably pass
on PostgreSQL, even though everything we've done so far--with this
marked exception--performs pretty spectacularly, all told.

/rls

--
Rosser Schwarz
Total Card, Inc.


---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

pgsql-performance by date:

From: "Rosser Schwarz"
Date: 15 March 2004, 20:16:29
Subject: Re: atrocious update performance

From: Tom Lane
Date: 15 March 2004, 20:47:06
Subject: Re: High CPU with 7.4.1 after running for about 2 weeks

Re: atrocious update performance - Mailing list pgsql-performance

Previous

Next