Re: Massive table (500M rows) update nightmare - Mailing list pgsql-performance

From Leo Mannhart
Subject Re: Massive table (500M rows) update nightmare
Date
Msg-id 4B45DE4C.4050606@beecom.ch
Whole thread Raw
In response to Massive table (500M rows) update nightmare  ("Carlo Stonebanks" <stonec.register@sympatico.ca>)
Responses Re: Massive table (500M rows) update nightmare
Re: Massive table (500M rows) update nightmare
Re: Massive table (500M rows) update nightmare
List pgsql-performance
Carlo Stonebanks wrote:
> Our DB has an audit table which is 500M rows and growing. (FYI the
> objects being audited are grouped semantically, not individual field
> values).
>
> Recently we wanted to add a new feature and we altered the table to add
> a new column. We are backfilling this varchar(255) column by writing a
> TCL script to page through the rows (where every update is a UPDATE ...
> WHERE id >= x AND id < x+10 and a commit is performed after every 1000
> updates statement, i.e. every 10000 rows.)
>
> We have 10 columns, six of which are indexed. Rough calculations suggest
> that this will take two to three weeks to complete on an 8-core CPU with
> more than enough memory.
>
> As a ballpark estimate - is this sort of performance for an 500M updates
> what one would expect of PG given the table structure (detailed below)
> or should I dig deeper to look for performance issues?
>
> As always, thanks!
>
> Carlo
>

If it is possible to lock this audit table exclusively (may be during
off peak hours) I would look into
- create new_audit_table as select col1, col2, col3 ... col9,
'new_col_value' from old_audit_table;
- create all indexes
- drop old_audit_table
- rename new_audit_table to old_audit_table

That is probably the fasted method you can do, even if you have to join
the "new_col_value" from an extra helper-table with the correspondig id.
Remeber, databases are born to join.

You could also try to just update the whole table in one go, it is
probably faster than you expect.

just a thought
Leo

pgsql-performance by date:

Previous
From: "A. Kretschmer"
Date:
Subject: Re: Air-traffic benchmark
Next
From: Lefteris
Date:
Subject: Re: Air-traffic benchmark