Re: Running update in chunks? - Mailing list pgsql-general

From Tim Uckun
Subject Re: Running update in chunks?
Date
Msg-id CAGuHJrMPKf-E6wb40F7UCWFxFA9=6mi4e=N8OHqBnZGpqS50qQ@mail.gmail.com
Whole thread Raw
In response to Re: Running update in chunks?  ("Kevin Grittner" <kgrittn@mail.com>)
Responses Re: Running update in chunks?  (Richard Huxton <dev@archonet.com>)
List pgsql-general
Just to close this up and give some guidance to future googlers...

There are no foreign keys. The table is empty when I start. I copy the
data into it from a text file.

Removing indexes didn't help much (made a very slight difference).

running a query CREATE TEMP TABLE tt AS SELECT .... using a massive
join takes about 8 seconds. I presume that's the baseline for the disk
and RAM given my current postgres configuration.  Note that this is
not a satisfactory option for me because I can't do what I want in one
step (the update I specified is one of many).

running a very simple update "UPDATE imports set make_id = null"
takes over 50 seconds so that's the minimum amount of time any update
is going to take.

Running a complex update where I join all the tables together and
update all the fields takes about 106 seconds.

Just running a complex select with the joins takes no time at all.

I tried chunking the updates using chunks of 100 records and 1000
records (where ID between X and Y repeatedly) and it was even slower.

Conclusion.  Updates on postgres are slow (given the default
postgresql.conf).  I presume this is due to MVCC or the WAL or
something and there are probably some things I can do to tweak the
conf file to make them go faster but out of the box running an update
on a table with lots of rows is going to cost you a lot. Removing the
indexes doesn't help that much.

Suggestion for the PG team.  Deliver a more realistic postgres.conf by
default. The default one seems to be aimed at ten year old PCs with
very little RAM and disk space. At least deliver additional conf files
for small, medium, large, huge setups.


pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: What is impact of "varchar_opts"?
Next
From: Edson Richter
Date:
Subject: Re: What is impact of "varchar_opts"?