I'm loving the fact that while I am doing some one-time updates to the
DB, users can still SELECT away to glory. This is a major boon in
comparison to my experience with another major opensource database.
However, I am a little frustrated by the amount of time PGSQL takes to
complete tasks. Just to accommodate these tasks, my conf file has the
following:
autovacuum = off
wal_buffers=64
checkpoint_segments=1000
checkpoint_timeout=900
fsync = off
maintenance_work_mem = 128MB
[PS: I will enable fsync after these operations, and decrease the
checkpoint_segments.]
I have dropped all indexes/indicises on my table, except for the
primary key. Still, when I run the query:
UPDATE mytable SET mycolumn = lower(mycolumn);
This is, at the time of this writing, has taken well over 35 minutes!
On a table of a "mere" 6 million rows (quoted from one discussion on
this mailing list).
I am on a 4GB RAM machine with two Intel Dual Core processors. Albeit
this is not a dedicated db server, another comparable FOSS database
never took these kinds of times to perform its operations.
Suspecting that locking may be the cause of this, I read up on
http://www.postgresql.org/docs/8.2/static/explicit-locking.html and
found nothing specific that would help a person starting out on the DB
to actually do meaningful explicit locking that the UPDATE command
does not already do.
I am now trying doing something like
UPDATE mytable SET mycolumn = lower(mycolumn)
WHERE id BETWEEN x AND y ;
This is way too laborious and untenable because I want to put the
fsync back on as soon as possible; this is a production database!
What else can I do to make this go fast enough to be normal!? Penny
for any thoughts and tips.