Re: Possible causes for database corruption and solutions - Mailing list pgsql-general

From Craig Ringer
Subject Re: Possible causes for database corruption and solutions
Date
Msg-id 4B28474A.3000501@postnewspapers.com.au
Whole thread Raw
In response to Possible causes for database corruption and solutions  (Michael Clark <codingninja@gmail.com>)
Responses Re: Possible causes for database corruption and solutions  (Michael Clark <codingninja@gmail.com>)
List pgsql-general
On 16/12/2009 6:39 AM, Michael Clark wrote:
> Hello all,
>
> Over the past 6 months or so I have posted to the list a couple times
> looking for information regarding recovering databases from corruption.
>   At the time the incidents of corruption among our users was very low,
> but the frequency is starting to increase, most likely due to the
> increased user base that has upgraded to the version of our software
> that uses Postgres.
>
> I have a couple questions I am hoping to get some feedback on.

> Secondly, I ask about an alternative solution to the corruption problem
> because with preliminary testing we have seen a significant degradation
> in performance.

 From changing to fsync_writethrough ? That's a good thing - it suggests
that maybe now the data is actually hitting disk when Pg asks it to.

You can have fast(er), or safe, but not both. Now that your database is
actually doing what it should be and truthfully promising that data has
hit disk when you commit, you may have to adopt some strategies to
reduce the number of very short repetitive transactions you perform.

( Should Pg perhaps detect OS X and switch the default to
fsync_writethrough ? Or are the "test to see if fsync() works on
startup" plans going anywhere? )

> I then restore the dump into a newly created DB (createdb -E UTF8 Test1)
> with the command:
> pg_restore -F c -d Test1 pgbackup

Make sure to do the restore in a single transaction. It will be
*enormously*, *massively* faster. See the "--single-transaction" option
to pg_restore.

If you can't do that, then set a commit_delay so that PostgreSQL can
batch the fsync()s for commmits together. See postgresql.conf and the
documentation for commit_delay and commit_siblings.

> I am hoping there may be an alternative to fsync_writethrough and/or we
> are barking up the wrong tree w.r.t. the cause of the corruptions.

If fsync_writethrough slowed things down that much then you've almost
certainly nailed the cause of the corruptions. Now you just need to tune
your DB, and adapt how you use the DB, so that you're less affected by
the necessary performance hit imposed by safe and reliable use of disk
storage.

--
Craig Ringer

pgsql-general by date:

Previous
From: Phoenix Kiula
Date:
Subject: Re: How to remove non-UTF values from a table?
Next
From: Craig Ringer
Date:
Subject: Re: Possible causes for database corruption and solutions