On Wed, Mar 23, 2011 at 3:33 AM, Merrick <merrick@gmail.com> wrote:
> Hi,
>
> I am looking for some advice on where to troubleshoot after 1 drive in
> a RAID 1 failed.
>
> Thank you.
>
> I am running v 7.41, I am currently importing the data to another
> physical server running 8.4 and will test with that once I can. In the
> meantime here is relevant info:
>
> Backups used to take 25 minutes, and now take 110 minutes, before
> replacing the drive it became clear the backup was not going to finish
> since in 120 minutes it had only finished 200mb of 2.8gb.
>
> Before replacing the drive:
> -----------------------------------
> We noticed all of the queries were slow, many taking over 100 seconds.
> After we replaced the drives we noticed the queries are running 40
> seconds or more and most are 8 seconds or more where the same query
> used to take only 1 second. We have replaced a drive in this RAID 1
> before and nothing like this happened. The schema was not touched for
> at least 1 week prior to this.
>
> Since replacing the drive I have:
> -------------------------------------------
> Restored from a backup a few hours before the queries became very
> slow.
> Reindex all tables
> Vacuum all tables
> Analyze all tables
>
> Here is what I get with iostat:
>
> iostat -k /dev/sda2
> Linux 2.6.26-2-686-bigmem (db1)
> avg-cpu: %user %nice %system %iowait %steal %idle
> 19.61 0.00 8.34 1.60 0.00 70.45
probably the replacement drive is bunk, or some esoteric hw problem is
tripping you up. some iostat numbers while you are having the problem
would be more telling. the solution is obvious -- in terms of this
server, it's time to ramble on...
merlin