Thread: Slow restoration question

Slow restoration question

From
Eric Lam
Date:
Hello list,

what is the quickest way of dumping a DB and restoring it? I have done a

   "pg_dump -D database | split --line-bytes 1546m part"

Restoration as

  "cat part* | psql database 2> errors 1>/dev/null"

all dumpfiles total about 17Gb. It has been running for 50ish hrs and up
to about the fourth file (5-6 ish Gb) and this is on a raid 5 server.

A while back I did something similar for a table with where I put all
the insert statements in one begin/end/commit block, this slowed down
the restoration process. Will the same problem [slow restoration] occur
if there is no BEGIN and END block? I assume the reason for slow inserts
in this instance is that it allows for rollback, if this is the case
can I turn this off?

Thanks in advance
Eric Lam

Re: Slow restoration question

From
Tom Lane
Date:
Eric Lam <elam@lisasoft.com> writes:
> what is the quickest way of dumping a DB and restoring it? I have done a

>    "pg_dump -D database | split --line-bytes 1546m part"

Don't use "-D" if you want fast restore ...

            regards, tom lane

Re: Slow restoration question

From
Andreas Kretschmer
Date:
Tom Lane <tgl@sss.pgh.pa.us> schrieb:

> Eric Lam <elam@lisasoft.com> writes:
> > what is the quickest way of dumping a DB and restoring it? I have done a
>
> >    "pg_dump -D database | split --line-bytes 1546m part"
>
> Don't use "-D" if you want fast restore ...

hehe, yes ;-)

http://people.planetpostgresql.org/devrim/index.php?/archives/44-d-of-pg_dump.html


Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."    (unknow)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: Slow restoration question

From
"Jim C. Nasby"
Date:
On Wed, Apr 26, 2006 at 05:14:41PM +0930, Eric Lam wrote:
> all dumpfiles total about 17Gb. It has been running for 50ish hrs and up
> to about the fourth file (5-6 ish Gb) and this is on a raid 5 server.

RAID5 generally doesn't bode too well for performance; that could be
part of the issue.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Slow restoration question

From
Brendan Duddridge
Date:
Everyone here always says that RAID 5 isn't good for Postgres. We
have an Apple Xserve RAID configured with RAID 5. We chose RAID 5
because Apple said their Xserve RAID was "optimized" for RAID 5. Not
sure if we made the right decision though. They give an option for
formatting as RAID 0+1. Is that the same as RAID 10 that everyone
talks about? Or is it the reverse?

Thanks,

____________________________________________________________________
Brendan Duddridge | CTO | 403-277-5591 x24 |  brendan@clickspace.com

ClickSpace Interactive Inc.
Suite L100, 239 - 10th Ave. SE
Calgary, AB  T2G 0V9

http://www.clickspace.com

On May 2, 2006, at 11:16 AM, Jim C. Nasby wrote:

> On Wed, Apr 26, 2006 at 05:14:41PM +0930, Eric Lam wrote:
>> all dumpfiles total about 17Gb. It has been running for 50ish hrs
>> and up
>> to about the fourth file (5-6 ish Gb) and this is on a raid 5 server.
>
> RAID5 generally doesn't bode too well for performance; that could be
> part of the issue.
> --
> Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
> Pervasive Software      http://pervasive.com    work: 512-231-6117
> vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org
>



Re: Slow restoration question

From
Mark Lewis
Date:
They are not equivalent.  As I understand it, RAID 0+1 performs about
the same as RAID 10 when everything is working, but degrades much less
nicely in the presence of a single failed drive, and is more likely to
suffer catastrophic data loss if multiple drives fail.

-- Mark

On Tue, 2006-05-02 at 12:40 -0600, Brendan Duddridge wrote:
> Everyone here always says that RAID 5 isn't good for Postgres. We
> have an Apple Xserve RAID configured with RAID 5. We chose RAID 5
> because Apple said their Xserve RAID was "optimized" for RAID 5. Not
> sure if we made the right decision though. They give an option for
> formatting as RAID 0+1. Is that the same as RAID 10 that everyone
> talks about? Or is it the reverse?
>
> Thanks,
>
> ____________________________________________________________________
> Brendan Duddridge | CTO | 403-277-5591 x24 |  brendan@clickspace.com
>
> ClickSpace Interactive Inc.
> Suite L100, 239 - 10th Ave. SE
> Calgary, AB  T2G 0V9
>
> http://www.clickspace.com
>
> On May 2, 2006, at 11:16 AM, Jim C. Nasby wrote:
>
> > On Wed, Apr 26, 2006 at 05:14:41PM +0930, Eric Lam wrote:
> >> all dumpfiles total about 17Gb. It has been running for 50ish hrs
> >> and up
> >> to about the fourth file (5-6 ish Gb) and this is on a raid 5 server.
> >
> > RAID5 generally doesn't bode too well for performance; that could be
> > part of the issue.
> > --
> > Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
> > Pervasive Software      http://pervasive.com    work: 512-231-6117
> > vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
> >
> > ---------------------------(end of
> > broadcast)---------------------------
> > TIP 4: Have you searched our list archives?
> >
> >                http://archives.postgresql.org
> >
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match

Re: Slow restoration question

From
Will Reese
Date:
RAID 10 is better than RAID 0+1.  There is a lot of information on
the net about this, but here is the first one that popped up on
google for me.

http://www.pcguide.com/ref/hdd/perf/raid/levels/multLevel01-c.html

The quick summary is that performance is about the same between the
two, but RAID 10 gives better fault tolerance and rebuild
performance.  I have seen docs for RAID cards that have confused
these two RAID levels.  In addition, some cards claim to support RAID
10, when they actually support RAID 0+1 or even RAID 0+1 with
concatenation (lame, some of the Dell PERCs have this).

RAID 10 with 6 drives would stripe across 3 mirrored pairs.  RAID 0+1
with 6 drives is a mirror of two striped arrays (3 disks each).  RAID
0+1 (with concatenation) using 6 drives is a mirror of two volumes
(kind of like JBOD) each consisting of 3 drives concatenated together
(it's a cheap implementation, and it gives about the same performance
as RAID 1 but with increased storage capacity and less fault
tolerance).  RAID 10 is better than RAID 5 (especially with 6 or less
disks) because you don't have the performance hit for parity (which
dramatically affects rebuild performance and write performance) and
you get better fault tolerance (up to 3 disks can fail in a 6 disk
RAID 10 and you can still be online, with RAID 5 you can only lose 1
drive).  All of this comes with a higher cost (more drives and higher
end cards).

-- Will Reese http://blog.rezra.com


On May 2, 2006, at 1:49 PM, Mark Lewis wrote:

> They are not equivalent.  As I understand it, RAID 0+1 performs about
> the same as RAID 10 when everything is working, but degrades much less
> nicely in the presence of a single failed drive, and is more likely to
> suffer catastrophic data loss if multiple drives fail.
>
> -- Mark
>
> On Tue, 2006-05-02 at 12:40 -0600, Brendan Duddridge wrote:
>> Everyone here always says that RAID 5 isn't good for Postgres. We
>> have an Apple Xserve RAID configured with RAID 5. We chose RAID 5
>> because Apple said their Xserve RAID was "optimized" for RAID 5. Not
>> sure if we made the right decision though. They give an option for
>> formatting as RAID 0+1. Is that the same as RAID 10 that everyone
>> talks about? Or is it the reverse?
>>
>> Thanks,
>>
>> ____________________________________________________________________
>> Brendan Duddridge | CTO | 403-277-5591 x24 |  brendan@clickspace.com
>>
>> ClickSpace Interactive Inc.
>> Suite L100, 239 - 10th Ave. SE
>> Calgary, AB  T2G 0V9
>>
>> http://www.clickspace.com
>>
>> On May 2, 2006, at 11:16 AM, Jim C. Nasby wrote:
>>
>>> On Wed, Apr 26, 2006 at 05:14:41PM +0930, Eric Lam wrote:
>>>> all dumpfiles total about 17Gb. It has been running for 50ish hrs
>>>> and up
>>>> to about the fourth file (5-6 ish Gb) and this is on a raid 5
>>>> server.
>>>
>>> RAID5 generally doesn't bode too well for performance; that could be
>>> part of the issue.
>>> --
>>> Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
>>> Pervasive Software      http://pervasive.com    work: 512-231-6117
>>> vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
>>>
>>> ---------------------------(end of
>>> broadcast)---------------------------
>>> TIP 4: Have you searched our list archives?
>>>
>>>                http://archives.postgresql.org
>>>
>>
>>
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 9: In versions below 8.0, the planner will ignore your desire to
>>        choose an index scan if your joining column's datatypes do not
>>        match
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 6: explain analyze is your friend


Re: Slow restoration question

From
"Jim C. Nasby"
Date:
BTW, you should be able to check to see what the controller is actually
doing by pulling one of the drives from a running array. If it only
hammers 2 drives during the rebuild, it's RAID10. If it hammers all the
drives, it's 0+1.

As for Xserve raid, it is possible to eliminate most (or maybe even all)
of the overhead associated with RAID5, depending on how tricky the
controller wants to be. I believe many large storage appliances actually
use RAID5 internally, but they perform a bunch of 'magic' behind the
scenes to get good performance from it. So, it is possible that the
XServe RAID performs quite well on RAID5. If you provided the results
from bonnie as well as info about the drives I suspect someone here
could tell you if you're getting close to RAID10 performance or not.

On Tue, May 02, 2006 at 02:34:16PM -0500, Will Reese wrote:
> RAID 10 is better than RAID 0+1.  There is a lot of information on
> the net about this, but here is the first one that popped up on
> google for me.
>
> http://www.pcguide.com/ref/hdd/perf/raid/levels/multLevel01-c.html
>
> The quick summary is that performance is about the same between the
> two, but RAID 10 gives better fault tolerance and rebuild
> performance.  I have seen docs for RAID cards that have confused
> these two RAID levels.  In addition, some cards claim to support RAID
> 10, when they actually support RAID 0+1 or even RAID 0+1 with
> concatenation (lame, some of the Dell PERCs have this).
>
> RAID 10 with 6 drives would stripe across 3 mirrored pairs.  RAID 0+1
> with 6 drives is a mirror of two striped arrays (3 disks each).  RAID
> 0+1 (with concatenation) using 6 drives is a mirror of two volumes
> (kind of like JBOD) each consisting of 3 drives concatenated together
> (it's a cheap implementation, and it gives about the same performance
> as RAID 1 but with increased storage capacity and less fault
> tolerance).  RAID 10 is better than RAID 5 (especially with 6 or less
> disks) because you don't have the performance hit for parity (which
> dramatically affects rebuild performance and write performance) and
> you get better fault tolerance (up to 3 disks can fail in a 6 disk
> RAID 10 and you can still be online, with RAID 5 you can only lose 1
> drive).  All of this comes with a higher cost (more drives and higher
> end cards).
>
> -- Will Reese http://blog.rezra.com
>
>
> On May 2, 2006, at 1:49 PM, Mark Lewis wrote:
>
> >They are not equivalent.  As I understand it, RAID 0+1 performs about
> >the same as RAID 10 when everything is working, but degrades much less
> >nicely in the presence of a single failed drive, and is more likely to
> >suffer catastrophic data loss if multiple drives fail.
> >
> >-- Mark
> >
> >On Tue, 2006-05-02 at 12:40 -0600, Brendan Duddridge wrote:
> >>Everyone here always says that RAID 5 isn't good for Postgres. We
> >>have an Apple Xserve RAID configured with RAID 5. We chose RAID 5
> >>because Apple said their Xserve RAID was "optimized" for RAID 5. Not
> >>sure if we made the right decision though. They give an option for
> >>formatting as RAID 0+1. Is that the same as RAID 10 that everyone
> >>talks about? Or is it the reverse?
> >>
> >>Thanks,
> >>
> >>____________________________________________________________________
> >>Brendan Duddridge | CTO | 403-277-5591 x24 |  brendan@clickspace.com
> >>
> >>ClickSpace Interactive Inc.
> >>Suite L100, 239 - 10th Ave. SE
> >>Calgary, AB  T2G 0V9
> >>
> >>http://www.clickspace.com
> >>
> >>On May 2, 2006, at 11:16 AM, Jim C. Nasby wrote:
> >>
> >>>On Wed, Apr 26, 2006 at 05:14:41PM +0930, Eric Lam wrote:
> >>>>all dumpfiles total about 17Gb. It has been running for 50ish hrs
> >>>>and up
> >>>>to about the fourth file (5-6 ish Gb) and this is on a raid 5
> >>>>server.
> >>>
> >>>RAID5 generally doesn't bode too well for performance; that could be
> >>>part of the issue.
> >>>--
> >>>Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
> >>>Pervasive Software      http://pervasive.com    work: 512-231-6117
> >>>vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
> >>>
> >>>---------------------------(end of
> >>>broadcast)---------------------------
> >>>TIP 4: Have you searched our list archives?
> >>>
> >>>               http://archives.postgresql.org
> >>>
> >>
> >>
> >>
> >>---------------------------(end of
> >>broadcast)---------------------------
> >>TIP 9: In versions below 8.0, the planner will ignore your desire to
> >>       choose an index scan if your joining column's datatypes do not
> >>       match
> >
> >---------------------------(end of
> >broadcast)---------------------------
> >TIP 6: explain analyze is your friend
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>

--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Slow restoration question

From
Brendan Duddridge
Date:
Hi Jim,

The output from bonnie on my boot drive is:

File './Bonnie.27964', size: 0
Writing with putc()...done
Rewriting...done
Writing intelligently...done
Reading with getc()...done
Reading intelligently...done
Seeker 2...Seeker 1...Seeker 3...start 'em...done...done...done...
               -------Sequential Output-------- ---Sequential Input--
--Random--
               -Per Char- --Block--- -Rewrite-- -Per Char- --Block---
--Seeks---
Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %
CPU  /sec %CPU
             0 36325 98.1 66207 22.9 60663 16.2 50553 99.9 710972
100.0 44659.8 191.3


And the output from the RAID drive is:

File './Bonnie.27978', size: 0
Writing with putc()...done
Rewriting...done
Writing intelligently...done
Reading with getc()...done
Reading intelligently...done
Seeker 1...Seeker 2...Seeker 3...start 'em...done...done...done...
               -------Sequential Output-------- ---Sequential Input--
--Random--
               -Per Char- --Block--- -Rewrite-- -Per Char- --Block---
--Seeks---
Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %
CPU  /sec %CPU
             0 40365 99.4 211625 61.4 212425 57.0 50740 99.9 730515
100.0 45897.9 190.1


Each drive in the RAID 5 is a 400 GB serial ATA drive. I'm not sure
the manufacturer or the model number as it was all in a packaged box
when we received it and I didn't check.

Do these numbers seem decent enough for a Postgres database?


Thanks,

____________________________________________________________________
Brendan Duddridge | CTO | 403-277-5591 x24 |  brendan@clickspace.com

ClickSpace Interactive Inc.
Suite L100, 239 - 10th Ave. SE
Calgary, AB  T2G 0V9

http://www.clickspace.com

On May 2, 2006, at 3:53 PM, Jim C. Nasby wrote:

> BTW, you should be able to check to see what the controller is
> actually
> doing by pulling one of the drives from a running array. If it only
> hammers 2 drives during the rebuild, it's RAID10. If it hammers all
> the
> drives, it's 0+1.
>
> As for Xserve raid, it is possible to eliminate most (or maybe even
> all)
> of the overhead associated with RAID5, depending on how tricky the
> controller wants to be. I believe many large storage appliances
> actually
> use RAID5 internally, but they perform a bunch of 'magic' behind the
> scenes to get good performance from it. So, it is possible that the
> XServe RAID performs quite well on RAID5. If you provided the results
> from bonnie as well as info about the drives I suspect someone here
> could tell you if you're getting close to RAID10 performance or not.
>
> On Tue, May 02, 2006 at 02:34:16PM -0500, Will Reese wrote:
>> RAID 10 is better than RAID 0+1.  There is a lot of information on
>> the net about this, but here is the first one that popped up on
>> google for me.
>>
>> http://www.pcguide.com/ref/hdd/perf/raid/levels/multLevel01-c.html
>>
>> The quick summary is that performance is about the same between the
>> two, but RAID 10 gives better fault tolerance and rebuild
>> performance.  I have seen docs for RAID cards that have confused
>> these two RAID levels.  In addition, some cards claim to support RAID
>> 10, when they actually support RAID 0+1 or even RAID 0+1 with
>> concatenation (lame, some of the Dell PERCs have this).
>>
>> RAID 10 with 6 drives would stripe across 3 mirrored pairs.  RAID 0+1
>> with 6 drives is a mirror of two striped arrays (3 disks each).  RAID
>> 0+1 (with concatenation) using 6 drives is a mirror of two volumes
>> (kind of like JBOD) each consisting of 3 drives concatenated together
>> (it's a cheap implementation, and it gives about the same performance
>> as RAID 1 but with increased storage capacity and less fault
>> tolerance).  RAID 10 is better than RAID 5 (especially with 6 or less
>> disks) because you don't have the performance hit for parity (which
>> dramatically affects rebuild performance and write performance) and
>> you get better fault tolerance (up to 3 disks can fail in a 6 disk
>> RAID 10 and you can still be online, with RAID 5 you can only lose 1
>> drive).  All of this comes with a higher cost (more drives and higher
>> end cards).
>>
>> -- Will Reese http://blog.rezra.com
>>
>>
>> On May 2, 2006, at 1:49 PM, Mark Lewis wrote:
>>
>>> They are not equivalent.  As I understand it, RAID 0+1 performs
>>> about
>>> the same as RAID 10 when everything is working, but degrades much
>>> less
>>> nicely in the presence of a single failed drive, and is more
>>> likely to
>>> suffer catastrophic data loss if multiple drives fail.
>>>
>>> -- Mark
>>>
>>> On Tue, 2006-05-02 at 12:40 -0600, Brendan Duddridge wrote:
>>>> Everyone here always says that RAID 5 isn't good for Postgres. We
>>>> have an Apple Xserve RAID configured with RAID 5. We chose RAID 5
>>>> because Apple said their Xserve RAID was "optimized" for RAID 5.
>>>> Not
>>>> sure if we made the right decision though. They give an option for
>>>> formatting as RAID 0+1. Is that the same as RAID 10 that everyone
>>>> talks about? Or is it the reverse?
>>>>
>>>> Thanks,
>>>>
>>>> ___________________________________________________________________
>>>> _
>>>> Brendan Duddridge | CTO | 403-277-5591 x24 |
>>>> brendan@clickspace.com
>>>>
>>>> ClickSpace Interactive Inc.
>>>> Suite L100, 239 - 10th Ave. SE
>>>> Calgary, AB  T2G 0V9
>>>>
>>>> http://www.clickspace.com
>>>>
>>>> On May 2, 2006, at 11:16 AM, Jim C. Nasby wrote:
>>>>
>>>>> On Wed, Apr 26, 2006 at 05:14:41PM +0930, Eric Lam wrote:
>>>>>> all dumpfiles total about 17Gb. It has been running for 50ish hrs
>>>>>> and up
>>>>>> to about the fourth file (5-6 ish Gb) and this is on a raid 5
>>>>>> server.
>>>>>
>>>>> RAID5 generally doesn't bode too well for performance; that
>>>>> could be
>>>>> part of the issue.
>>>>> --
>>>>> Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
>>>>> Pervasive Software      http://pervasive.com    work: 512-231-6117
>>>>> vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
>>>>>
>>>>> ---------------------------(end of
>>>>> broadcast)---------------------------
>>>>> TIP 4: Have you searched our list archives?
>>>>>
>>>>>               http://archives.postgresql.org
>>>>>
>>>>
>>>>
>>>>
>>>> ---------------------------(end of
>>>> broadcast)---------------------------
>>>> TIP 9: In versions below 8.0, the planner will ignore your
>>>> desire to
>>>>       choose an index scan if your joining column's datatypes do
>>>> not
>>>>       match
>>>
>>> ---------------------------(end of
>>> broadcast)---------------------------
>>> TIP 6: explain analyze is your friend
>>
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 5: don't forget to increase your free space map settings
>>
>
> --
> Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
> Pervasive Software      http://pervasive.com    work: 512-231-6117
> vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
>



Re: Slow restoration question

From
Eric Lam
Date:
Tom Lane wrote:

>Eric Lam <elam@lisasoft.com> writes:
>
>
>>what is the quickest way of dumping a DB and restoring it? I have done a
>>
>>
>
>
>
>>   "pg_dump -D database | split --line-bytes 1546m part"
>>
>>
>
>Don't use "-D" if you want fast restore ...
>
>            regards, tom lane
>
>
>
thanks, I read that from the doco, the reason why I am using the -D
option is because I was informed by previous people in the company that
they never got a 100% strike rate in database restoration without using
the -D or -d options. If I have enough space on the QA/staging machine
I'll give the no options dump restoration a try.

Anyone have any estimates the time differences between the -D, -d and
[using no option].

regards
Eric Lam

Re: Slow restoration question

From
Michael Stone
Date:
On Tue, May 02, 2006 at 08:09:52PM -0600, Brendan Duddridge wrote:
>               -------Sequential Output-------- ---Sequential Input--  --Random--
>               -Per Char- --Block--- -Rewrite-- -Per Char- --Block---  --Seeks---
>Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec % CPU  /sec %CPU
>             0 40365 99.4 211625 61.4 212425 57.0 50740 99.9 730515  100.0 45897.9 190.1
[snip]
>Do these numbers seem decent enough for a Postgres database?

These numbers seem completely bogus, probably because bonnie is using a
file size smaller than memory and is reporting caching effects. (730MB/s
isn't possible for a single external RAID unit with a pair of 2Gb/s
interfaces.) bonnie in general isn't particularly useful on modern
large-ram systems, in my experience.

Mike Stone

Re: Slow restoration question

From
Jeff Trout
Date:
On May 3, 2006, at 8:18 AM, Michael Stone wrote:

> On Tue, May 02, 2006 at 08:09:52PM -0600, Brendan Duddridge wrote:
>>               -------Sequential Output-------- ---Sequential
>> Input--  --Random--
>>               -Per Char- --Block--- -Rewrite-- -Per Char- --
>> Block---  --Seeks---
>> Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %
>> CPU  /sec %CPU
>>             0 40365 99.4 211625 61.4 212425 57.0 50740 99.9
>> 730515  100.0 45897.9 190.1
> [snip]
>> Do these numbers seem decent enough for a Postgres database?
>
> These numbers seem completely bogus, probably because bonnie is
> using a file size smaller than memory and is reporting caching
> effects. (730MB/s isn't possible for a single external RAID unit
> with a pair of 2Gb/s interfaces.) bonnie in general isn't
> particularly useful on modern large-ram systems, in my experience.
>

Bonnie++ is able to use very large datasets. It also tries to figure
out hte size you want (2x ram) - the original bonnie is limited to 2GB.

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/



Re: Slow restoration question

From
Vivek Khera
Date:
On May 3, 2006, at 9:19 AM, Jeff Trout wrote:

> Bonnie++ is able to use very large datasets. It also tries to
> figure out hte size you want (2x ram) - the original bonnie is
> limited to 2GB.

but you have to be careful building bonnie++ since it has bad
assumptions about which systems can do large files... eg, on FreeBSD
it doesn't try large files unless you patch it appropriately (which
the freebsd port does for you).


Attachment

Re: Slow restoration question

From
Jeff Trout
Date:
On May 3, 2006, at 10:16 AM, Vivek Khera wrote:

>
> On May 3, 2006, at 9:19 AM, Jeff Trout wrote:
>
>> Bonnie++ is able to use very large datasets. It also tries to
>> figure out hte size you want (2x ram) - the original bonnie is
>> limited to 2GB.
>
> but you have to be careful building bonnie++ since it has bad
> assumptions about which systems can do large files... eg, on
> FreeBSD it doesn't try large files unless you patch it
> appropriately (which the freebsd port does for you).
>

On platforms it thinks can't use large files it uses multiple sets of
2GB files. (Sort of like our beloved PG)
--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/



Re: Slow restoration question

From
Michael Stone
Date:
On Wed, May 03, 2006 at 09:19:52AM -0400, Jeff Trout wrote:
>Bonnie++ is able to use very large datasets. It also tries to figure
>out hte size you want (2x ram) - the original bonnie is limited to 2GB.

Yes, and once you get into large datasets like that the quality of the
data is fairly poor because the program can't really eliminate cache
effects. IOW, it tries but (in my experience) doesn't succeed very well.

Mike Stone

Re: Slow restoration question

From
Scott Marlowe
Date:
On Wed, 2006-05-03 at 10:59, Michael Stone wrote:
> On Wed, May 03, 2006 at 09:19:52AM -0400, Jeff Trout wrote:
> >Bonnie++ is able to use very large datasets. It also tries to figure
> >out hte size you want (2x ram) - the original bonnie is limited to 2GB.
>
> Yes, and once you get into large datasets like that the quality of the
> data is fairly poor because the program can't really eliminate cache
> effects. IOW, it tries but (in my experience) doesn't succeed very well.

I have often used the mem=xxx arguments to lilo when needing to limit
the amount of memory for testing purposes.  Just google for limit memory
and your bootloader to find the options.

Re: Slow restoration question

From
Michael Stone
Date:
On Wed, May 03, 2006 at 11:07:15AM -0500, Scott Marlowe wrote:
>I have often used the mem=xxx arguments to lilo when needing to limit
>the amount of memory for testing purposes.  Just google for limit memory
>and your bootloader to find the options.

Or, just don't worry about it. Even if you get bonnie to reflect real
numbers, so what? In general the goal is to optimize application
performance, not bonnie performance. A simple set of dd's is enough to
give you a rough idea of disk performance, beyond that you really need
to see how your disk is performing with your actual workload.

Mike Stone

Re: Slow restoration question

From
"Jim C. Nasby"
Date:
On Wed, May 03, 2006 at 01:06:06PM -0400, Michael Stone wrote:
> On Wed, May 03, 2006 at 11:07:15AM -0500, Scott Marlowe wrote:
> >I have often used the mem=xxx arguments to lilo when needing to limit
> >the amount of memory for testing purposes.  Just google for limit memory
> >and your bootloader to find the options.
>
> Or, just don't worry about it. Even if you get bonnie to reflect real
> numbers, so what? In general the goal is to optimize application
> performance, not bonnie performance. A simple set of dd's is enough to
> give you a rough idea of disk performance, beyond that you really need
> to see how your disk is performing with your actual workload.

Well, in this case the question was about random write access, which dd
won't show you.
--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Slow restoration question

From
Michael Stone
Date:
On Wed, May 03, 2006 at 01:08:21PM -0500, Jim C. Nasby wrote:
>Well, in this case the question was about random write access, which dd
>won't show you.

That's the kind of thing you need to measure against your workload.

Mike Stone

Re: Slow restoration question

From
Scott Marlowe
Date:
On Wed, 2006-05-03 at 14:26, Michael Stone wrote:
> On Wed, May 03, 2006 at 01:08:21PM -0500, Jim C. Nasby wrote:
> >Well, in this case the question was about random write access, which dd
> >won't show you.
>
> That's the kind of thing you need to measure against your workload.

Of course, the final benchmarking should be your application.

But, supposed you're comparing 12 or so RAID controllers for a one week
period, and you don't even have the app fully written yet, and because
of time constraints, you'll need the server ready before the app is
done.  You don't need perfection, but you need some idea how the array
performs.  I maintain that both methodologies have their uses.

Note that I'm referring to bonnie++ as was an earlier poster.  It
certainly seems capable of giving you a good idea of how your hardware
will behave under load.

Re: Slow restoration question

From
Michael Stone
Date:
On Wed, May 03, 2006 at 02:40:15PM -0500, Scott Marlowe wrote:
>Note that I'm referring to bonnie++ as was an earlier poster.  It
>certainly seems capable of giving you a good idea of how your hardware
>will behave under load.

IME it give fairly useless results. YMMV. Definately the numbers posted
before seem bogus. If you have some way to make those figures useful in
your circumstance, great. Too often I see people taking bonnie numbers
at face value and then being surprised that don't relate at all to
real-world performance. If your experience differs, fine.

Mike Stone

Re: Slow restoration question

From
Scott Marlowe
Date:
On Wed, 2006-05-03 at 15:53, Michael Stone wrote:
> On Wed, May 03, 2006 at 02:40:15PM -0500, Scott Marlowe wrote:
> >Note that I'm referring to bonnie++ as was an earlier poster.  It
> >certainly seems capable of giving you a good idea of how your hardware
> >will behave under load.
>
> IME it give fairly useless results. YMMV. Definately the numbers posted
> before seem bogus. If you have some way to make those figures useful in
> your circumstance, great. Too often I see people taking bonnie numbers
> at face value and then being surprised that don't relate at all to
> real-world performance. If your experience differs, fine.

I think the real problem is that people use the older bonnie that can
only work with smaller datasets on a machine with all the memory
enabled.  This will, for certain, give meaningless numbers.

OTOH, having used bonnie++ on a machine artificially limited to 256 to
512 meg or ram or so, has given me some very useful numbers, especially
if you set the data set size to be several gigabytes.

Keep in mind, the numbers listed before likely WERE generated on a
machine with plenty of memory using the older bonnie, so those numbers
should be bogus.

If you've not tried bonnie++ on a limited memory machine, you really
should.  It's a quite useful tool for a simple first pass to figure out
which RAID and fs configurations should be tested more thoroughly.

Re: Slow restoration question

From
Michael Stone
Date:
On Wed, May 03, 2006 at 04:30:32PM -0500, Scott Marlowe wrote:
>If you've not tried bonnie++ on a limited memory machine, you really
>should.

Yes, I have. I also patched bonnie to handle large files and other such
nifty things before bonnie++ was forked. Mostly I just didn't get much
value out of all that, because at the end of theago day optimizing for
bonnie just doesn't equate to optimizing for real-world workloads.
Again, if it's useful for your workload, great.

Mike Stone