Thread: Re: [GENERAL] Upgrade to dual processor machine?

Re: [GENERAL] Upgrade to dual processor machine?

From
"Henrik Steffen"
Date:
hi steve,

why fsync? - what's fsync? never heard of it... google tells
me something about syncing of remote hosts ... so why should I
activate it ?? ... I conclude, it's probably disabled because
I don't know what it is ....

it's a raid-1 ide system

--

Mit freundlichem Gruß

Henrik Steffen
Geschäftsführer

top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com          Tel. +49 4141 991230
mail: steffen@topconcepts.com       Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline:  +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
Ihr SMS-Gateway: JETZT NEU unter: http://sms.city-map.de
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------

----- Original Message -----
From: "Steve Wolfe" <nw@codon.com>
To: <pgsql-general@postgresql.org>
Sent: Thursday, November 14, 2002 7:46 PM
Subject: Re: [GENERAL] Upgrade to dual processor machine?


> > The cache-field is saying 873548K cached at the moment
> > Is this a "whole bunch of cache" in your opinion? Is it too much?
>
>   Too much cache?  It ain't possible. ; )
>
>   For what it's worth, my DB machine generally uses about 1.25 gigs for
> disk cache, in addition to the 64 megs that are on the RAID card, and
> that's just fine with me.  I allocate 256 megs of shared memory (32768
> buffers), and the machine hums along very nicely.  vmstat shows that
> actual reads to the disk are *extremely* rare, and the writes that come
> from inserts/etc. are nicely buffered.
>
>   Here's how I chose 256 megs for shared buffers:  First, I increased the
> shared buffer amount until I didn't see any more performance benefits.
> Then I doubled it just for fun. ; )
>
>   Again, in your message it seemed like you were doing quite a bit of
> writes - have you disabled fsync, and what sort of disk system do you
> have?
>
> steve
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)


Re: [GENERAL] Upgrade to dual processor machine?

From
Doug McNaught
Date:
"Henrik Steffen" <steffen@city-map.de> writes:

> hi steve,
>
> why fsync? - what's fsync? never heard of it... google tells
> me something about syncing of remote hosts ... so why should I
> activate it ?? ... I conclude, it's probably disabled because
> I don't know what it is ....

fsync() is a system call that flushes a file's contents from the
buffer cache to disk.  PG uses it to ensure consistency in the WAL
files.  It is enabled by default.  Do NOT disable it unless you know
exactly what you are doing and are prepared to sacrifice some data
integrity for performance.

-Doug

Re: [GENERAL] Upgrade to dual processor machine?

From
"scott.marlowe"
Date:
On Thu, 14 Nov 2002, Henrik Steffen wrote:

>
> hi steve,
>
> why fsync? - what's fsync? never heard of it... google tells
> me something about syncing of remote hosts ... so why should I
> activate it ?? ... I conclude, it's probably disabled because
> I don't know what it is ....
>
> it's a raid-1 ide system

fsync is enabled by default.  fsync flushes disk buffers after every
write.  Turning it off lets the OS flush buffers at its leisure.  setting
fsync=false will often double the write performance and since writes are
running faster, there's more bandwidth for the reads as well, so
everything goes faster.

Definitely look at putting your data onto a Ultra160 SCSI 15krpm RAID1
set.  My dual 80 Gig Ultra100 IDEs can get about 30 Megs a second in a
RAID1 for raw reads under bonnie++, while my pair of Ultra80 10krpm 18 gig
scsis can get about 48 Megs a second raw read.

Plus SCSI is usually MUCH faster for writes than IDE.


Re: [GENERAL] Upgrade to dual processor machine?

From
"scott.marlowe"
Date:
On 14 Nov 2002, Doug McNaught wrote:

> "Henrik Steffen" <steffen@city-map.de> writes:
>
> > hi steve,
> >
> > why fsync? - what's fsync? never heard of it... google tells
> > me something about syncing of remote hosts ... so why should I
> > activate it ?? ... I conclude, it's probably disabled because
> > I don't know what it is ....
>
> fsync() is a system call that flushes a file's contents from the
> buffer cache to disk.  PG uses it to ensure consistency in the WAL
> files.  It is enabled by default.  Do NOT disable it unless you know
> exactly what you are doing and are prepared to sacrifice some data
> integrity for performance.

I thought the danger with WAL was minimized to the point of not being an
issue anymore.  Tom?


Re: [GENERAL] Upgrade to dual processor machine?

From
Tom Lane
Date:
"scott.marlowe" <scott.marlowe@ihs.com> writes:
> On 14 Nov 2002, Doug McNaught wrote:
>> fsync() is a system call that flushes a file's contents from the
>> buffer cache to disk.  PG uses it to ensure consistency in the WAL
>> files.  It is enabled by default.  Do NOT disable it unless you know
>> exactly what you are doing and are prepared to sacrifice some data
>> integrity for performance.

> I thought the danger with WAL was minimized to the point of not being an
> issue anymore.  Tom?

Actually, more the other way 'round: WAL minimizes the cost of using
fsync, since we now only need to fsync the WAL file and not anything
else.  The risk of not using it is still data corruption --- mainly
because without fsync, we can't be certain that WAL writes hit disk
in advance of the corresponding data-page changes.  If you have a crash,
the system will replay the log as far as it can; but if there are
additional unlogged changes in the data files, you might have
inconsistencies.

I'd definitely recommend keeping fsync on in any production
installation.  For development maybe you don't care about data loss...

            regards, tom lane

Re: [GENERAL] Upgrade to dual processor machine?

From
"Steve Wolfe"
Date:
> fsync() is a system call that flushes a file's contents from the
> buffer cache to disk.  PG uses it to ensure consistency in the WAL
> files.  It is enabled by default.  Do NOT disable it unless you know
> exactly what you are doing and are prepared to sacrifice some data
> integrity for performance.

  The only issue of data integrity is in the case of an unclean shutdown,
like a power failure or a crash.  PG and my OS are reliable enough that I
trust them not to crash, and my hardware has proven itself as well.  Of
course, as you point out, if someone doesn't trust their server, they're
taking chances.

  That being said, even on other machines with fsync turned off and
unclean shutdowns (power cycles, etc.), I have yet to run into any problem
with PG's consistency, although I certainly cannot guarantee that would be
the case for anyone else!

steve


Re: [GENERAL] Upgrade to dual processor machine?

From
"Steve Wolfe"
Date:
> fsync is enabled by default.  fsync flushes disk buffers after every
> write.  Turning it off lets the OS flush buffers at its leisure.
setting
> fsync=false will often double the write performance and since writes are
> running faster, there's more bandwidth for the reads as well, so
> everything goes faster.

  "doubling performance" is very conservative, I've seen it give more than
a tenfold increase in performance on large insert/update batches.  Of
course, the exact figure depends on a lot of hardware and OS factors.

> Definitely look at putting your data onto a Ultra160 SCSI 15krpm RAID1
> set.  My dual 80 Gig Ultra100 IDEs can get about 30 Megs a second in a
> RAID1 for raw reads under bonnie++, while my pair of Ultra80 10krpm 18
gig
> scsis can get about 48 Megs a second raw read.

   If you trust the hardware, disabling fsync and using copious quantities
of cache/buffer can almost eliminate actual disk access.  My DB machine
will quickly blip the lights on the RAID array once a minute or so, but
that's about it.  All of the actual work is happening from RAM.  Of
course, with obscenely large data sets, that becomes difficult to achieve.

steve