Thread: -F option, RAM usage, more...
I would be grateful if someone could help me understand exactly how the -F option changes Postgres' behaviour. I am trying to tweak the speed at which it operates: I understand from the docs that -F ".. prevents fsync()'s from flushing to disk after every transaction.." and that this boosts performance because RAM accesses are far faster than disk accesses. I have also seen some impressive stats regarding the degree of this performance boost. Some specific questions: 1. How often DOES PG flush to disk - if at all - when the -F option is invoked? Can this be controllled? 2. I have no first hand experience with Oracle, but I understand that one of the keys to its speed is its ability to pull the entire database (or selected tables) into RAM and work them from there. Is this comparable to Postgres' -F option? 3. With -F, does PG pull the database into RAM at startup? Or does it pull data into RAM as it is accessed. (eg: the first few queries would be slower, but subsequent queries on same data would be faster...)? 4. Does the -F option speed SELECTs as well as it speeds INSERTs? 5. I have a dedicated Linux 2.2.16 db server with 2GB of RAM. How can I be sure that Postgres is using all the RAM that it can? (-S option? -B option?) 6. How does Vacuuming affect PG if it is running with -F? Any other information regarding -F would be appreciated. Thanks in advance. Mike Biamonte
On Wed, 4 Oct 2000, Mike Biamonte wrote: > I understand from the docs that -F ".. prevents fsync()'s from > flushing to disk after every transaction.." and that this boosts > performance because RAM accesses are far faster than disk accesses. I > have also seen some impressive stats regarding the degree of this > performance boost. Normally, in order to ensure integrity of the database/datafiles, PostgreSQL calls fsync() after each transaction - what this does, is it ensures that all disk buffers are flushed to disk - ensuring that any changes done in that transaction are committed to disk. > 1. How often DOES PG flush to disk - if at all - when the -F option is > invoked? Can this be controllled? Once after each transaction. > 2. I have no first hand experience with Oracle, but I understand that > one of the keys to its speed is its ability to pull the entire > database (or selected tables) into RAM and work them from there. Is > this comparable to Postgres' -F option? No. As far as I know PostgreSQL can't load an entire table into memory beyond what could/would/might be cached in-memory as part of the OS'es disk caching. > 3. With -F, does PG pull the database into RAM at startup? Or does it > pull data into RAM as it is accessed. (eg: the first few queries > would be slower, but subsequent queries on same data would be > faster...)? No. > 4. Does the -F option speed SELECTs as well as it speeds INSERTs? I'll wager to assume that fsync() is only called after INSERT/UPDATE/DELETE/ALTER's - since these are the only ones that modify the on-disk data - although I could be wrong. So no, it wouldn't speed up SELECT's. > 5. I have a dedicated Linux 2.2.16 db server with 2GB of RAM. How can > I be sure that Postgres is using all the RAM that it can? (-S > option? -B option?) Adjust your -S and -B options appropriately - not that I have any "recommended values" :( -- Dominic J. Eidson "Baruk Khazad! Khazad ai-menu!" - Gimli ------------------------------------------------------------------------------- http://www.the-infinite.org/ http://www.the-infinite.org/~dominic/
> > 1. How often DOES PG flush to disk - if at all - when the -F option is > > invoked? Can this be controllled? > > Once after each transaction. That's what it does when -F is *not* used, right? -F disables calling fsync() after each transaction, right?.. -Mitch
On Wed, 4 Oct 2000, Mitch Vincent wrote: > That's what it does when -F is *not* used, right? -F disables calling > fsync() after each transaction, right?.. Yes, sorry - when -F is _not_ used, fsync() _is_ called after each transaction - when -F _is_ used, fsync() is _not_ called. Sorry for the mixup. -- Dominic J. Eidson "Baruk Khazad! Khazad ai-menu!" - Gimli ------------------------------------------------------------------------------- http://www.the-infinite.org/ http://www.the-infinite.org/~dominic/
On Wed, Oct 04, 2000 at 02:09:47PM -0400, Mike Biamonte wrote: > I understand from the docs that -F ".. prevents fsync()'s from > flushing to disk after every transaction.." and that this boosts > performance because RAM accesses are far faster than disk accesses. I > have also seen some impressive stats regarding the degree of this > performance boost. Correct me if I'm wrong, but I believe that when you specify '-F', it allows the filesystem to buffer I/O operations, performing several operations once after another. This is much faster than with fsync(), where the disk heads have to be moved frequently. Also, allowing the I/O subsystem to buffer some data will speed subsequent accesses of it, until the buffer is flushed. HTH, Neil -- Neil Conway <neilconway@home.com> Get my GnuPG key from: http://klamath.dyndns.org/mykey.asc Encrypted mail welcomed Violence is to dictatorship as propaganda is to democracy. -- Noam Chomsky
Attachment
Hmm, it seems we all know just enough to be dangerous :-) I have seen many threads on the "to fsync() or not to fsync()" and overwhelmingly people have come out and said that to not fsync() is A Bad Thing(TM). -- If Neil is right then it being bad or not is going to depend very much on the filesystem (I think)... Now I'm pretty confused (as I'm sure others are) -- can someone that knows beyond a reasonable doubt beat us with a clue stick on this? Are we taking a huge risk if we use -F and disable fsync() or no? -Mitch ----- Original Message ----- From: "Neil Conway" <nconway@klamath.dyndns.org> To: <pgsql-general@hub.org> Sent: Wednesday, October 04, 2000 1:24 PM Subject: Re: [GENERAL] -F option, RAM usage, more... On Wed, Oct 04, 2000 at 02:09:47PM -0400, Mike Biamonte wrote: > I understand from the docs that -F ".. prevents fsync()'s from > flushing to disk after every transaction.." and that this boosts > performance because RAM accesses are far faster than disk accesses. I > have also seen some impressive stats regarding the degree of this > performance boost. Correct me if I'm wrong, but I believe that when you specify '-F', it allows the filesystem to buffer I/O operations, performing several operations once after another. This is much faster than with fsync(), where the disk heads have to be moved frequently. Also, allowing the I/O subsystem to buffer some data will speed subsequent accesses of it, until the buffer is flushed. HTH, Neil -- Neil Conway <neilconway@home.com> Get my GnuPG key from: http://klamath.dyndns.org/mykey.asc Encrypted mail welcomed Violence is to dictatorship as propaganda is to democracy. -- Noam Chomsky
When I used postgres on linux, I found the following happened when the system failed in the middle of transactions: * ext2 + fsync: file system screwed-up, db OK * ext2 - fsync: much faster than above, file system screwed-up, db needed to be restored sometimes * reiserfs + fsync: as fast as ext2 without fsync, file system OK, db OK * reiserfs - fsync: no noticeable difference in speed from above, file system OK, db had to be restored *every time* Now I use freebsd...Can't comment on the various configurations yet, but to the folks concerned with memory issues, take note of this: freebsd manages virtual memory and disk caches much better than linux. It will even kick idle processes out of memory to make room for disk cache, something I never saw when working with linux. ----- Original Message ----- From: "Mitch Vincent" <mitch@venux.net> To: <pgsql-general@hub.org>; "Neil Conway" <nconway@klamath.dyndns.org> Sent: Wednesday, October 04, 2000 5:40 PM Subject: Re: [GENERAL] -F option, RAM usage, more... > Hmm, it seems we all know just enough to be dangerous :-) > > I have seen many threads on the "to fsync() or not to fsync()" and > overwhelmingly people have come out and said that to not fsync() is A Bad > Thing(TM). -- If Neil is right then it being bad or not is going to depend > very much on the filesystem (I think)... > > Now I'm pretty confused (as I'm sure others are) -- can someone that knows > beyond a reasonable doubt beat us with a clue stick on this? Are we taking > a huge risk if we use -F and disable fsync() or no? > > -Mitch >
"Mitch Vincent" <mitch@venux.net> writes: > Now I'm pretty confused (as I'm sure others are) -- can someone that knows > beyond a reasonable doubt beat us with a clue stick on this? Are we taking > a huge risk if we use -F and disable fsync() or no? Postgres will write() modified pages out to the kernel at transaction commit, -F or no. The difference is whether it then issues an fsync() to force the kernel to write all the modified pages to disk before it believes the transaction is committed. With -F (ie, no fsync) the changes are out of the application and into the kernel's disk buffers, but not necessarily physically down on disk, at the time Postgres updates pg_log to show the transaction as committed. If you have a subsequent system crash then it's possible that the pg_log update got written out but only some of the data pages modified by the transaction got written --- in which case you have an inconsistent DB, because the transaction's effects weren't all-or-nothing like they're supposed to be. The behavior would depend on exactly what order the kernel chose to flush dirty buffers out to disk. If you're not using -F then Postgres fsync()s all the data files it's touched, then writes pg_log, then fsync()s pg_log. If the kernel respects fsync 100% then this should guarantee atomic effects of a transaction: pg_log will not show the transaction as committed unless all the data changes it made are safely down on disk. If you have a reliable kernel, reliable power (ie a UPS) and aren't too worried about hardware failures then there's no good reason to insist on fsync. A Postgres server crash wouldn't mess up already- committed data, since that data is safely out of the server and into the hands of the kernel. However, if you don't want to trust the kernel (+hardware) to get the data it's accepted down to disk sooner or later, then you'd better be using fsync. It's interesting that someone reported reiserfs to show different behavior on crash than ext2. I'd have thought this was mainly an issue of what order the kernel chose to flush dirty buffers in, which doesn't seem like it'd depend on the filesystem organization ... but maybe it does. regards, tom lane
> > 1. How often DOES PG flush to disk - if at all - when the -F option is > > invoked? Can this be controllled? > > Once after each transaction. That's what it does when -F is *not* used, right? -F disables calling fsync() after each transaction, right?.. -Mitch