Thread: Crash Recovery
To preface my question, we are still in the process of evaluating postgres to determine if we want to switch our production environment over. I'm curious about where I can find documentation about crash recovery in postgres. In mysql, there is a nice table recovery utility (myisamchk). is there something similar in postgres? What do we do if a table or database becomes corrupted? (I'm aware of backup techniques, but it isn't feasible for some of our larger tables. We're already running on raid 5, but can't do much more) Thanks, -N
Noah Silverman <noah@allresearch.com> writes: > I'm curious about where I can find documentation about crash recovery in > postgres. In mysql, there is a nice table recovery utility (myisamchk). > is there something similar in postgres? There are no automated recovery tools for Postgres, because there are no known failure modes that are systematic enough to allow automatic recovery. We prefer to fix such bugs rather than patch around them. There are some last-ditch tools for reconstructing indexes (REINDEX) and for throwing away the WAL log (pg_resetxlog) but I have not seen any recent cases where I would have felt that blind invocation of either would be a good move. > What do we do if a table or > database becomes corrupted? (I'm aware of backup techniques, but it isn't > feasible for some of our larger tables. Reconsider that. If your data center burns down tonight, what is your fallback? Ultimately, you *must* have a backup copy, or you're just not taking the possibility of failure seriously. regards, tom lane
On Thu, Jan 23, 2003 at 10:32:58PM -0500, Noah Silverman wrote: > To preface my question, we are still in the process of evaluating postgres > to determine if we want to switch our production environment over. > > I'm curious about where I can find documentation about crash recovery in > postgres. In mysql, there is a nice table recovery utility (myisamchk). It recovers automatically. Make sure you run with fsync turned on. That calls fsync on the WAL at the point of every COMMIT, and COMMIT isn't finished before the fsync returns. Then, in case of a crash, the WAL just plays back and fixes up the data area. > is there something similar in postgres? What do we do if a table or > database becomes corrupted? (I'm aware of backup techniques, but it isn't I have never had a table become corrupted under Postgres. There have been some recent cases where people's bad hardware caused bad data to make it into a table. Postgres's error reporting usually saves you there, because you can go in and stomp on the bad tuple if need be. There are some utilities to help in this; one of them, from Red Hat, allows you to look at the binary data in various formats (it's pretty slick). I believe it's available from sources.redhat.com/rhdb. > feasible for some of our larger tables. We're already running on raid 5, > but can't do much more) I suspect you can. First, are you using ECC memory in your production machines? If not, start doing so. Now. It is _the most important_ thing, aside from RAID, that you can do to protect your data. Almost every problem of inconsistency I've seen on the lists in the past year and a bit has been to do with bad hardware -- usually memory or disk controllers. (BTW, redundant disk controllers, and ones with some intelligence built in so that they check themsleves, are also mighty valuable here. But memory goes bad way more often.) Also, I'm not sure just what you mean about backups "not being feasible" for some of the larger tables, but you need to back up daily. Since pg_dump takes a consistent snapshot, there's no data inconsistency trouble, and you can just start the backup and go away. If the resulting files are too large, use split. And if the problem is space, well, disk is cheap these days, and so is tape, compared to having to re-get the data you lost. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
On Thu, 2003-01-23 at 21:32, Noah Silverman wrote: > To preface my question, we are still in the process of evaluating postgres > to determine if we want to switch our production environment over. > > I'm curious about where I can find documentation about crash recovery in > postgres. In mysql, there is a nice table recovery utility (myisamchk). > is there something similar in postgres? What do we do if a table or > database becomes corrupted? (I'm aware of backup techniques, but it isn't > feasible for some of our larger tables. We're already running on raid 5, > but can't do much more) Of course it's feasible!! If corporations can backup terrabyte-sized databases, then you can backup your comparatively puny DB. In fact, if your data is vital to your company, you *must* back it up. Otherwise, poof goes the company if the computer is destroyed. Now, it might cost some bucks to buy a tape drive, or a multi-loader, if you have *lots* of data, but it *can* be done... Btw, what happens if an obscure bug in the RAID controller shows is head, and starts corrupting your data? A table recovery utility wouldn't do squat, then... -- +---------------------------------------------------------------+ | Ron Johnson, Jr. mailto:ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "Fear the Penguin!!" | +---------------------------------------------------------------+
On Fri, 2003-01-24 at 07:22, Andrew Sullivan wrote: > On Thu, Jan 23, 2003 at 10:32:58PM -0500, Noah Silverman wrote: > > To preface my question, we are still in the process of evaluating postgres > > to determine if we want to switch our production environment over. > > > > I'm curious about where I can find documentation about crash recovery in > > postgres. In mysql, there is a nice table recovery utility (myisamchk). > > It recovers automatically. Make sure you run with fsync turned on. > That calls fsync on the WAL at the point of every COMMIT, and COMMIT > isn't finished before the fsync returns. Then, in case of a crash, > the WAL just plays back and fixes up the data area. On commercial databases, there's a command to flush the roll-forward logs to tape at intervals during the day. Thus, if the disk(s) get corrupted, one can restore the database to new disks, then apply the on-tape roll-forward logs to the database, and you'd have only lost a few hours of data, instead of however many hours (or days) it's been since the last database backup. Also, flushing them to tape (or a different partition) ensures that they don't fill up the partition during a particularly intensive batch job. Are there any FM's that explain how this works in Postgres? Thanks, Ron -- +---------------------------------------------------------------+ | Ron Johnson, Jr. mailto:ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "Fear the Penguin!!" | +---------------------------------------------------------------+
Speaking about daily backups... We are running into some serious trouble with our backup policy. First (and less important), the size of our backups is increasing a lot; yet information is not changing, only being added; so, the obvious question: is there a way to make incremental backup? And the second (and intriguing) problem: whenever I run pg_dump, my system *freezes* until pg_dump finishes. When I say "system", I mean the software that is running and sending data to the PG database. It just freezes, users are unable to connect during several minutes, and the ones already connected think the server died, so they end up disconnecting after one or two minutes seeing that the server does not respond. Is this normal? Is there any way to avoid it? (I guess if I have a solution to the first problem -- i.e., doing incremental backups -- then that would solve this one, since it would only "freeze" the system for a few seconds, which wouldn't be that bad...) Thanks for any comments! Carlos --
On Fri, Jan 24, 2003 at 08:12:19AM -0600, Ron Johnson wrote: > On commercial databases, there's a command to flush the roll-forward > logs to tape at intervals during the day. [. . .] > Are there any FM's that explain how this works in Postgres? Not yet, because you can't do it. There is, I understand, some code currently being included in 7.4 to do this. So that's when it'll happen. Look for "point in time recovery" or "PITR" on the -hackers list to see the progress. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
On Fri, 2003-01-24 at 10:16, Carlos Moreno wrote: > Speaking about daily backups... We are running into some serious > trouble with our backup policy. > > First (and less important), the size of our backups is increasing > a lot; yet information is not changing, only being added; so, the > obvious question: is there a way to make incremental backup? Incremental backups are coming. Some folks at RedHat are working on finishing a PIT implementation, with with any luck 7.4 will do what you want. For the time being you might be able to cheat. If you're not touching the old data, it should come out in roughly the same order every time. You might be able to get away with doing a diff between the new backup and an older one, and simply store that. When restoring, you'll need to patch together the proper restore file. > And the second (and intriguing) problem: whenever I run pg_dump, > my system *freezes* until pg_dump finishes. When I say "system", No, this isn't normal -- nor do I believe it. The only explanation would be a hardware or operating system limitation. I.e. with heavy disk usage it used to be possible to peg the CPU -- making everything else CPU starved, but the advent of DMA drives put an end to that. A pg_dump is not resource friendly, simply due to the quantity of information its dealing with. Are you dumping across a network? Perhaps the NIC is maxed out. -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Attachment
Carlos Moreno wrote: <snip> > And the second (and intriguing) problem: whenever I run pg_dump, > my system *freezes* until pg_dump finishes. When I say "system", > I mean the software that is running and sending data to the PG > database. It just freezes, users are unable to connect during > several minutes, and the ones already connected think the server > died, so they end up disconnecting after one or two minutes > seeing that the server does not respond. Is there any chance that you have hardware problems? For example a couple of disk areas that are defective and the system is not happy about, or maybe hard drive controller problems? With PC's, this sort of thing generally seems to mean hardware problems of some sort that are being triggered by PostgreSQL having to run through the entire dataset. Could be caused by I/O load, could be caused by hard drive errors, etc. ? > Is this normal? No. Out of curiosity, which operating system are you using? :-( Regards and best wishes, Justin Clift > Is there any way to avoid it? (I guess if I > have a solution to the first problem -- i.e., doing incremental > backups -- then that would solve this one, since it would only > "freeze" the system for a few seconds, which wouldn't be that > bad...) > > Thanks for any comments! > > Carlos > -- > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
On Fri, Jan 24, 2003 at 10:16:42AM -0500, Carlos Moreno wrote: > obvious question: is there a way to make incremental backup? Not really, at the moment. Sorry. It's supposed to be coming soon (see my other message about PITR). > my system *freezes* until pg_dump finishes. When I say "system", > Is this normal? Is there any way to avoid it? (I guess if I No, it's not normal. I think some additional digging is needed. One thing that is important is to make sure your pg_dump doesn't cause swapping on the machine. Causing swapping is easy if you have been too aggressive in shared-memory allocation for the postmaster, and your OS is careless about who gets to be a candidate for paging. (Solaris 7.1 without priority paging was subject to this problem, for instance). A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
On Fri, 2003-01-24 at 09:48, Andrew Sullivan wrote: > On Fri, Jan 24, 2003 at 08:12:19AM -0600, Ron Johnson wrote: > > > On commercial databases, there's a command to flush the roll-forward > > logs to tape at intervals during the day. > > [. . .] > > > Are there any FM's that explain how this works in Postgres? > > Not yet, because you can't do it. > > There is, I understand, some code currently being included in 7.4 to > do this. So that's when it'll happen. Look for "point in time > recovery" or "PITR" on the -hackers list to see the progress. Great! That's a big step towards enterprise functiomality. Another big step would be aggregate functions using indexes, but that's been discussed before... -- +---------------------------------------------------------------+ | Ron Johnson, Jr. mailto:ron.l.johnson@cox.net | | Jefferson, LA USA http://members.cox.net/ron.l.johnson | | | | "Fear the Penguin!!" | +---------------------------------------------------------------+
Carlos Moreno <moreno@mochima.com> writes: > And the second (and intriguing) problem: whenever I run pg_dump, > my system *freezes* until pg_dump finishes. When I say "system", > I mean the software that is running and sending data to the PG > database. Other people have responded on the assumption that this is a performance problem, but you should also consider the possibility that it's bad coding of your application software. Does your app try to grab exclusive table locks? If so, it'll sit there waiting for the pg_dump to complete. pg_dump only takes ACCESS SHARE lock on the tables it's working on, which is the weakest type of lock and does not conflict with most database operations ... but it does conflict with ACCESS EXCLUSIVE lock requests. regards, tom lane
Tom Lane wrote: >Carlos Moreno <moreno@mochima.com> writes: > >>And the second (and intriguing) problem: whenever I run pg_dump, >>my system *freezes* until pg_dump finishes. When I say "system", >>I mean the software that is running and sending data to the PG >>database. >> > >Other people have responded on the assumption that this is a performance >problem, but you should also consider the possibility that it's bad >coding of your application software. Does your app try to grab >exclusive table locks? If so, it'll sit there waiting for the pg_dump >to complete. > Thanks Tom and the others that have replied. One quick question, Tom, before some general comments and reply to the other messages... Where would I specify any locks the software wants to do? Is it something you do when connecting to the database, or when executing the query?? (I ask this because, that I know, we're not doing any locks; but this may just be lack of knowledge on my part; I may be doing that without being aware of it) (I guess I'll check the docs, instead of asking you guys to do my homework! :-)) Assuming that I indeed am not locking any tables, I tend to suspect that it is a problem of excessive workload; I'd like to doubt the possibility of defective hardware -- it's a dedicated server hired from a company that I'd like to believe are serious guys :-) (Rackforce.com, in case someone wants to break some bad news to me :-O ) The server is a Dual Athlon 1.8GHz, with 1GB of RAM, running Linux 7.3, and approx. 250MB for shared buffers. I installed PostgreSQL from the sources (7.2.3). It's running nothing else (I mean, no apache, no public ftp or downloads), other than our application, that is. "vmstat -n 1" reports ZERO swaps (si and so columns) during normal operation at peak times, and also during pg_dump (CPU idle time typically is around 95%, maybe going down to70 or 80 at peak times, and drops to approx. 40-60% during the time pg_dump is running -- would that be high enough load to make the software slow down to a crawl?). And no, as I said above, I don't think the software locks any tables -- in fact, if you ask me, I would say *there is* bad coding in the application, but precisely because there are no locks, no transactions (I know, shame on me! That's near the top in the list of most important things to do...), so that's why I was so reluctant to believe my colleague when he insisted that the pg_dump's were "freezing" the application... I had to see it with my own eyes, and on two different occasions, to be convinced :-( In case this tells you something... The size of the backup files (in plain ASCII) are around 300MB (the command is "nice pg_dump -c -f file.sql dbname"). Any further comments will be welcome and highly appreciated. But thank you all for the replies so far! It gives me a good starting point to do some digging. Thanks, Carlos --
Carlos Moreno wrote: > And no, as I said above, I don't think the software locks > any tables -- in fact, if you ask me, I would say *there > is* bad coding in the application, but precisely because > there are no locks, no transactions (I know, shame on me! > That's near the top in the list of most important things > to do...), so that's why I was so reluctant to believe > my colleague when he insisted that the pg_dump's were > "freezing" the application... I had to see it with my > own eyes, and on two different occasions, to be convinced > :-( > > In case this tells you something... The size of the > backup files (in plain ASCII) are around 300MB (the > command is "nice pg_dump -c -f file.sql dbname"). One thing you can do to help track this down is to place stats_command_string = on in your postgresql.conf and restart the database (it may be sufficient to tell the database to reread the config file via "pg_ctl reload"). Then, when the backup is going, run the application. When it "freezes", connect to the database via psql as the user postgres and do a "select * from pg_stat_activity". You'll see the list of connected processes and the current query being executed by each, if any. Do that multiple times and you should see the progress, if any, the application is making in terms of database queries. Hope this helps... -- Kevin Brown kevin@sysexperts.com
Carlos Moreno <moreno@mochima.com> writes: > Tom Lane wrote: >> Other people have responded on the assumption that this is a performance >> problem, but you should also consider the possibility that it's bad >> coding of your application software. Does your app try to grab >> exclusive table locks? > One quick question, Tom, before some general comments and > reply to the other messages... Where would I specify any > locks the software wants to do? If you are not issuing any explicit "LOCK" SQL commands, then you can disregard my theory. regards, tom lane
I said: > Carlos Moreno <moreno@mochima.com> writes: >> One quick question, Tom, before some general comments and >> reply to the other messages... Where would I specify any >> locks the software wants to do? > If you are not issuing any explicit "LOCK" SQL commands, then you can > disregard my theory. Actually, that's too simple. Are you creating and dropping tables, or issuing schema-change commands (such as ADD COLUMN or RENAME)? All of those things take exclusive locks on the tables they modify. Ordinary SELECT/INSERT/UPDATE/DELETE operations can run in parallel with pg_dump, but messing with the database structure is another story. I guess the real question here is whether your app is actually stopped dead (as it would be if waiting for a lock), or just slowed to a crawl (as a performance problem could do). I cannot tell if your "frozen" description is hyperbole or literal truth. One thing that might help diagnose it is to look at the output of ps auxww (or ps -ef on SysV-ish platforms) to see what all the backends are currently doing while the problem exists. regards, tom lane
On Fri, 24 Jan 2003, Carlos Moreno wrote: > The server is a Dual Athlon 1.8GHz, with 1GB of RAM, > running Linux 7.3, and approx. 250MB for shared buffers. > ... > In case this tells you something... The size of the > backup files (in plain ASCII) are around 300MB (the > command is "nice pg_dump -c -f file.sql dbname"). I was going to ask you to check your disk I/O statistics, but that tells me that disk I/O is probably not the problem. If the ASCII dump file (I assume by "plain ASCII" you mean uncompressed as well) is only 300 MB, your database size is likely well under 100 MB. In which case the entire database ought to be residing in the buffer cache, and you should see maximum CPU utilisation during the dump, and not too much disk I/O. (This is, however, assuming that that's the only database on your machine. You don't have another 250 GB database that gets lots of random access hiding there, do you? :-)) On a big machine like that, with such a small database, you should be able to do a dump in a couple of minutes with little noticable impact on the performance of clients. I would probably start with carefully tracing what your clients are doing during backup, and where they're blocking. cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC
On Fri, Jan 24, 2003 at 08:29:41PM -0500, Tom Lane wrote: > auxww (or ps -ef on SysV-ish platforms) to see what all the backends are Except Solaris, where ps -ef gives you no information at all. Use /usr/ucb/ps -auxww. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
Tom Lane wrote: >I said: > >>Carlos Moreno <moreno@mochima.com> writes: >> >>>One quick question, Tom, before some general comments and >>>reply to the other messages... Where would I specify any >>>locks the software wants to do? >>> > >>If you are not issuing any explicit "LOCK" SQL commands, then you can >>disregard my theory. >> Well, it was a good thing that you brought it to my attention. Yes, two minutes after I wrote the message I found the docs that told me it is an SQL command -- which means that I'm positively sure that I'm not doing any of those :-) I guess a well-developed software could use some locks here and there, and the risk of making a mistake and "over-blocking" things is there... >>Actually, that's too simple. Are you creating and dropping tables, >>or issuing schema-change commands (such as ADD COLUMN or RENAME)? >>All of those things take exclusive locks on the tables they modify. >>Ordinary SELECT/INSERT/UPDATE/DELETE operations can run in parallel with >>pg_dump, but messing with the database structure is another story. >> I do that (changing the database schema while the system is running) once in a while -- but not on a regular basis, and definitely never during the time a pg_dump is in progress (*that* would have scared me to death ;-)) > >I guess the real question here is whether your app is actually stopped >dead (as it would be if waiting for a lock), or just slowed to a crawl >(as a performance problem could do). I cannot tell if your "frozen" >description is hyperbole or literal truth. > Actually, you got me on that one... From the "practical" point of view, you could say it's literal truth (i.e., the system responsiveness falls to ZERO). The system is an online multi-player game, where the main things the database is doing is holding the users information to process the login authentications, and logging results and the progress of games (to later -- offline -- compute statistics, rankings, etc.). Logging is done on a separate worker thread, so it shouldn't matter if that stops for a few minutes (the lists of SQL's pending to be executed would just grow during that time)... But the thing is, when I run pg_dump, the games freeze, you are absolutely unable to connect (the server does not respond, period), and the players that are in a game, playing, massively abandon games, and you then see comments in the chat window that the server went down, etc. (i.e., I take it the server stopped responding to them and they abandoned thinking that the connection had dropped, or that the server had died). Now, I guess a more specific answer to your question is important (i.e., is the above behaviour the result of the system slowing to a crawl, or is it that the software just hung on a single db.Exec statement in the main loop and no single line of code is being executed until the pg_dump finishes? -- according to the comments so far, I would say this last option is not possible), and I think I'll get such an answer when running some tests as suggested by you and others that replied. >One thing that might help diagnose it is to look at the output of ps >auxww (or ps -ef on SysV-ish platforms) to see what all the backends are >currently doing while the problem exists. > We have done (IIRC) top (the command "top", that is), and yes, the postmaster process takes a lot of CPU... (not sure of the exact numbers, but it was at the top). Anyway, thanks again guys for the valuable comments and ideas!! Carlos --
On Sat, Jan 25, 2003 at 10:22:51AM -0500, Carlos Moreno wrote: > Well, it was a good thing that you brought it to my attention. > Yes, two minutes after I wrote the message I found the docs > that told me it is an SQL command -- which means that I'm > positively sure that I'm not doing any of those :-) I guess If you have a lot of foreign keys and are doing long-running UPDATES (or other related things), I think you might also see the same thing. You could spot this with ps -auxww (or friends) by looking for [some db operation] waiting. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110