Thread: How to determine a database is intact?
On a nightly basis, we shut the database down and do a file system backup. A short chronology of our database problem: 8/21 - count(*) of user tables succeeded (done once a week to get statistics) 8/23 - A specific search on a specific value (one out of over 2 million) caused postmaster to SEGV. I dropped the index in question and rebuilt it. All appeared ok. 8/28 - count(*) failed - postmaster aborted on a SEGV. I attempted a pg_dumpall. Postmaster aborted with SEGV about 240 MB into the dump (the database is about 50 GB). I reloaded 8/25 database and attempted a pg_dumpall. Same failure. I reloaded from the 8/21 file system dump. Pg_dumpall worked on this one. I spent the next almost 48 hours dumping and reloading 240 million rows and reprocessing the several million additions since 8/21. I didn't dare use the 8/21 database without reloading because I didn't know if it was good or not. Question: How can we tell that a database is intact? In the above example, pg_dumpall worked on the 8/21 database. Did it become corrupt between 8/21 and 8/23, or was it already corrupt and got worse? Pg_dumpall tells you nothing about the condition of indexes. Could a corrupt index corrupt data blocks? I'm looking at doing a pg_dumpall on a weekly basis so that we have a point in time where we know we have a recoverable database. When the database reaches several hundred GB and over over a billion rows, this isn't a great solution, and doesn't address the overall database integrity. Back to the original question... How can I verify the complete integrity of a database - especially a very large one where a reload or full index rebuild could take on the order of days? Wes
Hmm. I do a nightly dump of our production database, archive a copy offsite, and verify the quality of the dump by running a little verification script that is little more than a restore. But if it would take you more than a day to do that, I'm not sure. -tfo On Sep 2, 2004, at 3:30 PM, Wes wrote: > Back to the original question... How can I verify the complete > integrity of > a database - especially a very large one where a reload or full index > rebuild could take on the order of days? > > Wes
Wes wrote: > On a nightly basis, we shut the database down and do a file system backup. > > A short chronology of our database problem: [snip] > Question: > > How can we tell that a database is intact? In the above example, pg_dumpall > worked on the 8/21 database. Did it become corrupt between 8/21 and 8/23, > or was it already corrupt and got worse? Pg_dumpall tells you nothing about > the condition of indexes. Could a corrupt index corrupt data blocks? > > I'm looking at doing a pg_dumpall on a weekly basis so that we have a point > in time where we know we have a recoverable database. When the database > reaches several hundred GB and over over a billion rows, this isn't a great > solution, and doesn't address the overall database integrity. > > Back to the original question... How can I verify the complete integrity of > a database - especially a very large one where a reload or full index > rebuild could take on the order of days? You shouldn't have to verify anything. PG's job is to never corrupt your data, and providing your hardware is good it should do so. If you are getting problems almost daily that would suggest a RAM/disk problem to me (sig 11 usually implies RAM). Can't guarantee it's not PG but it's record of reliability is pretty good. Steps I'd take: 1. Check your version number against the release notes and see if you should upgrade. You don't mention your version, but it's always worth having the last dot-release (7.2.5, 7.3.7, 7.4.5) 2. Schedule time to run memory/disk tests against your hardware. Finding 48 hours might not be easy, but you need to know where you stand. 3. Setup slony or some other replication so I can schedule my downtime. -- Richard Huxton Archonet Ltd
Wes wrote: > On 9/3/04 3:11 AM, "Richard Huxton" <dev@archonet.com> wrote: > > >>You shouldn't have to verify anything. PG's job is to never corrupt your >>data, and providing your hardware is good it should do so. If you are >>getting problems almost daily that would suggest a RAM/disk problem to >>me (sig 11 usually implies RAM). Can't guarantee it's not PG but it's >>record of reliability is pretty good. > > > I believe SEGV typically just indicates it de-referenced a bad pointer (i.e. > NULL or out of range). The problem is not occurring on a daily basis. The > database has been in service since December of last year. It's just that > the symptoms progressed from no apparent symptoms, to a clearly corrupt DB. > My guess is that some minor corruption fed upon itself until the DB couldn't > even be dumped. Or even just that block of index was never used. >>Steps I'd take: >>1. Check your version number against the release notes and see if you >>should upgrade. You don't mention your version, but it's always worth >>having the last dot-release (7.2.5, 7.3.7, 7.4.5) >>2. Schedule time to run memory/disk tests against your hardware. Finding >>48 hours might not be easy, but you need to know where you stand. >>3. Setup slony or some other replication so I can schedule my downtime. > > > I thought I mentioned the level in my original mail - 7.4.1. We are > planning on running some diagnostics. Ah - first thing you can do is move to 7.4.5, that won't require a dump/reload. Do read the release notes first though. > Whether there is a bug in PostgreSQL, or there was a memory hit, or whatever > doesn't really matter to the original question. The database can become > corrupt. How can I tell that a database is fully intact at any given point > in time? If I reload from a system backup before the known corruption, how > can I be sure that the original corruption that precipitated the failure is > not still there and will again rear its ugly head? Put bluntly, you can't. The only way to verify the database as a whole is to check every single value in it. If actual values get corrupted then you may never even notice (e.g. a text field with a single character corrupted). However, if you dump and restore then three things can be guaranteed: 1. All values are valid for their type 2. All indexes are rebuilt 3. Constraints will be satisfied on all data. Is that good enough in your case? -- Richard Huxton Archonet Ltd
On 9/3/04 3:11 AM, "Richard Huxton" <dev@archonet.com> wrote: > You shouldn't have to verify anything. PG's job is to never corrupt your > data, and providing your hardware is good it should do so. If you are > getting problems almost daily that would suggest a RAM/disk problem to > me (sig 11 usually implies RAM). Can't guarantee it's not PG but it's > record of reliability is pretty good. I believe SEGV typically just indicates it de-referenced a bad pointer (i.e. NULL or out of range). The problem is not occurring on a daily basis. The database has been in service since December of last year. It's just that the symptoms progressed from no apparent symptoms, to a clearly corrupt DB. My guess is that some minor corruption fed upon itself until the DB couldn't even be dumped. > Steps I'd take: > 1. Check your version number against the release notes and see if you > should upgrade. You don't mention your version, but it's always worth > having the last dot-release (7.2.5, 7.3.7, 7.4.5) > 2. Schedule time to run memory/disk tests against your hardware. Finding > 48 hours might not be easy, but you need to know where you stand. > 3. Setup slony or some other replication so I can schedule my downtime. I thought I mentioned the level in my original mail - 7.4.1. We are planning on running some diagnostics. Whether there is a bug in PostgreSQL, or there was a memory hit, or whatever doesn't really matter to the original question. The database can become corrupt. How can I tell that a database is fully intact at any given point in time? If I reload from a system backup before the known corruption, how can I be sure that the original corruption that precipitated the failure is not still there and will again rear its ugly head? Wes
On 9/3/04 10:14 AM, "Richard Huxton" <dev@archonet.com> wrote: > Put bluntly, you can't. The only way to verify the database as a whole > is to check every single value in it. If actual values get corrupted > then you may never even notice (e.g. a text field with a single > character corrupted). > However, if you dump and restore then three things can be guaranteed: > 1. All values are valid for their type > 2. All indexes are rebuilt > 3. Constraints will be satisfied on all data. > Is that good enough in your case? No, a dump/reload isn't feasible. Right now, it takes about 24 hours to do the actual pg_dumpall and reload. When the database peaks in size and records start being aged out, multiply that by 4 (at least). Obviously, this isn't a check you can do weekly. I was hoping there might be a utility to scan the entire database for consistency (something akin to running ANALYZE FULL). Obviously, that would require accessing every row and every index value. Wes
On 9/3/2004 10:59 AM, Wes wrote: > On 9/3/04 3:11 AM, "Richard Huxton" <dev@archonet.com> wrote: > >> You shouldn't have to verify anything. PG's job is to never corrupt your >> data, and providing your hardware is good it should do so. If you are >> getting problems almost daily that would suggest a RAM/disk problem to >> me (sig 11 usually implies RAM). Can't guarantee it's not PG but it's >> record of reliability is pretty good. > > I believe SEGV typically just indicates it de-referenced a bad pointer (i.e. > NULL or out of range). The problem is not occurring on a daily basis. The > database has been in service since December of last year. It's just that > the symptoms progressed from no apparent symptoms, to a clearly corrupt DB. > My guess is that some minor corruption fed upon itself until the DB couldn't > even be dumped. Right, that's what a SIGSEGV is. And the usual reason for the bad value in that pointer is bad memory. What do you base your guess of a self multiplying corruption on? Or is this pure handwaving in self-defense? > >> Steps I'd take: >> 1. Check your version number against the release notes and see if you >> should upgrade. You don't mention your version, but it's always worth >> having the last dot-release (7.2.5, 7.3.7, 7.4.5) >> 2. Schedule time to run memory/disk tests against your hardware. Finding >> 48 hours might not be easy, but you need to know where you stand. >> 3. Setup slony or some other replication so I can schedule my downtime. > > I thought I mentioned the level in my original mail - 7.4.1. We are > planning on running some diagnostics. So you are running a Release that had 4 official bugfix releases from the vendor on hardware that is in an unknown condition? Is the server at least configured with ECC Ram, or is the data not important enough to justify for quality hardware? > > Whether there is a bug in PostgreSQL, or there was a memory hit, or whatever > doesn't really matter to the original question. The database can become > corrupt. How can I tell that a database is fully intact at any given point > in time? If I reload from a system backup before the known corruption, how > can I be sure that the original corruption that precipitated the failure is > not still there and will again rear its ugly head? Dump and restore. You don't need to restore onto the same server. Any test system with enough disk space would do. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On 9/4/04 5:53 AM, "Jan Wieck" <JanWieck@yahoo.com> wrote: > [snip] Well, I had a big long response composed to your snide remarks, but decided I'm not getting in a flame war with you on something that is irrelevant to the question I posed. > Is the server at > least configured with ECC Ram, or is the data not important enough to > justify for quality hardware? As a matter of fact it does have ECC, and no errors have been reported by the system or diagnostics. It's a Dell 6650. No disk errors have been reported either (hardware RAID controller). As it pertains to the question at hand, I don't care what caused the corruption. It's totally irrelevant to the question. Perfect hardware and perfect software don't exist. There will be hardware failures. There will be software failures. The question was whether or not there is a feasible way of determining at any given point in time that a database is fully intact. We are already now doing weekly pg_dumpall's. Doing a restore of each of those simply isn't reasonable. If there is no such integrity utility to scan the DB in place, then that's the way it is. That's the answer to my question. But, quit trying blame crappy hardware. Wes
On 9/4/2004 2:26 PM, Wes wrote: > On 9/4/04 5:53 AM, "Jan Wieck" <JanWieck@yahoo.com> wrote: > > intact. We are already now doing weekly pg_dumpall's. Doing a restore of > each of those simply isn't reasonable. If there is no such integrity Why isn't doing a restore of those reasonable? Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On 9/4/04 2:42 PM, "Jan Wieck" <JanWieck@yahoo.com> wrote: > Why isn't doing a restore of those reasonable? Because of the size and time required. Right now, it takes at least 24 hours, with a full hardware configuration (multi-CPU, 8 disk SCSI RAID, etc). That is going to do nothing but increase. Extrapolating linearly the *current* load, it will take at least 4 days to load when the database size peaks. But, based on past experience, the load will increase significantly before then (probably by a factor of 2-4 to be conservative). When the database gets that large, I have to consider that the reload time may not increase linearly with the size. If we do a pg_dumpall once a week, it will take longer to do a reload than the period between dumps. Just the pg_dump alone could easily take close to a day. It also requires we have duplicate fully configured hardware for each copy of the database we run just to verify a pg_dumpall - if it takes that long to load, I can't share hardware. Add to that the people time to monitor the systems and the process of 2x the hardware... In short, I need a solution that scales to huge databases (hundreds of gigabytes to over a terabyte), not one that works just for small to medium databases. pg_dumpall is hopefully reliable will presumably give me a snapshot that I know I can restore from if the database becomes hopelessly corrupt. But I can't individually test each one. I was hoping for a utility that would go through and verify all indexes and data are consistent, and if not, attempt to correct them. As for your earlier question of cascading errors, consider a file system - a type of database. If you get a file system error and correct it quickly, you usually will lose nothing. If, however, you ignore that error, it is likely to get worse over days or weeks. Other errors will crop up as a result of the bad information in the first one. At some point, the file system corruption may become so bad that it can't be recovered. Format and reload. I have seen this on NTFS, UFS, HFS/HFS+, and even ReiserFS. Journaling greatly reduces, but doesn't eliminate, this problem. There are tools that will scan your file system and guarantee it's integrity, or fix the errors (or attempt to fix them) if it finds any. I was looking for something similar for a Postgres database. Wes
Hi, Am Sa, den 04.09.2004 schrieb Wes um 22:51: > On 9/4/04 2:42 PM, "Jan Wieck" <JanWieck@yahoo.com> wrote: > > > Why isn't doing a restore of those reasonable? > > Because of the size and time required. Right now, it takes at least 24 > hours, with a full hardware configuration (multi-CPU, 8 disk SCSI RAID, > etc). That is going to do nothing but increase. Extrapolating linearly the > *current* load, it will take at least 4 days to load when the database size > peaks. ... > As for your earlier question of cascading errors, consider a file system - a > type of database. If you get a file system error and correct it quickly, > you usually will lose nothing. If, however, you ignore that error, it is > likely to get worse over days or weeks. Other errors will crop up as a > result of the bad information in the first one. At some point, the file > system corruption may become so bad that it can't be recovered. Format and > reload. I have seen this on NTFS, UFS, HFS/HFS+, and even ReiserFS. > Journaling greatly reduces, but doesn't eliminate, this problem. There are > tools that will scan your file system and guarantee it's integrity, or fix > the errors (or attempt to fix them) if it finds any. I was looking for > something similar for a Postgres database. Well, with such a huge database you probably should consider different backup strategies, a filesystem with snapshot support (XFS?) could help where you can copy a state of the database at any time - so you can backup the database cluster without stopping the postmaster. Also replication via slony could be an option. The best tool to verify the backup is probably the postmaster itself. I really doubt any other program would be smaller and faster :) (Filesystems provide a tool because the actual filesystem code is a kernel module) Regards Tino
On 9/4/04 5:28 PM, "Tino Wildenhain" <tino@wildenhain.de> wrote: > Well, with such a huge database you probably should consider > different backup strategies, a filesystem with snapshot > support (XFS?) could help where you can copy a state of the database > at any time - so you can backup the database cluster without > stopping the postmaster. Also replication via slony could be > an option. Yes, we are looking into using file system snapshots. We are currently using primarily file system backups (shut down the DB, back up the file system). The problem we ran into was that we didn't have a specific point in time where we knew with absolute certainty the backed up database was good - snapshots would not help here. I ended up starting with a recent backup, and working backwards until I found one that wouldn't crash postmaster on a pg_dumpall. Rather than trust that there was no corruption in that version (data blocks might be good, but pg_dumpall doesn't test indexes), I did a pg_dumpall and reload. > The best tool to verify the backup is probably the postmaster > itself. I really doubt any other program would be smaller and > faster :) Not really... Postmaster won't tell you if a structure is bad until it stumbles on it and crashes (or politely reports an error). Just because postmaster comes up doesn't mean your database is good. As far as I know, there is no "verify database" command option on postmaster - postmaster won't fsck your database. Wes
> > Why isn't doing a restore of those reasonable? > > Because of the size and time required. Right now, it takes at least 24 > hours, with a full hardware configuration (multi-CPU, 8 disk SCSI RAID, > etc). That is going to do nothing but increase. It seems to me then that you need a better primary solution. -- Mike Nolan
On 9/5/04 9:41 AM, "Scott Ribe" <scott_ribe@killerbytes.com> wrote: > There was this: > > <http://svana.org/kleptog/pgsql/pgfsck.html> > > Never used it, can't comment on it, doesn't appear to have been updated for > 7.4. But it might be a jumping-off point. The author of this dropped me a line last night. It looks like the type of thing I was looking for, but doesn't quite work with 7.4. It appears the changes necessary are probably minor, but I haven't had a chance to dig into it yet. It also does not walk the index trees, but that could be added. As you say, it looks like it might be a good starting point. Wes
> I was looking for > something similar for a Postgres database. There was this: <http://svana.org/kleptog/pgsql/pgfsck.html> Never used it, can't comment on it, doesn't appear to have been updated for 7.4. But it might be a jumping-off point. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 665-7007 voice
>> Why isn't doing a restore of those reasonable? > > Because of the size and time required. Right now, it takes at least 24 > hours, with a full hardware configuration (multi-CPU, 8 disk SCSI RAID, > etc). That is going to do nothing but increase. Extrapolating linearly > the I understand you restore to a test server... can you configure your test server with no fsync, pg will have only one connection so you can bump sort_mem, and the like ? wil the times will be reasonable then ? Doesn't a successful pg_dumpall prove the database is intact ?
On 9/4/2004 4:51 PM, Wes wrote: > On 9/4/04 2:42 PM, "Jan Wieck" <JanWieck@yahoo.com> wrote: > >> Why isn't doing a restore of those reasonable? > > Because of the size and time required. Right now, it takes at least 24 > hours, with a full hardware configuration (multi-CPU, 8 disk SCSI RAID, 24 hours to do what? The pg_dumpall, the restore or both? And what media is used to save the dump? Same disk array, tape, separate disk array on separate controller or separate archive server? What is the bottleneck? Planning the backup solution out of the blue, I would have a separate archive system. This system is connected via gigabit ethernet and has a storage array large enough to hold at least 2 full dumps. It also has the tape drive(s) to write off the dumps from the array for storage in an alternate location. Preferrably, I would have a little more time to finish the still missing features in Slony-I and take the dumps off a replica :-) Since another purpose is to verify if a dump is consistent, the archive system would have another simple stripe set that can hold the entire database, plus a ramdisk holding the xlog. PostgreSQL will most probably not have any kind of MyISAMcheck or however that utility is called. IIRC RedHat has something that can check offline for the integrity of a single heap. But I don't know if it is still under development or if they realized that this is a dead end because of the offline limitation. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
> Doesn't a successful pg_dumpall prove the database is intact ? It is very difficult to prove a negative, that there are no errors. What a successful pg_dumpall says is that pg_dumpall didn't find any errors, not that there were none present. Also, I thought part of the question was how to determine if the output from the pg_dumpall was itself intact without doing a complete restore. -- Mike Nolan
So, would it make sense to do something like a CRC on each record? (was How to determine a database is intact?)
From
Joel
Date:
Pardon me for this, but it's Monday morning here and I'm in shallow thinking mode. I was just looking over Wes's questions about checking database integrity, and the responses given, and I'm wondering if keeping a live CRC on each record would make sense? Or is this already done, or maybe an option? (Not that it would in any way solve Wes's company's current problem, ...) -- Joel <rees@ddcom.co.jp>
On 9/5/04 9:04 AM, "Jan Wieck" <JanWieck@yahoo.com> wrote: > 24 hours to do what? The pg_dumpall, the restore or both? There's more than 250 million rows. If I remember right, it's ballpark 25% data reload, 75% index/foreign constraint rebuild. Pg_dumpall is something like 3 hours or so. Wes
Wes <wespvp@syntegra.com> writes: > There's more than 250 million rows. If I remember right, it's ballpark 25% > data reload, 75% index/foreign constraint rebuild. Pg_dumpall is something > like 3 hours or so. FWIW, increasing sort_mem for the reload process would probably help with the index and FK rebuilds. (8.0 rejiggers things so that the memory constraints for these operations are driven off a separate variable, but in 7.* you need to muck about with sort_mem. The appropriate value for one-off operations is a lot higher than what you would want multiple competing backends to be using.) regards, tom lane
On 9/8/04 9:11 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > FWIW, increasing sort_mem for the reload process would probably help > with the index and FK rebuilds. (8.0 rejiggers things so that the > memory constraints for these operations are driven off a separate > variable, but in 7.* you need to muck about with sort_mem. The > appropriate value for one-off operations is a lot higher than what > you would want multiple competing backends to be using.) How big are we talking? I've got it set to 65535 right now. Multiply it by 10? Wes
Wes <wespvp@syntegra.com> writes: > On 9/8/04 9:11 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: >> FWIW, increasing sort_mem for the reload process would probably help >> with the index and FK rebuilds. > How big are we talking? I've got it set to 65535 right now. Multiply it by > 10? How much RAM have you got in the machine? I'd think that you could afford to put, say, a quarter or so of physical RAM into sort_mem, if there is nothing much else going on during the data import. 64MB is not a lot on that scale. regards, tom lane
Wes Palmer <Wesley.R.Palmer@syntegra.com> writes: > On 9/8/04 5:51 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: >> How much RAM have you got in the machine? I'd think that you could >> afford to put, say, a quarter or so of physical RAM into sort_mem, if >> there is nothing much else going on during the data import. 64MB is >> not a lot on that scale. > The system currently has 2GB. The system is dedicated to this database, so > for a dedicated load I would think I could set it up to 1GB? Maybe, but the sort_mem isn't the only thing you want a lot of --- you don't want to be starving the kernel of disk buffers either. Also I'm not prepared to guarantee that the setting will be respected on-the-nose, particularly not for FK rebuild. You might want to keep an eye on the thing with "top" and see what it really does. I'd suggest some experimentation, maybe with settings of 256MB, 512MB, 1G to see what happens. regards, tom lane
>>>>> "TL" == Tom Lane <tgl@sss.pgh.pa.us> writes: TL> Wes <wespvp@syntegra.com> writes: >> There's more than 250 million rows. If I remember right, it's ballpark 25% >> data reload, 75% index/foreign constraint rebuild. Pg_dumpall is something >> like 3 hours or so. TL> FWIW, increasing sort_mem for the reload process would probably help TL> with the index and FK rebuilds. (8.0 rejiggers things so that the I shaved significant time from 7.4.x restores by bumping up the checkpoint_segments to 50. My dumps currently take about 1.5 hours over a 100Mbit ethernet (server is on gig ethernet, dumping client is on 100Mbit). -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Vivek Khera, Ph.D. Khera Communications, Inc. Internet: khera@kciLink.com Rockville, MD +1-301-869-4449 x806 AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/
Believe it or not, we haven't gotten many requests for this feature, partly because such corruption is so rare. Also, any checker isn't going to find a change from "Baker" to "Faker" in a text field. --------------------------------------------------------------------------- Wes wrote: > On 9/4/04 5:28 PM, "Tino Wildenhain" <tino@wildenhain.de> wrote: > > > Well, with such a huge database you probably should consider > > different backup strategies, a filesystem with snapshot > > support (XFS?) could help where you can copy a state of the database > > at any time - so you can backup the database cluster without > > stopping the postmaster. Also replication via slony could be > > an option. > > Yes, we are looking into using file system snapshots. We are currently > using primarily file system backups (shut down the DB, back up the file > system). The problem we ran into was that we didn't have a specific point > in time where we knew with absolute certainty the backed up database was > good - snapshots would not help here. > > I ended up starting with a recent backup, and working backwards until I > found one that wouldn't crash postmaster on a pg_dumpall. Rather than trust > that there was no corruption in that version (data blocks might be good, but > pg_dumpall doesn't test indexes), I did a pg_dumpall and reload. > > > The best tool to verify the backup is probably the postmaster > > itself. I really doubt any other program would be smaller and > > faster :) > > Not really... Postmaster won't tell you if a structure is bad until it > stumbles on it and crashes (or politely reports an error). Just because > postmaster comes up doesn't mean your database is good. As far as I know, > there is no "verify database" command option on postmaster - postmaster > won't fsck your database. > > Wes > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On 9/9/04 11:07 AM, "Bruce Momjian" <pgman@candle.pha.pa.us> wrote: > Believe it or not, we haven't gotten many requests for this feature, > partly because such corruption is so rare. Also, any checker isn't > going to find a change from "Baker" to "Faker" in a text field. Yep, unless you added a CRC (and accepted the performance hit) to each record the best you could do is verify that the database is consistent. That would still be quite valuable, though - all block headers are valid, indexes don't point out into oblivion, etc. I expect there are only a handful of huge databases running a heavy load - the vast majority are probably tens no larger than 10's (maybe hundreds) of megabytes, or do not experience a heavy update load? Wes
On 9/9/04 9:27 AM, "Vivek Khera" <khera@kcilink.com> wrote: > I shaved significant time from 7.4.x restores by bumping up the > checkpoint_segments to 50. My dumps currently take about 1.5 hours > over a 100Mbit ethernet (server is on gig ethernet, dumping client is > on 100Mbit). Mine are already set to 30, but in a recent re-index, I saw warnings in the log so need to bump it up some more. I'm going to likely be doing a reload this weekend, so will include this in the run. Wes
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Believe it or not, we haven't gotten many requests for this feature, > partly because such corruption is so rare. Also, any checker isn't > going to find a change from "Baker" to "Faker" in a text field. Well there were people asking for per-block checksums. Given the frequency of people complaining about hardware problems I'm starting to think they may be a good idea after all. But checksums don't detect corruption caused by software failure. That would require a more of an fsck-like approach. There's nothing wrong with the idea, but it would be a lot of work and nobody else has felt the need for it. -- greg
>>>>> "W" == Wes <wespvp@syntegra.com> writes: W> I expect there are only a handful of huge databases running a heavy load - W> the vast majority are probably tens no larger than 10's (maybe hundreds) of W> megabytes, or do not experience a heavy update load? Gigabytes: 40+ currently (old data is pruned) Updates: jillions and jillions Inserts: several hundred millions Years: 3+ Corruptions: 0 Downtime: only for planned OS and Pg upgrades (7.1 -> 7.2 -> 7.4) OS: FreeBSD 4.x Hardware: Dell PowerEdge rackmount with SCSI hardware RAID I sleep well at night :-) -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Vivek Khera, Ph.D. Khera Communications, Inc. Internet: khera@kciLink.com Rockville, MD +1-301-869-4449 x806 AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/
Now THAT"S what I like to hear..... I too am on FreeBSD....Dell PowerEdge SCSI Hardware RAID.... I too will now sleep well at night!! Vivek Khera wrote: >>>>>>"W" == Wes <wespvp@syntegra.com> writes: >>>>>> >>>>>> > >W> I expect there are only a handful of huge databases running a heavy load - >W> the vast majority are probably tens no larger than 10's (maybe hundreds) of >W> megabytes, or do not experience a heavy update load? > >Gigabytes: 40+ currently (old data is pruned) >Updates: jillions and jillions >Inserts: several hundred millions >Years: 3+ >Corruptions: 0 >Downtime: only for planned OS and Pg upgrades (7.1 -> 7.2 -> 7.4) >OS: FreeBSD 4.x >Hardware: Dell PowerEdge rackmount with SCSI hardware RAID > >I sleep well at night :-) > > >
On Thu, Sep 09, 2004 at 12:30:31PM -0500, Wes wrote: > On 9/9/04 11:07 AM, "Bruce Momjian" <pgman@candle.pha.pa.us> wrote: > > > Believe it or not, we haven't gotten many requests for this feature, > > partly because such corruption is so rare. Also, any checker isn't > > going to find a change from "Baker" to "Faker" in a text field. > > Yep, unless you added a CRC (and accepted the performance hit) to each > record the best you could do is verify that the database is consistent. > That would still be quite valuable, though - all block headers are valid, > indexes don't point out into oblivion, etc. > > I expect there are only a handful of huge databases running a heavy load - > the vast majority are probably tens no larger than 10's (maybe hundreds) of > megabytes, or do not experience a heavy update load? I have a CRM-type application installed at a number of customer sites. Database sizes are in the 5 - 50 gigabyte range, and update rates are pretty high - hundreds of thousands a day, certainly, probably over a million a day at some sites. They've been running continuously for several years, on a mix of Solaris and Linux systems ranging from big SMP Sun boxes with nice RAID systems down to dinky little dual-CPU Linux/x86 boxes with a couple of software mirrored drives. A lot of the data is of a form where data corruption will be immediately recognised by the application (has to be one of a few pre-defined strings, or integers that have to be in a given range, that sort of thing). I don't believe that any customer has had database-related downtime apart from the scheduled, painful 7.2->7.4 upgrade. I've never seen a problem due to data corruption, despite a number of kernel panics, power failures and other such facts of life. And that's with no manual database maintenance at all. Just an application-specific maintenance daemon that does cleanup, vacuuming and analysis. Cheers, Steve
On 9/9/04 3:36 PM, "Vivek Khera" <khera@kcilink.com> wrote: > Gigabytes: 40+ currently (old data is pruned) > Updates: jillions and jillions > Inserts: several hundred millions > Years: 3+ > Corruptions: 0 > Downtime: only for planned OS and Pg upgrades (7.1 -> 7.2 -> 7.4) > OS: FreeBSD 4.x > Hardware: Dell PowerEdge rackmount with SCSI hardware RAID > > I sleep well at night :-) That is close to describing ours both size and hardware-wise, although it will be growing significantly larger than that. I slept well until about a week ago... The thing had been running for 9 months without a hit until then. I did say "most", not "all"... It's good to see specifics on DB's similar in size to ours. Wes
Joel wrote: > Pardon me for this, but it's Monday morning here and I'm in shallow > thinking mode. > > I was just looking over Wes's questions about checking database > integrity, and the responses given, and I'm wondering if keeping a live > CRC on each record would make sense? Or is this already done, or maybe > an option? (Not that it would in any way solve Wes's company's current > problem, ...) TODO has: * Add optional CRC checksum to heap and index pages -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On 9/8/04 5:51 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote: > How much RAM have you got in the machine? I'd think that you could > afford to put, say, a quarter or so of physical RAM into sort_mem, if > there is nothing much else going on during the data import. 64MB is > not a lot on that scale. The system currently has 2GB. The system is dedicated to this database, so for a dedicated load I would think I could set it up to 1GB? Wes