Re: Hard limit on WAL space used (because PANIC sucks) - Mailing list pgsql-hackers
From | MauMau |
---|---|
Subject | Re: Hard limit on WAL space used (because PANIC sucks) |
Date | |
Msg-id | 879A3981DBFB4B2D8B3A92C1B5E5CED6@maumau Whole thread Raw |
In response to | Re: Hard limit on WAL space used (because PANIC sucks) (Daniel Farina <daniel@heroku.com>) |
Responses |
Re: Hard limit on WAL space used (because PANIC sucks)
Re: Hard limit on WAL space used (because PANIC sucks) |
List | pgsql-hackers |
From: "Daniel Farina" <daniel@heroku.com> > On Fri, Jun 7, 2013 at 12:14 PM, Josh Berkus <josh@agliodbs.com> wrote: >> Right now, what we're telling users is "You can have continuous backup >> with Postgres, but you'd better hire and expensive consultant to set it >> up for you, or use this external tool of dubious provenance which >> there's no packages for, or you might accidentally cause your database >> to shut down in the middle of the night." >> >> At which point most sensible users say "no thanks, I'll use something >> else". > Inverted and just as well supported: "if you want to not accidentally > lose data, you better hire an expensive consultant to check your > systems for all sorts of default 'safety = off' features." This > being but the hypothetical first one. > > Furthermore, I see no reason why high quality external archiving > software cannot exist. Maybe some even exists already, and no doubt > they can be improved and the contract with Postgres enriched to that > purpose. > > Finally, it's not that hard to teach any archiver how to no-op at > user-peril, or perhaps Postgres can learn a way to do this expressly > to standardize the procedure a bit to ease publicly shared recipes, > perhaps. Yes, I feel designing reliable archiving, even for the simplest case - copy WAL to disk, is very difficult. I know there are following three problems if you just follow the PostgreSQL manual. Average users won't notice them. I guess even professional DBAs migrating from other DBMSs won't, either. 1. If the machine or postgres crashes while archive_command is copying a WAL file, later archive recovery fails. This is because cp leaves a file of less than 16MB in archive area, and postgres refuses to start when it finds such a small archive WAL file. The solution, which IIRC Tomas san told me here, is to do like "cp %p /archive/dir/%f.tmp && mv /archive/dir/%f.tmp /archive/dir/%f". 2. archive_command dumps core when you run pg_ctl stop -mi. This is because postmaster sends SIGQUIT to all its descendants. The core files accumulate in the data directory, which will be backed up with the database. Of course those core files are garbage. archive_command script needs to catch SIGQUIT and exit. 3. You cannot know the reason of archive_command failure (e.g. archive area full) if you don't use PostgreSQL's server logging. This is because archive_command failure is not logged in syslog/eventlog. I hope PostgreSQL will provide a reliable archiving facility that is ready to use. Regards MauMau
pgsql-hackers by date: