Thread: URGENT: Whole DB down ("no space left on device")
I am getting this message when I start the DB: psql: FATAL: could not access status of transaction 0 DETAIL: Could not write to file "pg_subtrans/01F8" at offset 221184: No space left on device. What is this about and how do I solve this? A "df -h" on my system shows this: Filesystem Type Size Used Avail Use% Mounted on /dev/sda5 ext3 9.9G 2.5G 6.9G 27% / /dev/sda1 ext3 99M 17M 78M 18% /boot none tmpfs 2.0G 0 2.0G 0% /dev/shm /dev/sda7 ext3 197G 17G 171G 9% /home /dev/sda8 ext3 1012M 34M 927M 4% /tmp /dev/sda3 ext3 9.9G 4.4G 5.0G 47% /usr /dev/sda2 ext3 9.9G 9.5G 0 100% /var /tmp none 1012M 34M 927M 4% /var/tmp Please help!
Phoenix Kiula wrote: > I am getting this message when I start the DB: > > > psql: FATAL: could not access status of transaction 0 > DETAIL: Could not write to file "pg_subtrans/01F8" at offset 221184: > No space left on device. > > > What is this about and how do I solve this? A "df -h" on my system shows this: > > > /dev/sda2 ext3 9.9G 9.5G 0 100% /var > Well, the error message is pretty clear, and assuming you don't keep your database in any non-standard location, you /var partition is indeed full. -- Tommy Gildseth
Looks like you're out of disk space on: /dev/sda2 ext3 9.9G 9.5G 0 100% /var is this where your database resides? Phoenix Kiula wrote: > I am getting this message when I start the DB: > > > psql: FATAL: could not access status of transaction 0 > DETAIL: Could not write to file "pg_subtrans/01F8" at offset 221184: > No space left on device. > > > What is this about and how do I solve this? A "df -h" on my system shows this: > > > Filesystem Type Size Used Avail Use% Mounted on > /dev/sda5 ext3 9.9G 2.5G 6.9G 27% / > /dev/sda1 ext3 99M 17M 78M 18% /boot > none tmpfs 2.0G 0 2.0G 0% /dev/shm > /dev/sda7 ext3 197G 17G 171G 9% /home > /dev/sda8 ext3 1012M 34M 927M 4% /tmp > /dev/sda3 ext3 9.9G 4.4G 5.0G 47% /usr > /dev/sda2 ext3 9.9G 9.5G 0 100% /var > /tmp none 1012M 34M 927M 4% /var/tmp > > > Please help! > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster > > >
On 8/31/07, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > I am getting this message when I start the DB: > > > psql: FATAL: could not access status of transaction 0 > DETAIL: Could not write to file "pg_subtrans/01F8" at offset 221184: > No space left on device. > > > What is this about and how do I solve this? A "df -h" on my system shows this: You should take the database down if it is not already and immediately take a file system backup of the database and move it to a secure location. You may temporarily make some space by symlinking database folders to partitions that have space (/home)...such as pg_xlog, or folders inside the database proper. After having freed up at least a few 100 mb of space, start the database and make sure it comes up properly. If it does, take a proper backup and investigate a long term solution to the storage problem...buy a drive :-) merlin
Phoenix Kiula írta: > I am getting this message when I start the DB: > > > psql: FATAL: could not access status of transaction 0 > DETAIL: Could not write to file "pg_subtrans/01F8" at offset 221184: > No space left on device. > > > What is this about and how do I solve this? A "df -h" on my system shows this: > > > Filesystem Type Size Used Avail Use% Mounted on > ... > /dev/sda2 ext3 9.9G 9.5G 0 100% /var > This is the problem. Free up some space under /var or move either the whole partition or PostgreSQL's data directory to a new disk. The data directory lives under /var/lib/postgresql (mainstream) or /var/lib/pgsql (RedHat speciality). -- ---------------------------------- Zoltán Böszörményi Cybertec Geschwinde & Schönig GmbH http://www.postgresql.at/
On 8/31/07, Zoltan Boszormenyi <zb@cybertec.at> wrote: > Phoenix Kiula írta: > > I am getting this message when I start the DB: > > > > > > psql: FATAL: could not access status of transaction 0 > > DETAIL: Could not write to file "pg_subtrans/01F8" at offset 221184: > > No space left on device. > > > > > > What is this about and how do I solve this? A "df -h" on my system shows this: > > > > > > Filesystem Type Size Used Avail Use% Mounted on > > ... > > /dev/sda2 ext3 9.9G 9.5G 0 100% /var > > In addition to what others have already said, when things calm down you should consider implementing some sort of monitoring system that is configured to start screaming before you run into problems like this. At my place of work, we've set up Nagios to monitor the space left on various partitions, and email us when a partition gets above 90% full. - eggyknap
On 31/08/2007, Zoltan Boszormenyi <zb@cybertec.at> wrote: > Phoenix Kiula írta: > > I am getting this message when I start the DB: > > > > > > psql: FATAL: could not access status of transaction 0 > > DETAIL: Could not write to file "pg_subtrans/01F8" at offset 221184: > > No space left on device. > > > > > > What is this about and how do I solve this? A "df -h" on my system shows this: > > > > > > Filesystem Type Size Used Avail Use% Mounted on > > ... > > /dev/sda2 ext3 9.9G 9.5G 0 100% /var > > > > This is the problem. Free up some space under /var or move either > the whole partition or PostgreSQL's data directory to a new disk. > The data directory lives under /var/lib/postgresql (mainstream) or > /var/lib/pgsql (RedHat speciality). > Thanks everyone. Yes, /var was full because of the backups that're going there. Database is back working. It was my backup script. It is set to save a daily backup to the /var/ folder, which is not clever. I'll change it to be in the "backup" folder which is a mounted one. On that note, is it recommended to store the data of the database on a different hard disk than the one on which the database is running? How can I change the data folder for a live database? Many thanks!
On 31/08/2007, Josh Tolley <eggyknap@gmail.com> wrote: > On 8/31/07, Zoltan Boszormenyi <zb@cybertec.at> wrote: > > Phoenix Kiula írta: > In addition to what others have already said, when things calm down > you should consider implementing some sort of monitoring system that > is configured to start screaming before you run into problems like > this. At my place of work, we've set up Nagios to monitor the space > left on various partitions, and email us when a partition gets above > 90% full. Wow, Nagois seems like a superb tool. Thanks for the recommendation!
On Aug 31, 2007, at 8:35 AM, Phoenix Kiula wrote: > Thanks everyone. Yes, /var was full because of the backups that're > going there. > > Database is back working. > > It was my backup script. It is set to save a daily backup to the /var/ > folder, which is not clever. I'll change it to be in the "backup" > folder which is a mounted one. > > On that note, is it recommended to store the data of the database on a > different hard disk than the one on which the database is running? How > can I change the data folder for a live database? > > Many thanks! The data directory is where the database is "running". If you're referring to where the postgres binaries are, it doesn't matter as they are loaded into memory when the server starts. As far as moving the data directory goes, you can't move it for a running database. All of the options to move a server's data directory involve, at some point, shutting down the db. Alternatively, if you're running out of space on the disk currently holding the data, you can add another drive in a new tablespace. Erik Jones Software Developer | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
On Fri, 2007-08-31 at 22:34 +0800, Phoenix Kiula wrote: > On 31/08/2007, Josh Tolley <eggyknap@gmail.com> wrote: > > On 8/31/07, Zoltan Boszormenyi <zb@cybertec.at> wrote: > > > Phoenix Kiula írta: > > > In addition to what others have already said, when things calm down > > you should consider implementing some sort of monitoring system that > > is configured to start screaming before you run into problems like > > this. At my place of work, we've set up Nagios to monitor the space > > left on various partitions, and email us when a partition gets above > > 90% full. > > > > Wow, Nagois seems like a superb tool. Thanks for the recommendation! > You might also consider OpenNMS. Regards, Jeff Davis
On 31/08/2007, Jeff Davis <pgsql@j-davis.com> wrote: > On Fri, 2007-08-31 at 22:34 +0800, Phoenix Kiula wrote: > > On 31/08/2007, Josh Tolley <eggyknap@gmail.com> wrote: > > > On 8/31/07, Zoltan Boszormenyi <zb@cybertec.at> wrote: > > > > Phoenix Kiula írta: > > > > > In addition to what others have already said, when things calm down > > > you should consider implementing some sort of monitoring system that > > > is configured to start screaming before you run into problems like > > > this. At my place of work, we've set up Nagios to monitor the space > > > left on various partitions, and email us when a partition gets above > > > 90% full. > > > > > > > > Wow, Nagois seems like a superb tool. Thanks for the recommendation! > > > > You might also consider OpenNMS. I spent about 3 hours trying to get it running and said - I'm at eval stage, and nagios/centreon is installed and working... (even if not as theoretically nice)... there are lots of very promising systems out there (hyperic, zenoss, etc) but if it ain't an apt-get or yum away then... why not just go with what *is* there? Surely it must be being used by more people, if not, why aren't the others in the repos? Random ramblings! Cheers Anton -- echo '16i[q]sa[ln0=aln100%Pln100/snlbx]sbA0D4D465452snlbxq' | dc This will help you for 99.9% of your problems ...
Hi guys, I've got a bunch of PosgreSQL servers connected to external storage, where a single server needs to be serving as WO database dealing with INSERTs only, and bunch of other guys need to obtain a copy of that data for RO serving, without taking resources on WO server. The idea is to have say 2 raw devices which would be used as 2 WAL segments (round-robin). RO servers will go after the one that's not used at a given time with something like xlogdump utility and produce INSERT statements to be then executed locally. After that import is done, a command will be issued to the WO server to switch to the other segment so that the cycle can repeat. The objective of that replication model is to ensure that SELECT queries won't ever affect the performance of the WO server, which may experience uneven loads. Is that possible with the 8.2.4 or 8.3 capabilities, or maybe with minor modifications ? Thanks! Best regards, Alex Vinogradovs
Alex Vinogradovs <AVinogradovs@clearpathnet.com> writes: > The idea is to have say 2 raw devices which would be used as 2 WAL > segments (round-robin). RO servers will go after the one that's not used > at a given time with something like xlogdump utility and produce INSERT > statements to be then executed locally. After that import is done, a > command will be issued to the WO server to switch to the other segment > so that the cycle can repeat. Why would you insist on these being raw devices? Do you enjoy writing filesystems from scratch? regards, tom lane
WAL segments already have their structure. Filesystem would be an overhead, plus I meantioned access to the same storage from multiple hosts - no filesystem mounting, synchronization and other problems. I figured PG folks aren't interested in adding enterprise-level storage functionality (movable tablespaces, raw devices for tablespaces, etc), thus I foresee the model described as the only way to achieve somewhat decent performance in a stressed environment. On Fri, 2007-08-31 at 19:21 -0400, Tom Lane wrote: > Alex Vinogradovs <AVinogradovs@clearpathnet.com> writes: > > The idea is to have say 2 raw devices which would be used as 2 WAL > > segments (round-robin). RO servers will go after the one that's not used > > at a given time with something like xlogdump utility and produce INSERT > > statements to be then executed locally. After that import is done, a > > command will be issued to the WO server to switch to the other segment > > so that the cycle can repeat. > > Why would you insist on these being raw devices? Do you enjoy writing > filesystems from scratch? > > regards, tom lane
Alex Vinogradovs <AVinogradovs@Clearpathnet.com> writes: > WAL segments already have their structure. Filesystem would be an > overhead, Just because you'd like that to be true doesn't make it true. We have to manage a variable number of active segments; track whether a given segment is waiting for future use, active, waiting to be archived, etc; manage status signaling to the archiver process; and so on. Now I'll freely admit that using a filesystem is only one of the ways that those problems could be attacked, but that's how they've been attacked in Postgres. If you want to not have that functionality present then you'd need to rewrite all that code and provide some other infrastructure for it to use. regards, tom lane
But would it be a problem to have only 1 active segment at all times ? My inspiration pretty much comes from Oracle, where redo logs are pre-configured and can be switched by a command issued to the instance. > Just because you'd like that to be true doesn't make it true. We have > to manage a variable number of active segments; track whether a given > segment is waiting for future use, active, waiting to be archived, etc; > manage status signaling to the archiver process; and so on. Now I'll > freely admit that using a filesystem is only one of the ways that those > problems could be attacked, but that's how they've been attacked in > Postgres. If you want to not have that functionality present then > you'd need to rewrite all that code and provide some other > infrastructure for it to use. > > regards, tom lane
Alex Vinogradovs wrote: > WAL segments already have their structure. Filesystem would be an > overhead, In this case you can choose a filesystem with lower overhead. For example with WAL you don't need a journalling filesystem at all, so using ext2 is not a bad idea. For Pg data files, you need journalling of metadata only, not of data; the latter is provided by WAL. So you can mount the data filesystem with the option data=writeback. -- Alvaro Herrera Valdivia, Chile ICBM: S 39º 49' 18.1", W 73º 13' 56.4" "All rings of power are equal, But some rings of power are more equal than others." (George Orwell's The Lord of the Rings)
Probably you missed that part... In my setup, I need at least 2 boxes going after those files, while 3rd box keeps on writing to them... I can't mount ext2 even in R/O mode while it's being written to by another guy. I can't unmount it before mounting exclusively on any of them either, since PG will be writing to that location. The only way is to do the WAL shipping, which probably wouldn't be that bad since the copying would be done via DMA, but still isn't as good as it could be since that would utilize the same spindles... On Fri, 2007-08-31 at 20:23 -0400, Alvaro Herrera wrote: > Alex Vinogradovs wrote: > > WAL segments already have their structure. Filesystem would be an > > overhead, > > In this case you can choose a filesystem with lower overhead. For > example with WAL you don't need a journalling filesystem at all, so > using ext2 is not a bad idea. For Pg data files, you need journalling > of metadata only, not of data; the latter is provided by WAL. So you > can mount the data filesystem with the option data=writeback. >
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Alex Vinogradovs wrote: > Hi guys, > > > I've got a bunch of PosgreSQL servers connected to external storage, > where a single server needs to be serving as WO database dealing with > INSERTs only, and bunch of other guys need to obtain a copy of that > data for RO serving, without taking resources on WO server. You can't do that with PostgreSQL without replication. Unless you are willing to have outages with your RO servers to apply the logs. Further you are considering the wrong logs. It is not the WAL logs, but the archive logs that you need. Sincerely, Joshua D. Drake > The idea is to have say 2 raw devices which would be used as 2 WAL > segments (round-robin). RO servers will go after the one that's not used > at a given time with something like xlogdump utility and produce INSERT > statements to be then executed locally. After that import is done, a > command will be issued to the WO server to switch to the other segment > so that the cycle can repeat. > The objective of that replication model is to ensure that SELECT > queries won't ever affect the performance of the WO server, > which may experience uneven loads. > > Is that possible with the 8.2.4 or 8.3 capabilities, or maybe with > minor modifications ? > > Thanks! > > Best regards, > Alex Vinogradovs > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org/ > - -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFG2LY7ATb/zqfZUUQRAkM6AJ9AcueKf/f7Aali9cuia12Cp3ea3wCfdN+s C3VIqLGY/pHMdFtXt6Tgx74= =RASk -----END PGP SIGNATURE-----
Oh well, I guess I will just use some trigger to invoke a C function and store the statements in a raw device with some proprietary format, while the actual inserts don't take place at all. In case anyone has more ideas, please let me know. On Fri, 2007-08-31 at 17:45 -0700, Joshua D. Drake wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Alex Vinogradovs wrote: > > Hi guys, > > > > > > I've got a bunch of PosgreSQL servers connected to external storage, > > where a single server needs to be serving as WO database dealing with > > INSERTs only, and bunch of other guys need to obtain a copy of that > > data for RO serving, without taking resources on WO server. > > You can't do that with PostgreSQL without replication. Unless you are > willing to have outages with your RO servers to apply the logs. > > Further you are considering the wrong logs. It is not the WAL logs, but > the archive logs that you need. > > Sincerely, > > Joshua D. Drake >
Alex Vinogradovs wrote: > Probably you missed that part... In my setup, I need at least > 2 boxes going after those files, while 3rd box keeps on writing > to them... I can't mount ext2 even in R/O mode while it's being > written to by another guy. I can't unmount it before mounting > exclusively on any of them either, since PG will be writing to > that location. The only way is to do the WAL shipping, which > probably wouldn't be that bad since the copying would be done > via DMA, but still isn't as good as it could be since that would > utilize the same spindles... Oh, I see. What I've seen described is to put a PITR slave on a filesystem with snapshotting ability, like ZFS on Solaris. You can then have two copies of the PITR logs. One gets a postmaster running in "warm standby" mode, i.e. recovering logs in a loop. The other one, in a sort of jail (I don't know the Solaris terminology for this) stops the recovery and enters normal mode. You can query it all you like at that point. Periodically you stop the server in normal mode, resync the snapshot (which basically resets the "modified" block list in the filesystem), take a new snapshot, create the jail and stop the recovery mode again. So you have a fresher postmaster for queries. It's not as good as having a true hot standby, for sure. But it seems it's good enough while we wait. -- Alvaro Herrera Valdivia, Chile ICBM: S 39º 49' 18.1", W 73º 13' 56.4" "Those who use electric razors are infidels destined to burn in hell while we drink from rivers of beer, download free vids and mingle with naked well shaved babes." (http://slashdot.org/comments.pl?sid=44793&cid=4647152)
Yeah, that's the trick... I need high availability with high performance and nearly real-time synchronization ;-) Also, I've got FreeBSD here... ZFS will be out with 7.0 release, plus UFS2 has snapshotting capability too. But the whole method isn't good enough anyway. > Oh, I see. > > What I've seen described is to put a PITR slave on a filesystem with > snapshotting ability, like ZFS on Solaris. > > You can then have two copies of the PITR logs. One gets a postmaster > running in "warm standby" mode, i.e. recovering logs in a loop. The > other one, in a sort of jail (I don't know the Solaris terminology for > this) stops the recovery and enters normal mode. You can query it all > you like at that point. > > Periodically you stop the server in normal mode, resync the snapshot > (which basically resets the "modified" block list in the filesystem), > take a new snapshot, create the jail and stop the recovery mode again. > So you have a fresher postmaster for queries. > > It's not as good as having a true hot standby, for sure. But it seems > it's good enough while we wait. >
Alex Vinogradovs wrote: > Hi guys, > > > I've got a bunch of PosgreSQL servers connected to external storage, > where a single server needs to be serving as WO database dealing with > INSERTs only, and bunch of other guys need to obtain a copy of that > data for RO serving, without taking resources on WO server. You can't do that with PostgreSQL without replication. Unless you are willing to have outages with your RO servers to apply the logs. Further you are considering the wrong logs. It is not the WAL logs, but the archive logs that you need. Sincerely, Joshua D. Drake > The idea is to have say 2 raw devices which would be used as 2 WAL > segments (round-robin). RO servers will go after the one that's not used > at a given time with something like xlogdump utility and produce INSERT > statements to be then executed locally. After that import is done, a > command will be issued to the WO server to switch to the other segment > so that the cycle can repeat. > The objective of that replication model is to ensure that SELECT > queries won't ever affect the performance of the WO server, > which may experience uneven loads. > > Is that possible with the 8.2.4 or 8.3 capabilities, or maybe with > minor modifications ? > > Thanks! > > Best regards, > Alex Vinogradovs > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org/ >