Thread: URGENT: Whole DB down ("no space left on device")

URGENT: Whole DB down ("no space left on device")

From
"Phoenix Kiula"
Date:
I am getting this message when I start the DB:


psql: FATAL:  could not access status of transaction 0
DETAIL:  Could not write to file "pg_subtrans/01F8" at offset 221184:
No space left on device.


What is this about and how do I solve this? A "df -h" on my system shows this:


Filesystem    Type    Size  Used Avail Use% Mounted on
/dev/sda5     ext3    9.9G  2.5G  6.9G  27% /
/dev/sda1     ext3     99M   17M   78M  18% /boot
none         tmpfs    2.0G     0  2.0G   0% /dev/shm
/dev/sda7     ext3    197G   17G  171G   9% /home
/dev/sda8     ext3   1012M   34M  927M   4% /tmp
/dev/sda3     ext3    9.9G  4.4G  5.0G  47% /usr
/dev/sda2     ext3    9.9G  9.5G     0 100% /var
/tmp          none   1012M   34M  927M   4% /var/tmp


Please help!

Re: URGENT: Whole DB down ("no space left on device")

From
Tommy Gildseth
Date:
Phoenix Kiula wrote:
> I am getting this message when I start the DB:
>
>
> psql: FATAL:  could not access status of transaction 0
> DETAIL:  Could not write to file "pg_subtrans/01F8" at offset 221184:
> No space left on device.
>
>
> What is this about and how do I solve this? A "df -h" on my system shows this:
>
>

> /dev/sda2     ext3    9.9G  9.5G     0 100% /var
>

Well, the error message is pretty clear, and assuming you don't keep
your database in any non-standard location, you /var partition is indeed
full.

--
Tommy Gildseth


Re: URGENT: Whole DB down ("no space left on device")

From
Bruce McAlister
Date:
Looks like you're out of disk space on:

/dev/sda2     ext3    9.9G  9.5G     0 100% /var


is this where your database resides?



Phoenix Kiula wrote:
> I am getting this message when I start the DB:
>
>
> psql: FATAL:  could not access status of transaction 0
> DETAIL:  Could not write to file "pg_subtrans/01F8" at offset 221184:
> No space left on device.
>
>
> What is this about and how do I solve this? A "df -h" on my system shows this:
>
>
> Filesystem    Type    Size  Used Avail Use% Mounted on
> /dev/sda5     ext3    9.9G  2.5G  6.9G  27% /
> /dev/sda1     ext3     99M   17M   78M  18% /boot
> none         tmpfs    2.0G     0  2.0G   0% /dev/shm
> /dev/sda7     ext3    197G   17G  171G   9% /home
> /dev/sda8     ext3   1012M   34M  927M   4% /tmp
> /dev/sda3     ext3    9.9G  4.4G  5.0G  47% /usr
> /dev/sda2     ext3    9.9G  9.5G     0 100% /var
> /tmp          none   1012M   34M  927M   4% /var/tmp
>
>
> Please help!
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>
>
>

Re: URGENT: Whole DB down ("no space left on device")

From
"Merlin Moncure"
Date:
On 8/31/07, Phoenix Kiula <phoenix.kiula@gmail.com> wrote:
> I am getting this message when I start the DB:
>
>
> psql: FATAL:  could not access status of transaction 0
> DETAIL:  Could not write to file "pg_subtrans/01F8" at offset 221184:
> No space left on device.
>
>
> What is this about and how do I solve this? A "df -h" on my system shows this:

You should take the database down if it is not already and immediately
take a file system backup of the database and move it to a secure
location.  You may temporarily make some space by symlinking database
folders to partitions that have space (/home)...such as pg_xlog, or
folders inside the database proper.

After having freed up at least a few 100 mb of space, start the
database and make sure it comes up properly.  If it does, take a
proper backup and investigate a long term solution to the storage
problem...buy a drive :-)

merlin

Re: URGENT: Whole DB down ("no space left on device")

From
Zoltan Boszormenyi
Date:
Phoenix Kiula írta:
> I am getting this message when I start the DB:
>
>
> psql: FATAL:  could not access status of transaction 0
> DETAIL:  Could not write to file "pg_subtrans/01F8" at offset 221184:
> No space left on device.
>
>
> What is this about and how do I solve this? A "df -h" on my system shows this:
>
>
> Filesystem    Type    Size  Used Avail Use% Mounted on
> ...
> /dev/sda2     ext3    9.9G  9.5G     0 100% /var
>

This is the problem. Free up some space under /var or move either
the whole partition or PostgreSQL's data directory to a new disk.
The data directory lives under /var/lib/postgresql (mainstream) or
/var/lib/pgsql (RedHat speciality).

--
----------------------------------
Zoltán Böszörményi
Cybertec Geschwinde & Schönig GmbH
http://www.postgresql.at/



Re: URGENT: Whole DB down ("no space left on device")

From
"Josh Tolley"
Date:
On 8/31/07, Zoltan Boszormenyi <zb@cybertec.at> wrote:
> Phoenix Kiula írta:
> > I am getting this message when I start the DB:
> >
> >
> > psql: FATAL:  could not access status of transaction 0
> > DETAIL:  Could not write to file "pg_subtrans/01F8" at offset 221184:
> > No space left on device.
> >
> >
> > What is this about and how do I solve this? A "df -h" on my system shows this:
> >
> >
> > Filesystem    Type    Size  Used Avail Use% Mounted on
> > ...
> > /dev/sda2     ext3    9.9G  9.5G     0 100% /var
> >

In addition to what others have already said, when things calm down
you should consider implementing some sort of monitoring system that
is configured to start screaming before you run into problems like
this. At my place of work, we've set up Nagios to monitor the space
left on various partitions, and email us when a partition gets above
90% full.

- eggyknap

Re: URGENT: Whole DB down ("no space left on device")

From
"Phoenix Kiula"
Date:
On 31/08/2007, Zoltan Boszormenyi <zb@cybertec.at> wrote:
> Phoenix Kiula írta:
> > I am getting this message when I start the DB:
> >
> >
> > psql: FATAL:  could not access status of transaction 0
> > DETAIL:  Could not write to file "pg_subtrans/01F8" at offset 221184:
> > No space left on device.
> >
> >
> > What is this about and how do I solve this? A "df -h" on my system shows this:
> >
> >
> > Filesystem    Type    Size  Used Avail Use% Mounted on
> > ...
> > /dev/sda2     ext3    9.9G  9.5G     0 100% /var
> >
>
> This is the problem. Free up some space under /var or move either
> the whole partition or PostgreSQL's data directory to a new disk.
> The data directory lives under /var/lib/postgresql (mainstream) or
> /var/lib/pgsql (RedHat speciality).
>



Thanks everyone. Yes, /var was full because of the backups that're going there.

Database is back working.

It was my backup script. It is set to save a daily backup to the /var/
folder, which is not clever. I'll change it to be in the "backup"
folder which is a mounted one.

On that note, is it recommended to store the data of the database on a
different hard disk than the one on which the database is running? How
can I change the data folder for a live database?

Many thanks!

Re: URGENT: Whole DB down ("no space left on device")

From
"Phoenix Kiula"
Date:
On 31/08/2007, Josh Tolley <eggyknap@gmail.com> wrote:
> On 8/31/07, Zoltan Boszormenyi <zb@cybertec.at> wrote:
> > Phoenix Kiula írta:

> In addition to what others have already said, when things calm down
> you should consider implementing some sort of monitoring system that
> is configured to start screaming before you run into problems like
> this. At my place of work, we've set up Nagios to monitor the space
> left on various partitions, and email us when a partition gets above
> 90% full.



Wow, Nagois seems like a superb tool. Thanks for the recommendation!

Re: URGENT: Whole DB down ("no space left on device")

From
Erik Jones
Date:
On Aug 31, 2007, at 8:35 AM, Phoenix Kiula wrote:

> Thanks everyone. Yes, /var was full because of the backups that're
> going there.
>
> Database is back working.
>
> It was my backup script. It is set to save a daily backup to the /var/
> folder, which is not clever. I'll change it to be in the "backup"
> folder which is a mounted one.
>
> On that note, is it recommended to store the data of the database on a
> different hard disk than the one on which the database is running? How
> can I change the data folder for a live database?
>
> Many thanks!

The data directory is where the database is "running".  If you're
referring to where the postgres binaries are, it doesn't matter as
they are loaded into memory when the server starts.  As far as moving
the data directory goes, you can't move it for a running database.
All of the options to move a server's data directory involve, at some
point, shutting down the db.  Alternatively, if you're running out of
space on the disk currently holding the data, you can add another
drive in a new tablespace.

Erik Jones

Software Developer | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



Re: URGENT: Whole DB down ("no space left on device")

From
Jeff Davis
Date:
On Fri, 2007-08-31 at 22:34 +0800, Phoenix Kiula wrote:
> On 31/08/2007, Josh Tolley <eggyknap@gmail.com> wrote:
> > On 8/31/07, Zoltan Boszormenyi <zb@cybertec.at> wrote:
> > > Phoenix Kiula írta:
>
> > In addition to what others have already said, when things calm down
> > you should consider implementing some sort of monitoring system that
> > is configured to start screaming before you run into problems like
> > this. At my place of work, we've set up Nagios to monitor the space
> > left on various partitions, and email us when a partition gets above
> > 90% full.
>
>
>
> Wow, Nagois seems like a superb tool. Thanks for the recommendation!
>

You might also consider OpenNMS.

Regards,
    Jeff Davis


Re: URGENT: Whole DB down ("no space left on device")

From
"Anton Melser"
Date:
On 31/08/2007, Jeff Davis <pgsql@j-davis.com> wrote:
> On Fri, 2007-08-31 at 22:34 +0800, Phoenix Kiula wrote:
> > On 31/08/2007, Josh Tolley <eggyknap@gmail.com> wrote:
> > > On 8/31/07, Zoltan Boszormenyi <zb@cybertec.at> wrote:
> > > > Phoenix Kiula írta:
> >
> > > In addition to what others have already said, when things calm down
> > > you should consider implementing some sort of monitoring system that
> > > is configured to start screaming before you run into problems like
> > > this. At my place of work, we've set up Nagios to monitor the space
> > > left on various partitions, and email us when a partition gets above
> > > 90% full.
> >
> >
> >
> > Wow, Nagois seems like a superb tool. Thanks for the recommendation!
> >
>
> You might also consider OpenNMS.

I spent about 3 hours trying to get it running and said - I'm at eval
stage, and nagios/centreon is installed and working... (even if not as
theoretically nice)... there are lots of very promising systems out
there (hyperic, zenoss, etc) but if it ain't an apt-get or yum away
then... why not just go with what *is* there? Surely it must be being
used by more people, if not, why aren't the others in the repos?
Random ramblings!
Cheers
Anton


--
echo '16i[q]sa[ln0=aln100%Pln100/snlbx]sbA0D4D465452snlbxq' | dc
This will help you for 99.9% of your problems ...

WAL to RAW devices ?

From
Alex Vinogradovs
Date:
Hi guys,


I've got a bunch of PosgreSQL servers connected to external storage,
 where a single server needs to be serving as WO database dealing with
INSERTs only, and bunch of other guys need to obtain a copy of that
data for RO serving, without taking resources on WO server.
 The idea is to have say 2 raw devices which would be used as 2 WAL
segments (round-robin). RO servers will go after the one that's not used
at a given time with something like xlogdump utility and produce INSERT
statements to be then executed locally. After that import is done, a
command will be issued to the WO server to switch to the other segment
so that the cycle can repeat.
 The objective of that replication model is to ensure that SELECT
queries won't ever affect the performance of the WO server,
which may experience uneven loads.

Is that possible with the 8.2.4 or 8.3 capabilities, or maybe with
minor modifications ?

Thanks!

Best regards,
Alex Vinogradovs


Re: WAL to RAW devices ?

From
Tom Lane
Date:
Alex Vinogradovs <AVinogradovs@clearpathnet.com> writes:
>  The idea is to have say 2 raw devices which would be used as 2 WAL
> segments (round-robin). RO servers will go after the one that's not used
> at a given time with something like xlogdump utility and produce INSERT
> statements to be then executed locally. After that import is done, a
> command will be issued to the WO server to switch to the other segment
> so that the cycle can repeat.

Why would you insist on these being raw devices?  Do you enjoy writing
filesystems from scratch?

            regards, tom lane

Re: WAL to RAW devices ?

From
Alex Vinogradovs
Date:
WAL segments already have their structure. Filesystem would be an
overhead, plus I meantioned access to the same storage from
multiple hosts - no filesystem mounting, synchronization and
other problems.

I figured PG folks aren't interested in adding enterprise-level storage
functionality (movable tablespaces, raw devices for tablespaces, etc),
thus I foresee the model described as the only way to achieve somewhat
decent performance in a stressed environment.


On Fri, 2007-08-31 at 19:21 -0400, Tom Lane wrote:
> Alex Vinogradovs <AVinogradovs@clearpathnet.com> writes:
> >  The idea is to have say 2 raw devices which would be used as 2 WAL
> > segments (round-robin). RO servers will go after the one that's not used
> > at a given time with something like xlogdump utility and produce INSERT
> > statements to be then executed locally. After that import is done, a
> > command will be issued to the WO server to switch to the other segment
> > so that the cycle can repeat.
>
> Why would you insist on these being raw devices?  Do you enjoy writing
> filesystems from scratch?
>
>             regards, tom lane

Re: WAL to RAW devices ?

From
Tom Lane
Date:
Alex Vinogradovs <AVinogradovs@Clearpathnet.com> writes:
> WAL segments already have their structure. Filesystem would be an
> overhead,

Just because you'd like that to be true doesn't make it true.  We have
to manage a variable number of active segments; track whether a given
segment is waiting for future use, active, waiting to be archived, etc;
manage status signaling to the archiver process; and so on.  Now I'll
freely admit that using a filesystem is only one of the ways that those
problems could be attacked, but that's how they've been attacked in
Postgres.  If you want to not have that functionality present then
you'd need to rewrite all that code and provide some other
infrastructure for it to use.

            regards, tom lane

Re: WAL to RAW devices ?

From
Alex Vinogradovs
Date:
But would it be a problem to have only 1 active segment at all times ?
My inspiration pretty much comes from Oracle, where redo logs are
pre-configured and can be switched by a command issued to the instance.


> Just because you'd like that to be true doesn't make it true.  We have
> to manage a variable number of active segments; track whether a given
> segment is waiting for future use, active, waiting to be archived, etc;
> manage status signaling to the archiver process; and so on.  Now I'll
> freely admit that using a filesystem is only one of the ways that those
> problems could be attacked, but that's how they've been attacked in
> Postgres.  If you want to not have that functionality present then
> you'd need to rewrite all that code and provide some other
> infrastructure for it to use.
>
>             regards, tom lane

Re: WAL to RAW devices ?

From
Alvaro Herrera
Date:
Alex Vinogradovs wrote:
> WAL segments already have their structure. Filesystem would be an
> overhead,

In this case you can choose a filesystem with lower overhead.  For
example with WAL you don't need a journalling filesystem at all, so
using ext2 is not a bad idea.  For Pg data files, you need journalling
of metadata only, not of data; the latter is provided by WAL.  So you
can mount the data filesystem with the option data=writeback.

--
Alvaro Herrera       Valdivia, Chile   ICBM: S 39º 49' 18.1", W 73º 13' 56.4"
"All rings of power are equal,
But some rings of power are more equal than others."
                                 (George Orwell's The Lord of the Rings)

Re: WAL to RAW devices ?

From
Alex Vinogradovs
Date:
Probably you missed that part... In my setup, I need at least
2 boxes going after those files, while 3rd box keeps on writing
to them... I can't mount ext2 even in R/O mode while it's being
written to by another guy. I can't unmount it before mounting
exclusively on any of them either, since PG will be writing to
that location. The only way is to do the WAL shipping, which
probably wouldn't be that bad since the copying would be done
via DMA, but still isn't as good as it could be since that would
utilize the same spindles...


On Fri, 2007-08-31 at 20:23 -0400, Alvaro Herrera wrote:
> Alex Vinogradovs wrote:
> > WAL segments already have their structure. Filesystem would be an
> > overhead,
>
> In this case you can choose a filesystem with lower overhead.  For
> example with WAL you don't need a journalling filesystem at all, so
> using ext2 is not a bad idea.  For Pg data files, you need journalling
> of metadata only, not of data; the latter is provided by WAL.  So you
> can mount the data filesystem with the option data=writeback.
>

Re: WAL to RAW devices ?

From
"Joshua D. Drake"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Alex Vinogradovs wrote:
> Hi guys,
>
>
> I've got a bunch of PosgreSQL servers connected to external storage,
>  where a single server needs to be serving as WO database dealing with
> INSERTs only, and bunch of other guys need to obtain a copy of that
> data for RO serving, without taking resources on WO server.

You can't do that with PostgreSQL without replication. Unless you are
willing to have outages with your RO servers to apply the logs.

Further you are considering the wrong logs. It is not the WAL logs, but
the archive logs that you need.

Sincerely,

Joshua D. Drake


>  The idea is to have say 2 raw devices which would be used as 2 WAL
> segments (round-robin). RO servers will go after the one that's not used
> at a given time with something like xlogdump utility and produce INSERT
> statements to be then executed locally. After that import is done, a
> command will be issued to the WO server to switch to the other segment
> so that the cycle can repeat.
>  The objective of that replication model is to ensure that SELECT
> queries won't ever affect the performance of the WO server,
> which may experience uneven loads.
>
> Is that possible with the 8.2.4 or 8.3 capabilities, or maybe with
> minor modifications ?
>
> Thanks!
>
> Best regards,
> Alex Vinogradovs
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org/
>


- --

      === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
PostgreSQL solutions since 1997  http://www.commandprompt.com/
            UNIQUE NOT NULL
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG2LY7ATb/zqfZUUQRAkM6AJ9AcueKf/f7Aali9cuia12Cp3ea3wCfdN+s
C3VIqLGY/pHMdFtXt6Tgx74=
=RASk
-----END PGP SIGNATURE-----

Re: WAL to RAW devices ?

From
Alex Vinogradovs
Date:
Oh well, I guess I will just use some trigger to invoke a C
function and store the statements in a raw device with some
proprietary format, while the actual inserts don't take place
at all.

In case anyone has more ideas, please let me know.


On Fri, 2007-08-31 at 17:45 -0700, Joshua D. Drake wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Alex Vinogradovs wrote:
> > Hi guys,
> >
> >
> > I've got a bunch of PosgreSQL servers connected to external storage,
> >  where a single server needs to be serving as WO database dealing with
> > INSERTs only, and bunch of other guys need to obtain a copy of that
> > data for RO serving, without taking resources on WO server.
>
> You can't do that with PostgreSQL without replication. Unless you are
> willing to have outages with your RO servers to apply the logs.
>
> Further you are considering the wrong logs. It is not the WAL logs, but
> the archive logs that you need.
>
> Sincerely,
>
> Joshua D. Drake
>


Re: WAL to RAW devices ?

From
Alvaro Herrera
Date:
Alex Vinogradovs wrote:
> Probably you missed that part... In my setup, I need at least
> 2 boxes going after those files, while 3rd box keeps on writing
> to them... I can't mount ext2 even in R/O mode while it's being
> written to by another guy. I can't unmount it before mounting
> exclusively on any of them either, since PG will be writing to
> that location. The only way is to do the WAL shipping, which
> probably wouldn't be that bad since the copying would be done
> via DMA, but still isn't as good as it could be since that would
> utilize the same spindles...

Oh, I see.

What I've seen described is to put a PITR slave on a filesystem with
snapshotting ability, like ZFS on Solaris.

You can then have two copies of the PITR logs.  One gets a postmaster
running in "warm standby" mode, i.e. recovering logs in a loop.  The
other one, in a sort of jail (I don't know the Solaris terminology for
this) stops the recovery and enters normal mode.  You can query it all
you like at that point.

Periodically you stop the server in normal mode, resync the snapshot
(which basically resets the "modified" block list in the filesystem),
take a new snapshot, create the jail and stop the recovery mode again.
So you have a fresher postmaster for queries.

It's not as good as having a true hot standby, for sure.  But it seems
it's good enough while we wait.

--
Alvaro Herrera       Valdivia, Chile   ICBM: S 39º 49' 18.1", W 73º 13' 56.4"
"Those who use electric razors are infidels destined to burn in hell while
we drink from rivers of beer, download free vids and mingle with naked
well shaved babes." (http://slashdot.org/comments.pl?sid=44793&cid=4647152)

Re: WAL to RAW devices ?

From
Alex Vinogradovs
Date:
Yeah, that's the trick... I need high availability with
high performance and nearly real-time synchronization ;-)
Also, I've got FreeBSD here... ZFS will be out with 7.0
release, plus UFS2 has snapshotting capability too. But
the whole method isn't good enough anyway.


> Oh, I see.
>
> What I've seen described is to put a PITR slave on a filesystem with
> snapshotting ability, like ZFS on Solaris.
>
> You can then have two copies of the PITR logs.  One gets a postmaster
> running in "warm standby" mode, i.e. recovering logs in a loop.  The
> other one, in a sort of jail (I don't know the Solaris terminology for
> this) stops the recovery and enters normal mode.  You can query it all
> you like at that point.
>
> Periodically you stop the server in normal mode, resync the snapshot
> (which basically resets the "modified" block list in the filesystem),
> take a new snapshot, create the jail and stop the recovery mode again.
> So you have a fresher postmaster for queries.
>
> It's not as good as having a true hot standby, for sure.  But it seems
> it's good enough while we wait.
>

Re: WAL to RAW devices ?

From
"Joshua D. Drake"
Date:
Alex Vinogradovs wrote:
> Hi guys,
>
>
> I've got a bunch of PosgreSQL servers connected to external storage,
>  where a single server needs to be serving as WO database dealing with
> INSERTs only, and bunch of other guys need to obtain a copy of that
> data for RO serving, without taking resources on WO server.

You can't do that with PostgreSQL without replication. Unless you are
willing to have outages with your RO servers to apply the logs.

Further you are considering the wrong logs. It is not the WAL logs, but
the archive logs that you need.

Sincerely,

Joshua D. Drake


>  The idea is to have say 2 raw devices which would be used as 2 WAL
> segments (round-robin). RO servers will go after the one that's not used
> at a given time with something like xlogdump utility and produce INSERT
> statements to be then executed locally. After that import is done, a
> command will be issued to the WO server to switch to the other segment
> so that the cycle can repeat.
>  The objective of that replication model is to ensure that SELECT
> queries won't ever affect the performance of the WO server,
> which may experience uneven loads.
>
> Is that possible with the 8.2.4 or 8.3 capabilities, or maybe with
> minor modifications ?
>
> Thanks!
>
> Best regards,
> Alex Vinogradovs
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org/
>