Re: When adding millions of rows at once, getting out of disk space errors - Mailing list pgsql-general

From Sam Mason
Subject Re: When adding millions of rows at once, getting out of disk space errors
Date
Msg-id 20090219175535.GW32672@frubble.xen.chris-lamb.co.uk
Whole thread Raw
In response to Re: When adding millions of rows at once, getting out of disk space errors  (Mike Christensen <imaudi@comcast.net>)
List pgsql-general
On Thu, Feb 19, 2009 at 02:58:02AM -0800, Mike Christensen wrote:
> I would assume if the account the service is running under has limited
> disk space, it won't really matter what OS you're running under.
> Postgres will throw an "out of disk space" error.

Similarly to Scott, every time I've come across this PG has recovered
quite happily on its own.

> The problem for me is I was in the middle of a transaction which
> inserted about 50,000 rows into a table, and that transaction was
> pending in the transaction log.  For some reason, the service didn't
> quit property and when it started back up, it began a repair process.
> When it got to the pending transactions (I think about 16 megs worth of
> data) it just hung with no useful errors or anything outputted to the
> log or stderr.  I think this needs to be fixed!

By default PG will always create 16MB WAL files, if you're modifying a
lot of data then multiple log files will be involved.  I've done silly
things like inserting the cross product of two large tables and wondered
why things are taking so long it fails after a while with an out of disk
space error, but PG never misbehaved when I did this.  This would be
inserting several tens of GB of data, i.e. 100's of millions of rows.

> First off, when Postgres starts and sees that your database was not
> closed properly, it should tell you there's pending transactions and ask
> if you want to dump them or try to process them (or maybe save them for
> later).

Doing much of that would break things quite impressively.  Applications
are, or should be, designed to assume that once a transaction has
commited then it's comitted.  If the database can decide to rollback a
commited transaction because you run out of space doing something else
this is bad.  Similarly, a transaction can't commit half way through,
which is almost what you're saying you'd like to happen.  This would
lead to all sorts of inconsistency.

> If you process them, there should be clear status and you
> should know what's going on.  It's very possible the service would have
> /eventually/ started up for me had I waited long enough (I tried 5-6
> hrs, with no logs, output, or change in memory consumption; thus I
> assumed it was dead)..

PG, at least under Linux, is very verbose about things when they look
suspicious.  They must be appearing somewhere under Windows as well,
maybe others can suggest where to look.

> Also, if there are errors processing the transaction log, rather than
> just error out and exit, it should at least tell you to run
> pg_resetxlog.exe.  Another idea is if I just delete everything in the
> pg_xlog directory, it should recover and boot up without any errors
> rather than complaining about missing checkpoint files.

And potentially corrupt the entire database?  PG's behavior is designed
to be safely pessimistic and to keep your data safe.  E.g. if a disk is
on the way out and starts corrupting the log then you want to contain
the error rather than spreading it further.

--
  Sam  http://samason.me.uk/

pgsql-general by date:

Previous
From: "Rob Richardson"
Date:
Subject: Search for text in any function
Next
From: c k
Date:
Subject: Re: UPDATE