Thread: Weird disk/table space consumption problem

Weird disk/table space consumption problem

From
Dirk Riehle
Date:
Hi,

I had some weird disk space consumption problem. I do
(non-mission-critical) data crunching using large data sets.

For one analysis, I created a table of about 15 columns, and then ran an
insert with a subselect on the table, where the select was drawing from
three other tables, merging over ids. One of the other tables has about
60M rows.

The insert ran for about 18h before running out of disk space. Before
the query, there had been about 1TB free on the disk.

The disk itself is a soft raid 5 array under Ubuntu. (That's what I have
for large data sets...)

After the insert query failed, the disk space was not made available
again; PostgreSQL did not free it up (or would not free it up quickly).
I rebooted soon thereafter.

During boot, fsck (must have been fsck) gave me tons of freeing up inode
messages.

So, I wonder what went wrong? Any explanation? Soft raid no good for
PostgreSQL?

Thanks,
Dirk

--
Phone: +1 (650) 215 3459
Blog: http://dirkriehle.com
http://twitter.com/dirkriehle

Re: Weird disk/table space consumption problem

From
Scott Marlowe
Date:
On Sat, Jul 11, 2009 at 4:41 PM, Dirk Riehle<dirk@riehle.org> wrote:
> Hi,
>
> I had some weird disk space consumption problem. I do (non-mission-critical)
> data crunching using large data sets.
>
> For one analysis, I created a table of about 15 columns, and then ran an
> insert with a subselect on the table, where the select was drawing from
> three other tables, merging over ids. One of the other tables has about 60M
> rows.
>
> The insert ran for about 18h before running out of disk space. Before the
> query, there had been about 1TB free on the disk.
>
> The disk itself is a soft raid 5 array under Ubuntu. (That's what I have for
> large data sets...)
>
> After the insert query failed, the disk space was not made available again;
> PostgreSQL did not free it up (or would not free it up quickly). I rebooted
> soon thereafter.
>
> During boot, fsck (must have been fsck) gave me tons of freeing up inode
> messages.
>
> So, I wonder what went wrong? Any explanation? Soft raid no good for
> PostgreSQL?

I've no such problems caused by modern linux software RAID (md).  It's
no surprise pgsql didn't free up the space, as it was all likely dead
tuples at that point.  Are you sure you didn't have a cartesian
product that created a larger set than you anticipated?  I'd post the
query here to see if anyone has any suggestions on that.

As for the freeing inode messages, it sounds like something is
causeing file system corruption, and my first suspect is always memory
/ hardware issues.  Have you run memtest86 on that machine to make
sure it's got good memory etc.?

Re: Weird disk/table space consumption problem

From
Tom Lane
Date:
Dirk Riehle <dirk@riehle.org> writes:
> For one analysis, I created a table of about 15 columns, and then ran an
> insert with a subselect on the table, where the select was drawing from
> three other tables, merging over ids. One of the other tables has about
> 60M rows.

> The insert ran for about 18h before running out of disk space. Before
> the query, there had been about 1TB free on the disk.

> After the insert query failed, the disk space was not made available
> again; PostgreSQL did not free it up (or would not free it up quickly).

What that part sounds like is you mistyped the insert such that it
was inserting a huge number of rows.  It's not too hard to do if
you get the join condition wrong --- what you meant to be a sane
join can easily turn into a Cartesian product.  In theory vacuum
could reclaim the space eventually, but it'd take awhile.

> I rebooted soon thereafter.

> During boot, fsck (must have been fsck) gave me tons of freeing up inode
> messages.

And this part is a filesystem bug; it cannot possibly be Postgres'
fault that the filesystem got corrupted.

            regards, tom lane

Re: Weird disk/table space consumption problem

From
Dirk Riehle
Date:

Scott Marlowe wrote:
>> So, I wonder what went wrong? Any explanation? Soft raid no good for
>> PostgreSQL?
>
> I've no such problems caused by modern linux software RAID (md).  It's
> no surprise pgsql didn't free up the space, as it was all likely dead
> tuples at that point.  Are you sure you didn't have a cartesian
> product that created a larger set than you anticipated?  I'd post the
> query here to see if anyone has any suggestions on that.

I'm glad to hear my soft raid md0 is no problem.

> As for the freeing inode messages, it sounds like something is
> causeing file system corruption, and my first suspect is always memory
> / hardware issues.  Have you run memtest86 on that machine to make
> sure it's got good memory etc.?

I do have some weird every few days error where the soft raid blocks for
a couple of seconds and I get this kernel log output:

Jul  7 19:58:55 server kernel: [40336.000239] ata1.00: status: { DRDY }
Jul  7 19:58:55 server kernel: [40336.000244] ata1.00: cmd
61/08:a0:a7:44:21/00:00:00:00:00/40 tag 20 ncq 4096 out
Jul  7 19:58:55 server kernel: [40336.000245]          res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

This an experimental machine, an old(er) Dell PowerEdge 800. I haven't
had the time to look into this. Any thoughts?

Thanks!
Dirk

--
Phone: +1 (650) 215 3459
Blog: http://dirkriehle.com
http://twitter.com/dirkriehle

Re: Weird disk/table space consumption problem

From
Dirk Riehle
Date:
Tom Lane wrote:
> Dirk Riehle <dirk@riehle.org> writes:
>> For one analysis, I created a table of about 15 columns, and then ran an
>> insert with a subselect on the table, where the select was drawing from
>> three other tables, merging over ids. One of the other tables has about
>> 60M rows.
>
>> The insert ran for about 18h before running out of disk space. Before
>> the query, there had been about 1TB free on the disk.
>
>> After the insert query failed, the disk space was not made available
>> again; PostgreSQL did not free it up (or would not free it up quickly).
>
> What that part sounds like is you mistyped the insert such that it
> was inserting a huge number of rows.  It's not too hard to do if
> you get the join condition wrong --- what you meant to be a sane
> join can easily turn into a Cartesian product.  In theory vacuum
> could reclaim the space eventually, but it'd take awhile.

Hmm... here the insert:

insert into t select ... from commits c, diffs d, sloc_metrics sm, where
d.commit_id = c.id and sm.diff_id = d.id;

sm has the 46M rows, d has 60M rows. I don't know enough about database
engines to understand why it would not be able to incrementally do the
matching but rather might run into a cartesian product?

In any case, I pared down the tables and it ran through quickly without
further problems. So maybe I had a typo in there.

>> I rebooted soon thereafter.
>
>> During boot, fsck (must have been fsck) gave me tons of freeing up inode
>> messages.
>
> And this part is a filesystem bug; it cannot possibly be Postgres'
> fault that the filesystem got corrupted.

One would think so? But the reboot should have stopped PostgreSQL properly.

Thanks for the help!

Dirk

--
Phone: +1 (650) 215 3459
Blog: http://dirkriehle.com
http://twitter.com/dirkriehle

Re: Weird disk/table space consumption problem

From
Scott Marlowe
Date:
On Sat, Jul 11, 2009 at 7:19 PM, Dirk Riehle<dirk@riehle.org> wrote:
> Tom Lane wrote:
>>
>> Dirk Riehle <dirk@riehle.org> writes:

>>> During boot, fsck (must have been fsck) gave me tons of freeing up inode
>>> messages.
>>
>> And this part is a filesystem bug; it cannot possibly be Postgres'
>> fault that the filesystem got corrupted.
>
> One would think so? But the reboot should have stopped PostgreSQL properly.

Doesn't matter whether or not pgsql was shut down properly, it
shouldn't result in file system corruption.  Now if the file wasn't
properly unmounted before power down, that could cause file system
corruption.  With a journaling file system the file system should come
right back up corruption free.

Are you running an older distro or something maybe?

Re: Weird disk/table space consumption problem

From
Scott Marlowe
Date:
On Sat, Jul 11, 2009 at 7:19 PM, Dirk Riehle<dirk@riehle.org> wrote:
>
> I do have some weird every few days error where the soft raid blocks for a
> couple of seconds and I get this kernel log output:
>
> Jul  7 19:58:55 server kernel: [40336.000239] ata1.00: status: { DRDY }
> Jul  7 19:58:55 server kernel: [40336.000244] ata1.00: cmd
> 61/08:a0:a7:44:21/00:00:00:00:00/40 tag 20 ncq 4096 out
> Jul  7 19:58:55 server kernel: [40336.000245]          res
> 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
>
> This an experimental machine, an old(er) Dell PowerEdge 800. I haven't had
> the time to look into this. Any thoughts?

Sounds like either bad hardware or a buggy driver.  Sounds like the
drive / driver is resetting spontaneously.

Re: Weird disk/table space consumption problem

From
Craig Ringer
Date:
On Sat, 2009-07-11 at 18:19 -0700, Dirk Riehle wrote:

> I do have some weird every few days error where the soft raid blocks for
> a couple of seconds and I get this kernel log output:
>
> Jul  7 19:58:55 server kernel: [40336.000239] ata1.00: status: { DRDY }
> Jul  7 19:58:55 server kernel: [40336.000244] ata1.00: cmd
> 61/08:a0:a7:44:21/00:00:00:00:00/40 tag 20 ncq 4096 out
> Jul  7 19:58:55 server kernel: [40336.000245]          res
> 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

Have you used smartctl (from the smartmontools package - on
Debian/Ubuntu at least) to examine the drive?

In particular, you should ask the drive to do a self-test and media
scan. This will not prevent take it out of the RAID or prevent it from
servicing normal operations, though it may slow it down a bit. Run:

smartctl -d ata -t long /dev/sda

then "sleep" however long it says the test will take, eg "sleep 2h".

When the sleep command exits, run:

smartctl -d ata -a /dev/sda

to see general info on the drive, its error logs, and its test logs. If
you see errors logged on the drive, if the test shows as failed, if you
see a non-zero "reallocated sector" count, or if "pending sector" is
non-zero, then it's time to replace the drive.

--
Craig Ringer