Thread: Weird disk/table space consumption problem
Hi, I had some weird disk space consumption problem. I do (non-mission-critical) data crunching using large data sets. For one analysis, I created a table of about 15 columns, and then ran an insert with a subselect on the table, where the select was drawing from three other tables, merging over ids. One of the other tables has about 60M rows. The insert ran for about 18h before running out of disk space. Before the query, there had been about 1TB free on the disk. The disk itself is a soft raid 5 array under Ubuntu. (That's what I have for large data sets...) After the insert query failed, the disk space was not made available again; PostgreSQL did not free it up (or would not free it up quickly). I rebooted soon thereafter. During boot, fsck (must have been fsck) gave me tons of freeing up inode messages. So, I wonder what went wrong? Any explanation? Soft raid no good for PostgreSQL? Thanks, Dirk -- Phone: +1 (650) 215 3459 Blog: http://dirkriehle.com http://twitter.com/dirkriehle
On Sat, Jul 11, 2009 at 4:41 PM, Dirk Riehle<dirk@riehle.org> wrote: > Hi, > > I had some weird disk space consumption problem. I do (non-mission-critical) > data crunching using large data sets. > > For one analysis, I created a table of about 15 columns, and then ran an > insert with a subselect on the table, where the select was drawing from > three other tables, merging over ids. One of the other tables has about 60M > rows. > > The insert ran for about 18h before running out of disk space. Before the > query, there had been about 1TB free on the disk. > > The disk itself is a soft raid 5 array under Ubuntu. (That's what I have for > large data sets...) > > After the insert query failed, the disk space was not made available again; > PostgreSQL did not free it up (or would not free it up quickly). I rebooted > soon thereafter. > > During boot, fsck (must have been fsck) gave me tons of freeing up inode > messages. > > So, I wonder what went wrong? Any explanation? Soft raid no good for > PostgreSQL? I've no such problems caused by modern linux software RAID (md). It's no surprise pgsql didn't free up the space, as it was all likely dead tuples at that point. Are you sure you didn't have a cartesian product that created a larger set than you anticipated? I'd post the query here to see if anyone has any suggestions on that. As for the freeing inode messages, it sounds like something is causeing file system corruption, and my first suspect is always memory / hardware issues. Have you run memtest86 on that machine to make sure it's got good memory etc.?
Dirk Riehle <dirk@riehle.org> writes: > For one analysis, I created a table of about 15 columns, and then ran an > insert with a subselect on the table, where the select was drawing from > three other tables, merging over ids. One of the other tables has about > 60M rows. > The insert ran for about 18h before running out of disk space. Before > the query, there had been about 1TB free on the disk. > After the insert query failed, the disk space was not made available > again; PostgreSQL did not free it up (or would not free it up quickly). What that part sounds like is you mistyped the insert such that it was inserting a huge number of rows. It's not too hard to do if you get the join condition wrong --- what you meant to be a sane join can easily turn into a Cartesian product. In theory vacuum could reclaim the space eventually, but it'd take awhile. > I rebooted soon thereafter. > During boot, fsck (must have been fsck) gave me tons of freeing up inode > messages. And this part is a filesystem bug; it cannot possibly be Postgres' fault that the filesystem got corrupted. regards, tom lane
Scott Marlowe wrote: >> So, I wonder what went wrong? Any explanation? Soft raid no good for >> PostgreSQL? > > I've no such problems caused by modern linux software RAID (md). It's > no surprise pgsql didn't free up the space, as it was all likely dead > tuples at that point. Are you sure you didn't have a cartesian > product that created a larger set than you anticipated? I'd post the > query here to see if anyone has any suggestions on that. I'm glad to hear my soft raid md0 is no problem. > As for the freeing inode messages, it sounds like something is > causeing file system corruption, and my first suspect is always memory > / hardware issues. Have you run memtest86 on that machine to make > sure it's got good memory etc.? I do have some weird every few days error where the soft raid blocks for a couple of seconds and I get this kernel log output: Jul 7 19:58:55 server kernel: [40336.000239] ata1.00: status: { DRDY } Jul 7 19:58:55 server kernel: [40336.000244] ata1.00: cmd 61/08:a0:a7:44:21/00:00:00:00:00/40 tag 20 ncq 4096 out Jul 7 19:58:55 server kernel: [40336.000245] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) This an experimental machine, an old(er) Dell PowerEdge 800. I haven't had the time to look into this. Any thoughts? Thanks! Dirk -- Phone: +1 (650) 215 3459 Blog: http://dirkriehle.com http://twitter.com/dirkriehle
Tom Lane wrote: > Dirk Riehle <dirk@riehle.org> writes: >> For one analysis, I created a table of about 15 columns, and then ran an >> insert with a subselect on the table, where the select was drawing from >> three other tables, merging over ids. One of the other tables has about >> 60M rows. > >> The insert ran for about 18h before running out of disk space. Before >> the query, there had been about 1TB free on the disk. > >> After the insert query failed, the disk space was not made available >> again; PostgreSQL did not free it up (or would not free it up quickly). > > What that part sounds like is you mistyped the insert such that it > was inserting a huge number of rows. It's not too hard to do if > you get the join condition wrong --- what you meant to be a sane > join can easily turn into a Cartesian product. In theory vacuum > could reclaim the space eventually, but it'd take awhile. Hmm... here the insert: insert into t select ... from commits c, diffs d, sloc_metrics sm, where d.commit_id = c.id and sm.diff_id = d.id; sm has the 46M rows, d has 60M rows. I don't know enough about database engines to understand why it would not be able to incrementally do the matching but rather might run into a cartesian product? In any case, I pared down the tables and it ran through quickly without further problems. So maybe I had a typo in there. >> I rebooted soon thereafter. > >> During boot, fsck (must have been fsck) gave me tons of freeing up inode >> messages. > > And this part is a filesystem bug; it cannot possibly be Postgres' > fault that the filesystem got corrupted. One would think so? But the reboot should have stopped PostgreSQL properly. Thanks for the help! Dirk -- Phone: +1 (650) 215 3459 Blog: http://dirkriehle.com http://twitter.com/dirkriehle
On Sat, Jul 11, 2009 at 7:19 PM, Dirk Riehle<dirk@riehle.org> wrote: > Tom Lane wrote: >> >> Dirk Riehle <dirk@riehle.org> writes: >>> During boot, fsck (must have been fsck) gave me tons of freeing up inode >>> messages. >> >> And this part is a filesystem bug; it cannot possibly be Postgres' >> fault that the filesystem got corrupted. > > One would think so? But the reboot should have stopped PostgreSQL properly. Doesn't matter whether or not pgsql was shut down properly, it shouldn't result in file system corruption. Now if the file wasn't properly unmounted before power down, that could cause file system corruption. With a journaling file system the file system should come right back up corruption free. Are you running an older distro or something maybe?
On Sat, Jul 11, 2009 at 7:19 PM, Dirk Riehle<dirk@riehle.org> wrote: > > I do have some weird every few days error where the soft raid blocks for a > couple of seconds and I get this kernel log output: > > Jul 7 19:58:55 server kernel: [40336.000239] ata1.00: status: { DRDY } > Jul 7 19:58:55 server kernel: [40336.000244] ata1.00: cmd > 61/08:a0:a7:44:21/00:00:00:00:00/40 tag 20 ncq 4096 out > Jul 7 19:58:55 server kernel: [40336.000245] res > 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) > > This an experimental machine, an old(er) Dell PowerEdge 800. I haven't had > the time to look into this. Any thoughts? Sounds like either bad hardware or a buggy driver. Sounds like the drive / driver is resetting spontaneously.
On Sat, 2009-07-11 at 18:19 -0700, Dirk Riehle wrote: > I do have some weird every few days error where the soft raid blocks for > a couple of seconds and I get this kernel log output: > > Jul 7 19:58:55 server kernel: [40336.000239] ata1.00: status: { DRDY } > Jul 7 19:58:55 server kernel: [40336.000244] ata1.00: cmd > 61/08:a0:a7:44:21/00:00:00:00:00/40 tag 20 ncq 4096 out > Jul 7 19:58:55 server kernel: [40336.000245] res > 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Have you used smartctl (from the smartmontools package - on Debian/Ubuntu at least) to examine the drive? In particular, you should ask the drive to do a self-test and media scan. This will not prevent take it out of the RAID or prevent it from servicing normal operations, though it may slow it down a bit. Run: smartctl -d ata -t long /dev/sda then "sleep" however long it says the test will take, eg "sleep 2h". When the sleep command exits, run: smartctl -d ata -a /dev/sda to see general info on the drive, its error logs, and its test logs. If you see errors logged on the drive, if the test shows as failed, if you see a non-zero "reallocated sector" count, or if "pending sector" is non-zero, then it's time to replace the drive. -- Craig Ringer