Thread: When adding millions of rows at once, getting out of disk space errors
Hi all - I'm doing some perf testing and need huge amounts of data. So I have a program that is adding data to a few tables ranging from 500,000 to 15M rows. The program is just a simply C# program that blasts data into the DB, but after about 3M rows or so I get an errror: ERROR: could not extend relation 1663/41130/41177: No space left on device HINT: Check free disk space. If I do a full VACUUM on the table being inserted into, the error goes away but it comes back very quickly. Obviously, I wouldn't want this happening in a production environment. I've noticed some auto-vacuum settings as well (I just checked the box and left all the defaults) but that doesn't seem to help too much. What's the recommended setup in a production environment for tables where tons of data will be inserted? It seems to me there's some sort of "max table size" before you have to allocate more space on the disk, however I can't seem to find where these settings are and how to allow millions of rows to be inserted into a table without having to vacuum every few million rows.. Mike
On Wed, Feb 18, 2009 at 1:39 PM, Mike Christensen <imaudi@comcast.net> wrote: > Hi all - > ERROR: could not extend relation 1663/41130/41177: No space left on device > HINT: Check free disk space. You're running out of disk space.
Re: When adding millions of rows at once, getting out of disk space errors
From
Mike Christensen
Date:
I have well over 50 gigs free on that drive.. I doubt it.
Scott Marlowe wrote:
Scott Marlowe wrote:
On Wed, Feb 18, 2009 at 1:39 PM, Mike Christensen <imaudi@comcast.net> wrote:Hi all - ERROR: could not extend relation 1663/41130/41177: No space left on device HINT: Check free disk space.You're running out of disk space.
Mike Christensen <imaudi@comcast.net> writes: > I'm doing some perf testing and need huge amounts of data. So I have a > program that is adding data to a few tables ranging from 500,000 to 15M > rows. The program is just a simply C# program that blasts data into the > DB, but after about 3M rows or so I get an errror: > ERROR: could not extend relation 1663/41130/41177: No space left on device > HINT: Check free disk space. Not to put too fine a point on it, but maybe you need to buy a bigger disk. regards, tom lane
On Wednesday 18 February 2009, Mike Christensen <imaudi@comcast.net> wrote: > > ERROR: could not extend relation 1663/41130/41177: No space left on > device HINT: Check free disk space. > It seems to me there's some sort of "max table size" before you have to > allocate more space on the disk, however I can't seem to find where > these settings are and how to allow millions of rows to be inserted into > a table without having to vacuum every few million rows.. The error indicates that your file system is full. It's not a PostgreSQL problem. Hence the "hint". -- A hybrid Escalade is missing the point much in the same way that having a diet soda with your extra large pepperoni pizza is missing the point.
On Wed, 2009-02-18 at 12:55 -0800, Mike Christensen wrote: > I have well over 50 gigs free on that drive.. I doubt it. Are you sure the pg data directory is on the drive you think it is? Are you doing alot of deletes or are you merely inserting? Are you doing any sorting and therefore running out of temp space in your tmp partition [supposition, I've run into something like this before, but not specifically with Postgres]. -Mark
On Wed, Feb 18, 2009 at 1:55 PM, Mike Christensen <imaudi@comcast.net> wrote: > I have well over 50 gigs free on that drive.. I doubt it. Quotas? Something's making the OS think the drive is full.
In response to Mike Christensen <imaudi@comcast.net>: > I have well over 50 gigs free on that drive.. I doubt it. I'm not aware of that error having false-positives associated with it. Common confusion on this point could result from having quotas enabled, or possibly you're running out of space, then when you check free space (after the error has occurred) the space has already been reclaimed from the failed transaction and therefore it looks like you have plenty of free space, but it disappears when you're running the process. Also, 2x check that Postgres' data files are on the same partition as where you've got the 50G free (I've made that mistake more than once) If none of those help, provide more details. Based on the detail level you've provided, you've run out of disk space. -- Bill Moran http://www.potentialtech.com http://people.collaborativefusion.com/~wmoran/
On Wed, Feb 18, 2009 at 12:39:50PM -0800, Mike Christensen wrote: > I'm doing some perf testing and need huge amounts of data. So I have a > program that is adding data to a few tables ranging from 500,000 to 15M > rows. I assume you're repeatedly inserting data and then deleting it? If so, PG won't get much of a chance to clean up after you. Because of the way it handles transactions all of the old data will be left in the table until the table is vacuumed and the appropriate tuples/rows are marked as deleted. > The program is just a simply C# program that blasts data into the > DB, Just out of interest, do you know about the COPY command? things will go much faster than a large number of INSERT statements. > but after about 3M rows or so I get an errror: > > ERROR: could not extend relation 1663/41130/41177: No space left on device > HINT: Check free disk space. > > If I do a full VACUUM on the table being inserted into, the error goes > away but it comes back very quickly. Obviously, I wouldn't want this > happening in a production environment. VACUUM FULL's should very rarely be done, routine maintenance would be to do plain VACUUMs or let the auto-vacuum daemon handle things. This will mark the space as available and subsequent operations will reuse the space. > What's the recommended setup in a production environment for tables > where tons of data will be inserted? If you're repeatedly inserting and deleting data then you'll probably want to intersperse some VACUUMs in there. > It seems to me there's some sort of "max table size" before you have to > allocate more space on the disk, however I can't seem to find where > these settings are and how to allow millions of rows to be inserted into > a table without having to vacuum every few million rows.. There's no maximum table size you get control over; 15million rows on its own isn't considered particularly big but you need to start being careful at that stage. If you've got a particularly "wide" table (i.e. lots of attributes/columns) this is obviously going to take more space and you may consider normalizing the data out into separate tables. Once your row count gets to 10 or 100 times what your dealing with you'd probably need to start thinking about partitioning the tables and how to do that would depend on your usage patterns. -- Sam http://samason.me.uk/
Re: When adding millions of rows at once, getting out of disk space errors
From
Mike Christensen
Date:
Ooo good call, the account is part of the "Users" group which has a quota:
The users will have the following disk quota:
Disk space limited to 1024 MB
Warning sent at 900 MB
Which is the exact size of the database..
However, anyone have a clue on how to change this? This is Windows Server 2003 SBS, I'm not an expert <g>
Bill Moran wrote:
The users will have the following disk quota:
Disk space limited to 1024 MB
Warning sent at 900 MB
Which is the exact size of the database..
However, anyone have a clue on how to change this? This is Windows Server 2003 SBS, I'm not an expert <g>
Bill Moran wrote:
In response to Mike Christensen <imaudi@comcast.net>:I have well over 50 gigs free on that drive.. I doubt it.I'm not aware of that error having false-positives associated with it. Common confusion on this point could result from having quotas enabled, or possibly you're running out of space, then when you check free space (after the error has occurred) the space has already been reclaimed from the failed transaction and therefore it looks like you have plenty of free space, but it disappears when you're running the process. Also, 2x check that Postgres' data files are on the same partition as where you've got the 50G free (I've made that mistake more than once) If none of those help, provide more details. Based on the detail level you've provided, you've run out of disk space.
Mike Christensen wrote: > Ooo good call, the account is part of the "Users" group which has a quota: > > The users will have the following disk quota: > Disk space limited to 1024 MB > Warning sent at 900 MB > > Which is the exact size of the database.. > > However, anyone have a clue on how to change this? This is Windows > Server 2003 SBS, I'm not an expert <g> I'm not particularly familiar with SBS, but in general, the postgres service account is not normally a part of -any- group on Windows.
Re: When adding millions of rows at once, getting out of disk space errors
From
Mike Christensen
Date:
I just changed the account manually and did not set a disk quota.. However, now I have bigger problems since the service will not start up anymore. I tried re-booting twice. First I get: 2009-02-18 21:24:25 GMT FATAL: lock file "postmaster.pid" already exists 2009-02-18 21:24:25 GMT HINT: Is another postmaster (PID 1888) running in data directory "C:/Program Files/PostgreSQL/8.3/data"? After I delete the file, I try to start the service again and get: 2009-02-18 13:27:18 PST FATAL: could not create any TCP/IP sockets Any ideas? John R Pierce wrote: > Mike Christensen wrote: >> Ooo good call, the account is part of the "Users" group which has a >> quota: >> >> The users will have the following disk quota: >> Disk space limited to 1024 MB >> Warning sent at 900 MB >> >> Which is the exact size of the database.. >> >> However, anyone have a clue on how to change this? This is Windows >> Server 2003 SBS, I'm not an expert <g> > > I'm not particularly familiar with SBS, but in general, the postgres > service account is not normally a part of -any- group on Windows. > > >
VTD-XML 2.5 is now released. Please go to https://sourceforge.net/project/showfiles.php?group_id=110612&package_id... to download the latest version.
Changes from Version 2.4 (2/2009)
* Added separate VTD indexing generating and loading (see http://vtd-xml.sf.net/persistence.html for further info)
* Integrated extended VTD supporting 256 GB doc (In Java only).
* Added duplicateNav() for replicate multiple VTDNav instances sharing XML, VTD and LC buffer (availabe in Java and C#).
* Various bug fixes and enhancements.
Re: When adding millions of rows at once, getting out of disk space errors
From
Grzegorz Jaśkiewicz
Date:
I bet it is on windows (judging by html in that email), but if isn't: open a console and issue: watch -n 0.5 df -h and run that insert again ;)
Re: When adding millions of rows at once, getting out of disk space errors
From
Mike Christensen
Date:
Actually I'm writing emails on my Mac <g> However, the Postgres service is running on my Windows 2003 machine.. The disk space issue turned out to be a disk quota which was easy to solve. Unfortunately, the fact it crashed Postgres and with a massive transaction log left the server in a state where it wouldn't boot anymore. I was eventually able to fix it by resetting the transaction log manually. I'm hoping future versions of Postgres will handle this scenario a lot better.. Mike Grzegorz Jaśkiewicz wrote: > I bet it is on windows (judging by html in that email), but if isn't: > open a console and issue: > watch -n 0.5 df -h > and run that insert again ;) > >
On Thu, Feb 19, 2009 at 3:11 AM, Mike Christensen <imaudi@comcast.net> wrote: > Actually I'm writing emails on my Mac <g> > > However, the Postgres service is running on my Windows 2003 machine.. > > The disk space issue turned out to be a disk quota which was easy to solve. > Unfortunately, the fact it crashed Postgres and with a massive transaction > log left the server in a state where it wouldn't boot anymore. I was > eventually able to fix it by resetting the transaction log manually. I'm > hoping future versions of Postgres will handle this scenario a lot better.. They're certainly supposed to. I've had no such problems running out of space on linux in the past. I wonder if it's a windows thing.
Re: When adding millions of rows at once, getting out of disk space errors
From
Mike Christensen
Date:
I would assume if the account the service is running under has limited disk space, it won't really matter what OS you're running under. Postgres will throw an "out of disk space" error.
The problem for me is I was in the middle of a transaction which inserted about 50,000 rows into a table, and that transaction was pending in the transaction log. For some reason, the service didn't quit property and when it started back up, it began a repair process. When it got to the pending transactions (I think about 16 megs worth of data) it just hung with no useful errors or anything outputted to the log or stderr. I think this needs to be fixed!
First off, when Postgres starts and sees that your database was not closed properly, it should tell you there's pending transactions and ask if you want to dump them or try to process them (or maybe save them for later). If you process them, there should be clear status and you should know what's going on. It's very possible the service would have /eventually/ started up for me had I waited long enough (I tried 5-6 hrs, with no logs, output, or change in memory consumption; thus I assumed it was dead)..
Also, if there are errors processing the transaction log, rather than just error out and exit, it should at least tell you to run pg_resetxlog.exe. Another idea is if I just delete everything in the pg_xlog directory, it should recover and boot up without any errors rather than complaining about missing checkpoint files.
No big problem, the good news is Postgres never once actually corrupted my data and I didn't lose a single byte (well, except for the pending transactions that I didn't care about). More good news is I learned a lot and even got familiar with some of the source code and debugging, and it's made me finally get around to writing an automatic backup script that runs every midnight.
Mike
Scott Marlowe wrote:
The problem for me is I was in the middle of a transaction which inserted about 50,000 rows into a table, and that transaction was pending in the transaction log. For some reason, the service didn't quit property and when it started back up, it began a repair process. When it got to the pending transactions (I think about 16 megs worth of data) it just hung with no useful errors or anything outputted to the log or stderr. I think this needs to be fixed!
First off, when Postgres starts and sees that your database was not closed properly, it should tell you there's pending transactions and ask if you want to dump them or try to process them (or maybe save them for later). If you process them, there should be clear status and you should know what's going on. It's very possible the service would have /eventually/ started up for me had I waited long enough (I tried 5-6 hrs, with no logs, output, or change in memory consumption; thus I assumed it was dead)..
Also, if there are errors processing the transaction log, rather than just error out and exit, it should at least tell you to run pg_resetxlog.exe. Another idea is if I just delete everything in the pg_xlog directory, it should recover and boot up without any errors rather than complaining about missing checkpoint files.
No big problem, the good news is Postgres never once actually corrupted my data and I didn't lose a single byte (well, except for the pending transactions that I didn't care about). More good news is I learned a lot and even got familiar with some of the source code and debugging, and it's made me finally get around to writing an automatic backup script that runs every midnight.
Mike
Scott Marlowe wrote:
On Thu, Feb 19, 2009 at 3:11 AM, Mike Christensen <imaudi@comcast.net> wrote:Actually I'm writing emails on my Mac <g> However, the Postgres service is running on my Windows 2003 machine.. The disk space issue turned out to be a disk quota which was easy to solve.Unfortunately, the fact it crashed Postgres and with a massive transaction log left the server in a state where it wouldn't boot anymore. I was eventually able to fix it by resetting the transaction log manually. I'm hoping future versions of Postgres will handle this scenario a lot better..They're certainly supposed to. I've had no such problems running out of space on linux in the past. I wonder if it's a windows thing.
Mike Christensen wrote: > I have well over 50 gigs free on that drive.. I doubt it. out of inodes. > > Scott Marlowe wrote: >> On Wed, Feb 18, 2009 at 1:39 PM, Mike Christensen <imaudi@comcast.net> wrote: >> >>> Hi all - >>> ERROR: could not extend relation 1663/41130/41177: No space left on device >>> HINT: Check free disk space. >>> >> >> You're running out of disk space. >> >> -- Until later, Geoffrey Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. - Benjamin Franklin
Mike Christensen wrote: > First off, when Postgres starts and sees that your database was not > closed properly, it should tell you there's pending transactions and > ask if you want to dump them or try to process them (or maybe save > them for later). If you process them, there should be clear status > and you should know what's going on. how does a system service ask you anything? a service normally starts when the system boots, before anyone logs onto the console.
On Thu, Feb 19, 2009 at 02:58:02AM -0800, Mike Christensen wrote: > I would assume if the account the service is running under has limited > disk space, it won't really matter what OS you're running under. > Postgres will throw an "out of disk space" error. Similarly to Scott, every time I've come across this PG has recovered quite happily on its own. > The problem for me is I was in the middle of a transaction which > inserted about 50,000 rows into a table, and that transaction was > pending in the transaction log. For some reason, the service didn't > quit property and when it started back up, it began a repair process. > When it got to the pending transactions (I think about 16 megs worth of > data) it just hung with no useful errors or anything outputted to the > log or stderr. I think this needs to be fixed! By default PG will always create 16MB WAL files, if you're modifying a lot of data then multiple log files will be involved. I've done silly things like inserting the cross product of two large tables and wondered why things are taking so long it fails after a while with an out of disk space error, but PG never misbehaved when I did this. This would be inserting several tens of GB of data, i.e. 100's of millions of rows. > First off, when Postgres starts and sees that your database was not > closed properly, it should tell you there's pending transactions and ask > if you want to dump them or try to process them (or maybe save them for > later). Doing much of that would break things quite impressively. Applications are, or should be, designed to assume that once a transaction has commited then it's comitted. If the database can decide to rollback a commited transaction because you run out of space doing something else this is bad. Similarly, a transaction can't commit half way through, which is almost what you're saying you'd like to happen. This would lead to all sorts of inconsistency. > If you process them, there should be clear status and you > should know what's going on. It's very possible the service would have > /eventually/ started up for me had I waited long enough (I tried 5-6 > hrs, with no logs, output, or change in memory consumption; thus I > assumed it was dead).. PG, at least under Linux, is very verbose about things when they look suspicious. They must be appearing somewhere under Windows as well, maybe others can suggest where to look. > Also, if there are errors processing the transaction log, rather than > just error out and exit, it should at least tell you to run > pg_resetxlog.exe. Another idea is if I just delete everything in the > pg_xlog directory, it should recover and boot up without any errors > rather than complaining about missing checkpoint files. And potentially corrupt the entire database? PG's behavior is designed to be safely pessimistic and to keep your data safe. E.g. if a disk is on the way out and starts corrupting the log then you want to contain the error rather than spreading it further. -- Sam http://samason.me.uk/
Re: When adding millions of rows at once, getting out of disk space errors
From
Mike Christensen
Date:
Yea sorry good point.. It's probably at least safe to say the process should not just hang though, and there should be more info in the log as to what it's doing.. John R Pierce wrote: > Mike Christensen wrote: >> First off, when Postgres starts and sees that your database was not >> closed properly, it should tell you there's pending transactions and >> ask if you want to dump them or try to process them (or maybe save >> them for later). If you process them, there should be clear status >> and you should know what's going on. > > > how does a system service ask you anything? a service normally > starts when the system boots, before anyone logs onto the console. > > >