Thread: PG_XLOG 27028 files running out of space
My postgres db ran out of space. I have 27028 files in the pg_xlog directory. I'm unclear what happened this has been running flawless for years. I do have archiving turned on and run an archive command every 10 minutes.
I'm not sure how to go about cleaning this up, I got the DB back up, but I've only got 6gb free on this drive and it's going to blow up, if I can't relieve some of the stress from this directory over 220gb.
What are my options?
Thanks
Postgres 9.1.6
slon 2.1.2
Tory
I'm not sure how to go about cleaning this up, I got the DB back up, but I've only got 6gb free on this drive and it's going to blow up, if I can't relieve some of the stress from this directory over 220gb.
What are my options?
Thanks
Postgres 9.1.6
slon 2.1.2
Tory
2013/2/14 Tory M Blue <tmblue@gmail.com>
My postgres db ran out of space. I have 27028 files in the pg_xlog directory. I'm unclear what happened this has been running flawless for years. I do have archiving turned on and run an archive command every 10 minutes.
I'm not sure how to go about cleaning this up, I got the DB back up, but I've only got 6gb free on this drive and it's going to blow up, if I can't relieve some of the stress from this directory over 220gb.
What are my options?
Thanks
Postgres 9.1.6
slon 2.1.2
I can't give any advice right now, but I'd suggest posting more details of your
setup, including as much of your postgresql.conf file as possible (especially
the checkpoint_* and archive_* settings) and also the output of pg_controldata.
Ian Barwick
On Thu, Feb 14, 2013 at 3:01 AM, Ian Lawrence Barwick <barwick@gmail.com> wrote:
Thanks Ian
I figured it out and figured out a way around it for now.
My archive destination had it's ownership changed and thus the archive command could not write to the directory. I didn't catch this until well it was too late. So 225GB, 27000 files later.
I found a few writeups on how to clear this up and use the command true in the archive command to quickly and easily delete a bunch of wal files from the pg_xlog directory in short order. So that worked and now since I know what the cause was, I should be able to restore my pg_archive PITR configs and be good to go.
This is definitely one of those bullets I would rather not of taken, but the damage appears to be minimal (thank you postgres)
Thanks again
Tory
2013/2/14 Tory M Blue <tmblue@gmail.com>My postgres db ran out of space. I have 27028 files in the pg_xlog directory. I'm unclear what happened this has been running flawless for years. I do have archiving turned on and run an archive command every 10 minutes.
I'm not sure how to go about cleaning this up, I got the DB back up, but I've only got 6gb free on this drive and it's going to blow up, if I can't relieve some of the stress from this directory over 220gb.
What are my options?
Thanks
Postgres 9.1.6
slon 2.1.2I can't give any advice right now, but I'd suggest posting more details of yoursetup, including as much of your postgresql.conf file as possible (especiallythe checkpoint_* and archive_* settings) and also the output of pg_controldata.Ian Barwick
Thanks Ian
I figured it out and figured out a way around it for now.
My archive destination had it's ownership changed and thus the archive command could not write to the directory. I didn't catch this until well it was too late. So 225GB, 27000 files later.
I found a few writeups on how to clear this up and use the command true in the archive command to quickly and easily delete a bunch of wal files from the pg_xlog directory in short order. So that worked and now since I know what the cause was, I should be able to restore my pg_archive PITR configs and be good to go.
This is definitely one of those bullets I would rather not of taken, but the damage appears to be minimal (thank you postgres)
Thanks again
Tory
On 14.02.2013 12:49, Tory M Blue wrote: > My postgres db ran out of space. I have 27028 files in the pg_xlog > directory. I'm unclear what happened this has been running flawless for > years. I do have archiving turned on and run an archive command every 10 > minutes. > > I'm not sure how to go about cleaning this up, I got the DB back up, but > I've only got 6gb free on this drive and it's going to blow up, if I can't > relieve some of the stress from this directory over 220gb. > > What are my options? You'll need to delete some of the oldest xlog files to release disk space. But first you need to make sure you don't delete any files that are still needed, and what got you into this situation in the first place. You say that you "run an archive command every 10 minutes". What do you mean by that? archive_command specified in postgresql.conf is executed automatically by the system, so you don't need to and should not run that manually. After archive_command has run successfully, and the system doesn't need the WAL file for recovery anymore (ie. after the next checkpoint), the system will delete the archived file to release disk space. Clearly that hasn't been working in your system for some reason. If archive_command doesn't succeed, ie. it returns a non-zero return code, the system will keep retrying forever until it succeeds, without deleting the file. Have you checked the logs for any archive_command errors? To get out of the immediate trouble, run "pg_controldata", and make note of this line: Latest checkpoint's REDO WAL file: 000000010000000000000001 Anything older than that file is not needed for recovery. You can delete those, if you have them safely archived. - Heikki
Tory M Blue wrote: > My postgres db ran out of space. I have 27028 files in the pg_xlog directory. I'm unclear what > happened this has been running flawless for years. I do have archiving turned on and run an archive > command every 10 minutes. > > I'm not sure how to go about cleaning this up, I got the DB back up, but I've only got 6gb free on > this drive and it's going to blow up, if I can't relieve some of the stress from this directory over > 220gb. > Postgres 9.1.6 > slon 2.1.2 Are there any messages in the log file? Are you sure that archiving works, i.e. do WAL files show up in your archive location? The most likely explanation for what you observe is that archive_command returns a non-zero result (fails). That would lead to a message in the log. Yours, Laurenz Albe
On Thu, Feb 14, 2013 at 3:08 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
Thanks Heikki,
Yes I misspoke with the archive command, sorry, that was a timeout and in my haste/disorientation I misread/spoke. So I'm clear on that.
I'm also over my issue after discovering the problem, but pg_controldata is something I could of used initially in my panic, so I've added that command to my toolbox and appreciate the response!
Thanks
Tory
On 14.02.2013 12:49, Tory M Blue wrote:My postgres db ran out of space. I have 27028 files in the pg_xlog
directory. I'm unclear what happened this has been running flawless for
years. I do have archiving turned on and run an archive command every 10
minutes.
I'm not sure how to go about cleaning this up, I got the DB back up, but
I've only got 6gb free on this drive and it's going to blow up, if I can't
relieve some of the stress from this directory over 220gb.
What are my options?
You'll need to delete some of the oldest xlog files to release disk space. But first you need to make sure you don't delete any files that are still needed, and what got you into this situation in the first place.
You say that you "run an archive command every 10 minutes". What do you mean by that? archive_command specified in postgresql.conf is executed automatically by the system, so you don't need to and should not run that manually. After archive_command has run successfully, and the system doesn't need the WAL file for recovery anymore (ie. after the next checkpoint), the system will delete the archived file to release disk space. Clearly that hasn't been working in your system for some reason. If archive_command doesn't succeed, ie. it returns a non-zero return code, the system will keep retrying forever until it succeeds, without deleting the file. Have you checked the logs for any archive_command errors?
To get out of the immediate trouble, run "pg_controldata", and make note of this line:
Latest checkpoint's REDO WAL file: 000000010000000000000001
Anything older than that file is not needed for recovery. You can delete those, if you have them safely archived.
- Heikki
Thanks Heikki,
Yes I misspoke with the archive command, sorry, that was a timeout and in my haste/disorientation I misread/spoke. So I'm clear on that.
I'm also over my issue after discovering the problem, but pg_controldata is something I could of used initially in my panic, so I've added that command to my toolbox and appreciate the response!
Thanks
Tory