Thread: no space left on device
I'm running Postgres 7.1.3, and just started having a problem where my dynamic site is going down (read-only DB, with no writes happening to the DB) regularly (every other day). I have no idea whay this is happening, and my search of the FAQ's and mail list don't bring up anything. i've attached the error from the log file, at the end of this message. Here's an output of the disk usage from within the DB dir [postgres - DB]$ du -k . 1716 ./base/1 1716 ./base/16555 5192 ./base/56048 8628 ./base 116 ./global 32812 ./pg_xlog 11380 ./pg_clog 53192 . Note that the pg_xlog dir is huge! Here's its contents: [postgres - DB/pg_xlog]$ ls -al total 32816 drwx------ 2 postgres admin 4096 Mar 29 2003 . drwx------ 6 postgres admin 4096 Jan 9 15:04 .. -rwx------ 1 postgres admin 16777216 Jan 9 15:09 0000000000000001 -rwx------ 1 postgres admin 16777216 Mar 29 2003 0000000000000002 What are these files, and what can I do to resolve this issue? Thx, Zeb -- DEBUG: statistics collector process (pid 2523) exited with exit code 1 PGSTAT: Error closing temp stats file PGSTAT: /usr/local/G101/App/DB/./global/pgstat.tmp.7823: No space left on device PGSTAT: AbDEBUG: statistics collector process (pid 2979) exited with exit code 1 FATAL 2: write of clog file 43, offset 188416 failed: No space left on device DEBUG: server process (pid 3741) exited with exit code 2 DEBUG: terminating any other active server processes NOTICE: Message from PostgreSQL backend: The Postmaster has informed me that some other backend died abnormally and possibly corrupted shared memory. I have rolled back the current transaction and am going to terminate your database system connection and exit. Please reconnect to the database system and repeat your query. NOTICE: Message from PostgreSQL backend: The Postmaster has informed me that some other backend died abnormally and possibly corrupted shared memory. I have rolled back the current transaction and am going to terminate your database system connection and exit. Please reconnect to the database system and repeat your query. NOTICE: Message from PostgreSQL backend: The Postmaster has informed me that some other backend died abnormally and possibly corrupted shared memory. I have rolled back the current transaction and am going to terminate your database system connection and exit. Please reconnect to the database system and repeat your query. DEBUG: all server processes terminated; reinitializing shared memory and semaph ores DEBUG: database system was interrupted at 2004-01-09 05:22:52 EST DEBUG: checkpoint record is at 0/138CFD4 DEBUG: redo record is at 0/138CFD4; undo record is at 0/0; shutdown FALSE DEBUG: next transaction id: 45811837; next oid: 65205 DEBUG: database system was not properly shut down; automatic recovery in progre ss DEBUG: redo starts at 0/138D014 FATAL 2: write of clog file 43, offset 188416 failed: No space left on device DEBUG: startup process (pid 3785) exited with exit code 2 DEBUG: aborting startup due to startup process failure
> >Note that the pg_xlog dir is huge! Here's its contents: > >[postgres - DB/pg_xlog]$ ls -al >total 32816 >drwx------ 2 postgres admin 4096 Mar 29 2003 . >drwx------ 6 postgres admin 4096 Jan 9 15:04 .. >-rwx------ 1 postgres admin 16777216 Jan 9 15:09 0000000000000001 >-rwx------ 1 postgres admin 16777216 Mar 29 2003 0000000000000002 > >What are these files, and what can I do to resolve this issue? > > The are check_point files. You need them. Have you ran a vacuum recently? Sincerely, Joshua D. Drake >Thx, > >Zeb > > >-- >DEBUG: statistics collector process (pid 2523) exited with exit code 1 >PGSTAT: Error closing temp stats file >PGSTAT: /usr/local/G101/App/DB/./global/pgstat.tmp.7823: No space left on >device >PGSTAT: AbDEBUG: statistics collector process (pid 2979) exited with exit >code >1 >FATAL 2: write of clog file 43, offset 188416 failed: No space left on >device >DEBUG: server process (pid 3741) exited with exit code 2 >DEBUG: terminating any other active server processes >NOTICE: Message from PostgreSQL backend: > The Postmaster has informed me that some other backend > died abnormally and possibly corrupted shared memory. > I have rolled back the current transaction and am > going to terminate your database system connection and exit. > Please reconnect to the database system and repeat your query. >NOTICE: Message from PostgreSQL backend: > The Postmaster has informed me that some other backend > died abnormally and possibly corrupted shared memory. > I have rolled back the current transaction and am > going to terminate your database system connection and exit. > Please reconnect to the database system and repeat your query. >NOTICE: Message from PostgreSQL backend: > The Postmaster has informed me that some other backend > died abnormally and possibly corrupted shared memory. > I have rolled back the current transaction and am > going to terminate your database system connection and exit. > Please reconnect to the database system and repeat your query. >DEBUG: all server processes terminated; reinitializing shared memory and >semaph >ores >DEBUG: database system was interrupted at 2004-01-09 05:22:52 EST >DEBUG: checkpoint record is at 0/138CFD4 >DEBUG: redo record is at 0/138CFD4; undo record is at 0/0; shutdown FALSE >DEBUG: next transaction id: 45811837; next oid: 65205 >DEBUG: database system was not properly shut down; automatic recovery in >progre >ss >DEBUG: redo starts at 0/138D014 >FATAL 2: write of clog file 43, offset 188416 failed: No space left on >device >DEBUG: startup process (pid 3785) exited with exit code 2 >DEBUG: aborting startup due to startup process failure > >---------------------------(end of broadcast)--------------------------- >TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > > -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com PostgreSQL Replicator -- production quality replication for PostgreSQL
On Friday 09 January 2004 20:31, Aurangzeb M. Agha wrote: > I'm running Postgres 7.1.3, and just started having a problem where my > dynamic site is going down (read-only DB, with no writes happening to the > DB) regularly (every other day). I have no idea whay this is happening, > and my search of the FAQ's and mail list don't bring up anything. i've > attached the error from the log file, at the end of this message. > > Here's an output of the disk usage from within the DB dir > > [postgres - DB]$ du -k . > 1716 ./base/1 > 1716 ./base/16555 > 5192 ./base/56048 > 8628 ./base > 116 ./global > 32812 ./pg_xlog > 11380 ./pg_clog > 53192 . OK, and what does "df -m" show? That will display disk sizes and free space remaining. Your error is that you have run out of disk space. > Note that the pg_xlog dir is huge! Here's its contents: Well - it's 32MB (2 x 16MB as you show below). > -rwx------ 1 postgres admin 16777216 Jan 9 15:09 0000000000000001 > -rwx------ 1 postgres admin 16777216 Mar 29 2003 0000000000000002 > > What are these files, and what can I do to resolve this issue? They're transaction logs (see the section on WAL). You can probably reduce them from their default size of 16MB, I'm guessing by changing some constant in the source and re-compiling. -- Richard Huxton Archonet Ltd
I've not run a vacuum in quite some time, and that's because I've only been doing reads from this DB. I was under the impression that I should run vacuum when tables are heavily modified: http://www.postgresql.org/docs/aw_pgsql_book/node110.html I guess I must have been mistaken? I'm looking through the docs now, but am having trouble finding this: how can I vacuum the entire DB at once? Thx, Zeb On Fri, 9 Jan 2004, Joshua D. Drake wrote: : :> :>Note that the pg_xlog dir is huge! Here's its contents: :> :>[postgres - DB/pg_xlog]$ ls -al :>total 32816 :>drwx------ 2 postgres admin 4096 Mar 29 2003 . :>drwx------ 6 postgres admin 4096 Jan 9 15:04 .. :>-rwx------ 1 postgres admin 16777216 Jan 9 15:09 0000000000000001 :>-rwx------ 1 postgres admin 16777216 Mar 29 2003 0000000000000002 :> :>What are these files, and what can I do to resolve this issue? :> :> :The are check_point files. You need them. Have you ran a vacuum recently? : :Sincerely, : :Joshua D. Drake : : : :>Thx, :> :>Zeb :> :> :>-- :>DEBUG: statistics collector process (pid 2523) exited with exit code 1 :>PGSTAT: Error closing temp stats file :>PGSTAT: /usr/local/G101/App/DB/./global/pgstat.tmp.7823: No space left on :>device :>PGSTAT: AbDEBUG: statistics collector process (pid 2979) exited with exit :>code :>1 :>FATAL 2: write of clog file 43, offset 188416 failed: No space left on :>device :>DEBUG: server process (pid 3741) exited with exit code 2 :>DEBUG: terminating any other active server processes :>NOTICE: Message from PostgreSQL backend: :> The Postmaster has informed me that some other backend :> died abnormally and possibly corrupted shared memory. :> I have rolled back the current transaction and am :> going to terminate your database system connection and exit. :> Please reconnect to the database system and repeat your query. :>NOTICE: Message from PostgreSQL backend: :> The Postmaster has informed me that some other backend :> died abnormally and possibly corrupted shared memory. :> I have rolled back the current transaction and am :> going to terminate your database system connection and exit. :> Please reconnect to the database system and repeat your query. :>NOTICE: Message from PostgreSQL backend: :> The Postmaster has informed me that some other backend :> died abnormally and possibly corrupted shared memory. :> I have rolled back the current transaction and am :> going to terminate your database system connection and exit. :> Please reconnect to the database system and repeat your query. :>DEBUG: all server processes terminated; reinitializing shared memory and :>semaph :>ores :>DEBUG: database system was interrupted at 2004-01-09 05:22:52 EST :>DEBUG: checkpoint record is at 0/138CFD4 :>DEBUG: redo record is at 0/138CFD4; undo record is at 0/0; shutdown FALSE :>DEBUG: next transaction id: 45811837; next oid: 65205 :>DEBUG: database system was not properly shut down; automatic recovery in :>progre :>ss :>DEBUG: redo starts at 0/138D014 :>FATAL 2: write of clog file 43, offset 188416 failed: No space left on :>device :>DEBUG: startup process (pid 3785) exited with exit code 2 :>DEBUG: aborting startup due to startup process failure :> :>---------------------------(end of broadcast)--------------------------- :>TIP 6: Have you searched our list archives? :> :> http://archives.postgresql.org :> :> : : :
Aurangzeb M. Agha wrote: >I've not run a vacuum in quite some time, and that's because I've only >been doing reads from this DB. I was under the impression that I should >run vacuum when tables are heavily modified: > > That would be accurate. Did you recently add a second database? Sincerely, Joshua D. Drake >http://www.postgresql.org/docs/aw_pgsql_book/node110.html > >I guess I must have been mistaken? > >I'm looking through the docs now, but am having trouble finding this: how >can I vacuum the entire DB at once? > >Thx, > >Zeb > >On Fri, 9 Jan 2004, Joshua D. Drake wrote: > >: >:> >:>Note that the pg_xlog dir is huge! Here's its contents: >:> >:>[postgres - DB/pg_xlog]$ ls -al >:>total 32816 >:>drwx------ 2 postgres admin 4096 Mar 29 2003 . >:>drwx------ 6 postgres admin 4096 Jan 9 15:04 .. >:>-rwx------ 1 postgres admin 16777216 Jan 9 15:09 0000000000000001 >:>-rwx------ 1 postgres admin 16777216 Mar 29 2003 0000000000000002 >:> >:>What are these files, and what can I do to resolve this issue? >:> >:> >:The are check_point files. You need them. Have you ran a vacuum recently? >: >:Sincerely, >: >:Joshua D. Drake >: >: >: >:>Thx, >:> >:>Zeb >:> >:> >:>-- >:>DEBUG: statistics collector process (pid 2523) exited with exit code 1 >:>PGSTAT: Error closing temp stats file >:>PGSTAT: /usr/local/G101/App/DB/./global/pgstat.tmp.7823: No space left on >:>device >:>PGSTAT: AbDEBUG: statistics collector process (pid 2979) exited with exit >:>code >:>1 >:>FATAL 2: write of clog file 43, offset 188416 failed: No space left on >:>device >:>DEBUG: server process (pid 3741) exited with exit code 2 >:>DEBUG: terminating any other active server processes >:>NOTICE: Message from PostgreSQL backend: >:> The Postmaster has informed me that some other backend >:> died abnormally and possibly corrupted shared memory. >:> I have rolled back the current transaction and am >:> going to terminate your database system connection and exit. >:> Please reconnect to the database system and repeat your query. >:>NOTICE: Message from PostgreSQL backend: >:> The Postmaster has informed me that some other backend >:> died abnormally and possibly corrupted shared memory. >:> I have rolled back the current transaction and am >:> going to terminate your database system connection and exit. >:> Please reconnect to the database system and repeat your query. >:>NOTICE: Message from PostgreSQL backend: >:> The Postmaster has informed me that some other backend >:> died abnormally and possibly corrupted shared memory. >:> I have rolled back the current transaction and am >:> going to terminate your database system connection and exit. >:> Please reconnect to the database system and repeat your query. >:>DEBUG: all server processes terminated; reinitializing shared memory and >:>semaph >:>ores >:>DEBUG: database system was interrupted at 2004-01-09 05:22:52 EST >:>DEBUG: checkpoint record is at 0/138CFD4 >:>DEBUG: redo record is at 0/138CFD4; undo record is at 0/0; shutdown FALSE >:>DEBUG: next transaction id: 45811837; next oid: 65205 >:>DEBUG: database system was not properly shut down; automatic recovery in >:>progre >:>ss >:>DEBUG: redo starts at 0/138D014 >:>FATAL 2: write of clog file 43, offset 188416 failed: No space left on >:>device >:>DEBUG: startup process (pid 3785) exited with exit code 2 >:>DEBUG: aborting startup due to startup process failure >:> >:>---------------------------(end of broadcast)--------------------------- >:>TIP 6: Have you searched our list archives? >:> >:> http://archives.postgresql.org >:> >:> >: >: >: > > -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com PostgreSQL Replicator -- production quality replication for PostgreSQL
Here's the output of "df -m": [postgres - DB]$ df -m . Filesystem 1M-blocks Used Available Use% Mounted on - 63328 55308 4803 93% / Thx for the info. Rgs, Aurangzeb On Fri, 9 Jan 2004, Richard Huxton wrote: :On Friday 09 January 2004 20:31, Aurangzeb M. Agha wrote: :> I'm running Postgres 7.1.3, and just started having a problem where my :> dynamic site is going down (read-only DB, with no writes happening to the :> DB) regularly (every other day). I have no idea whay this is happening, :> and my search of the FAQ's and mail list don't bring up anything. i've :> attached the error from the log file, at the end of this message. :> :> Here's an output of the disk usage from within the DB dir :> :> [postgres - DB]$ du -k . :> 1716 ./base/1 :> 1716 ./base/16555 :> 5192 ./base/56048 :> 8628 ./base :> 116 ./global :> 32812 ./pg_xlog :> 11380 ./pg_clog :> 53192 . : :OK, and what does "df -m" show? That will display disk sizes and free space :remaining. Your error is that you have run out of disk space. : :> Note that the pg_xlog dir is huge! Here's its contents: : :Well - it's 32MB (2 x 16MB as you show below). : :> -rwx------ 1 postgres admin 16777216 Jan 9 15:09 0000000000000001 :> -rwx------ 1 postgres admin 16777216 Mar 29 2003 0000000000000002 :> :> What are these files, and what can I do to resolve this issue? : :They're transaction logs (see the section on WAL). You can probably reduce :them from their default size of 16MB, I'm guessing by changing some constant :in the source and re-compiling. : : -- Aurangzeb M. Agha | Email : ama@mltp.com | Home : +1 413 586.4863 | Pager : +1 413 785.7568 | : 4137857568@myairmail.com 73 Bridge St. #15 | Mobile: <coming soon> Northampton, MA 01060 | e-Fax : +1 978 246.0770 USA | PGP id: <coming soon>
No, I've not added any new DB's. In fact, what's puzzling is that this DB has been running without issue (except for one server restart) for the last nine months. Now, all of a sudden, with no DB changes, additions, etc... I'm getting this problem. Do you suggest that I still run a vacuumdb? Rgs, Zeb On Fri, 9 Jan 2004, Joshua D. Drake wrote: :Aurangzeb M. Agha wrote: : :>I've not run a vacuum in quite some time, and that's because I've only :>been doing reads from this DB. I was under the impression that I should :>run vacuum when tables are heavily modified: :> :> : :That would be accurate. Did you recently add a second database? : :Sincerely, : :Joshua D. Drake : : :>http://www.postgresql.org/docs/aw_pgsql_book/node110.html :> :>I guess I must have been mistaken? :> :>I'm looking through the docs now, but am having trouble finding this: how :>can I vacuum the entire DB at once? :> :>Thx, :> :>Zeb :> :>On Fri, 9 Jan 2004, Joshua D. Drake wrote: :> :>: :>:> :>:>Note that the pg_xlog dir is huge! Here's its contents: :>:> :>:>[postgres - DB/pg_xlog]$ ls -al :>:>total 32816 :>:>drwx------ 2 postgres admin 4096 Mar 29 2003 . :>:>drwx------ 6 postgres admin 4096 Jan 9 15:04 .. :>:>-rwx------ 1 postgres admin 16777216 Jan 9 15:09 0000000000000001 :>:>-rwx------ 1 postgres admin 16777216 Mar 29 2003 0000000000000002 :>:> :>:>What are these files, and what can I do to resolve this issue? :>:> :>:> :>:The are check_point files. You need them. Have you ran a vacuum recently? :>: :>:Sincerely, :>: :>:Joshua D. Drake :>: :>: :>: :>:>Thx, :>:> :>:>Zeb :>:> :>:> :>:>-- :>:>DEBUG: statistics collector process (pid 2523) exited with exit code 1 :>:>PGSTAT: Error closing temp stats file :>:>PGSTAT: /usr/local/G101/App/DB/./global/pgstat.tmp.7823: No space left on :>:>device :>:>PGSTAT: AbDEBUG: statistics collector process (pid 2979) exited with exit :>:>code :>:>1 :>:>FATAL 2: write of clog file 43, offset 188416 failed: No space left on :>:>device :>:>DEBUG: server process (pid 3741) exited with exit code 2 :>:>DEBUG: terminating any other active server processes :>:>NOTICE: Message from PostgreSQL backend: :>:> The Postmaster has informed me that some other backend :>:> died abnormally and possibly corrupted shared memory. :>:> I have rolled back the current transaction and am :>:> going to terminate your database system connection and exit. :>:> Please reconnect to the database system and repeat your query. :>:>NOTICE: Message from PostgreSQL backend: :>:> The Postmaster has informed me that some other backend :>:> died abnormally and possibly corrupted shared memory. :>:> I have rolled back the current transaction and am :>:> going to terminate your database system connection and exit. :>:> Please reconnect to the database system and repeat your query. :>:>NOTICE: Message from PostgreSQL backend: :>:> The Postmaster has informed me that some other backend :>:> died abnormally and possibly corrupted shared memory. :>:> I have rolled back the current transaction and am :>:> going to terminate your database system connection and exit. :>:> Please reconnect to the database system and repeat your query. :>:>DEBUG: all server processes terminated; reinitializing shared memory and :>:>semaph :>:>ores :>:>DEBUG: database system was interrupted at 2004-01-09 05:22:52 EST :>:>DEBUG: checkpoint record is at 0/138CFD4 :>:>DEBUG: redo record is at 0/138CFD4; undo record is at 0/0; shutdown FALSE :>:>DEBUG: next transaction id: 45811837; next oid: 65205 :>:>DEBUG: database system was not properly shut down; automatic recovery in :>:>progre :>:>ss :>:>DEBUG: redo starts at 0/138D014 :>:>FATAL 2: write of clog file 43, offset 188416 failed: No space left on :>:>device :>:>DEBUG: startup process (pid 3785) exited with exit code 2 :>:>DEBUG: aborting startup due to startup process failure :>:> :>:>---------------------------(end of broadcast)--------------------------- :>:>TIP 6: Have you searched our list archives? :>:> :>:> http://archives.postgresql.org :>:> :>:> :>: :>: :>: :> :> : : : -- Aurangzeb M. Agha | Email : ama@mltp.com | Home : +1 413 586.4863 | Pager : +1 413 785.7568 | : 4137857568@myairmail.com 73 Bridge St. #15 | Mobile: <coming soon> Northampton, MA 01060 | e-Fax : +1 978 246.0770 USA | PGP id: <coming soon>
Hello, Wait... from the df you provided you have space left on the device: postgres - DB]$ df -m . Filesystem 1M-blocks Used Available Use% Mounted on - 63328 55308 4803 93% / Perhaps you are out of inodes? Sincerely, Josuha D. Drake Aurangzeb M. Agha wrote: >No, I've not added any new DB's. In fact, what's puzzling is that this DB >has been running without issue (except for one server restart) for the >last nine months. Now, all of a sudden, with no DB changes, additions, >etc... I'm getting this problem. > >Do you suggest that I still run a vacuumdb? > >Rgs, > >Zeb > > > >On Fri, 9 Jan 2004, Joshua D. Drake wrote: > >:Aurangzeb M. Agha wrote: >: >:>I've not run a vacuum in quite some time, and that's because I've only >:>been doing reads from this DB. I was under the impression that I should >:>run vacuum when tables are heavily modified: >:> >:> >: >:That would be accurate. Did you recently add a second database? >: >:Sincerely, >: >:Joshua D. Drake >: >: >:>http://www.postgresql.org/docs/aw_pgsql_book/node110.html >:> >:>I guess I must have been mistaken? >:> >:>I'm looking through the docs now, but am having trouble finding this: how >:>can I vacuum the entire DB at once? >:> >:>Thx, >:> >:>Zeb >:> >:>On Fri, 9 Jan 2004, Joshua D. Drake wrote: >:> >:>: >:>:> >:>:>Note that the pg_xlog dir is huge! Here's its contents: >:>:> >:>:>[postgres - DB/pg_xlog]$ ls -al >:>:>total 32816 >:>:>drwx------ 2 postgres admin 4096 Mar 29 2003 . >:>:>drwx------ 6 postgres admin 4096 Jan 9 15:04 .. >:>:>-rwx------ 1 postgres admin 16777216 Jan 9 15:09 0000000000000001 >:>:>-rwx------ 1 postgres admin 16777216 Mar 29 2003 0000000000000002 >:>:> >:>:>What are these files, and what can I do to resolve this issue? >:>:> >:>:> >:>:The are check_point files. You need them. Have you ran a vacuum recently? >:>: >:>:Sincerely, >:>: >:>:Joshua D. Drake >:>: >:>: >:>: >:>:>Thx, >:>:> >:>:>Zeb >:>:> >:>:> >:>:>-- >:>:>DEBUG: statistics collector process (pid 2523) exited with exit code 1 >:>:>PGSTAT: Error closing temp stats file >:>:>PGSTAT: /usr/local/G101/App/DB/./global/pgstat.tmp.7823: No space left on >:>:>device >:>:>PGSTAT: AbDEBUG: statistics collector process (pid 2979) exited with exit >:>:>code >:>:>1 >:>:>FATAL 2: write of clog file 43, offset 188416 failed: No space left on >:>:>device >:>:>DEBUG: server process (pid 3741) exited with exit code 2 >:>:>DEBUG: terminating any other active server processes >:>:>NOTICE: Message from PostgreSQL backend: >:>:> The Postmaster has informed me that some other backend >:>:> died abnormally and possibly corrupted shared memory. >:>:> I have rolled back the current transaction and am >:>:> going to terminate your database system connection and exit. >:>:> Please reconnect to the database system and repeat your query. >:>:>NOTICE: Message from PostgreSQL backend: >:>:> The Postmaster has informed me that some other backend >:>:> died abnormally and possibly corrupted shared memory. >:>:> I have rolled back the current transaction and am >:>:> going to terminate your database system connection and exit. >:>:> Please reconnect to the database system and repeat your query. >:>:>NOTICE: Message from PostgreSQL backend: >:>:> The Postmaster has informed me that some other backend >:>:> died abnormally and possibly corrupted shared memory. >:>:> I have rolled back the current transaction and am >:>:> going to terminate your database system connection and exit. >:>:> Please reconnect to the database system and repeat your query. >:>:>DEBUG: all server processes terminated; reinitializing shared memory and >:>:>semaph >:>:>ores >:>:>DEBUG: database system was interrupted at 2004-01-09 05:22:52 EST >:>:>DEBUG: checkpoint record is at 0/138CFD4 >:>:>DEBUG: redo record is at 0/138CFD4; undo record is at 0/0; shutdown FALSE >:>:>DEBUG: next transaction id: 45811837; next oid: 65205 >:>:>DEBUG: database system was not properly shut down; automatic recovery in >:>:>progre >:>:>ss >:>:>DEBUG: redo starts at 0/138D014 >:>:>FATAL 2: write of clog file 43, offset 188416 failed: No space left on >:>:>device >:>:>DEBUG: startup process (pid 3785) exited with exit code 2 >:>:>DEBUG: aborting startup due to startup process failure >:>:> >:>:>---------------------------(end of broadcast)--------------------------- >:>:>TIP 6: Have you searched our list archives? >:>:> >:>:> http://archives.postgresql.org >:>:> >:>:> >:>: >:>: >:>: >:> >:> >: >: >: > > > -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com PostgreSQL Replicator -- production quality replication for PostgreSQL
On Fri, 9 Jan 2004, Aurangzeb M. Agha wrote: > I'm running Postgres 7.1.3, and just started having a problem where my > dynamic site is going down (read-only DB, with no writes happening to the > DB) regularly (every other day). I have no idea whay this is happening, > and my search of the FAQ's and mail list don't bring up anything. i've > attached the error from the log file, at the end of this message. > > Here's an output of the disk usage from within the DB dir > > [postgres - DB]$ du -k . > 1716 ./base/1 > 1716 ./base/16555 > 5192 ./base/56048 > 8628 ./base > 116 ./global > 32812 ./pg_xlog > 11380 ./pg_clog > 53192 . > > Note that the pg_xlog dir is huge! Here's its contents: That's normal. 32 meg isn't really that big. How big of a partition do you have this database on? Your best bet is to put it on a bigger partition. the pg_xlog directory is gonna be at least 16 megs for most installations. Do you have any transactions sitting at idle keeping postgresql from recycling the xlogs? Normally when you run out of space it's a lack of vacuuming, but here it just sounds like either the partition is too small or the postgres user is living under a quota on that partition.
:That's normal. 32 meg isn't really that big. How big of a partition do :you have this database on? Your best bet is to put it on a bigger :partition. the pg_xlog directory is gonna be at least 16 megs for most :installations. : :Do you have any transactions sitting at idle keeping postgresql from :recycling the xlogs? : :Normally when you run out of space it's a lack of vacuuming, but here it :just sounds like either the partition is too small or the postgres user is :living under a quota on that partition. Scott -- I'm at 93% disk usage: [postgres - DB]$ df -m . Filesystem 1M-blocks Used Available Use% Mounted on - 63328 55308 4803 93% / I don't know about transactions sitting idle--like I mentioned this DB is read-only, and there's no writes taking place. Would I still need to worry about transactions? How can I check to see if there are any? Re vacuuming, I haven't run vacuum for the same reason as above. This is only a read-only DB, and I didn't think a vacuum was necessary if there's no writes happening to the DB. My concern is why this problem is happening now (on a read-only DB). The DB has had nothing written to it, no new DB's have been created, and the disk usage has stayed constant. I'm stumped as to why this problem has started all of a sudden. Rgs, Zeb
On Fri, Jan 09, 2004 at 01:11:51PM -0800, Aurangzeb M. Agha wrote: > I've not run a vacuum in quite some time, and that's because I've only > been doing reads from this DB. I was under the impression that I should > run vacuum when tables are heavily modified: > > http://www.postgresql.org/docs/aw_pgsql_book/node110.html Another reason to vacuum is that the pg_clog files are deleted if they are no longer needed. So if you had vacuumed, there would be less files there (I'm not sure if this was the case on 7.1 though). Maybe in the meantime you could move one of the pg_xlog files to another filesystem and make a symlink to the correct position. That should given you some breathing room. Vacuum right after that. Also keep in mind that deleted files that are kept open by running processes do not release the occupied space ... see if you have some process with an open file on that filesystem which is no longer present (some clever usage of fuser and ps should give you that info) -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "We are who we choose to be", sang the goldfinch when the sun is high (Sandman)
> Wait... from the df you provided you have space left on the device: > > postgres - DB]$ df -m . > Filesystem 1M-blocks Used Available Use% Mounted on > - 63328 55308 4803 93% / > > Perhaps you are out of inodes? Remember that df shows the *total* space left on the device, especially when run as root. Some percent are reserved for root, however, AFAIR pretty much exactly 7% in my experience, eg. user postgres can't use them but rather sees a device that's out of space. That may be the case here. Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
On Fri, 9 Jan 2004, Aurangzeb M. Agha wrote: > :That's normal. 32 meg isn't really that big. How big of a partition do > :you have this database on? Your best bet is to put it on a bigger > :partition. the pg_xlog directory is gonna be at least 16 megs for most > :installations. > : > :Do you have any transactions sitting at idle keeping postgresql from > :recycling the xlogs? > : > :Normally when you run out of space it's a lack of vacuuming, but here it > :just sounds like either the partition is too small or the postgres user is > :living under a quota on that partition. > > Scott -- I'm at 93% disk usage: > > [postgres - DB]$ df -m . > Filesystem 1M-blocks Used Available Use% Mounted on > - 63328 55308 4803 93% / Do you have root access to it? if so, set the reserved space for root to be 0%, and then try vacuuming. Vacuuming requires some free space, and since you're pretty much out, it isn't gonna be able to complete. > I don't know about transactions sitting idle--like I mentioned this DB is > read-only, and there's no writes taking place. Would I still need to > worry about transactions? How can I check to see if there are any? If you stop and restart it all transactions that are holding will be disconnected, so that would clear that up. But it looks to me like you just have it on too small of a partition. On a modern multi-gigabyte hard drive, Postgresql's usage of tens of megs for transactions logs is no big deal, but on a smaller partition like yours it can cause problems. > Re vacuuming, I haven't run vacuum for the same reason as above. This is > only a read-only DB, and I didn't think a vacuum was necessary if there's > no writes happening to the DB. Well, if the database has been emptied and refilled it would use the space, so it might be something like that. Or that it was right on the edge of being out of space and some single alter user kinda thing drove it over the edge. Hard to say. It looks like your individual databases are pretty small, so I doubt there's lots of lost space in them. Can you get a larger partition to move the data directory to on that box? I'd recommend having about twice the max size of your database as a minimum, which would be 120 to 150 megs for you.
"Joshua D. Drake" <jd@commandprompt.com> writes: > Perhaps you are out of inodes? Either that or he's hitting a per-user quota limit, which is perhaps more likely. regards, tom lane
On Fri, 9 Jan 2004, Aurangzeb M. Agha wrote: > Here's the output of "df -m": > > [postgres - DB]$ df -m . > Filesystem 1M-blocks Used Available Use% Mounted on > - 63328 55308 4803 93% / > But your du, below, of the postgres data directory shows 53MB in use. That's an order of magnitude smaller than the 55GB the above appears to be saying is used in the db. Start again with du -sk /* and follow the biggest numbers. Ideas: - have you logfiles that processes, such as postmaster, are writing to and have got huge? - did you have the above and you deleted such files without restarting the process that was writing to the deleted files? - /var/(mail|tmp|whatever) is huge due to huge amounts of email recieved and not deleted - /home/whatever is huge due to logfiles, downloads (inc. application caches), datafiles, software builds, ... -- Nigel Andrews > Thx for the info. > > > Rgs, > > Aurangzeb > > > On Fri, 9 Jan 2004, Richard Huxton wrote: > > :On Friday 09 January 2004 20:31, Aurangzeb M. Agha wrote: > :> I'm running Postgres 7.1.3, and just started having a problem where my > :> dynamic site is going down (read-only DB, with no writes happening to the > :> DB) regularly (every other day). I have no idea whay this is happening, > :> and my search of the FAQ's and mail list don't bring up anything. i've > :> attached the error from the log file, at the end of this message. > :> > :> Here's an output of the disk usage from within the DB dir > :> > :> [postgres - DB]$ du -k . > :> 1716 ./base/1 > :> 1716 ./base/16555 > :> 5192 ./base/56048 > :> 8628 ./base > :> 116 ./global > :> 32812 ./pg_xlog > :> 11380 ./pg_clog > :> 53192 . > : > :OK, and what does "df -m" show? That will display disk sizes and free space > :remaining. Your error is that you have run out of disk space. > : > :> Note that the pg_xlog dir is huge! Here's its contents: > : > :Well - it's 32MB (2 x 16MB as you show below). > : > :> -rwx------ 1 postgres admin 16777216 Jan 9 15:09 0000000000000001 > :> -rwx------ 1 postgres admin 16777216 Mar 29 2003 0000000000000002 > :> > :> What are these files, and what can I do to resolve this issue? > : > :They're transaction logs (see the section on WAL). You can probably reduce > :them from their default size of 16MB, I'm guessing by changing some constant > :in the source and re-compiling. > : > : > >
On Fri, 9 Jan 2004, Aurangzeb M. Agha wrote: > > Scott -- I'm at 93% disk usage: > > [postgres - DB]$ df -m . > Filesystem 1M-blocks Used Available Use% Mounted on > - 63328 55308 4803 93% / > BTW, don't do -m with df it confuses us old folk, as you probably noticed from the responses. :) -- Nigel
Right! Thus my quandry. Re inodes, how can I check this? But why would this be? Is Postgres sucking up inodes just sitting there as a read-only DB? AMA On Fri, 9 Jan 2004, Joshua D. Drake wrote: :Hello, : : Wait... from the df you provided you have space left on the device: : :postgres - DB]$ df -m . :Filesystem 1M-blocks Used Available Use% Mounted on :- 63328 55308 4803 93% / : :Perhaps you are out of inodes? : :Sincerely, : :Josuha D. Drake : : : : :Aurangzeb M. Agha wrote: : :>No, I've not added any new DB's. In fact, what's puzzling is that this DB :>has been running without issue (except for one server restart) for the :>last nine months. Now, all of a sudden, with no DB changes, additions, :>etc... I'm getting this problem. :> :>Do you suggest that I still run a vacuumdb? :> :>Rgs, :> :>Zeb :> :> :> :>On Fri, 9 Jan 2004, Joshua D. Drake wrote: :> :>:Aurangzeb M. Agha wrote: :>: :>:>I've not run a vacuum in quite some time, and that's because I've only :>:>been doing reads from this DB. I was under the impression that I should :>:>run vacuum when tables are heavily modified: :>:> :>:> :>: :>:That would be accurate. Did you recently add a second database? :>: :>:Sincerely, :>: :>:Joshua D. Drake :>: :>: :>:>http://www.postgresql.org/docs/aw_pgsql_book/node110.html :>:> :>:>I guess I must have been mistaken? :>:> :>:>I'm looking through the docs now, but am having trouble finding this: how :>:>can I vacuum the entire DB at once? :>:> :>:>Thx, :>:> :>:>Zeb :>:> :>:>On Fri, 9 Jan 2004, Joshua D. Drake wrote: :>:> :>:>: :>:>:> :>:>:>Note that the pg_xlog dir is huge! Here's its contents: :>:>:> :>:>:>[postgres - DB/pg_xlog]$ ls -al :>:>:>total 32816 :>:>:>drwx------ 2 postgres admin 4096 Mar 29 2003 . :>:>:>drwx------ 6 postgres admin 4096 Jan 9 15:04 .. :>:>:>-rwx------ 1 postgres admin 16777216 Jan 9 15:09 0000000000000001 :>:>:>-rwx------ 1 postgres admin 16777216 Mar 29 2003 0000000000000002 :>:>:> :>:>:>What are these files, and what can I do to resolve this issue? :>:>:> :>:>:> :>:>:The are check_point files. You need them. Have you ran a vacuum recently? :>:>: :>:>:Sincerely, :>:>: :>:>:Joshua D. Drake :>:>: :>:>: :>:>: :>:>:>Thx, :>:>:> :>:>:>Zeb :>:>:> :>:>:> :>:>:>-- :>:>:>DEBUG: statistics collector process (pid 2523) exited with exit code 1 :>:>:>PGSTAT: Error closing temp stats file :>:>:>PGSTAT: /usr/local/G101/App/DB/./global/pgstat.tmp.7823: No space left on :>:>:>device :>:>:>PGSTAT: AbDEBUG: statistics collector process (pid 2979) exited with exit :>:>:>code :>:>:>1 :>:>:>FATAL 2: write of clog file 43, offset 188416 failed: No space left on :>:>:>device :>:>:>DEBUG: server process (pid 3741) exited with exit code 2 :>:>:>DEBUG: terminating any other active server processes :>:>:>NOTICE: Message from PostgreSQL backend: :>:>:> The Postmaster has informed me that some other backend :>:>:> died abnormally and possibly corrupted shared memory. :>:>:> I have rolled back the current transaction and am :>:>:> going to terminate your database system connection and exit. :>:>:> Please reconnect to the database system and repeat your query. :>:>:>NOTICE: Message from PostgreSQL backend: :>:>:> The Postmaster has informed me that some other backend :>:>:> died abnormally and possibly corrupted shared memory. :>:>:> I have rolled back the current transaction and am :>:>:> going to terminate your database system connection and exit. :>:>:> Please reconnect to the database system and repeat your query. :>:>:>NOTICE: Message from PostgreSQL backend: :>:>:> The Postmaster has informed me that some other backend :>:>:> died abnormally and possibly corrupted shared memory. :>:>:> I have rolled back the current transaction and am :>:>:> going to terminate your database system connection and exit. :>:>:> Please reconnect to the database system and repeat your query. :>:>:>DEBUG: all server processes terminated; reinitializing shared memory and :>:>:>semaph :>:>:>ores :>:>:>DEBUG: database system was interrupted at 2004-01-09 05:22:52 EST :>:>:>DEBUG: checkpoint record is at 0/138CFD4 :>:>:>DEBUG: redo record is at 0/138CFD4; undo record is at 0/0; shutdown FALSE :>:>:>DEBUG: next transaction id: 45811837; next oid: 65205 :>:>:>DEBUG: database system was not properly shut down; automatic recovery in :>:>:>progre :>:>:>ss :>:>:>DEBUG: redo starts at 0/138D014 :>:>:>FATAL 2: write of clog file 43, offset 188416 failed: No space left on :>:>:>device :>:>:>DEBUG: startup process (pid 3785) exited with exit code 2 :>:>:>DEBUG: aborting startup due to startup process failure :>:>:> :>:>:>---------------------------(end of broadcast)--------------------------- :>:>:>TIP 6: Have you searched our list archives? :>:>:> :>:>:> http://archives.postgresql.org :>:>:> :>:>:> :>:>: :>:>: :>:>: :>:> :>:> :>: :>: :>: :> :> :> : : : -- Aurangzeb M. Agha | Email : ama@mltp.com | Home : +1 413 586.4863 | Pager : +1 413 785.7568 | : 4137857568@myairmail.com 73 Bridge St. #15 | Mobile: <coming soon> Northampton, MA 01060 | e-Fax : +1 978 246.0770 USA | PGP id: <coming soon>
"scott.marlowe" <scott.marlowe@ihs.com> writes: >> [postgres - DB]$ df -m . >> Filesystem 1M-blocks Used Available Use% Mounted on >> - 63328 55308 4803 93% / > Do you have root access to it? if so, set the reserved space for root to > be 0%, and then try vacuuming. Vacuuming requires some free space, and > since you're pretty much out, it isn't gonna be able to complete. Look again --- it's showing free space in MB not KB. He's got 4.8GB free. (Although that might be free-from-root's-point-of-view, rather than what an unprivileged user can use ...) regards, tom lane
On Fri, 9 Jan 2004, Aurangzeb M. Agha wrote: > DEBUG: statistics collector process (pid 2523) exited with exit code 1 > PGSTAT: Error closing temp stats file > PGSTAT: /usr/local/G101/App/DB/./global/pgstat.tmp.7823: No space left on > device To me it does not sound strange that the database is growing when the stat collector updates the tables with statistics. And since there are updates it would have been good to have vacuumed avery once in a while. I don't know the internals of pg as well as some of the other people who have answered, but as far as I know the stat collector is not special in any way but is updating the stat tables. -- /Dennis Björklund
"Aurangzeb M. Agha" <ama-list@mltp.com> writes: > Re inodes, how can I check this? "df -i" should help. > But why would this be? Is Postgres > sucking up inodes just sitting there as a read-only DB? I think you have missed the point here. Postgres is using 0.1 percent of your disk; whatever is eating disk space or inodes is somewhere in the 92.9% of the disk that you have not told us about. You are focusing on killing the messenger instead of finding the true source of the problem. You should also check into the per-user-quota possibility. regards, tom lane
On Fri, 9 Jan 2004, Tom Lane wrote: > "scott.marlowe" <scott.marlowe@ihs.com> writes: > >> [postgres - DB]$ df -m . > >> Filesystem 1M-blocks Used Available Use% Mounted on > >> - 63328 55308 4803 93% / > > > Do you have root access to it? if so, set the reserved space for root to > > be 0%, and then try vacuuming. Vacuuming requires some free space, and > > since you're pretty much out, it isn't gonna be able to complete. > > Look again --- it's showing free space in MB not KB. He's got 4.8GB > free. (Although that might be free-from-root's-point-of-view, rather > than what an unprivileged user can use ...) Good catch. I'm so used to using raw df output... Yeah, it looks like root's reserved space is getting him to me, but since it's the root partition, it's possible it's out of inodes as well. Aurangzeb, try running df -i to see how many inodes you have left...
On Fri, 9 Jan 2004, Aurangzeb M. Agha wrote: > Right! Thus my quandry. > > Re inodes, how can I check this? But why would this be? Is Postgres > sucking up inodes just sitting there as a read-only DB? If you are out of inodes, I seriously doubt it is Postgresql's fault, as you seem to be running everything on the root partition here, it could be any other process more likely than postgresql is using all the inodes. Basically, when you make a lot of small files you can run out of inodes. Since postgresql tends to make a few rather large files, it's usually not a concern. df -i shows inode usage. On linux, you can change the % reserved for root to 1% with tune2fs: tune2fs -m 1
I would suspect some *other* service is using the 4G for transient storage every now and again, and it just so happens that Pg is getting tripped up. What else does this machine run ? regards Mark Nigel J. Andrews wrote: >On Fri, 9 Jan 2004, Aurangzeb M. Agha wrote: > > > >>Here's the output of "df -m": >> >>[postgres - DB]$ df -m . >>Filesystem 1M-blocks Used Available Use% Mounted on >>- 63328 55308 4803 93% / >> >> >> > >But your du, below, of the postgres data directory shows 53MB in use. That's an >order of magnitude smaller than the 55GB the above appears to be saying is used >in the db. > > > >
On Fri, 9 Jan 2004, scott.marlowe wrote: > On Fri, 9 Jan 2004, Aurangzeb M. Agha wrote: > > > Right! Thus my quandry. > > > > Re inodes, how can I check this? But why would this be? Is Postgres > > sucking up inodes just sitting there as a read-only DB? > > If you are out of inodes, I seriously doubt it is Postgresql's fault, as > you seem to be running everything on the root partition here, it could be > any other process more likely than postgresql is using all the inodes. > Basically, when you make a lot of small files you can run out of inodes. And a common culprit is whatever is being used for usenet caching/serving...or ordinary mail which is just accumulating in /var/mail (or whereever). > Since postgresql tends to make a few rather large files, it's usually not > a concern. > > df -i shows inode usage. > > On linux, you can change the % reserved for root to 1% with tune2fs: > > tune2fs -m 1 -- Nigel J. Andrews
On Sat, 10 Jan 2004, Nigel J. Andrews wrote: > And a common culprit is whatever is being used for usenet caching/serving...or > ordinary mail which is just accumulating in /var/mail (or whereever). Sheesh. Did I really put ordinary mailbox mail in the uses up inodes category? I should taken out and whi....errrr...on the other hand better not might be too exciting for some and spark off a whole new xxx web site. Nigel Andrews
On Sat, Jan 10, 2004 at 00:38:43 +0000, "Nigel J. Andrews" <nandrews@investsystems.co.uk> wrote: > > On Sat, 10 Jan 2004, Nigel J. Andrews wrote: > > And a common culprit is whatever is being used for usenet caching/serving...or > > ordinary mail which is just accumulating in /var/mail (or whereever). > > Sheesh. Did I really put ordinary mailbox mail in the uses up inodes category? > I should taken out and whi....errrr...on the other hand better not might be too > exciting for some and spark off a whole new xxx web site. While mbox mailboxes only take one inode per mailbox, maildir mailboxes take one inode per message. So if you are using maildir you could potentially use a significant number of inodes for email.
nandrews@investsystems.co.uk ("Nigel J. Andrews") writes: > On Sat, 10 Jan 2004, Nigel J. Andrews wrote: >> And a common culprit is whatever is being used for usenet caching/serving...or >> ordinary mail which is just accumulating in /var/mail (or whereever). > > Sheesh. Did I really put ordinary mailbox mail in the uses up inodes category? > I should taken out and whi....errrr...on the other hand better not might be too > exciting for some and spark off a whole new xxx web site. Mail accumulating in "mbox" spools shouldn't chew up inodes too badly, but if you're using "Maildir" to spool mail, whether incoming or outgoing, it sure can... -- let name="cbbrowne" and tld="libertyrms.info" in String.concat "@" [name;tld];; <http://dev6.int.libertyrms.com/> Christopher Browne (416) 646 3304 x124 (land)
On Fri, 9 Jan 2004, Tom Lane wrote: :"scott.marlowe" <scott.marlowe@ihs.com> writes: :>> [postgres - DB]$ df -m . :>> Filesystem 1M-blocks Used Available Use% Mounted on :>> - 63328 55308 4803 93% / : :> Do you have root access to it? if so, set the reserved space for root to :> be 0%, and then try vacuuming. Vacuuming requires some free space, and :> since you're pretty much out, it isn't gonna be able to complete. : :Look again --- it's showing free space in MB not KB. He's got 4.8GB :free. (Although that might be free-from-root's-point-of-view, rather :than what an unprivileged user can use ...) Tom -- You're right here. This account is running on a virtual server, so the 4.8GB free is not for this user. Re i-nodes: [admin - temp]$ df -i . Filesystem Inodes IUsed IFree IUse% Mounted on - 8241152 1819166 6421986 23% / However, I did just get word from the ISP that they had some sort of error log rotation error which was keeping logs from being deleted off the machine, taking up a lot of space (for this user account). So the 93% is aparently not a good representation of the disk usage, as its not for this specific user account. Rgs, Zeb