Re: Database Disappeared - Mailing list pgsql-general

From Roberts, Adam
Subject Re: Database Disappeared
Date
Msg-id 1098229321.3160.4.camel@gunther
Whole thread Raw
In response to Database Disappeared  ("Roberts, Adam" <Adam.Roberts@echostar.com>)
Responses Re: Database Disappeared  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Tom,
Thanks so much for your reply.  In fact we were not vacuuming this
particular db at all! While we have other postgres db's on this same
host, this was a "recently" added db and it was never added to the
individual db list to be vacuumed.  We have since modified the cronjob
to perform a "vacuumdb -a -z" to avoid such an omission in the future.

However, I would like to make doubly sure that the behavior we saw is
explainable by a trans id wraparound failure and that we don't have
something else lurking.  We originally discovered the problem last
Friday because the nightly pg_dumpall did not export any rows from any
of the tables associated with this instance. From our failure in May, we
learned to monitor this backup for problems.  Also, when connecting to
the instance, (either by the OpenACS application or via psql) selecting,
add/del/modifying data from tables in the db all acted completely
normal.  But, if we attempted to execute a "\d" or "\df" from psql
connected either as postgres or as the db owner (nsadmin), we would get
a "No relations found." error.  So, in order to preserve the data as
much as possible, we actually performed a "copy to..." stmt for each
individual table based on our known list of tables so that we could
re-import the data back into a completely rebuilt instance (we dropped
and recreated the instance in order to restore all functionality).  The
behavior remained consistent even after bouncing the postmaster and
aolserver.

So, my main question is, is it reasonable to say that a trans id
wraparound failure could create a situation in which you could
use/manipulate user data tables if you refer to the data tables directly
but if you tried to use a util (such as pgdump) or an internal psql
query like \d or \df it would appear that you have no data tables?

Thanks again for your help - we are reviewing all routine maint tasks
thoroughly for our environment.


On Mon, 2004-10-18 at 16:23, Tom Lane wrote:
> "Roberts, Adam" <Adam.Roberts@echostar.com> writes:
> > The following is a link to the last post we had with this same
problem:
> > http://openacs.org/forums/message-view?message_id=185837
>
> Given the comment about 500M pg_clog, I would venture that you aren't
> doing adequate vacuuming and have therefore run into a transaction ID
> wraparound failure.  500Mb clog corresponds to 2 billion transactions
> which is right about where things would blow up if you'd not done the
> necessary database-wide vacuums; and the fact that clog hasn't gotten
> truncated is pretty strong evidence that you didn't.
>
> Please read the administrator's guide:
> http://www.postgresql.org/docs/7.4/static/maintenance.html
> (adjust for your PG version as needed)
>
>                       regards, tom lane
>

pgsql-general by date:

Previous
From: pw
Date:
Subject: Re: How do I recover from>> pg_xlog/0000000000000000 (log
Next
From: Alvaro Herrera
Date:
Subject: Re: How do I recover from>> pg_xlog/0000000000000000 (log