Thread: select failure

select failure

From
"Ian Cass"
Date:
Ideas where I should look for the answer to this problem & how I should fix
it?

observer=# select host, current_route, count(current_route)
observer-# from messages_200205
observer-# group by host, current_route;
FATAL 2:  open of /var/lib/postgres/data/pg_clog/0534 failed: No such file
or directory
server closed the connection unexpectedly       This probably means the server terminated abnormally       before or
whileprocessing the request.
 
The connection to the server was lost. Attempting reset: NOTICE:  Message
from PostgreSQL backend:       The Postmaster has informed me that some other backend       died abnormally and
possiblycorrupted shared memory.       I have rolled back the current transaction and am       going to terminate your
databasesystem connection and exit.       Please reconnect to the database system and repeat your query.
 
Failed.

--
Ian Cass



Re: select failure

From
Gunther Schadow
Date:
Ian Cass wrote:

> Ideas where I should look for the answer to this problem & how I should fix
> it?
> 
> observer=# select host, current_route, count(current_route)
> observer-# from messages_200205
> observer-# group by host, current_route;
> FATAL 2:  open of /var/lib/postgres/data/pg_clog/0534 failed: No such file
> or directory


OUCH! that looks quite bad. It's a postgresql internal error.
What version of PostgreSQL do you run? Is this some beta version?
Or do you have someone deleting files randomly?

regards,
-Gunther



-- 
Gunther Schadow, M.D., Ph.D.                    gschadow@regenstrief.org
Medical Information Scientist      Regenstrief Institute for Health Care
Adjunct Assistant Professor        Indiana University School of Medicine
tel:1(317)630-7960                         http://aurora.regenstrief.org




Re: select failure

From
"Ian Cass"
Date:
Debian apt-get installed 7.2.1. No I don't have ppl deleting files.

What do I need to do to fix it?

Worst case, I can probably remove the table and rebuild it. Although it
contains about 27million rows and will take 2 or 3 days to rebuild.

Thing is, how did it happen and is it likely to happen again?

--
Ian Cass

----- Original Message -----
From: "Gunther Schadow" <gunther@aurora.regenstrief.org>
To: "Ian Cass" <ian.cass@mblox.com>
Cc: <pgsql-sql@postgresql.org>
Sent: Sunday, June 09, 2002 7:10 PM
Subject: Re: [SQL] select failure


> Ian Cass wrote:
>
> > Ideas where I should look for the answer to this problem & how I should
fix
> > it?
> >
> > observer=# select host, current_route, count(current_route)
> > observer-# from messages_200205
> > observer-# group by host, current_route;
> > FATAL 2:  open of /var/lib/postgres/data/pg_clog/0534 failed: No such
file
> > or directory
>
>
> OUCH! that looks quite bad. It's a postgresql internal error.
> What version of PostgreSQL do you run? Is this some beta version?
> Or do you have someone deleting files randomly?
>
> regards,
> -Gunther
>
>
>
> --
> Gunther Schadow, M.D., Ph.D.                    gschadow@regenstrief.org
> Medical Information Scientist      Regenstrief Institute for Health Care
> Adjunct Assistant Professor        Indiana University School of Medicine
> tel:1(317)630-7960                         http://aurora.regenstrief.org
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>



Re: select failure

From
Tom Lane
Date:
"Ian Cass" <ian.cass@mblox.com> writes:
> FATAL 2:  open of /var/lib/postgres/data/pg_clog/0534 failed: No such file
> or directory

We have seen a couple reports like this one, eg
http://archives.postgresql.org/pgsql-general/2002-03/msg00179.php
http://archives.postgresql.org/pgsql-novice/2002-06/msg00027.php

It could be that all of you have been bitten by hardware problems
(viz, dropped bits in xmin/xmax fields), or there might be some
software bug that's leading to this failure.  So far, no one's
been very cooperative about digging into a failure case to try to
determine exactly what happened...
        regards, tom lane


Re: select failure

From
Gunther Schadow
Date:
Ian Cass wrote:

> Debian apt-get installed 7.2.1. No I don't have ppl deleting files.
> 
> What do I need to do to fix it?
> 
> Worst case, I can probably remove the table and rebuild it. Although it
> contains about 27million rows and will take 2 or 3 days to rebuild.
> 
> Thing is, how did it happen and is it likely to happen again?


I don't know. But something is really in very bad shape. I suggest
you backtrack one minor version to 7.2.

Rebuilding the entire thing is what I would do. However, I don't
really undestand what that pg_clog subdirectory is. In 2.7 that
I use I recall that all data is in the data/base directory just
as it was since time began. If I ever had a file vanish from there
I would definitely completely rebuild the database (from backups
if any or from scratch.)

Rebuilding 27million rows shouldn't take 2 or 3 days. Are you
sure you do a COPY ... FROM instead of individual INSERTs? Also
you may consider turning off indexes and deferring foreign key
triggers. However, I don't need to do this to load a similar
sized database overnight. YMMV, depends on machine speed too,
of course.

Good luck, and beware, I'm not really the pgsql geek who really
understands your error. I just use common sense, so, don't take
my word for gold w/r/t the recommendation to backtrack version and
rebuild.

regards,
-Gunther


> 
> --
> Ian Cass
> 
> ----- Original Message -----
> From: "Gunther Schadow" <gunther@aurora.regenstrief.org>
> To: "Ian Cass" <ian.cass@mblox.com>
> Cc: <pgsql-sql@postgresql.org>
> Sent: Sunday, June 09, 2002 7:10 PM
> Subject: Re: [SQL] select failure
> 
> 
> 
>>Ian Cass wrote:
>>
>>
>>>Ideas where I should look for the answer to this problem & how I should
>>>
> fix
> 
>>>it?
>>>
>>>observer=# select host, current_route, count(current_route)
>>>observer-# from messages_200205
>>>observer-# group by host, current_route;
>>>FATAL 2:  open of /var/lib/postgres/data/pg_clog/0534 failed: No such
>>>
> file
> 
>>>or directory
>>>
>>
>>OUCH! that looks quite bad. It's a postgresql internal error.
>>What version of PostgreSQL do you run? Is this some beta version?
>>Or do you have someone deleting files randomly?
>>
>>regards,
>>-Gunther
>>
>>
>>
>>--
>>Gunther Schadow, M.D., Ph.D.                    gschadow@regenstrief.org
>>Medical Information Scientist      Regenstrief Institute for Health Care
>>Adjunct Assistant Professor        Indiana University School of Medicine
>>tel:1(317)630-7960                         http://aurora.regenstrief.org
>>
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 2: you can get off all lists at once with the unregister command
>>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>>
>>


-- 
Gunther Schadow, M.D., Ph.D.                    gschadow@regenstrief.org
Medical Information Scientist      Regenstrief Institute for Health Care
Adjunct Assistant Professor        Indiana University School of Medicine
tel:1(317)630-7960                         http://aurora.regenstrief.org




Re: select failure

From
"Ian Cass"
Date:
> Rebuilding 27million rows shouldn't take 2 or 3 days. Are you
> sure you do a COPY ... FROM instead of individual INSERTs?

I need to refeed the data from application log files. They may or may not
have duplicate rows. The only way I can filter these it to feed them all in
parallel using Perl & DBI into a unique index. That's what takes the time.

Do you suggest I rebuild the entire database or just the affected table? I
have approx 400million rows in the entire database.

--
Ian Cass


>Also
> you may consider turning off indexes and deferring foreign key
> triggers. However, I don't need to do this to load a similar
> sized database overnight. YMMV, depends on machine speed too,
> of course.
>
> Good luck, and beware, I'm not really the pgsql geek who really
> understands your error. I just use common sense, so, don't take
> my word for gold w/r/t the recommendation to backtrack version and
> rebuild.
>
> regards,
> -Gunther
>
>
> >
> > --
> > Ian Cass
> >
> > ----- Original Message -----
> > From: "Gunther Schadow" <gunther@aurora.regenstrief.org>
> > To: "Ian Cass" <ian.cass@mblox.com>
> > Cc: <pgsql-sql@postgresql.org>
> > Sent: Sunday, June 09, 2002 7:10 PM
> > Subject: Re: [SQL] select failure
> >
> >
> >
> >>Ian Cass wrote:
> >>
> >>
> >>>Ideas where I should look for the answer to this problem & how I should
> >>>
> > fix
> >
> >>>it?
> >>>
> >>>observer=# select host, current_route, count(current_route)
> >>>observer-# from messages_200205
> >>>observer-# group by host, current_route;
> >>>FATAL 2:  open of /var/lib/postgres/data/pg_clog/0534 failed: No such
> >>>
> > file
> >
> >>>or directory
> >>>
> >>
> >>OUCH! that looks quite bad. It's a postgresql internal error.
> >>What version of PostgreSQL do you run? Is this some beta version?
> >>Or do you have someone deleting files randomly?
> >>
> >>regards,
> >>-Gunther
> >>
> >>
> >>
> >>--
> >>Gunther Schadow, M.D., Ph.D.                    gschadow@regenstrief.org
> >>Medical Information Scientist      Regenstrief Institute for Health Care
> >>Adjunct Assistant Professor        Indiana University School of Medicine
> >>tel:1(317)630-7960                         http://aurora.regenstrief.org
> >>
> >>
> >>
> >>---------------------------(end of broadcast)---------------------------
> >>TIP 2: you can get off all lists at once with the unregister command
> >>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> >>
> >>
>
>
> --
> Gunther Schadow, M.D., Ph.D.                    gschadow@regenstrief.org
> Medical Information Scientist      Regenstrief Institute for Health Care
> Adjunct Assistant Professor        Indiana University School of Medicine
> tel:1(317)630-7960                         http://aurora.regenstrief.org
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>