Thread: Loosing files after backend crash

Loosing files after backend crash

From
pgsql-bugs@postgresql.org
Date:
Konstantin Solodovnikov (ks@tcnet.ru) reports a bug with a severity of 2
The lower the number the more severe it is.

Short Description
Loosing files after backend crash

Long Description
After a backend crash the file for my sequence cannot be found.

I use PostgreSQL 7.1RC1 compiled by GCC 2.95.2 on FreeBSD 4.2-STABLE and that's what I encountered.

During execution of some sql code backend crashed (the reason for it's crashing is not the actual topic of this
report).
I successfully re-run psql without restarting the postmaster.
And tried to delete the sequence, which i created in the previous session.
That's what pgsql told me:
---8<---
Netflow_Test=# drop SEQUENCE seq_i___data_buffer ;
ERROR:  cannot open seq_i___data_buffer: No such file or directory
---8<---
I examined the system tables (ps_class) and found the tuple, describing my sequence.
The "relfilenode" attribute of the tuple contained the value "326210". I tried to find the file with that name in the
databasedirectory - it really was not there. 
Thus, the backend crash has led not only to impossibility to complete my task, but also to loosing some data. The
sequencewas actually created in an already committed transaction, not in the one, that crashed. 
Another problem is that I cannot fix the situation (drop that sequence) by standard PGSQL means.
I had to create an empty file with that name ("326210") for "DROP SEQUENCE" command to successfully complete.

As i guess, the situation is not reprodusable. Backend crashed many times before w/o such problems.

A little additional question/proposition: can there be any tool, which could find and resolve the errors in the system
tables?

Best Regards,
Konstantin Solodovnikov.

Sample Code


No file was uploaded with this report

Re: Loosing files after backend crash

From
Tom Lane
Date:
pgsql-bugs@postgresql.org writes:
> During execution of some sql code backend crashed (the reason for it's
> crashing is not the actual topic of this report).
> And tried to delete the sequence, which i created in the previous session.
> That's what pgsql told me:
> ---8<---
> Netflow_Test=# drop SEQUENCE seq_i___data_buffer ;
> ERROR:  cannot open seq_i___data_buffer: No such file or directory

Was that a system-wide crash, or just a backend crash?  It seems
to me that your kernel must have lost that file for you; Postgres
wouldn't have gone out and deleted it.

> Another problem is that I cannot fix the situation (drop that
> sequence) by standard PGSQL means.  I had to create an empty file with
> that name ("326210") for "DROP SEQUENCE" command to successfully
> complete.

This used to work pre-7.1, and it seems to me it should still work;
relcache.c is being inappropriately paranoid about finding the file.
I have corrected that part of the problem.

            regards, tom lane

RE: Loosing files after backend crash

From
Солодовников Константин
Date:

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Tuesday, April 03, 2001 3:35 AM
> To: ks@tcnet.ru; pgsql-bugs@postgresql.org
> Subject: Re: [BUGS] Loosing files after backend crash
>
>
> pgsql-bugs@postgresql.org writes:
> > During execution of some sql code backend crashed (the
> reason for it's
> > crashing is not the actual topic of this report).
> > And tried to delete the sequence, which i created in the
> previous session.
> > That's what pgsql told me:
> > ---8<---
> > Netflow_Test=# drop SEQUENCE seq_i___data_buffer ;
> > ERROR:  cannot open seq_i___data_buffer: No such file or directory
>
> Was that a system-wide crash, or just a backend crash?
It was just a backend crash.

> It seems to me that your kernel must have lost that file for you; Postgres
> wouldn't have gone out and deleted it.
How could kernel _loose_ the file, which was already created?
I would agree with the possibility of loosing some data in the file during
the backend crash.
But loosing the whole file ...
The sequence was previously created in an already committed transaction.
Correct me if I am wrong, but after I said "commit" the file would be on my
disk.
If so, how can backend's crash "erase" the file from disk?

BTW, I have some more facts to report.
Just about half an hour ago the backend crashed one more time in the same
situation.
Now it has not only lost the file for my sequence (the same very sequence),
but it also has TWO tuples in the pg_class table with the name of my
sequence.
All other attribites of the tuples also hold identical values.
And PGSQL doesn't consider it to be a sequence.
When I say 'DROP SEQUENCE ...' it says that it is not a sequence and advises
me to use 'DROP TABLE'.
When I say 'DROP TABLE ...' it says something like 'cannot drop system table
pg_temp.XXXXX.X' (I changed the actial digits into 'X'-es. Not sure about
exact words.)
The "relkind" attribute of the corresponding two tuples in the pg_class
table hold letter 'S' (The uppercase 's').

Regards,
Konstantin Solodovnikov.


Re: Loosing files after backend crash

From
Tom Lane
Date:
Солодовников Константин <ks@tcnet.ru> writes:
>> It seems to me that your kernel must have lost that file for you; Postgres
>> wouldn't have gone out and deleted it.
> How could kernel _loose_ the file, which was already created?
> I would agree with the possibility of loosing some data in the file during
> the backend crash.
> But loosing the whole file ...
> The sequence was previously created in an already committed transaction.
> Correct me if I am wrong, but after I said "commit" the file would be on my
> disk.

Sooner than that: a create call is issued to the kernel as soon as you
say CREATE SEQUENCE.

> If so, how can backend's crash "erase" the file from disk?

I'd like to know that too.

> BTW, I have some more facts to report.
> Just about half an hour ago the backend crashed one more time in the same
> situation.

Backtrace from core file, please?

> Now it has not only lost the file for my sequence (the same very sequence),
> but it also has TWO tuples in the pg_class table with the name of my
> sequence.
> All other attribites of the tuples also hold identical values.
> And PGSQL doesn't consider it to be a sequence.

Could we see the exact contents of pg_class for the sequence?  (both
tuples including OIDs)

> When I say 'DROP SEQUENCE ...' it says that it is not a sequence and advises
> me to use 'DROP TABLE'.
> When I say 'DROP TABLE ...' it says something like 'cannot drop system table
> pg_temp.XXXXX.X' (I changed the actial digits into 'X'-es. Not sure about
> exact words.)

Hm, that sounds like some sort of conflict with a temp table.  Is it
possible that you have been using a temp table name that matches the
sequence name?  Are there any pg_class entries whose names begin with
pg_temp, and if so could we see details on those too?
        regards, tom lane


RE: Loosing files after backend crash

From
"Mikheev, Vadim"
Date:
> The "relfilenode" attribute of the tuple contained the value
> "326210". I tried to find the file with that name in the
> database directory - it really was not there.

Did you look in lost+found?

> Thus, the backend crash has led not only to impossibility to
> complete my task, but also to loosing some data. The sequence
> was actually created in an already committed transaction, not
> in the one, that crashed.

How much time passed after sequence creation till crash?
Did you use -F (fsync = NO)?

Vadim