Thread: Transaction, Rollback and Database Corruption question,

Transaction, Rollback and Database Corruption question,

From
Joe Kislo
Date:
I've never used PGSQL in a production environment, so forgive me if I'm
missing some of the common nuances assosciated with PGSQL.  I have
burrowed through the dox, but havn't found anything which goes into the
detail I need.

I was trying to import a database into PGSQL, I have a file which
contains a long series of create table and insert clauses.  I was using
the command:
isql DB_NAME < FILENAME

At the top of the import file there is a BEGIN;

and at the bottom there is a commit.  This should make this entire
import a single transaction.

During this process I accidentially hit control-C, canceling the whole
shabang.  The transaction obviously was never commited, and since the
connection to the database was terminated, a rollback should have
occoured.

I tried to restart, but errors started flying every which way.
Investigating using psql yielded illogical results:


TT_fcgi=# create table answer (i int);
ERROR:  cannot create answer
ERROR:  cannot create answer
TT_fcgi=# \d
No relations found.
TT_fcgi=# drop table answer;
ERROR:  Relation 'answer' does not exist
ERROR:  Relation 'answer' does not exist
TT_fcgi=# create table answer (i int);
ERROR:  cannot create answer
ERROR:  cannot create answer

PG gives no usuable error message as to why the table cannot be created,
and I'm simply nolonger able to create a table named "answer".

I terminated the PGserver, and restarted it.  Obviously that would
-force- a rollback of any inprogress transactions.  Still, same results,
my database is hosed.

From my experiences with Interbase this should simply never happen.

Here's the obvious questions stemming from this problem:

Why was my transaction never rolled back?
How do you determine what the outstanding transactions are?
How do you remotely roll back pending transactions?
How do you determine who is currently connected to a database?

 What happens in this case of two transactions:

    T1: begin
    T2: begin
    T1: Create table x with schema y
    T2: create table x with schema z
    T1: insert a into x
    T2: insert b into x
    T2: rollback
    T1: commit

Does the existance of a table in one transaction context cross over to
other transactions?
How do I query my PG server and determine the version?  The PGSQL client
version reports 7.0.2; but in the future I would like to query the
server and not some client application assosciated with PG.

FYI, deleting the answer file in the db directory seemed to alleviate my
problem. But obviously that is highly highly highly unacceptable.

Thanks,

-Joe

Re: Transaction, Rollback and Database Corruption question,

From
Tom Lane
Date:
Joe Kislo <kislo@athenium.com> writes:
> At the top of the import file there is a BEGIN;
> and at the bottom there is a commit.  This should make this entire
> import a single transaction.

> During this process I accidentially hit control-C, canceling the whole
> shabang.  The transaction obviously was never commited, and since the
> connection to the database was terminated, a rollback should have
> occoured.

Hm.  There are some longstanding problems in 7.0.* and before if you try
to roll back a DROP TABLE (finally fixed for 7.1, but that doesn't help
you now).  I'm not sure if that explains your results, however.  What
is the structure of DROP and CREATE commands in your script?

> TT_fcgi=# create table answer (i int);
> ERROR:  cannot create answer
> ERROR:  cannot create answer
> TT_fcgi=# \d
> No relations found.
> TT_fcgi=# drop table answer;
> ERROR:  Relation 'answer' does not exist
> ERROR:  Relation 'answer' does not exist

This looks a lot like there is a physical file
$PGDATA/base/TT_fcgi/answer, but no pg_class entry for "answer".
I'm not sure how you could've gotten to that state based on what
you said above, but you should be able to restore the database to
usability by reaching in and deleting the orphaned file.

> PG gives no usuable error message as to why the table cannot be created,

7.0.3 fixes the lack of kernel error messages when reporting failure
to create/delete a physical file.

>  What happens in this case of two transactions:

>     T1: begin
>     T2: begin
>     T1: Create table x with schema y
>     T2: create table x with schema z

T2 will wait to see if T1 commits, and if so fail with a duplicate-key
error...

> How do I query my PG server and determine the version?

select version();

            regards, tom lane

Re: Transaction, Rollback and Database Corruption question,

From
Tom Lane
Date:
Joe Kislo <kislo@athenium.com> writes:
>>>> FATAL 1:  my bits moved right off the end of the world!
>>>> Recreate index pg_attribute_relid_attnam_index.
>>
>> We've heard sporadic reports of this, but I've never been able to
>> reproduce it myself.  If you can submit a reproducible test script,
>> it'd be a big help.

>     Oh.. it's repeatable :)  I will upgrade to 7.0.3 tomorrow at work, and
> see if it still happens.. I had been assuming it had something to do
> with the same problem described at the top (so it may be fixed by
> 7.0.3).

Probably it's not fixed.  My suspicion is that there's some corner-case
bug lurking in the btree index routines.  With a reproducible test case,
we could find and squash it... but I've been over that code with a
flinty eye, and I haven't found it yet.

            regards, tom lane

Re: Transaction, Rollback and Database Corruption question,

From
Tom Lane
Date:
Joe Kislo <kislo@athenium.com> writes:
>     Well, I never issue a drop table.  I issue a CREATE TABLE, I issue some
> inserts.. then I TERMINATE the database connection.  Implicit rollback
> should uncreate the table and leave everything consistent.

Oh, wait, you're running 7.0.2 aren't you?  Update to 7.0.3, it'll get
better.  7.0.2 forgets to run the transaction-abort cleanup routine if
the client disconnects mid-transaction.  The database itself is fine,
but the action of deleting the no-longer-referenced table files doesn't
get done.

> If I terminate a large set of create
> tables and inserts in the middle.. and I try again, I get this error
> spewing from the server:

> FATAL 1:  my bits moved right off the end of the world!
>     Recreate index pg_attribute_relid_attnam_index.

We've heard sporadic reports of this, but I've never been able to
reproduce it myself.  If you can submit a reproducible test script,
it'd be a big help.

> Oh, one more thing.  How do I, using SQL, determine if a table exists?
> EG, in Mysql it's something like:
> show tables like 'MyTable';

> In Interbase you select it out of the RDBS$RELATIONS table...

In PG you look for its entry in pg_class.  What, you wanted a standard ;-) ?

            regards, tom lane