Re: Why copy_relation_data only use wal whenWALarchivingis enabled - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Why copy_relation_data only use wal whenWALarchivingis enabled
Date
Msg-id 7757.1192640333@sss.pgh.pa.us
Whole thread Raw
In response to Re: Why copy_relation_data only use wal whenWALarchivingis enabled  (Heikki Linnakangas <heikki@enterprisedb.com>)
Responses Re: Why copy_relation_data only use wal whenWALarchivingis enabled
List pgsql-hackers
Heikki Linnakangas <heikki@enterprisedb.com> writes:
> I don't think you still quite understand what's happening. GetNewOid()
> is not interesting here, look at GetNewRelFileNode() instead. And
> neither are snapshots or MVCC visibility rules.

Simon has a legitimate objection; not that there's no bug, but that the
probability of getting bitten is exceedingly small.  The test script you
showed cheats six-ways-from-Sunday to cause an OID collision that would
never happen in practice.  The only case where it would really happen
is if a table that has existed for a long time (~ 2^32 OID creations)
gets dropped and then you're unlucky enough to recycle that exact OID
before the next checkpoint --- and then crash before the checkpoint.

I think we should think about ways to fix this, but I don't feel a need
to try to backpatch a solution.

I tend to agree that truncating the file, and extending the fsync
request mechanism to actually delete it after the next checkpoint,
is the most reasonable route to a fix.

I think the objection about leaking files on crash is wrong. We'd
have the replay of the deletion to fix things up --- it could probably
delete the file immediately, and if not could certainly put it back
on the fsync request queue.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Dave Page
Date:
Subject: Re: rolcanlogin vs. the flat password file
Next
From: Heikki Linnakangas
Date:
Subject: Re: Why copy_relation_data only use wal whenWALarchivingis enabled