Heikki Linnakangas <heikki@enterprisedb.com> writes:
> I don't think you still quite understand what's happening. GetNewOid()
> is not interesting here, look at GetNewRelFileNode() instead. And
> neither are snapshots or MVCC visibility rules.
Simon has a legitimate objection; not that there's no bug, but that the
probability of getting bitten is exceedingly small. The test script you
showed cheats six-ways-from-Sunday to cause an OID collision that would
never happen in practice. The only case where it would really happen
is if a table that has existed for a long time (~ 2^32 OID creations)
gets dropped and then you're unlucky enough to recycle that exact OID
before the next checkpoint --- and then crash before the checkpoint.
I think we should think about ways to fix this, but I don't feel a need
to try to backpatch a solution.
I tend to agree that truncating the file, and extending the fsync
request mechanism to actually delete it after the next checkpoint,
is the most reasonable route to a fix.
I think the objection about leaking files on crash is wrong. We'd
have the replay of the deletion to fix things up --- it could probably
delete the file immediately, and if not could certainly put it back
on the fsync request queue.
regards, tom lane