Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>> Florian suggested a scheme where the xid and epoch is embedded in the
>> filename, but that's unnecessarily complex. We could just make
>> relfilenode a 64-bit integer. 2^64 should be enough for everyone.
>
> Doesn't fix the problem unless DB and TS OIDs become int64 too;
Remember what the original scenario was:
1. DROP TABLE foo;
2. BEGIN; CREATE TABLE bar; COPY TO bar ...; COMMIT -- bar gets the same
relfilenode as "foo", by chance
3. <crash>
4. WAL replay of "DROP TABLE foo" deletes the data inserted at step 2.
Because the table was created in the same transaction, we skipped WAL
logging the inserts, and the data is lost.
Even if we can get a collision in DB or tablespace OIDs, we can't get a
collision at step 2 if we don't reuse relfilenodes.
> in fact, given that we generate relfilenodes off the OID counter,
> it's difficult to see how you do this without making *all* OIDs
> 64-bit.
By generating them off a new 64-bit counter instead of the OID counter.
Or by extending the OID counter to 64-bits, but only using the lower 64
bits for OIDs.
> Plus you're assuming that the machine has working 64-bit ints.
> There's a large difference in my mind between saying "bigint
> doesn't work right if you don't have working int64" and "we don't
> guarantee the safety of your data if you don't have int64".
> I'm not prepared to rip out the non-collision code until such
> time as we irrevocably refuse to build on machines without int64.
Hmm. We could make it a struct of two int4s if we have to, couldn't we?
-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com