Re: UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table - Mailing list pgsql-hackers

From Christopher Browne
Subject Re: UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table
Date
Msg-id CAFNqd5UsHVXY8Q=WbrsFFu+pV6F1yZk+rdHd=oOu2m7giOp55w@mail.gmail.com
Whole thread Raw
In response to Re: UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
Last year, I built a pl/pgsql generator of "version 1-ish" UUIDs, which would combine timestamps with local information to construct data that kind of emulated the timestamp+MAC address that is version #1 of UUID.

Note that there are several versions of UUIDs:

1.  Combines MAC address, timestamp, random #
2.  DCE Security (replaces some bits with user's UID/GID and others with POSIX Domain); I don't think this one is much used...
3.  MD5 Hash
4.  Purely Random
5.  SHA-1 Hash

There are merits to each.  The "tough one" is #1, as that requires pulling data that can't generally be accessed portably.

I figured out (and could probably donate some code) how to construct the bits of #1 using the inputs of *my* choice (e.g. - I set up to "make up" my own MAC address surrogate, and transformed PostgreSQL timestamp values into the timestamp, and threw in my own bit of randomness), which provided well-formed UUIDs with nice enough characteristics.

It wouldn't be "out there" to do a somewhat PostgreSQL-flavoured version of this that wouldn't actually use MAC addresses, but rather, would use data we have:

a) Having a sequence feeding some local uniqueness would fit with the "clock seq" bits (e.g. - the octets in RFC 4122 entitled clock-seq-and-reserved and clock-seq-low)
b) NOW() provides data for time-low, time-mid, time-high-and-version
c) We'd need 6 hex octets for "node"; I seem to recall there being something established by initdb that might be usable.

The only piece that's directly troublesome, for UUID Type 1, is the "node" value.  I'll observe that it isn't unusual for UUID implementations to generate random values for that.

Note that for the other UUID versions, there's NO non-portable data needed.

It seems to me that a "UUIDserial" type, which combined:
  a) A sequence, to be the 'clock';
  b) Possibly another sequence to store local node ID, which might get seeded from DB internals
would provide a "PostgreSQL-flavoured" version of UUID Type 1.

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: 9.4 Proposal: Initdb creates a single table
Next
From: Alvaro Herrera
Date:
Subject: Re: assertion failure 9.3.4