Re: C based plugins, clocks, locks, and configuration variables - Mailing list pgsql-hackers

From Clifford Hammerschmidt
Subject Re: C based plugins, clocks, locks, and configuration variables
Date
Msg-id CANvN6gx=7cosvFJsfurV_br0EkYOUDE4na4sSZPZwOBRQuu8ew@mail.gmail.com
Whole thread Raw
In response to Re: C based plugins, clocks, locks, and configuration variables  (Clifford Hammerschmidt <tanglebones@gmail.com>)
Responses Re: C based plugins, clocks, locks, and configuration variables  (Craig Ringer <craig.ringer@2ndquadrant.com>)
List pgsql-hackers
Looking closer at the bit math, I screwed it up.... it should be 64 bits time, 6 bit uuid version, 8 node, 8 seq, and the rest random ... which is 42 bits of random. I'll find the code in a bit.

-- 
Clifford Hammerschmidt, P.Eng.

On Tue, Nov 8, 2016 at 9:42 AM, Clifford Hammerschmidt <tanglebones@gmail.com> wrote:
Hi Jim,

The values are still globally unique. The odds of a collision are very very low. Two instances with the same node_id generating on the same millisecond (in their local view of time) have a 1:2^34 chance of collision. node_id only repeats every 256 machines in a cluster (assuming you're configured correctly), and the probability of the same millisecond being used on both machines is also low (depends on generation rate and machine speed). The only real concern is with clock replays (i.e. something sets the clock backwards, like an admin or a badly implemented time sync system), which does happen in rare instances and is why seq is there to extend that space out and reduce the chance of a collision in that millisecond. (time replays are a real problem with id systems like snowflake.)

Also, the point of the timestamp isn't uniqueness, it's the generally monotonically ascending aspect I want. This causes inserts to append to the index (much faster than random inserts in large indexes because of cache coherency), and causes data generated around the same time to occupy near nodes in the index (again, cache benefits, as related data tends to be generated bunched up in time).

Thanks,
-Cliff. 

-- 
Clifford Hammerschmidt, P.Eng.

On Tue, Nov 8, 2016 at 6:27 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 11/3/16 7:14 PM, Craig Ringer wrote:
1) getting microseconds (or nanoseconds) from UTC epoch in a plugin

GetCurrentIntegerTimestamp()

Since you're serializing generation anyway you might want to just forgo the timestamp completely. It's not like the values your generating are globally unique anymore, or hard to guess.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


pgsql-hackers by date:

Previous
From: Andreas Karlsson
Date:
Subject: Re: [PATCH] Reload SSL certificates on SIGHUP
Next
From: Peter Eisentraut
Date:
Subject: Re: Logical Replication WIP