Re: C based plugins, clocks, locks, and configuration variables - Mailing list pgsql-hackers

From Clifford Hammerschmidt
Subject Re: C based plugins, clocks, locks, and configuration variables
Date
Msg-id CANvN6gzMoD3M3k3fX1fbaG8KJb+caLk1yZOkqRzvumyYO1hz5A@mail.gmail.com
Whole thread Raw
In response to Re: C based plugins, clocks, locks, and configuration variables  (Jim Nasby <Jim.Nasby@BlueTreble.com>)
Responses Re: C based plugins, clocks, locks, and configuration variables  (Clifford Hammerschmidt <tanglebones@gmail.com>)
List pgsql-hackers
Hi Jim,

The values are still globally unique. The odds of a collision are very very low. Two instances with the same node_id generating on the same millisecond (in their local view of time) have a 1:2^34 chance of collision. node_id only repeats every 256 machines in a cluster (assuming you're configured correctly), and the probability of the same millisecond being used on both machines is also low (depends on generation rate and machine speed). The only real concern is with clock replays (i.e. something sets the clock backwards, like an admin or a badly implemented time sync system), which does happen in rare instances and is why seq is there to extend that space out and reduce the chance of a collision in that millisecond. (time replays are a real problem with id systems like snowflake.)

Also, the point of the timestamp isn't uniqueness, it's the generally monotonically ascending aspect I want. This causes inserts to append to the index (much faster than random inserts in large indexes because of cache coherency), and causes data generated around the same time to occupy near nodes in the index (again, cache benefits, as related data tends to be generated bunched up in time).

Thanks,
-Cliff. 

-- 
Clifford Hammerschmidt, P.Eng.

On Tue, Nov 8, 2016 at 6:27 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 11/3/16 7:14 PM, Craig Ringer wrote:
1) getting microseconds (or nanoseconds) from UTC epoch in a plugin

GetCurrentIntegerTimestamp()

Since you're serializing generation anyway you might want to just forgo the timestamp completely. It's not like the values your generating are globally unique anymore, or hard to guess.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461

pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Write Ahead Logging for Hash Indexes
Next
From: Magnus Hagander
Date:
Subject: Make pg_basebackup -x stream the default