Re: UUID v7 - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: UUID v7 |
Date | |
Msg-id | CAD21AoAVU-VnuqvdafqUCHCAo7Msb6_2k7nSzd_fijMbbtKrCg@mail.gmail.com Whole thread Raw |
In response to | Re: UUID v7 ("Daniel Verite" <daniel@manitou-mail.org>) |
Responses |
Re: UUID v7
|
List | pgsql-hackers |
On Sun, Feb 9, 2025 at 9:07 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote: > > I've took into account note from Sergey that "offset" is better name for uuidv7() argument than "shift". > > > On 5 Feb 2025, at 03:02, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > >> > >> I was thinking about incorporating test like this. > >> > >>>> With this patch we can generate correct UUIDs in a very distant future. > >>>> postgres=# select x, uuid_extract_timestamp(uuidv7((x::text || ' year'::text)::interval)), > >>>> (x::text || ' year'::text)::interval > >>>> from generate_series(1,9000,1000) x; > >>>> x | uuid_extract_timestamp | interval > >>>> ------+-----------------------------+------------ > >>>> 1 | 2026-01-31 12:00:53.084+05 | 1 year > >>>> 1001 | 3026-01-31 12:00:53.084+05 | 1001 years > >>>> 2001 | 4026-01-31 12:00:53.084+05 | 2001 years > >>>> 3001 | 5026-01-31 12:00:53.084+05 | 3001 years > >>>> 4001 | 6026-01-31 12:00:53.084+05 | 4001 years > >>>> 5001 | 7026-01-31 12:00:53.085+05 | 5001 years > >>>> 6001 | 8026-01-31 12:00:53.085+05 | 6001 years > >>>> 7001 | 9026-01-31 12:00:53.085+05 | 7001 years > >>>> 8001 | 10026-01-31 12:00:53.085+05 | 8001 years > >>>> (9 rows) > >> > > > > Something like following queries might be workable for example? > > > > create table test (c serial, d uuid, t timestamptz generated always as > > (uuid_extract_timestamp(d)) stored); > > insert into test (d) select uuidv7((n || 'years')::interval) from > > generate_series(1, 2000) n; > > select count(*) from (select t - lag(t) over (order by c) as diff from > > test) where diff > '10 year' ; > > Yeah, makes sense. I reduced tolerance to 366+1 day. Must be stable if we've done all the time offset business right. > > > Here are some review comments: > > > > #define NS_PER_S INT64CONST(1000000000) > > #define NS_PER_MS INT64CONST(1000000) > > +#define US_PER_MS INT64CONST(1000) > > #define NS_PER_US INT64CONST(1000) > > > > I think it's clear if we put US_PER_MS below NS_PER_US. > > OK. > > > --- > > * > > - * ns is a number of nanoseconds since start of the UNIX epoch. This value is > > + * unix_ts_ms is a number of milliseconds since start of the UNIX epoch, > > + * ns_in_ms is a number of nanoseconds within millisecond. These values are > > * used for time-dependent bits of UUID. > > > > I think we can mention that the RFC describes that stored unix > > timestamp as an unsigned integer. > > Done. Feel free to adjust my wordings, I've no sense of idiomatic English. > > > > > --- > > static pg_uuid_t * > > -generate_uuidv7(int64 ns) > > +generate_uuidv7(uint64 unix_ts_ms, uint32 ns_in_ms) > > > > How about renaming ns_in_ms with sub_ms? > > OK. > > > > > --- > > + /* 64 bits is enough for real time, but not for a time range of UUID */ > > > > I could not understand the point of this comment. It seems to say that > > 64-bits is not enough for a time range of UUID, but doesn't the time > > range of UUIDv7 use only 48 bits? It seems to need more comments. > > I've tried to say that acquiring current time as an int64 ns since UNIX epoch is still viable for the code (until year2262). > > > > --- > > - ns = (ts + (POSTGRES_EPOCH_JDATE - UNIX_EPOCH_JDATE) * > > SECS_PER_DAY * USECS_PER_SEC) > > - * NS_PER_US + ns % NS_PER_US; > > + us = (ts + (POSTGRES_EPOCH_JDATE - UNIX_EPOCH_JDATE) * > > SECS_PER_DAY * USECS_PER_SEC); > > > > /* Generate an UUIDv7 */ > > - uuid = generate_uuidv7(ns); > > + uuid = generate_uuidv7(us / US_PER_MS, (us % US_PER_MS) * > > NS_PER_US + ns % NS_PER_US); > > > > Need to update comments in uuidv7_internval() such as: > > > > /* > > * Shift the current timestamp by the given interval. To calculate time > > * shift correctly, we convert the UNIX epoch to TimestampTz and use > > * timestamptz_pl_interval(). Since this calculation is done with > > * microsecond precision, we carry nanoseconds from original ns value to > > * shifted ns value. > > */ > > > > and > > > > /* > > * Convert a TimestampTz value back to an UNIX epoch and back nanoseconds. > > */ > > I've tried. I'm not very satisfied with comments, but could not come up with easier description. > Thank you for updating the patch. I had missed to track this patch. I've updated the patch from your v4 patch. In this version, I excluded the argument name change (from 'shift' to 'offset') as it's not related to the bug fix and simplified the regression test case. Please review it. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Attachment
pgsql-hackers by date: