Re: UUID v7 - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: UUID v7
Date
Msg-id CAD21AoB+8DsmjW7m9AdC-w-y_aiAx=EyZqnyVVR2H6MrX39H9A@mail.gmail.com
Whole thread Raw
In response to Re: UUID v7  ("Andrey M. Borodin" <x4mmm@yandex-team.ru>)
List pgsql-hackers
On Tue, Nov 19, 2024 at 7:54 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
>
>
> > On 20 Nov 2024, at 00:06, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Nov 19, 2024 at 9:45 AM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> >>
> >>
> >>
> >>> On 19 Nov 2024, at 14:31, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> >>>
> >>> Done.
> >>
> >> Here's v33 intact + one more patch to add 2 bits of entropy on MacOS (to compensate lack of nanoseconds).
> >> What do you think?
> >>
> >
> > Thank you for updating the patch!
> >
> > I've reviewed the v33 patch and made some changes mostly for cosmetic
> > things. Please review it to see if we accept these changes.
>
> Your changes look good to me. I particularly like sortability test.
> I see that you removed implementation of clock_gettime() for Windows. Well, this makes sense.
>
>
> >
> > I have one question about the additional patch:
> >
> > +#if defined(__darwin__)
> > + /*
> > + * On MacOS real time is truncted to microseconds. Thus, 2 least
> > + * significant bits of increased_clock_precision are neither random
> > + * (CSPRNG), nor time-dependent (in a sense - truly random). These 2 bits
> > + * are dependent on other time-specific bits, thus they do not contribute
> > + * to uniqueness. To make these bit random we mix in two bits from CSPRNG.
> > + */
> > + uuid->data[7] = uuid->data[7] ^ (uuid->data[8] >> 6);
> > +#endif
> >
> > I thought that the whole 12 bits in "rand_a" is actually
> > time-dependent since we store 1/4096 fraction of sub-milliseconds. Am
> > I missing something?
>
> We have 12 bits in increaesd_clock_precission but only 1000 possible values of these bits. 2 least significant bits
aredefined by other 10 bits. 
> These bits are not equal to 0, they are changing.
> True, these bits are time-dependent in a sense that these bits are be computed from a full timestamp. I wanted to
expressthe fact that timestamp cannot be altered in a way so only these 2 bits are changed. 

Understood the idea. But does replacing the least significant 2 bits
with random 2 bits really not affect monotonicity? The ensured minimal
timestamp step is 245, ((NS_PER_MS / (1 << 12)) + 1), meaning that if
two UUIDs are generated within a microsecond on macOS, the two
timestamps differ by 245 ns. After calculating the increased clock
precision with these two timestamps, they differ only by 1, which
seems to be problematic to me.

Suppose the two timestamps are:

ns1: 1732142033754429000 (Nov 20, 2024 10:33:53.754429000)
ns2: 1732142033754429245 (Nov 20, 2024 10:33:53.754429245)

Their sub-milliseconds are calculated (by multiplying by 4096) to:

subms1: 1757 (0b011011011101)
subms2: 1758 (0b011011011110)

If we replace the least significant bits '01' of subms1 with random
bits '11' and replace '10' of subms2 with '00', we cannot guarantee
the monotonicity.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Accessing other session's temp table
Next
From: Thomas Munro
Date:
Subject: Re: On non-Windows, hard depend on uselocale(3)