Re: UUID v7 - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: UUID v7
Date
Msg-id CAD21AoBVxi5hZJnoyN-PKhd5UxCFtooRiDpsTDs3Eg=rROgsBA@mail.gmail.com
Whole thread Raw
In response to Re: UUID v7  ("Andrey M. Borodin" <x4mmm@yandex-team.ru>)
Responses Re: UUID v7
List pgsql-hackers
On Wed, Nov 6, 2024 at 10:14 AM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
>
>
> > On 5 Nov 2024, at 23:56, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> >
> > <v30-0001-Implement-UUID-v7.patch>
>
> Some more thoughts on this patch version:
>
> 0. Comment mentioning nanoseconds, while we do not need to carry anything
> /* Convert TimestampTz back and carry nanoseconds. */
>
> 1. There's unnecessary &3 in
> uuid->data[7] = uuid->data[7] | ((uuid->data[8] >> 6) & 3);
>
> 2. Currently we store 0..999 microseconds in 10 bits, so values 1000..1023 are unused. We could use them for
overflow.That would slightly increase non-overflowing capacity when generating more than million UUIDs per second on
onebackend. However, given current performance of our CSPRNG I do not think this feature worth code complexity. 
>

While using only 10 bits microseconds makes the implementation simple,
I'm not sure if 10 bits is enough to generate UUIDs at microsecond
granularity without losing monotonicity. Since 10-bit microseconds are
used as is in rand_a space, 1000 UUIDs can be generated per
millisecond without losing monotonicity.

For example, in my environment, it took 1808 milliseconds to generate
1 million UUIDs. This is about 533 UUIDs generated per millisecond. As
UUID generation performance improves, I think 10 bits will not be
enough.

=# select count(uuidv7()) from generate_series(1, 1_000_000);
count
---------
1000000
(1 row)

Time: 1808.734 ms

I found a similar comment from Sergey Prokhorenko[1]. He also mentioned:

> 4) Microsecond timestamp fraction subtracts 10 bits from random data, which increases the risk of collision. In the
counter,almost all bits are initialized with a random number, which reduces the risk of collision. 

I feel that it's better to switch to Method 1 or 2 with 12 bits or
larger counter space.

Regards,

[1] https://www.postgresql.org/message-id/305478845.5279532.1712440778735%40mail.yahoo.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: jian he
Date:
Subject: Re: not null constraints, again
Next
From: Andy Fan
Date:
Subject: Re: Deleting older versions in unique indexes to avoid page splits