On Wed, Nov 6, 2024 at 10:14 AM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>
>
>
> > On 5 Nov 2024, at 23:56, Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> >
> > <v30-0001-Implement-UUID-v7.patch>
>
> Some more thoughts on this patch version:
>
> 0. Comment mentioning nanoseconds, while we do not need to carry anything
> /* Convert TimestampTz back and carry nanoseconds. */
>
> 1. There's unnecessary &3 in
> uuid->data[7] = uuid->data[7] | ((uuid->data[8] >> 6) & 3);
>
> 2. Currently we store 0..999 microseconds in 10 bits, so values 1000..1023 are unused. We could use them for
overflow.That would slightly increase non-overflowing capacity when generating more than million UUIDs per second on
onebackend. However, given current performance of our CSPRNG I do not think this feature worth code complexity.
>
While using only 10 bits microseconds makes the implementation simple,
I'm not sure if 10 bits is enough to generate UUIDs at microsecond
granularity without losing monotonicity. Since 10-bit microseconds are
used as is in rand_a space, 1000 UUIDs can be generated per
millisecond without losing monotonicity.
For example, in my environment, it took 1808 milliseconds to generate
1 million UUIDs. This is about 533 UUIDs generated per millisecond. As
UUID generation performance improves, I think 10 bits will not be
enough.
=# select count(uuidv7()) from generate_series(1, 1_000_000);
count
---------
1000000
(1 row)
Time: 1808.734 ms
I found a similar comment from Sergey Prokhorenko[1]. He also mentioned:
> 4) Microsecond timestamp fraction subtracts 10 bits from random data, which increases the risk of collision. In the
counter,almost all bits are initialized with a random number, which reduces the risk of collision.
I feel that it's better to switch to Method 1 or 2 with 12 bits or
larger counter space.
Regards,
[1] https://www.postgresql.org/message-id/305478845.5279532.1712440778735%40mail.yahoo.com
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com