Re: UUID v7 - Mailing list pgsql-hackers

From Andrey M. Borodin
Subject Re: UUID v7
Date
Msg-id 80D39661-63D3-4A3C-9C5F-1E17BB1FF3AF@yandex-team.ru
Whole thread Raw
In response to Re: UUID v7  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: UUID v7
List pgsql-hackers

> On 31 Oct 2024, at 23:04, Stepan Neretin <sndcppg@gmail.com> wrote:
>
>
> Firstly, I'd like to discuss the increased_clock_precision variable, which
> currently divides the timestamp into milliseconds and nanoseconds. However,
> this approach only approximates the extra bits for sub-millisecond
> precision, leading to imprecise timestamps in high-frequency UUID
> generation.
No, timestamp is taken in nanoseconds, we keep precision of 1/4096 of ms. If you observe precision loss anywhere let me
know.

>
> To address this issue, we could consider using a more accurate method for
> calculating the timestamp. For instance, we could utilize a higher
> resolution clock or implement a more precise algorithm to ensure accurate
> timestamps.

That's what we do.

>
> Additionally, it would be beneficial to add validation checks for the
> interval argument. These checks could verify that the input interval is
> within reasonable bounds and that the calculated timestamp is accurate.
> Examples of checks could include verifying if the interval is too small,
> too large, or exceeds the maximum possible number of milliseconds and
> nanoseconds in a timestamp.

timestamptz_pl_interval() is already doing this.

> What do you think about these suggestions? Let me know your thoughts!

Thanks a lot for reviewing the patch!


> On 1 Nov 2024, at 10:33, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Oct 31, 2024 at 9:53 PM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
>>
>>
>>
>>> On 1 Nov 2024, at 03:00, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>>
>>> Therefore, if the
>>> system clock moves backward due to NTP, we cannot guarantee
>>> monotonicity and sortability. Is that right?
>>
>> Not exactly. Monotonicity is ensured for a given backend. We make sure that timestamp is advanced at least for
~250nsforward on each UUID generation. 60 bits of time are unique and ascending for a given backend. 
>>
>
> Thank you for your explanation. I now understand this code guarantees
> the monotonicity:
>
> +/* minimum amount of ns that guarantees step of increased_clock_precision */
> +#define SUB_MILLISECOND_STEP (1000000/4096 + 1)
> +       ns = get_real_time_ns();
> +       if (previous_ns + SUB_MILLISECOND_STEP >= ns)
> +               ns = previous_ns + SUB_MILLISECOND_STEP;
> +       previous_ns = ns;
>
>
> I think that one of the most important parts in UUIDv7 implementation
> is which method (1, 2, or 3 described in RFC 9562) we use to guarantee
> the monotonicity. The current patch employs method 3 with the
> assumption that 12 bits of sub-millisecond information is available on
> most of the systems we support. However, as far as I tested, on MacOS,
> values returned by  clock_gettime(CLOCK_REALTIME) are only microsecond
> precision, meaning that we could waste some randomness. Has this point
> been considered?
>

There was a thread "What is a typical precision of gettimeofday()?" [0]
There we found out that routines of instr_time.h are precise enough. On my machine (MacBook Air M3) I do not observe
significantdifferences between CLOCK_MONOTONIC_RAW and CLOCK_REALTIME in pg_test_timing results. 

CLOCK_MONOTONIC_RAW
x4mmm@x4mmm-osx bin % ./pg_test_timing
Testing timing overhead for 3 seconds.
Per loop time including overhead: 15.30 ns
Histogram of timing durations:
  < us   % of total      count
     1     98.47856  193113929
     2      1.52039    2981452
     4      0.00025        485
     8      0.00062       1211
    16      0.00012        237
    32      0.00004         79
    64      0.00002         30
   128      0.00000          8
   256      0.00000          5
   512      0.00000          3
  1024      0.00000          1
  2048      0.00000          2

CLOCK_REALTIME
x4mmm@x4mmm-osx bin % ./pg_test_timing
Testing timing overhead for 3 seconds.
Per loop time including overhead: 15.04 ns
Histogram of timing durations:
  < us   % of total      count
     1     98.49709  196477842
     2      1.50268    2997479
     4      0.00007        130
     8      0.00012        238
    16      0.00005         91
    32      0.00000          4
    64      0.00000          1

Thanks!


[0]
https://www.postgresql.org/message-id/flat/212C2E24-32CF-400E-982E-A446AB21E8CC%40yandex-team.ru#c89fa36829b2003147b6ce72170b5342





pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Doc: typo in config.sgml
Next
From: "Andrey M. Borodin"
Date:
Subject: Re: Using read stream in autoprewarm