Home > mailing lists

Re: UUID v7 - Mailing list pgsql-hackers

From	Masahiko Sawada
Subject	Re: UUID v7
Date	November 15 07:44:19
Msg-id	CAD21AoCHpg6a2fLhCRRv5n1eaPH39+Z+z6cS0PR_9C2JmjrHZQ@mail.gmail.com Whole thread Raw
In response to	Re: UUID v7 (Masahiko Sawada <sawada.mshk@gmail.com>)
List	pgsql-hackers

Tree view

On Mon, Nov 11, 2024 at 12:20 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, Nov 9, 2024 at 9:07 AM Sergey Prokhorenko
> <sergeyprokhorenko@yahoo.com.au> wrote:
> >
> > On Saturday 9 November 2024 at 01:00:15 am GMT+3, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > > the microsecond part is working also as a counter in a sense. IT seems fine to me but I'm slightly concerned that
thereis no guidance of such implementation in RFC 9562. 
> >
> > In fact, there is guidance of similar implementation in RFC 9562:
> > https://datatracker.ietf.org/doc/html/rfc9562#name-monotonicity-and-counters
> > "Counter Rollover Handling:"
> > "Alternatively, implementations MAY increment the timestamp ahead of the actual time and reinitialize the counter."
> >
>
> Indeed, thank you.
>
> > But in the near future, this may not be enough for the highest-performance systems.
>
> Yeah, I'm concerned about this. That time might gradually come. That
> being said, as long as rand_a part works also as a counter, it's fine.
> Also, 12 bits does not differ much as Andrey Borodin mentioned. I
> think in the first version it's better to start with a simple
> implementation rather than over-engineering it.
>
> Regarding the implementation, the v30 patch uses only microseconds
> precision time even on platforms where nanoseconds precision is
> available such as Linux. I think it's better to store the value of
> (sub-milliseconds * 4096) into 12-bits of rand_a space instead of
> directly storing microseconds into 10 bits space.

IIUC v29 patch implements UUIDv7 generation in this way. So I've
reviewed v29 patch and here are some review comments:

---
     * Set magic numbers for a "version 4" (pseudorandom) UUID, see
-    * http://tools.ietf.org/html/rfc4122#section-4.4
+    * http://tools.ietf.org/html/rfc9562#section-4.4
     */

The new RFC doesn't have section 4.4.

---
+ * All UUID bytes are filled with strong random numbers except version and
+ * variant 0b10 bits.

I'm concerned that "version and variant 0b10 bits" is not very clear
to readers. I think we can just mention "... except version and
variant bits".

---
+
+#ifndef WIN32
+#include <time.h>
+
+static uint64 get_real_time_ns()
+{
+       struct timespec tmp;
+
+       clock_gettime(CLOCK_REALTIME, &tmp);
+       return tmp.tv_sec * 1000000000L + tmp.tv_nsec;
+}
+#else /* WIN32 */
+
+#include "c.h"
+#include <sysinfoapi.h>
+#include <sys/time.h>
+
+/* FILETIME of Jan 1 1970 00:00:00, the PostgreSQL epoch */
+static const unsigned __int64 epoch = UINT64CONST(116444736000000000);
+
+/*
+ * FILETIME represents the number of 100-nanosecond intervals since
+ * January 1, 1601 (UTC).
+ */
+#define FILETIME_UNITS_TO_NS UINT64CONST(100)
+
+
+/*
+ * timezone information is stored outside the kernel so tzp isn't used anymore.
+ *
+ * Note: this function is not for Win32 high precision timing purposes. See
+ * elapsed_time().
+ */
+static uint64
+get_real_time_ns()
+{
+       FILETIME        file_time;
+       ULARGE_INTEGER ularge;
+
+       GetSystemTimePreciseAsFileTime(&file_time);
+       ularge.LowPart = file_time.dwLowDateTime;
+       ularge.HighPart = file_time.dwHighDateTime;
+
+       return (ularge.QuadPart - epoch) * FILETIME_UNITS_TO_NS;
+}
+#endif

I think that it's better to implement these functions in instr_time.h
or another file.

---
+/* minimum amount of ns that guarantees step of increased_clock_precision */
+#define SUB_MILLISECOND_STEP (1000000/4096 + 1)

I think we can rewrite it to:

#define NS_PER_MS INT64CONST(1000000)
#define SUB_MILLISECOND_STEP ((NS_PER_MS / (1 << 12)) + 1)

Which improves the readability.

Also, I think "#define NS_PER_US INT64CONST(1000)" can also be used in
many places.

---
+       /* set version field, top four bits are 0, 1, 1, 1 */
+       uuid->data[6] = (uuid->data[6] & 0x0f) | 0x70;
+       /* set variant field, top two bits are 1, 0 */
+       uuid->data[8] = (uuid->data[8] & 0x3f) | 0x80;

I think we can make an inline function to set both variant and version
so we can use it for generating UUIDv4 and UUIDv7.

--
+               tms = uuid->data[5];
+               tms += ((uint64) uuid->data[4]) << 8;
+               tms += ((uint64) uuid->data[3]) << 16;
+               tms += ((uint64) uuid->data[2]) << 24;
+               tms += ((uint64) uuid->data[1]) << 32;
+               tms += ((uint64) uuid->data[0]) << 40;

How about rewriting these to the following for consistency with UUIDv1 codes?

        tms = uuid->data[5]
        + ((uint64) uuid->data[4] << 8)
        + ((uint64) uuid->data[3] << 16)
        + ((uint64) uuid->data[2] << 24)
        + ((uint64) uuid->data[1] << 32)
        + ((uint64) uuid->data[0] << 40);

---
Thinking about the function structures more, I think we can refactor
generate_uuidv7(), uuidv7() and uuidv7_interval():

- create a function, get_clock_timestamp_ns(), that provides a
nanosecond-precision timestamp
    - the returned timestamp is guaranteed to be greater than the
previous returned value.
    - this function can be inlined.
- create a function, generate_uuidv7(), that takes a
nanosecond-precision timestamp as a function argument, and generate
UUIDv7 based on it.
- this function can be inlined too.
- uuidv7() gets the timestamp from get_clock_timestamp_ns() and passes
it to generate_uuidv7().
- uuidv7() gets the timestamp from get_clock_timestamp_ns(), adjusts
it based on the given interval, and passes it to generate_uuidv7().

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

pgsql-hackers by date:

From: Peter Smith
Date: 15 November, 06:39:46
Subject: Re: Improve the error message for logical replication of regular column to generated column.

From: torikoshia
Date: 15 November, 07:51:58
Subject: Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row

Re: UUID v7 - Mailing list pgsql-hackers

Previous

Next