Home > mailing lists

Thread: UUID or auto-increment

UUID or auto-increment

From

Ashkar Dev

Date:

10 August 2020, 15:53:23

Hi,

for web application is it needed to use UUID or auto-increment?

1- if two user inserts row at the same time, does it work?

2- dose the database give the same id for both users or execute one of them first? ( I mean ID conflict not happens?)

Thanks.

Re: UUID or auto-increment

From

Ravi Krishna

Date:

10 August 2020, 16:38:10

Both can handle concurrent writes. auto-increment is nothing but serial or sequence cols and they can handle unique concurrent request. That is why sometimes you may have gaps.

UUID is not only unique, but is also unique across space. You can have two different databases generate UUID at the same time and it will still be unique. So that will help if you are consolidating different databases into one big data mart and they can all can go to the same table without conflict. With Sequence or Serial that will be a problem.

Finally UUID results in write amplication in wal logs. Keep that in mind if your app does lot of writes.

Re: UUID or auto-increment

From

Michael Lewis

Date:

10 August 2020, 16:49:53

UUID are also random and not correlated with time typically, so with a very large table when accessing primarily recent data, hitting an index on a big table will pull random pages into memory instead of primarily the end of the index.

Re: UUID or auto-increment

From

Ron

Date:

10 August 2020, 16:51:50

On 8/10/20 11:38 AM, Ravi Krishna wrote:
[snip]

Finally UUID results in write amplication in wal logs. Keep that in mind if your app does lot of writes.

Because UUID is 32 bytes, while SERIAL is 4 bytes?

--
Angular momentum makes the world go 'round.

Re: UUID or auto-increment

From

Stephen Frost

Date:

10 August 2020, 16:53:32

Greeitngs,

* Ron (ronljohnsonjr@gmail.com) wrote:
> On 8/10/20 11:38 AM, Ravi Krishna wrote:
> >Finally UUID results in write amplication in wal logs.  Keep that in mind
> >if your app does lot of writes.
>
> Because UUID is 32 bytes, while SERIAL is 4 bytes?

and because it's random and so will touch a lot more pages when you're
using it...

Avoid UUIDs if you can- map them to something more sensible internally
if you have to deal with them.

Thanks,

Stephen

Attachment

signature.asc

Re: UUID or auto-increment

From

Adrian Klaver

Date:

10 August 2020, 16:58:51

On 8/10/20 9:51 AM, Ron wrote:
> On 8/10/20 11:38 AM, Ravi Krishna wrote:
> [snip]
>> Finally UUID results in write amplication in wal logs.  Keep that in 
>> mind if your app does lot of writes.
> 
> Because UUID is 32 bytes, while SERIAL is 4 bytes?

You mean 32 digits for 128 bits?:

https://www.postgresql.org/docs/12/datatype-uuid.html

And there is BIGSERIAL which is 8 bytes.

> 
> -- 
> Angular momentum makes the world go 'round.


-- 
Adrian Klaver
adrian.klaver@aklaver.com

Re: UUID or auto-increment

From

Israel Brewster

Date:

10 August 2020, 17:10:00

---

Israel Brewster
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320

Work: 907-474-5172
cell: 907-328-9145

On Aug 10, 2020, at 8:53 AM, Stephen Frost <sfrost@snowman.net> wrote:

Greeitngs,

* Ron (ronljohnsonjr@gmail.com) wrote:
On 8/10/20 11:38 AM, Ravi Krishna wrote:
Finally UUID results in write amplication in wal logs. Keep that in mind
if your app does lot of writes.

Because UUID is 32 bytes, while SERIAL is 4 bytes?

and because it's random and so will touch a lot more pages when you're
using it...

I would point out, however, that using a V1 UUID rather than a V4 can help with this as it is sequential, not random (based on MAC address and timestamp + random). There is a trade off, of course, as with V1 if two writes occur on the same computer at the exact same millisecond, there is a very very small chance of generating conflicting UUID’s (see https://www.sohamkamani.com/blog/2016/10/05/uuid1-vs-uuid4/). As there is still a random component, however, this seems quite unlikely.

Avoid UUIDs if you can- map them to something more sensible internally
if you have to deal with them.

Thanks,

Stephen

Re: UUID or auto-increment

From

Stephen Frost

Date:

10 August 2020, 17:16:29

Greetings,

* Israel Brewster (ijbrewster@alaska.edu) wrote:
> > On Aug 10, 2020, at 8:53 AM, Stephen Frost <sfrost@snowman.net> wrote:
> > * Ron (ronljohnsonjr@gmail.com) wrote:
> >> On 8/10/20 11:38 AM, Ravi Krishna wrote:
> >>> Finally UUID results in write amplication in wal logs.  Keep that in mind
> >>> if your app does lot of writes.
> >>
> >> Because UUID is 32 bytes, while SERIAL is 4 bytes?
> >
> > and because it's random and so will touch a lot more pages when you're
> > using it...
>
> I would point out, however, that using a V1 UUID rather than a V4 can help with this as it is sequential, not random
(basedon MAC address and timestamp + random). There is a trade off, of course, as with V1 if two writes occur on the
samecomputer at the exact same millisecond, there is a very very small chance of generating conflicting UUID’s (see
https://www.sohamkamani.com/blog/2016/10/05/uuid1-vs-uuid4/
<https://www.sohamkamani.com/blog/2016/10/05/uuid1-vs-uuid4/>).As there is still a random component, however, this
seemsquite unlikely. 

Sure, that helps, but it's still not great, and they're still much, much
larger than you'd ever need for an identifier inside of a given system,
so best to map it to something reasonable and avoid them as much as
possible.

Thanks,

Stephen

Attachment

signature.asc

Re: UUID or auto-increment

From

Adam Brusselback

Date:

10 August 2020, 17:38:02

> I would point out, however, that using a V1 UUID rather than a V4 can help with this as it is sequential, not random (based on MAC address and timestamp + random)

I wanted to make this point, using sequential UUIDs helped me reduce write amplification quite a bit with my application, I didn't use V1, instead I used: https://pgxn.org/dist/sequential_uuids/

Reduces the pain caused by UUIDs a ton IMO.

-Adam

Re: UUID or auto-increment

From

"Peter J. Holzer"

Date:

10 August 2020, 20:06:10

On 2020-08-10 09:10:00 -0800, Israel Brewster wrote:
> I would point out, however, that using a V1 UUID rather than a V4 can
> help with this as it is sequential, not random (based on MAC address
> and timestamp + random).

If I read the specs correctly, a V1 UUID will roll over every 429
seconds. I think that as far as index locality is concerned, this is
essentially random for most applications.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

signature.asc

Re: UUID or auto-increment

From

Israel Brewster

Date:

10 August 2020, 20:44:49

> On Aug 10, 2020, at 12:06 PM, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
>
> On 2020-08-10 09:10:00 -0800, Israel Brewster wrote:
>> I would point out, however, that using a V1 UUID rather than a V4 can
>> help with this as it is sequential, not random (based on MAC address
>> and timestamp + random).
>
> If I read the specs correctly, a V1 UUID will roll over every 429
> seconds. I think that as far as index locality is concerned, this is
> essentially random for most applications.

According to wikipedia, the time value in a V1 UUID is a 60-bit number, and will roll over "around 3400AD”, depending
onthe algorithm used, or 5236AD if the software treats the timestamp as unsigned. This timestamp is extended by a 13 or
14-bit“uniqifying" clock sequence to handle cases of overlap, and then the 48bit MAC address (constant, so no rollover
there)is appended. So perhaps that 13 or 14 bit “uniqifying” sequence will roll over every 429 seconds, however the
timestamp*as a whole* won’t roll over for quite a while yet, thereby guaranteeing that the UUIDs will be sequential,
notrandom (since, last I checked, time was sequential). 

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

>
>        hp
>
> --
>   _  | Peter J. Holzer    | Story must make more sense than reality.
> |_|_) |                    |
> | |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
> __/   | http://www.hjp.at/ |       challenge!"

Re: UUID or auto-increment

From

Rob Sargent

Date:

10 August 2020, 20:56:37


On 8/10/20 10:53 AM, Stephen Frost wrote:
> Greeitngs,
> 
> * Ron (ronljohnsonjr@gmail.com) wrote:
>> On 8/10/20 11:38 AM, Ravi Krishna wrote:
>>> Finally UUID results in write amplication in wal logs.  Keep that in mind
>>> if your app does lot of writes.
>>
>> Because UUID is 32 bytes, while SERIAL is 4 bytes?
> 
> and because it's random and so will touch a lot more pages when you're
> using it...
> 
> Avoid UUIDs if you can- map them to something more sensible internally
> if you have to deal with them.
> 
> Thanks,
> 
> Stephen
> 
I suspect the increased storage cost is more related to the size of the 
record than to the ratio of the data types.

What says two consecutively saved records ought to be stored on the same 
page or will likely be sought with the same search criterion.  Serial 
ids put a time order (loosely) on the data which may be completely 
artificial.

Re: UUID or auto-increment

From

John W Higgins

Date:

10 August 2020, 21:30:48

On Mon, Aug 10, 2020 at 1:45 PM Israel Brewster <ijbrewster@alaska.edu> wrote:

> On Aug 10, 2020, at 12:06 PM, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
>
> On 2020-08-10 09:10:00 -0800, Israel Brewster wrote:
>> I would point out, however, that using a V1 UUID rather than a V4 can
>> help with this as it is sequential, not random (based on MAC address
>> and timestamp + random).
>
> If I read the specs correctly, a V1 UUID will roll over every 429
> seconds. I think that as far as index locality is concerned, this is
> essentially random for most applications.

According to wikipedia, the time value in a V1 UUID is a 60-bit number, and will roll over "around 3400AD”, depending on the algorithm used, or 5236AD if the software treats the timestamp as unsigned. This timestamp is extended by a 13 or 14-bit “uniqifying" clock sequence to handle cases of overlap, and then the 48bit MAC address (constant, so no rollover there) is appended. So perhaps that 13 or 14 bit “uniqifying” sequence will roll over every 429 seconds, however the timestamp *as a whole* won’t roll over for quite a while yet, thereby guaranteeing that the UUIDs will be sequential, not random (since, last I checked, time was sequential).

Except the time portion of a V1 UUID is not written high to low but rather low then middle then high which means that the time portion is not expressed in a sequential format and the left 8 chars of a V1 UUID "rollover" every 429 seconds or so.

For example a V1 UUID right around now looks like

7db3f2ba-db4f-11ea-87d0-0242ac130003

Less than a second later

7db534cc-db4f-11ea-87d0-0242ac130003

So that looks sequential but in roughly 429 seconds it will look like

7db3f2ba-db4f-11ea-87d1-0242ac130003

More importantly in other roughly 300 seconds it would be something like

6ab3f2ba-db4f-11ea-87d2-0242ac130003

Note the move from 87d0 to 87d1 and 87d2 in the middle but the left 8 bytes "rollover".

That's not quite sequential in terms of indexing.

John