Thread: UUID or auto-increment
Hi,
for web application is it needed to use UUID or auto-increment?
1- if two user inserts row at the same time, does it work?
2- dose the database give the same id for both users or execute one of them first? ( I mean ID conflict not happens?)
Thanks.
Both can handle concurrent writes. auto-increment is nothing but serial or sequence cols and they can handle unique concurrent request. That is why sometimes you may have gaps.
UUID is not only unique, but is also unique across space. You can have two different databases generate UUID at the same time and it will still be unique. So that will help if you are consolidating different databases into one big data mart and they can all can go to the same table without conflict. With Sequence or Serial that will be a problem.
Finally UUID results in write amplication in wal logs. Keep that in mind if your app does lot of writes.
UUID are also random and not correlated with time typically, so with a very large table when accessing primarily recent data, hitting an index on a big table will pull random pages into memory instead of primarily the end of the index.
On 8/10/20 11:38 AM, Ravi Krishna wrote:
[snip]
Because UUID is 32 bytes, while SERIAL is 4 bytes?
[snip]
Finally UUID results in write amplication in wal logs. Keep that in mind if your app does lot of writes.
Because UUID is 32 bytes, while SERIAL is 4 bytes?
--
Angular momentum makes the world go 'round.
Angular momentum makes the world go 'round.
Greeitngs, * Ron (ronljohnsonjr@gmail.com) wrote: > On 8/10/20 11:38 AM, Ravi Krishna wrote: > >Finally UUID results in write amplication in wal logs. Keep that in mind > >if your app does lot of writes. > > Because UUID is 32 bytes, while SERIAL is 4 bytes? and because it's random and so will touch a lot more pages when you're using it... Avoid UUIDs if you can- map them to something more sensible internally if you have to deal with them. Thanks, Stephen
Attachment
On 8/10/20 9:51 AM, Ron wrote: > On 8/10/20 11:38 AM, Ravi Krishna wrote: > [snip] >> Finally UUID results in write amplication in wal logs. Keep that in >> mind if your app does lot of writes. > > Because UUID is 32 bytes, while SERIAL is 4 bytes? You mean 32 digits for 128 bits?: https://www.postgresql.org/docs/12/datatype-uuid.html And there is BIGSERIAL which is 8 bytes. > > -- > Angular momentum makes the world go 'round. -- Adrian Klaver adrian.klaver@aklaver.com
---
Israel Brewster
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320
Work: 907-474-5172
cell: 907-328-9145
cell: 907-328-9145
On Aug 10, 2020, at 8:53 AM, Stephen Frost <sfrost@snowman.net> wrote:Greeitngs,
* Ron (ronljohnsonjr@gmail.com) wrote:On 8/10/20 11:38 AM, Ravi Krishna wrote:Finally UUID results in write amplication in wal logs. Keep that in mind
if your app does lot of writes.
Because UUID is 32 bytes, while SERIAL is 4 bytes?
and because it's random and so will touch a lot more pages when you're
using it...
I would point out, however, that using a V1 UUID rather than a V4 can help with this as it is sequential, not random (based on MAC address and timestamp + random). There is a trade off, of course, as with V1 if two writes occur on the same computer at the exact same millisecond, there is a very very small chance of generating conflicting UUID’s (see https://www.sohamkamani.com/blog/2016/10/05/uuid1-vs-uuid4/). As there is still a random component, however, this seems quite unlikely.
Avoid UUIDs if you can- map them to something more sensible internally
if you have to deal with them.
Thanks,
Stephen
Greetings, * Israel Brewster (ijbrewster@alaska.edu) wrote: > > On Aug 10, 2020, at 8:53 AM, Stephen Frost <sfrost@snowman.net> wrote: > > * Ron (ronljohnsonjr@gmail.com) wrote: > >> On 8/10/20 11:38 AM, Ravi Krishna wrote: > >>> Finally UUID results in write amplication in wal logs. Keep that in mind > >>> if your app does lot of writes. > >> > >> Because UUID is 32 bytes, while SERIAL is 4 bytes? > > > > and because it's random and so will touch a lot more pages when you're > > using it... > > I would point out, however, that using a V1 UUID rather than a V4 can help with this as it is sequential, not random (basedon MAC address and timestamp + random). There is a trade off, of course, as with V1 if two writes occur on the samecomputer at the exact same millisecond, there is a very very small chance of generating conflicting UUID’s (see https://www.sohamkamani.com/blog/2016/10/05/uuid1-vs-uuid4/ <https://www.sohamkamani.com/blog/2016/10/05/uuid1-vs-uuid4/>).As there is still a random component, however, this seemsquite unlikely. Sure, that helps, but it's still not great, and they're still much, much larger than you'd ever need for an identifier inside of a given system, so best to map it to something reasonable and avoid them as much as possible. Thanks, Stephen
Attachment
> I would point out, however, that using a V1 UUID rather than a V4 can help with this as it is sequential, not random (based on MAC address and timestamp + random)
I wanted to make this point, using sequential UUIDs helped me reduce write amplification quite a bit with my application, I didn't use V1, instead I used: https://pgxn.org/dist/sequential_uuids/
Reduces the pain caused by UUIDs a ton IMO.
-Adam
On 2020-08-10 09:10:00 -0800, Israel Brewster wrote: > I would point out, however, that using a V1 UUID rather than a V4 can > help with this as it is sequential, not random (based on MAC address > and timestamp + random). If I read the specs correctly, a V1 UUID will roll over every 429 seconds. I think that as far as index locality is concerned, this is essentially random for most applications. hp -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | hjp@hjp.at | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!"
Attachment
> On Aug 10, 2020, at 12:06 PM, Peter J. Holzer <hjp-pgsql@hjp.at> wrote: > > On 2020-08-10 09:10:00 -0800, Israel Brewster wrote: >> I would point out, however, that using a V1 UUID rather than a V4 can >> help with this as it is sequential, not random (based on MAC address >> and timestamp + random). > > If I read the specs correctly, a V1 UUID will roll over every 429 > seconds. I think that as far as index locality is concerned, this is > essentially random for most applications. According to wikipedia, the time value in a V1 UUID is a 60-bit number, and will roll over "around 3400AD”, depending onthe algorithm used, or 5236AD if the software treats the timestamp as unsigned. This timestamp is extended by a 13 or 14-bit“uniqifying" clock sequence to handle cases of overlap, and then the 48bit MAC address (constant, so no rollover there)is appended. So perhaps that 13 or 14 bit “uniqifying” sequence will roll over every 429 seconds, however the timestamp*as a whole* won’t roll over for quite a while yet, thereby guaranteeing that the UUIDs will be sequential, notrandom (since, last I checked, time was sequential). --- Israel Brewster Software Engineer Alaska Volcano Observatory Geophysical Institute - UAF 2156 Koyukuk Drive Fairbanks AK 99775-7320 Work: 907-474-5172 cell: 907-328-9145 > > hp > > -- > _ | Peter J. Holzer | Story must make more sense than reality. > |_|_) | | > | | | hjp@hjp.at | -- Charles Stross, "Creative writing > __/ | http://www.hjp.at/ | challenge!"
On 8/10/20 10:53 AM, Stephen Frost wrote: > Greeitngs, > > * Ron (ronljohnsonjr@gmail.com) wrote: >> On 8/10/20 11:38 AM, Ravi Krishna wrote: >>> Finally UUID results in write amplication in wal logs. Keep that in mind >>> if your app does lot of writes. >> >> Because UUID is 32 bytes, while SERIAL is 4 bytes? > > and because it's random and so will touch a lot more pages when you're > using it... > > Avoid UUIDs if you can- map them to something more sensible internally > if you have to deal with them. > > Thanks, > > Stephen > I suspect the increased storage cost is more related to the size of the record than to the ratio of the data types. What says two consecutively saved records ought to be stored on the same page or will likely be sought with the same search criterion. Serial ids put a time order (loosely) on the data which may be completely artificial.
On Mon, Aug 10, 2020 at 1:45 PM Israel Brewster <ijbrewster@alaska.edu> wrote:
> On Aug 10, 2020, at 12:06 PM, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
>
> On 2020-08-10 09:10:00 -0800, Israel Brewster wrote:
>> I would point out, however, that using a V1 UUID rather than a V4 can
>> help with this as it is sequential, not random (based on MAC address
>> and timestamp + random).
>
> If I read the specs correctly, a V1 UUID will roll over every 429
> seconds. I think that as far as index locality is concerned, this is
> essentially random for most applications.
According to wikipedia, the time value in a V1 UUID is a 60-bit number, and will roll over "around 3400AD”, depending on the algorithm used, or 5236AD if the software treats the timestamp as unsigned. This timestamp is extended by a 13 or 14-bit “uniqifying" clock sequence to handle cases of overlap, and then the 48bit MAC address (constant, so no rollover there) is appended. So perhaps that 13 or 14 bit “uniqifying” sequence will roll over every 429 seconds, however the timestamp *as a whole* won’t roll over for quite a while yet, thereby guaranteeing that the UUIDs will be sequential, not random (since, last I checked, time was sequential).
Except the time portion of a V1 UUID is not written high to low but rather low then middle then high which means that the time portion is not expressed in a sequential format and the left 8 chars of a V1 UUID "rollover" every 429 seconds or so.
For example a V1 UUID right around now looks like
7db3f2ba-db4f-11ea-87d0-0242ac130003
Less than a second later
7db534cc-db4f-11ea-87d0-0242ac130003
So that looks sequential but in roughly 429 seconds it will look like
7db3f2ba-db4f-11ea-87d1-0242ac130003
More importantly in other roughly 300 seconds it would be something like
6ab3f2ba-db4f-11ea-87d2-0242ac130003
Note the move from 87d0 to 87d1 and 87d2 in the middle but the left 8 bytes "rollover".
That's not quite sequential in terms of indexing.
John