Re: UUID v1 optimizations... - Mailing list pgsql-performance
From | Ancoron Luciferis |
---|---|
Subject | Re: UUID v1 optimizations... |
Date | |
Msg-id | c27373f9-d07c-774a-3e53-1202d1ab1345@googlemail.com Whole thread Raw |
In response to | Re: UUID v1 optimizations... (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-performance |
On 25/05/2019 23:54, Tom Lane wrote: > Ancoron Luciferis <ancoron.luciferis@googlemail.com> writes: >> On 25/05/2019 16:57, Tom Lane wrote: >>> (4) it in fact *wouldn't* do anything useful, because we'd still have >>> to sort UUIDs in the same order as today, meaning that btree index behavior >>> would remain the same as before. Plus UUID comparison would get a lot >>> more complicated and slower than it is now. > >> I get your first sentence, but not your second. I know that when >> changing the internal byte order we'd have to completed re-compute >> everything on-disk (from table to index data), but why would the sorting >> in the index have to be the same? > > Because we aren't going to change the existing sort order of UUIDs. > We have no idea what applications might be dependent on that. > > As Vitalii correctly pointed out, your beef is not with the physical > storage of UUIDs anyway: you just wish they'd sort differently, since > that is what determines the behavior of a btree index. But we aren't > going to change the sort ordering because that's an even bigger > compatibility break than changing the physical storage; it'd affect > application-visible semantics. > > What you might want to think about is creating a function that maps > UUIDs into an ordering that makes sense to you, and then creating > a unique index over that function instead of the raw UUIDs. That > would give the results you want without having to negotiate with the > rest of the world about whether it's okay to change the semantics > of type uuid. > > regards, tom lane > I understand. Point taken, I really didn't think about someone could depend on an index order of a (pretty random) UUID. The whole point of me starting this discussion was about performance in multiple areas, but INSERT performance was really becoming an issue for us apart from the index bloat, which was way higher than just the 30% at several occasions (including out-of-disk-space in the early days), apart from the fact that the index was regularly dismissed due to it not being in memory. In that sense, just creating additional indexes with functions doesn't really solve the core issues that we had. Not to mention the performance of VACUUM, among other things. So, even we currently "solved" a lot of these issues at the application level, we now have UUID's that look like v1 UUID's but in fact will not be readable (in the representation as returned by PostgreSQL) by any other application that doesn't know our specific implementation. This forces us to hack other tools written in other languages that would otherwise understand and handle regular v1 UUID's as well. I should add that the tests I have made where all running on dedicated SSD's, one for the table data, one for the indexes and one for the WAL. If those where running against the same disks the difference would probably be much higher during writes. I'll think about creating an extension to provide a custom data type instead. So nobody would be at risk and anyone would decide explicitly for it with all consequences. Thank you for your time and precious input. :) Cheers, Ancoron
pgsql-performance by date: