Thread: Sequential UUID Generation
Hi,
We have migrated from Oracle to Postgres, there because of the replication requirements we used UUID columns.
I did a POC(in postgres) with sequential UUID against Non sequential which has shown lot of different in space utilization and index size. Sql server has "newsequentialid" which generates sequential UUID. I need similar functionality here.
I want to create a function which generates a sequential UUIDs, Any suggestions or support would be much appreciated.
Thanks,
Uday
I don't think PostgreSQL has anything like that at the moment. It would not be difficult to tweak the UUID generator to generate sequential (or monotonic) values, the tricky part seems to be durability requirements. One idea would be to simply store the value in (shared) memory, but that would mean losing the state on restart/crash, so there would need to be some sort of protection against generating duplicate values (say, using postmaster timestamp as the first 64 bits of the UUID). Another idea is to piggy-back this on bigint sequence somehow - split the 128bit range into 64+64, use the sequence value for the first 64b and pick the other half by random. That would guarantee both uniqueness, monotonicity and durability. And it would also be fairly random, making it difficult to guess UUIDs. regards On 10/29/2018 04:06 PM, Uday Bhaskar V wrote: > Hi, > > We have migrated from Oracle to Postgres, there because of the > replication requirements we used UUID columns. > I did a POC(in postgres) with sequential UUID against Non sequential > which has shown lot of different in space utilization and index size. > Sql server has "newsequentialid" which generates sequential UUID. I > need similar functionality here. > I want to create a function which generates a sequential UUIDs, Any > suggestions or support would be much appreciated. > > Thanks, > Uday -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Thanks Tomas! I will try.
Regards,
Uday
On Tue, Oct 30, 2018 at 6:43 PM Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
I don't think PostgreSQL has anything like that at the moment. It would
not be difficult to tweak the UUID generator to generate sequential (or
monotonic) values, the tricky part seems to be durability requirements.
One idea would be to simply store the value in (shared) memory, but that
would mean losing the state on restart/crash, so there would need to be
some sort of protection against generating duplicate values (say, using
postmaster timestamp as the first 64 bits of the UUID).
Another idea is to piggy-back this on bigint sequence somehow - split
the 128bit range into 64+64, use the sequence value for the first 64b
and pick the other half by random. That would guarantee both uniqueness,
monotonicity and durability. And it would also be fairly random, making
it difficult to guess UUIDs.
regards
On 10/29/2018 04:06 PM, Uday Bhaskar V wrote:
> Hi,
>
> We have migrated from Oracle to Postgres, there because of the
> replication requirements we used UUID columns.
> I did a POC(in postgres) with sequential UUID against Non sequential
> which has shown lot of different in space utilization and index size.
> Sql server has "newsequentialid" which generates sequential UUID. I
> need similar functionality here.
> I want to create a function which generates a sequential UUIDs, Any
> suggestions or support would be much appreciated.
>
> Thanks,
> Uday
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
I would be very interested in a extension which generated sequential uuids. My entire db is key'd with uuids, and I have measured some index bloat related specifically to random uuid generation.
Thanks for bringing this up.
I came across a project for time based UUID ("tuid") a little while back: https://github.com/tanglebones/pg_tuid
I haven't used in production but skimmed through the code a bit out of technical curiosity. It handles some of the expected edge cases for backwards clock drift and concurrent generation.
The repo includes a PG extension and sample app code for generating tuids in a couple languages as well as a pure-SQL one (though that one uses random() rather than get_random_bytes() so I'd consider it more of an proof of concept).
I tried below function as which can be used as default to column. But every time we need to created 2 sequences, 1st one takes care of the first 8 bytes and 2nd takes care of the 2nd part of the UUID. I have not tested index and space utilization. I have to examine this. This might not be completely unique in the nature. but still trying for the best.
CREATE OR REPLACE FUNCTION public.fnu_generate_sequential_uuid(
sequence1 text,
sequence2 text)
RETURNS uuid
LANGUAGE 'plpgsql'
COST 100.0
VOLATILE NOT LEAKPROOF
AS $function$
DECLARE
sequenceUUIDPart1 text;
randomUUIDPart2 text;
counter integer:=0;
significantByte integer:=0;
startIndex integer:=0;
endIndex integer:=0;
BEGIN
-- Get random UUID
randomUUIDPart2 := replace(( uuid_generate_v4 () :: text),'-','');
-- verify first sequence reached max count.
IF to_hex(currval(sequence1)) :: text = '7fffffffffffffff' THEN
startIndex:=0;
endIndex:=7;
-- convert sequence into 32 bit string
sequenceUUIDPart1 = rpad(to_hex(nextval(sequence2))::text, 32, '0');
ELSE
startIndex:=8;
endIndex:=15;
-- convert sequence into 32 bit string
sequenceUUIDPart1 = rpad(to_hex(nextval(sequence1))::text, 32, '0');
END IF;
RAISE NOTICE 'current Guid: %', sequenceUUIDPart1;
-- loop through the 8th byte to 16th byte, till first sequence max .
-- loop through the 0 to 7 the byte till second sequence end.
FOR counter IN startIndex..endIndex LOOP
select get_byte(decode(sequenceUUIDPart1::text,'hex'), counter) into significantByte;
-- fill last 8 bytes with the generated random UUID values.
sequenceUUIDPart1 := encode(set_byte(decode(sequenceUUIDPart1 ::text,'hex') :: bytea ,counter, significantByte ) :: bytea, 'hex') :: text;
RAISE NOTICE 'current Guid: %', sequenceUUIDPart1;
END LOOP;
return sequenceUUIDPart1 :: UUID;
EXCEPTION
WHEN OTHERS
THEN
RAISE EXCEPTION 'An error was encountered in create_engagement_data_get_aud_area_ent_list - % -ERROR- %', sqlstate, sqlerrm;
END
$function$;
On Wed, Oct 31, 2018 at 1:51 AM Sehrope Sarkuni <sehrope@jackdb.com> wrote:
I came across a project for time based UUID ("tuid") a little while back: https://github.com/tanglebones/pg_tuidI haven't used in production but skimmed through the code a bit out of technical curiosity. It handles some of the expected edge cases for backwards clock drift and concurrent generation.The repo includes a PG extension and sample app code for generating tuids in a couple languages as well as a pure-SQL one (though that one uses random() rather than get_random_bytes() so I'd consider it more of an proof of concept).
On 11/19/18 2:08 PM, Uday Bhaskar V wrote: > I tried below function as which can be used as default to column. But > every time we need to created 2 sequences, 1st one takes care of the > first 8 bytes and 2nd takes care of the 2nd part of the UUID. I have not > tested index and space utilization. I have to examine this. This might > not be completely unique in the nature. but still trying for the best. > I got a bit bored a couple of days ago, so I wrote an extension generating UUIDs based on either a sequence or timestamp. See https://blog.2ndquadrant.com/sequential-uuid-generators/ for an explanation and some simple test results. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services