Re: UUIDs & Clustered Indexes - Mailing list pgsql-general

From Mike Sofen
Subject Re: UUIDs & Clustered Indexes
Date
Msg-id 02de01d202dd$3b7d2190$b27764b0$@runbox.com
Whole thread Raw
In response to Re: UUIDs & Clustered Indexes  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general

From: Tom Lane  Sent: Tuesday, August 30, 2016 7:16 AM

Luke Gordon <gordysc@gmail.com> writes:

> However, according to a message on this mailing list, Postgres doesn't

> have clustered indexes:

> "But Postgres doesn't _have_ clustered indexes, so that article

> doesn't apply at all. The other authors appear to have missed this important point."

> https://www.postgresql.org/message-id/56798352.7060902%40uchicago.edu

 

> But, doing a quick check, it appears Postgres does indeed have a

> mechanism for a clustered index:

> https://www.postgresql.org/docs/9.5/static/sql-cluster.html

 

CLUSTER just does a one-time sort to put the table into index order.

There is no mechanism that would cause subsequent insertions of new keys to respect that ordering, so it's pretty much irrelevant to the argument about whether new UUID keys need to be generated in some ordered fashion.

 

Do you actually *need* UUID keys, and if so why?  A plain old bigint column is smaller, cheaper to index, and the natural mechanism for generating it (ie a sequence) will tend to preserve ordering for free.

 

                                             regards, tom lane

 

I agree with Tom for a “normal” application – I would always use bigints (bigserial) as a PK column.  The app I currently working on is a high security web app for which the app coders require guids for all identifiers flowing around the system.  So in this scenario, I’m using BOTH bigserials as the PK and uuids as AKs in the core tables.  I reference the bigints for all joins and (have to) use the uuids for the filters.  It’s been working ok so far, lookup performance on a table with a few million rows, using the uuid (indexed) is instantaneous.  I’ll soon have a 100 million+ rows loaded into a single table and know a bit more.

 

The uuids are also design insurance for me in case I need to shard, since I’ll need/want that uniqueness across servers.

 

Mike Sofen

pgsql-general by date:

Previous
From: Francisco Olarte
Date:
Subject: Re: UUIDs & Clustered Indexes
Next
From: Nicolas Grilly
Date:
Subject: Clustered index to preserve data locality in a multitenant application?