Thread: Primary keys and composite unique keys(basic question)

Primary keys and composite unique keys(basic question)

From
Mohan Radhakrishnan
Date:
Hello,
             We have UUIDs in our tables which are primary keys. But in some cases
we also identify a composite unique key apart from the primary key.

My assumption is that there should be a unique key index created by us using the composite key. And when we fetch using this composite key instead of the primary key we have a performance boost due to the index.

Is this right ? Are there more details I should read to understand this better ? Please point. If I should use the query planner to look at the statistics I will.
.
Thanks,
Mohan

Re: Primary keys and composite unique keys(basic question)

From
Tom Lane
Date:
Mohan Radhakrishnan <radhakrishnan.mohan@gmail.com> writes:
>              We have UUIDs in our tables which are primary keys. But in
> some cases
> we also identify a composite unique key apart from the primary key.

> My assumption is that there should be a unique key index created by us
> using the composite key. And when we fetch using this composite key instead
> of the primary key we have a performance boost due to the index.

You haven't provided a lot of detail, but use-a-UUID-as-a-primary-key
is often an antipattern.  The UUIDs are quasi-random, meaning there's
no locality of reference in the primary key index, resulting in
inefficiency in searches and insertions.  If the composite key you
mention has some actual relationship to your application's usage
patterns, it could be winning as a result of better locality of
access to that index.

            regards, tom lane



Re: Primary keys and composite unique keys(basic question)

From
Mohan Radhakrishnan
Date:
I will cover the UUIDs first. They are indispensable to us.
1. The data is distributed over regions So we need the row to be unique.
2. This distributed data is sent to services as events. That is the application architecture.

But we don't search using UUIDs always. Only when data from another distributed service
is received we need them and in such cases we have to join using them.

But for local data we can identify another composite unique key. Does PostgreSql
create a unique index for us ? What about a FK that references this composite
unique key ? Does it create a FK index ?


Thank you.

On Wed, Mar 31, 2021 at 7:42 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Mohan Radhakrishnan <radhakrishnan.mohan@gmail.com> writes:
>              We have UUIDs in our tables which are primary keys. But in
> some cases
> we also identify a composite unique key apart from the primary key.

> My assumption is that there should be a unique key index created by us
> using the composite key. And when we fetch using this composite key instead
> of the primary key we have a performance boost due to the index.

You haven't provided a lot of detail, but use-a-UUID-as-a-primary-key
is often an antipattern.  The UUIDs are quasi-random, meaning there's
no locality of reference in the primary key index, resulting in
inefficiency in searches and insertions.  If the composite key you
mention has some actual relationship to your application's usage
patterns, it could be winning as a result of better locality of
access to that index.

                        regards, tom lane

Re: Primary keys and composite unique keys(basic question)

From
Michael Lewis
Date:
Etiquette on these lists is to reply in line or below the relevant portion, not top-post with full quoting like default gmail behavior.

On Wed, Mar 31, 2021 at 9:18 AM Mohan Radhakrishnan <radhakrishnan.mohan@gmail.com> wrote:
But we don't search using UUIDs always. Only when data from another distributed service
is received we need them and in such cases we have to join using them.

I haven't used them so I don't recall exactly, but I believe there is a type of UUID generation which has some leading correlation to time which would help with reducing the random I/O issue that Tom Lane mentioned. A quick search of the archive may lead you to that, or someone else may chime in with the name I expect.
 
But for local data we can identify another composite unique key. Does PostgreSql
create a unique index for us ? What about a FK that references this composite
unique key ? Does it create a FK index ?

It is up to you to create whichever fkeys and indexes you require.

Re: Primary keys and composite unique keys(basic question)

From
Merlin Moncure
Date:
On Wed, Mar 31, 2021 at 3:36 AM Mohan Radhakrishnan
<radhakrishnan.mohan@gmail.com> wrote:
>
> Hello,
>              We have UUIDs in our tables which are primary keys. But in some cases
> we also identify a composite unique key apart from the primary key.
>
> My assumption is that there should be a unique key index created by us using the composite key. And when we fetch
usingthis composite key instead of the primary key we have a performance boost due to the index.
 
>
> Is this right ? Are there more details I should read to understand this better ? Please point. If I should use the
queryplanner to look at the statistics I will.
 

This is one of the great debates in computer science and it is not
settled.  There are various tradeoffs around using a composite key
derived from the data (aka natural key) vs generated identifiers. It's
a complex topic with many facets: performance, organization,
validation, and correctness are all relevant considerations.  I would
never use UUIDS for keys though.

merlin



Re: Primary keys and composite unique keys(basic question)

From
Rob Sargent
Date:
On 4/1/21 8:28 PM, Merlin Moncure wrote:
>
> This is one of the great debates in computer science and it is not
> settled.  There are various tradeoffs around using a composite key
> derived from the data (aka natural key) vs generated identifiers. It's
> a complex topic with many facets: performance, organization,
> validation, and correctness are all relevant considerations.  I would
> never use UUIDS for keys though.
>
> merlin
>
>
And, pray tell, for what exactly would you use universally unique 
identifiers.



Re: Primary keys and composite unique keys(basic question)

From
Hemil Ruparel
Date:
I used uuid4 for customer ids because i needed to interface with payment providers. Is that wrong? All other places except transaction ids, i have used serial ints

On Fri 2 Apr, 2021, 8:56 AM Rob Sargent, <robjsargent@gmail.com> wrote:
On 4/1/21 8:28 PM, Merlin Moncure wrote:
>
> This is one of the great debates in computer science and it is not
> settled.  There are various tradeoffs around using a composite key
> derived from the data (aka natural key) vs generated identifiers. It's
> a complex topic with many facets: performance, organization,
> validation, and correctness are all relevant considerations.  I would
> never use UUIDS for keys though.
>
> merlin
>
>
And, pray tell, for what exactly would you use universally unique
identifiers.


Re: Primary keys and composite unique keys(basic question)

From
Bertrand Mamasam
Date:
On Fri, Apr 2, 2021 at 8:57 AM Hemil Ruparel <hemilruparel2002@gmail.com> wrote:
I used uuid4 for customer ids because i needed to interface with payment providers. Is that wrong? All other places except transaction ids, i have used serial ints

On Fri 2 Apr, 2021, 8:56 AM Rob Sargent, <robjsargent@gmail.com> wrote:
On 4/1/21 8:28 PM, Merlin Moncure wrote:
>
> This is one of the great debates in computer science and it is not
> settled.  There are various tradeoffs around using a composite key
> derived from the data (aka natural key) vs generated identifiers. It's
> a complex topic with many facets: performance, organization,
> validation, and correctness are all relevant considerations.  I would
> never use UUIDS for keys though.
>
> merlin
>
>
And, pray tell, for what exactly would you use universally unique
identifiers.

ULID perform better than UUID with btree indexes. The first part is based on a timestamp, so if you don't mind exposing this information, they are a better choice. There is an implementation for postgres here: https://github.com/geckoboard/pgulid
Maybe there are others.

Re: Primary keys and composite unique keys(basic question)

From
Laurenz Albe
Date:
On Thu, 2021-04-01 at 21:28 -0500, Merlin Moncure wrote:
> I would never use UUIDS for keys though.

That makes me curious for your reasons.

I see the following disadvantages:

- A UUID requires twice as much storage space as a bigint.

- B-tree indexes are space optimized for inserting at the
  rightmost leaf page, but UUIDs are random.

- UUIDs are more expensive to generate.

On the other hand, many processes trying to insert into
the same index page might lead to contention.

Is there anything I have missed?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




Re: Primary keys and composite unique keys(basic question)

From
Merlin Moncure
Date:
On Thu, Apr 1, 2021 at 10:26 PM Rob Sargent <robjsargent@gmail.com> wrote:
>
> On 4/1/21 8:28 PM, Merlin Moncure wrote:
> >
> > This is one of the great debates in computer science and it is not
> > settled.  There are various tradeoffs around using a composite key
> > derived from the data (aka natural key) vs generated identifiers. It's
> > a complex topic with many facets: performance, organization,
> > validation, and correctness are all relevant considerations.  I would
> > never use UUIDS for keys though.
> >
> > merlin
> >
> >
> And, pray tell, for what exactly would you use universally unique
> identifiers.

I don't disagree that UUID are an ok choice in that scenario.  I'll
tell you what though, that scenario comes up fairly rarely.  However,
there are a couple of alternatives if you're curious.

*) Generate ids from a generator service.  This pattern is fairly
common. It has some downsides (slower, more complicated inserts
mainly) but works well in other ways.  You can mitigate the
performance downsides by allocated identifiers in blocks.

*) Use sequences, but with  a sequence id  added as a composite or
maksed into the integer. This works pretty well in practice.

merlin



Re: Primary keys and composite unique keys(basic question)

From
Merlin Moncure
Date:
On Fri, Apr 2, 2021 at 3:40 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> On Thu, 2021-04-01 at 21:28 -0500, Merlin Moncure wrote:
> > I would never use UUIDS for keys though.
>
> That makes me curious for your reasons.
>
> I see the following disadvantages:
>
> - A UUID requires twice as much storage space as a bigint.
>
> - B-tree indexes are space optimized for inserting at the
>   rightmost leaf page, but UUIDs are random.
>
> - UUIDs are more expensive to generate.
>
> On the other hand, many processes trying to insert into
> the same index page might lead to contention.
>
> Is there anything I have missed?

It's a small thing, but UUIDs are absolutely not memorizable by
humans; they have zero semantic value.  Sequential numeric identifiers
are generally easier to transpose and the value gives some clues to
its age (of course, in security contexts this can be a downside).

Performance-wise, UUIDS are absolutely horrible for data at scale as
Tom rightly points out.  Everything is randomized, just awful.  There
are some alternate implementations of UUID that mitigate this but I've
never seen them used in the wild in actual code.

merlin



Re: Primary keys and composite unique keys(basic question)

From
Rob Sargent
Date:



It's a small thing, but UUIDs are absolutely not memorizable by
humans; they have zero semantic value.  Sequential numeric identifiers
are generally easier to transpose and the value gives some clues to
its age (of course, in security contexts this can be a downside).

I take the above as a definite plus.  Spent too much of my life correcting others’ use of “remembered” id’s that just happened to perfectly match the wrong thing.

Performance-wise, UUIDS are absolutely horrible for data at scale as
Tom rightly points out.  Everything is randomized, just awful.  There
are some alternate implementations of UUID that mitigate this but I've
never seen them used in the wild in actual code.


That b-tree’s have been optimized to handle serial ints might be a considered a reaction to that popular (and distasteful) choice.  Perhaps there should be a ’non-optimized’ option.


Re: Primary keys and composite unique keys(basic question)

From
Merlin Moncure
Date:
On Mon, Apr 5, 2021 at 9:37 PM Rob Sargent <robjsargent@gmail.com> wrote:
>
> It's a small thing, but UUIDs are absolutely not memorizable by
> humans; they have zero semantic value.  Sequential numeric identifiers
> are generally easier to transpose and the value gives some clues to
> its age (of course, in security contexts this can be a downside).
>
> I take the above as a definite plus.  Spent too much of my life correcting others’ use of “remembered” id’s that just
happenedto perfectly match the wrong thing. 
>
> Performance-wise, UUIDS are absolutely horrible for data at scale as
> Tom rightly points out.  Everything is randomized, just awful.  There
> are some alternate implementations of UUID that mitigate this but I've
> never seen them used in the wild in actual code.
>
>
> That b-tree’s have been optimized to handle serial ints might be a considered a reaction to that popular (and
distasteful)choice.  Perhaps there should be a ’non-optimized’ option. 

It's not just the BTree, but the heap as well.   For large tables, you
are pretty much guaranteed to read a page for each record you want to
load via the key regardless of the pattern of access.  It's incredibly
wasteful regardless of the speed of the underlying storage fabric.
Very few developers actually understand this.

If computers were infinitely fast this wouldn't matter, but they aren't :-).

merlin



Re: Primary keys and composite unique keys(basic question)

From
Ron
Date:
On 4/5/21 9:37 PM, Rob Sargent wrote:
It's a small thing, but UUIDs are absolutely not memorizable by
humans; they have zero semantic value.  Sequential numeric identifiers
are generally easier to transpose and the value gives some clues to
its age (of course, in security contexts this can be a downside).

I take the above as a definite plus.  Spent too much of my life correcting others’ use of “remembered” id’s that just happened to perfectly match the wrong thing.

People seem to have stopped appending check digits to identifiers about 20 years ago, and I'm not sure why.

--
Angular momentum makes the world go 'round.

Re: Primary keys and composite unique keys(basic question)

From
Rob Sargent
Date:


On Apr 7, 2021, at 10:17 AM, Ron <ronljohnsonjr@gmail.com> wrote:

 On 4/5/21 9:37 PM, Rob Sargent wrote:
It's a small thing, but UUIDs are absolutely not memorizable by
humans; they have zero semantic value.  Sequential numeric identifiers
are generally easier to transpose and the value gives some clues to
its age (of course, in security contexts this can be a downside).

I take the above as a definite plus.  Spent too much of my life correcting others’ use of “remembered” id’s that just happened to perfectly match the wrong thing.

People seem to have stopped appending check digits to identifiers about 20 years ago, and I'm not sure why.

No the problem is “start from one”. User has item/I’d 10875 in hand and types in 10785 which of course in a sequence supplied ID steam is perfectly valid and wrong.  Really hard to track down. 
--
Angular momentum makes the world go 'round.

Re: Primary keys and composite unique keys(basic question)

From
Ron
Date:
On 4/7/21 11:35 AM, Rob Sargent wrote:

On Apr 7, 2021, at 10:17 AM, Ron <ronljohnsonjr@gmail.com> wrote:

 On 4/5/21 9:37 PM, Rob Sargent wrote:
It's a small thing, but UUIDs are absolutely not memorizable by
humans; they have zero semantic value.  Sequential numeric identifiers
are generally easier to transpose and the value gives some clues to
its age (of course, in security contexts this can be a downside).

I take the above as a definite plus.  Spent too much of my life correcting others’ use of “remembered” id’s that just happened to perfectly match the wrong thing.

People seem to have stopped appending check digits to identifiers about 20 years ago, and I'm not sure why.

No the problem is “start from one”. User has item/I’d 10875 in hand and types in 10785 which of course in a sequence supplied ID steam is perfectly valid and wrong.  Really hard to track down.

That's my point.  Adding a check digit (turning 10875 into 108753) would have caught that, since 107853 does not match 107854 (which is 10785 with a check digit added).

--
Angular momentum makes the world go 'round.

Re: Primary keys and composite unique keys(basic question)

From
Rob Sargent
Date:
On 4/7/21 11:59 AM, Ron wrote:
On 4/7/21 11:35 AM, Rob Sargent wrote:

On Apr 7, 2021, at 10:17 AM, Ron <ronljohnsonjr@gmail.com> wrote:

 On 4/5/21 9:37 PM, Rob Sargent wrote:
It's a small thing, but UUIDs are absolutely not memorizable by
humans; they have zero semantic value.  Sequential numeric identifiers
are generally easier to transpose and the value gives some clues to
its age (of course, in security contexts this can be a downside).

I take the above as a definite plus.  Spent too much of my life correcting others’ use of “remembered” id’s that just happened to perfectly match the wrong thing.

People seem to have stopped appending check digits to identifiers about 20 years ago, and I'm not sure why.

No the problem is “start from one”. User has item/I’d 10875 in hand and types in 10785 which of course in a sequence supplied ID steam is perfectly valid and wrong.  Really hard to track down.

That's my point.  Adding a check digit (turning 10875 into 108753) would have caught that, since 107853 does not match 107854 (which is 10785 with a check digit added).
Well you forget that 108753 is also a number in the series from 1 to maxint.   Maybe you're on to something though: a checksum dispensing sequence!

Re: Primary keys and composite unique keys(basic question)

From
Ron
Date:
On 4/7/21 1:16 PM, Rob Sargent wrote:
On 4/7/21 11:59 AM, Ron wrote:
On 4/7/21 11:35 AM, Rob Sargent wrote:

On Apr 7, 2021, at 10:17 AM, Ron <ronljohnsonjr@gmail.com> wrote:

 On 4/5/21 9:37 PM, Rob Sargent wrote:
It's a small thing, but UUIDs are absolutely not memorizable by
humans; they have zero semantic value.  Sequential numeric identifiers
are generally easier to transpose and the value gives some clues to
its age (of course, in security contexts this can be a downside).

I take the above as a definite plus.  Spent too much of my life correcting others’ use of “remembered” id’s that just happened to perfectly match the wrong thing.

People seem to have stopped appending check digits to identifiers about 20 years ago, and I'm not sure why.

No the problem is “start from one”. User has item/I’d 10875 in hand and types in 10785 which of course in a sequence supplied ID steam is perfectly valid and wrong.  Really hard to track down.

That's my point.  Adding a check digit (turning 10875 into 108753) would have caught that, since 107853 does not match 107854 (which is 10785 with a check digit added).
Well you forget that 108753 is also a number in the series from 1 to maxint.   Maybe you're on to something though: a checksum dispensing sequence!

Call a function which reads the next value from the sequence, appends the check digit and returns that number.  We were doing that 25 years ago.

--
Angular momentum makes the world go 'round.

Re: Primary keys and composite unique keys(basic question)

From
"Peter J. Holzer"
Date:
On 2021-04-07 10:35:55 -0600, Rob Sargent wrote:
>     On Apr 7, 2021, at 10:17 AM, Ron <ronljohnsonjr@gmail.com> wrote:
>
>     On 4/5/21 9:37 PM, Rob Sargent wrote:
>
>             It's a small thing, but UUIDs are absolutely not memorizable by
>             humans; they have zero semantic value.  Sequential numeric
>             identifiers
>             are generally easier to transpose and the value gives some clues to
>             its age (of course, in security contexts this can be a downside).
>
>
>         I take the above as a definite plus.  Spent too much of my life
>         correcting others’ use of “remembered” id’s that just happened to
>         perfectly match the wrong thing.
>
>
>     People seem to have stopped appending check digits to identifiers about 20
>     years ago, and I'm not sure why.
>
>
> No the problem is “start from one”. User has item/I’d 10875 in hand and types
> in 10785 which of course in a sequence supplied ID steam is perfectly valid and
> wrong.  Really hard to track down.

What I've often done when writing software for an Oracle RDBMS is to use
a global sequence instead of one sequence per table (Oracle didn't have
a SERIAL type, so you had to use an explicit sequence and write trigger
anyway). That caught wrong joins (an id value used in one table is never
used in another table) as well as most typos (since ids in a single
table were not dense).

With PostgreSQL I've stopped doing this since the SERIAL type makes it
much more convenient to have a separate sequence per table. But of
course that means that almost any table will have a row with id 10785
and one with 10875.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment