Thread: Advice on Contiguous IDs

Advice on Contiguous IDs

From
"Brian McKiernan"
Date:
Hi Folks,

Looking for some help/advice - not sure if this is the appropriate channel.

My Issue:
My primary keys in a certain table are not contiguous.

What I have done so far:

My Question:
1) What event would cause the CACHE clause in CREATE SEQUENCE to make an out of sequence next number?
2) In all cases am I correct in my thinking that in order to create contiguous primary key IDs then performance will greatly suffer? Do we have an idea of how bad this will generally be or what does that depend upon?

Many thanks in advance,
Brian

Re: Advice on Contiguous IDs

From
Alvaro Herrera
Date:
Brian McKiernan wrote:

> My Issue:
> My primary keys in a certain table are not contiguous.

If you have a need to have values that are contiguous, you need to ask
yourself why and then see what mechanism provides the semantics you
need.  An easy way is to lock the table containing the column, for
example, which of course means only one transaction can do it at a time.
For many use cases this is good enough.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Advice on Contiguous IDs

From
"David G. Johnston"
Date:
On Tue, Jan 9, 2018 at 2:06 AM, Brian McKiernan <brian.mckiernan@firstcircle.com> wrote:
1) What event would cause the CACHE clause in CREATE SEQUENCE to make an out of sequence next number?

​None - it will always issue the next sequential value when asked.  But the transaction asking doesn't have to use the provided value as the PK for the table or, and even if it does, if the transaction fails and rolls-back the sequence value it received is discarded/lost.​
​David J.​

Re: Advice on Contiguous IDs

From
Steve Atkins
Date:
> On Jan 9, 2018, at 1:06 AM, Brian McKiernan <brian.mckiernan@firstcircle.com> wrote:
>
>
> Hi Folks,
>
> Looking for some help/advice - not sure if this is the appropriate channel.

pgsql-general would be a better bet.

>
> My Issue:
> My primary keys in a certain table are not contiguous.

That itself isn't a problem at all. If there's a business requirement for them to be contiguous that's the issue to
considerfirst. 

>
> What I have done so far:
> I have checked the documentation and found:
https://wiki.postgresql.org/wiki/FAQ#Why_are_there_gaps_in_the_numbering_of_my_sequence.2FSERIAL_column.3F_Why_aren.27t_my_sequence_nu
> mbers_reused_on_transaction_abort.3F
>
> My Question:
> 1) What event would cause the CACHE clause in CREATE SEQUENCE to make an out of sequence next number?

It causes PostgreSQL to assign batches of numbers to each connection that needs one, making it more likely that they'll
beused out of order or that some won't be used at all. 

Using cache just makes it more obvious, though. There's no guarantee that a sequence will give you consecutive numbers,
northat they'll be ordered, in general. About the only thing that is guaranteed is that they'll be unique. 

> 2) In all cases am I correct in my thinking that in order to create contiguous primary key IDs then performance will
greatlysuffer? Do we have an idea of how bad this will generally be or what does that depend upon? 

Yes. You will have to effectively serialize all inserts into those tables, eliminating any concurrency.

You'd need to have a pretty compelling hard business requirement for consecutive numbers before it'd be worth
considering.

Cheers,
  Steve



Re: Advice on Contiguous IDs

From
Vik Fearing
Date:
On 01/09/2018 10:06 AM, Brian McKiernan wrote:
> Hi Folks,
> 
> Looking for some help/advice - not sure if this is the appropriate channel.

It is not.  You want the pgsql-general list, or perhaps pgsql-novice.

> My Issue:
> My primary keys in a certain table are not contiguous.

Is that really an issue?  The only valid case of gapless sequences I've
ever seen is invoice numbers.  If you're not doing that, why do you care?

> My Question:
> 1) What event would cause the CACHE clause in CREATE SEQUENCE to make an
> out of sequence next number?

If the server crashes, it can jump ahead by up to 32 values.  This is so
sequences don't have to be WAL logged every single time which could be
quite slow.

> 2) In all cases am I correct in my thinking that in order to create
> contiguous primary key IDs then performance will greatly suffer? Do we
> have an idea of how bad this will generally be or what does that depend
> upon?

Performance itself doesn't really suffer, concurrency does.  If you have
a lot of concurrent inserts on this table, then global performance will
indeed be worse than if you didn't care about gaps.  If it's just one
process doing the insert, you won't notice any performance drop at all.
-- 
Vik Fearing                                          +33 6 46 75 15 36
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support


Re: Advice on Contiguous IDs

From
"Brian McKiernan"
Date:
Thanks folks - extremely insightful.

Much appreciated.

Brian


On Wed 10 Jan 2018 at 01:33 Vik Fearing <Vik Fearing > wrote:

On 01/09/2018 10:06 AM, Brian McKiernan wrote:
> Hi Folks,
>
> Looking for some help/advice - not sure if this is the appropriate channel.

It is not. You want the pgsql-general list, or perhaps pgsql-novice.

> My Issue:
> My primary keys in a certain table are not contiguous.

Is that really an issue? The only valid case of gapless sequences I've
ever seen is invoice numbers. If you're not doing that, why do you care?

> My Question:
> 1) What event would cause the CACHE clause in CREATE SEQUENCE to make an
> out of sequence next number?

If the server crashes, it can jump ahead by up to 32 values. This is so
sequences don't have to be WAL logged every single time which could be
quite slow.

> 2) In all cases am I correct in my thinking that in order to create
> contiguous primary key IDs then performance will greatly suffer? Do we
> have an idea of how bad this will generally be or what does that depend
> upon?

Performance itself doesn't really suffer, concurrency does. If you have
a lot of concurrent inserts on this table, then global performance will
indeed be worse than if you didn't care about gaps. If it's just one
process doing the insert, you won't notice any performance drop at all.
--
Vik Fearing +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support