Re: Best approach for a "gap-less" sequence - Mailing list pgsql-general

From Dawid Kuroczko
Subject Re: Best approach for a "gap-less" sequence
Date
Msg-id 758d5e7f0608171414l548db1e9x43f2372c560c3c2@mail.gmail.com
Whole thread Raw
In response to Re: Best approach for a "gap-less" sequence  ("Merlin Moncure" <mmoncure@gmail.com>)
Responses Re: Best approach for a "gap-less" sequence  (elein <elein@varlena.com>)
List pgsql-general
On 8/17/06, Merlin Moncure <mmoncure@gmail.com> wrote:
> On 8/17/06, Brad Nicholson <bnichols@ca.afilias.info> wrote:
> > > > Hmm, I think you are wrong.  There is a SELECT ... FOR UPDATE;
> > > > The first-to-obtain the gapless sequence transaction will establish
> > > > a lock onthe "tax_id" row.  The other transaction will block until
> > > > the first transaction finishes (and the row is updated) and will
> > > > establish the row lock on it.
> > >
> > > yes, you are right...i didnt think the problem through properly.
> >
> > Lets just hope the performance on a concurrent system is not a
> > requirement of such a system...
>
> right, if the transations are long running, there is a big problem as
> they are serialized around access to the sequence.  however this is
> better than the control record approach because control record have
> problems with mvcc bloat.  concurrent performance will of course be
> awful.
>
> a good compomise in some cases is to save off canceled transactions
> ids' in a free list   you would still have to deal with transactions
> that were not gracefully cancelled though.

I believe there is no ON ROLLBACK trigger...  One might periodicaly
check the table for "gaps" and put them into "reuse me" table,
say a left outer join between generate_series and a real table,
withing "last known continuous id" and "max(id) from table".

However, if "gapless sequence" table bloat while long running transactions
is a problem, the other approach might be minimalising the size of this
"gapless sequence" table, say:

CREATE TABLE gapless_seq (
  gseq_value int NOT NULL;
);
INSER INTO gapless_seq VALUES(1);

...and then SELECT gseq_vaule FROM gapless_seq FOR UPDATE and
UPDATE gapless_seq SET gseq_value = gseq_value+1;

...this saves a few bytes per each update.  If we keep only a single
row, there is no need for an index (there is no vilibility information
in the index, so index would be useless in our case), and no need
for unique constraint checking (assuming we are sure noone will
ever ever insert additional rows -- it would be a disaster -- CREATE RULE
to ignore any INSERTs/DELETEs just to be sure.

Of course this does not solve the concurency issue, but I'm assuming
that if someone wants a gapless sequence, she wants its generation
serialized (or would assign sequences in post-transaction batches).

   Regards,
       Dawid

pgsql-general by date:

Previous
From: "Fabio Victora Hecht"
Date:
Subject: count and limit
Next
From: gustavo halperin
Date:
Subject: Re: Dynamic Partial Index