Thread: How to create "auto-increment" field WITHOUT a sequence object?
Hello,
From: Dmitry Koterov <dmitry@koterov.ru>
To: Postgres General <pgsql-general@postgresql.org>
Sent: Thu, June 30, 2011 8:40:39 PM
Subject: [GENERAL] How to create "auto-increment" field WITHOUT a sequence object?
On Jun 30, 2011, at 2:40 PM, Dmitry Koterov wrote: > Hello. > > I need to create an auto-increment field on a table WITHOUT using sequences: This problem pops up a lot for invoice sequence numbers for the tax office and related cases. (Search for "gapless sequence".) Since the numbers are really only needed for an external system (as you mention), then it may make sense to generate thegapless IDs when necessary and map the generated IDs to the rows later. The drawback is that some rows in the table willnot have the gapless ID until the batch job is run, but all rows will still be addressable by the real sequence ID. Cheers, M
This problem pops up a lot for invoice sequence numbers for the tax office and related cases. (Search for "gapless sequence".)
On Jun 30, 2011, at 2:40 PM, Dmitry Koterov wrote:
> Hello.
>
> I need to create an auto-increment field on a table WITHOUT using sequences:
Since the numbers are really only needed for an external system (as you mention), then it may make sense to generate the gapless IDs when necessary and map the generated IDs to the rows later. The drawback is that some rows in the table will not have the gapless ID until the batch job is run, but all rows will still be addressable by the real sequence ID.
Cheers,
M
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Dmitry Koterov
Sent: Thursday, June 30, 2011 2:41 PM
To: Postgres General
Subject: [GENERAL] How to create "auto-increment" field WITHOUT a sequence object?
Hello.
I need to create an auto-increment field on a table WITHOUT using sequences:
CREATE TABLE tbl(
name TEXT,
uniq_id INTEGER
);
Each INSERT to this table must generate a new uniq_id which is distinct from all others.
The problem is that these INSERTs are rolled back oftenly (i.e. they are executed within a transaction block which is rolled back time to time), this is an existing design of the current architecture and unfortunately we have to live with it. And I need as compact uniq_id generation (with minimum "holes") as it possible - this is a VERY important requirement (to export these values into external systems which accepts only IDs limited from 1 to 100000).
So I cannot use sequences: sequence value is obviously not rolled back, so if I insert nextval(...) as uniq_id, I will have large holes (because of often transaction rollbacks) and exhaust 100000 uniq_ids very fast. How to deal with all this without sequences?
I tried
BEGIN;
LOCK TABLE tbl;
INSERT INTO tbl(uniq_id) VALUES((SELECT max(uniq_id) FROM tbl) + 1);
COMMIT;
but seems it performs too hard locking - time to time this query is timed out (or sometimes deadlocks with other queries).
Is there any other, less hard, locking which allow me to guarantee that no INSERTs will be performed into tbl between max() calculation and UPDATE query itself, but does not lock the whole table?
>>>>>>>>>>>>>>>>>>>>
Why not have an internal and an external id? The internal one would use the sequence and wouldn’t care about being gap-less. The external one would be assigned post-Insert and thus, ignoring deletes, can be gap-less. Depending upon how frequently/quickly the external identifier needs to be present you have various options to actually assign the external identifier value.
CREATE TABLE tbl(
name TEXT,
uniq_id serial,
external_id integer NULL
);
Upon creating a new record you have a record with a NULL external_id. At some point in the future, prior to export, you can replace all the NULLs with actual values using a sequence. Depending on whether or not the transaction can be rolled back when successful you can add the “UPDATE” statement as the last statement of the transaction so that it will only fire if the transaction is otherwise going to complete successfully.
Without more detail about the how and why of your restrictions it is difficult to provide solutions.
David J.
>Original Message From: Dmitry Koterov >I need to create an auto-increment field on a table WITHOUT using >sequences: You may get some ideas at http://www.varlena.com/GeneralBits/130.php George
On Thu, Jun 30, 2011 at 12:40 PM, Dmitry Koterov <dmitry@koterov.ru> wrote: > Hello. > I need to create an auto-increment field on a table WITHOUT using sequences: > CREATE TABLE tbl( > name TEXT, > uniq_id INTEGER > ); > Each INSERT to this table must generate a new uniq_id which is distinct from > all others. Do they need to be in order? It looks like you only need a few since the range you mention is 1 to 100000. you could put those numbers in another table. Then in a transaction you grab a number from the table, try to delete it. If it's not there you lost a race condition, exit and try it again. If you can delete it then you try the insert transaction. If it fails things roll back and the lock on the row is released. If the transaction works you commit the whole thing. Are the transactions really long running?
OK.Possible next solution is ON AFTER UPDATE trigger:BEGINLOCK TABLE tbl IN SHARE UPDATE EXCLUSIVE MODE;UPDATE tbl SET uniq_id=(SELECT MAX(uniq_id) + 1 FROM tbl) WHERE id = NEW.id;END;Seems SHARE UPDATE EXCLUSIVE is a minimum locking which conflicts with itself and which does not conflict with pg_dump.(Unfortunately it conflicts with VACUUM which is performed by autovacuum process.)SHARE UPDATE EXCLUSIVE is better than default LOCK (which also blocks reading).On Thu, Jun 30, 2011 at 11:38 PM, A.M. <agentm@themactionfaction.com> wrote:No- advisory locks are managed by the application, so that is exactly what you don't want. The exclusive table lock is still exactly what you need unless you can postpone the generation of the secondary IDs.
On Jun 30, 2011, at 3:36 PM, Dmitry Koterov wrote:
> ...possibly within ON AFTER INSERT trigger:
>
> BEGIN
> pg_advisory_lock(0xDEADBEEF);
> UPDATE tbl SET uniq_id=(SELECT MAX(uniq_id) + 1 FROM tbl) WHERE id =
> NEW.id;
> END;
>
> Would it work without explicit pg_advisory_unlock() - would the locking be
> released on COMMIT/ROLLBACK?
Cheers,
M
On 06/30/2011 03:01 PM, Dmitry Koterov wrote: > > It may not be "fully" gapless. The main cause is to keep uniq_id as > low as it could be to not to exhaust 100000 values too fast. > I think solutions with addition tables look too complicated for this > case, is there a possiblilty to not to use an additional table? You may think it's possible to build a gapless design that is less complicated by writing some application or server code to enforce it. You've already tried this and learned that it's much harder than it seems. Doing this correctly without causing timeout and deadlock issues is a hard problem. Meanwhile, generating a gapless translation table that only includes things that have been committed is easy, and you're not likely to run into really strange and unexpected bugs in that implementation later. Given those are the two situations you're comparing here, I would say using the extra table is less complicated in every way. Sure, you're adding another table, but the process happening against it is really easy. The alternative doesn't have the extra table, but that doesn't make it less complicated. Complexity needs to consider how difficult a program is going to be to debug and maintain. And in those areas, making a single table gapless is quite complicated. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD Comprehensive and Customized PostgreSQL Training Classes: http://www.2ndquadrant.us/postgresql-training/