Thread: serial column vs. explicit sequence question

serial column vs. explicit sequence question

From

Charlie Toohey

Date:

13 June 2002, 16:12:50

I'm having a problem and there seems to be 2 solutions. It is simple and 
straighforward, but will take several paragraphs to explain.

I have a schema with a master-detail design. The master table does not have 
an expicit id, so I have a column of type serial. 

Lets say I need to insert a row into the master table and N rows into the 
detail table. After inserting a row into master, and before detail, I need to 
read the master table to obtain the value of the id for the row just 
inserted, so I can insert this id as the foreign key value for the N rows in 
the detail table. 

This seems like a poor solution because I have to write and then read the 
master table each time. With lot of activity on these tables, I don't know 
how well this will scale. Additionally, the only way that I can guarantee 
that I am getting the id of the most recent row inserted into master is to 
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE --- because otherwise, if other 
processes are inserting rows into master/detail concurrently, I may pick up 
the id from an incorrect row (the technique used to get the correct id is to 
include a timestamp column on the insert into master and then query for the 
latest row).

A better solution would seem to use a sequence explicitly, rather than a id 
column of type serial. I would obtain the id value from the sequence, and 
then insert this id into the master table and into the detail table. This 
way, I wouldn't be writing/reading the same table constantly -- I would only 
be writing to it, and, I would guarantee that I would be using the correct id 
in both master and detail without have to SET TRANSACTION ISOLATION LEVEL 
SERIALIZEABLE.

Any comments on which solution you would choose, or is there a better 
solution ?

Thanks,
Charlie

Re: serial column vs. explicit sequence question

From

Josh Berkus

Date:

13 June 2002, 17:29:44

Charlie,

> Lets say I need to insert a row into the master table and N rows into the
> detail table. After inserting a row into master, and before detail, I need
to
> read the master table to obtain the value of the id for the row just
> inserted, so I can insert this id as the foreign key value for the N rows in
> the detail table.

Hey, what about CURRVAL('sequence_name')?

--
-Josh Berkus

Re: serial column vs. explicit sequence question

From

Stephan Szabo

Date:

13 June 2002, 17:30:42

On Thu, 13 Jun 2002, Charlie Toohey wrote:

> I'm having a problem and there seems to be 2 solutions. It is simple and
> straighforward, but will take several paragraphs to explain.
>
> I have a schema with a master-detail design. The master table does not have
> an expicit id, so I have a column of type serial.
>
> Lets say I need to insert a row into the master table and N rows into the
> detail table. After inserting a row into master, and before detail, I need to
> read the master table to obtain the value of the id for the row just
> inserted, so I can insert this id as the foreign key value for the N rows in
> the detail table.
>
> This seems like a poor solution because I have to write and then read the
> master table each time. With lot of activity on these tables, I don't know
> how well this will scale. Additionally, the only way that I can guarantee
> that I am getting the id of the most recent row inserted into master is to
> SET TRANSACTION ISOLATION LEVEL SERIALIZABLE --- because otherwise, if other
> processes are inserting rows into master/detail concurrently, I may pick up
> the id from an incorrect row (the technique used to get the correct id is to
> include a timestamp column on the insert into master and then query for the
> latest row).
>
> A better solution would seem to use a sequence explicitly, rather than a id
> column of type serial. I would obtain the id value from the sequence, and
> then insert this id into the master table and into the detail table. This
> way, I wouldn't be writing/reading the same table constantly -- I would only
> be writing to it, and, I would guarantee that I would be using the correct id
> in both master and detail without have to SET TRANSACTION ISOLATION LEVEL
> SERIALIZEABLE.
>
> Any comments on which solution you would choose, or is there a better
> solution ?

Well, serial really is just an integer with a default value pulling from a
sequence, so right now you can use currval on the sequence (which I think
gets named something like <table>_<column>_seq

Re: serial column vs. explicit sequence question

From

Tom Lane

Date:

13 June 2002, 18:10:21

Charlie Toohey <ctoohey@pacbell.net> writes:
> A better solution would seem to use a sequence explicitly, rather than a id 
> column of type serial. I would obtain the id value from the sequence, and 
> then insert this id into the master table and into the detail table.

Yup.  But there's no need to change how you create the id column; serial
is just fine, since all it is is a sequence and a DEFAULT nextval('foo').
You just do something like
select nextval('name-of-id-columns-sequence') into $masterid;insert into master(id, ...) values ($masterid, ...);insert
intodetail ... $masterid ...;

rather than letting the default expression do it for you.
        regards, tom lane

Re: serial column vs. explicit sequence question

From

Jason Earl

Date:

13 June 2002, 18:45:01

Charlie Toohey <ctoohey@pacbell.net> writes:

> I'm having a problem and there seems to be 2 solutions. It is simple
> and straighforward, but will take several paragraphs to explain.
> 
> I have a schema with a master-detail design. The master table does
> not have an expicit id, so I have a column of type serial.
> 
> Lets say I need to insert a row into the master table and N rows
> into the detail table. After inserting a row into master, and before
> detail, I need to read the master table to obtain the value of the
> id for the row just inserted, so I can insert this id as the foreign
> key value for the N rows in the detail table.
> 
> This seems like a poor solution because I have to write and then
> read the master table each time. With lot of activity on these
> tables, I don't know how well this will scale. Additionally, the
> only way that I can guarantee that I am getting the id of the most
> recent row inserted into master is to SET TRANSACTION ISOLATION
> LEVEL SERIALIZABLE --- because otherwise, if other processes are
> inserting rows into master/detail concurrently, I may pick up the id
> from an incorrect row (the technique used to get the correct id is
> to include a timestamp column on the insert into master and then
> query for the latest row).
> 
> A better solution would seem to use a sequence explicitly, rather
> than a id column of type serial. I would obtain the id value from
> the sequence, and then insert this id into the master table and into
> the detail table. This way, I wouldn't be writing/reading the same
> table constantly -- I would only be writing to it, and, I would
> guarantee that I would be using the correct id in both master and
> detail without have to SET TRANSACTION ISOLATION LEVEL
> SERIALIZEABLE.
> 
> Any comments on which solution you would choose, or is there a
> better solution ?
> 
> Thanks,
> Charlie

The SERIAL type is a thin veneer over an underlying conglomeration of
a unique index and a sequence, nothing more, nothing less.  I still
prefer to use the old syntax that spelled this out explicitly (mostly
because it reminded me that I needed to drop the sequences as well as
the table if I made changes during the development phases of my
project).  Instead of using a serial type I have a whole pile of
scripts that contain bits that look like this:

DROP TABLE prod_journal;
DROP SEQUENCE prod_journal_id_seq;

CREATE SEQUENCE prod_journal_id_seq;

CREATE TABLE prod_journal (      id            int PRIMARY KEY                       DEFAULT
nextval('prod_journal_id_seq'),     ...

);

The SERIAL type does precisely the same sort of thing.  The only
difference is that PostgreSQL thinks up the sequence name for you
(currently PostgreSQL tries to choose a name that looks precisely like
the one I have chosen in this example).  The reason that I bring this
up is A) it makes me happy to think that I have been using PostgreSQL
long enough that my PostgreSQL memories predate the SERIAL type, and
B) to point out that there is not really a difference between using
the SERIAL type and using sequences explicitly.

What you *really* need is to get acquainted with the nifty sequence
functions currval and nextval.  They hold the secret to sequence
Nirvana.  See Chapter 4 Section 11 of the PostgreSQL User's Guide for
the full scoop.  The short story is that curval gives the current
value of the sequence (for whichever backend you are connected to) and
nextval will give you the next value of the sequence.

Now let's say that you had two simple tables foo for the master record
and bar for the detail records.

test=# create table foo (id serial primary key, name text);
NOTICE:  CREATE TABLE will create implicit sequence 'foo_id_seq' for SERIAL column 'foo.id'
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index 'foo_pkey' for table 'foo'
CREATE

test=# create table bar (master int references foo, detail text);
NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY check(s)
CREATE

You could then insert into these tables using something like this:

test=# begin;
BEGIN
test=# insert into foo (name) values ('Jason');
INSERT 67423220 1
test=# insert into bar (master, detail) values (currval('foo_id_seq'), 'Does this work');
INSERT 67423221 1
test=# insert into bar (master, detail) values (currval('foo_id_seq'), 'Apparently So!');
INSERT 67423222 1
test=# commit;
COMMIT

As long as you hang onto your connection to the back end you don't
even have to wrap this as one transaction.  Currval is connection
dependent, and so as long as you have the same connection currval will
give the correct answer, and currval is very very fast.

Hope this was helpful,
Jason

Re: serial column vs. explicit sequence question

From

Charlie Toohey

Date:

13 June 2002, 18:47:52

If session A and session B are concurrently doing the same master-detail 
transaction, wouldn't currval possibly reflect the sequence value used by the 
other session ? Or are you saying that since this will be an explicit 
transaction that currval won't reflect the fact that the sequence may have 
been incremented by another session ?


On Thursday 13 June 2002 02:06 pm, Stephan Szabo wrote:
> On Thu, 13 Jun 2002, Charlie Toohey wrote:
> > I'm having a problem and there seems to be 2 solutions. It is simple and
> > straighforward, but will take several paragraphs to explain.
> >
> > I have a schema with a master-detail design. The master table does not
> > have an expicit id, so I have a column of type serial.
> >
> > Lets say I need to insert a row into the master table and N rows into the
> > detail table. After inserting a row into master, and before detail, I
> > need to read the master table to obtain the value of the id for the row
> > just inserted, so I can insert this id as the foreign key value for the N
> > rows in the detail table.
> >
> > This seems like a poor solution because I have to write and then read the
> > master table each time. With lot of activity on these tables, I don't
> > know how well this will scale. Additionally, the only way that I can
> > guarantee that I am getting the id of the most recent row inserted into
> > master is to SET TRANSACTION ISOLATION LEVEL SERIALIZABLE --- because
> > otherwise, if other processes are inserting rows into master/detail
> > concurrently, I may pick up the id from an incorrect row (the technique
> > used to get the correct id is to include a timestamp column on the insert
> > into master and then query for the latest row).
> >
> > A better solution would seem to use a sequence explicitly, rather than a
> > id column of type serial. I would obtain the id value from the sequence,
> > and then insert this id into the master table and into the detail table.
> > This way, I wouldn't be writing/reading the same table constantly -- I
> > would only be writing to it, and, I would guarantee that I would be using
> > the correct id in both master and detail without have to SET TRANSACTION
> > ISOLATION LEVEL SERIALIZEABLE.
> >
> > Any comments on which solution you would choose, or is there a better
> > solution ?
>
> Well, serial really is just an integer with a default value pulling from a
> sequence, so right now you can use currval on the sequence (which I think
> gets named something like <table>_<column>_seq

Re: serial column vs. explicit sequence question

From

Stephan Szabo

Date:

13 June 2002, 18:53:27

On Thu, 13 Jun 2002, Charlie Toohey wrote:

> If session A and session B are concurrently doing the same master-detail
> transaction, wouldn't currval possibly reflect the sequence value used by the
> other session ? Or are you saying that since this will be an explicit
> transaction that currval won't reflect the fact that the sequence may have
> been incremented by another session ?

currval is defined to give the last value given to your session. The only
cases you have to watch out for are other potential nextvals in your
session (triggers/rules), but usually you can find those pretty easily.

Re: serial column vs. explicit sequence question

From

Charlie Toohey

Date:

13 June 2002, 19:00:29

that definitely helps ! thank you Jason --- the key thing that I didn't 
undertand, and you have now enlightened me, is that currval was connection 
dependent --- I didn't think this would be guaranteed to work with concurrent 
transactions, but now I understand.

Just prior to receiving your message, I posted a reply basically asking how 
currval would work if there were concurrent updates --- please ignore that 
response.

thanks everyone --- I now feel "empowered" to carry on with my project


On Thursday 13 June 2002 03:01 pm, Jason Earl wrote:
> Charlie Toohey <ctoohey@pacbell.net> writes:
> > I'm having a problem and there seems to be 2 solutions. It is simple
> > and straighforward, but will take several paragraphs to explain.
> >
> > I have a schema with a master-detail design. The master table does
> > not have an expicit id, so I have a column of type serial.
> >
> > Lets say I need to insert a row into the master table and N rows
> > into the detail table. After inserting a row into master, and before
> > detail, I need to read the master table to obtain the value of the
> > id for the row just inserted, so I can insert this id as the foreign
> > key value for the N rows in the detail table.
> >
> > This seems like a poor solution because I have to write and then
> > read the master table each time. With lot of activity on these
> > tables, I don't know how well this will scale. Additionally, the
> > only way that I can guarantee that I am getting the id of the most
> > recent row inserted into master is to SET TRANSACTION ISOLATION
> > LEVEL SERIALIZABLE --- because otherwise, if other processes are
> > inserting rows into master/detail concurrently, I may pick up the id
> > from an incorrect row (the technique used to get the correct id is
> > to include a timestamp column on the insert into master and then
> > query for the latest row).
> >
> > A better solution would seem to use a sequence explicitly, rather
> > than a id column of type serial. I would obtain the id value from
> > the sequence, and then insert this id into the master table and into
> > the detail table. This way, I wouldn't be writing/reading the same
> > table constantly -- I would only be writing to it, and, I would
> > guarantee that I would be using the correct id in both master and
> > detail without have to SET TRANSACTION ISOLATION LEVEL
> > SERIALIZEABLE.
> >
> > Any comments on which solution you would choose, or is there a
> > better solution ?
> >
> > Thanks,
> > Charlie
>
> The SERIAL type is a thin veneer over an underlying conglomeration of
> a unique index and a sequence, nothing more, nothing less.  I still
> prefer to use the old syntax that spelled this out explicitly (mostly
> because it reminded me that I needed to drop the sequences as well as
> the table if I made changes during the development phases of my
> project).  Instead of using a serial type I have a whole pile of
> scripts that contain bits that look like this:
>
> DROP TABLE prod_journal;
> DROP SEQUENCE prod_journal_id_seq;
>
> CREATE SEQUENCE prod_journal_id_seq;
>
> CREATE TABLE prod_journal (
>        id            int PRIMARY KEY
>                        DEFAULT nextval('prod_journal_id_seq'),
>        ...
> );
>
> The SERIAL type does precisely the same sort of thing.  The only
> difference is that PostgreSQL thinks up the sequence name for you
> (currently PostgreSQL tries to choose a name that looks precisely like
> the one I have chosen in this example).  The reason that I bring this
> up is A) it makes me happy to think that I have been using PostgreSQL
> long enough that my PostgreSQL memories predate the SERIAL type, and
> B) to point out that there is not really a difference between using
> the SERIAL type and using sequences explicitly.
>
> What you *really* need is to get acquainted with the nifty sequence
> functions currval and nextval.  They hold the secret to sequence
> Nirvana.  See Chapter 4 Section 11 of the PostgreSQL User's Guide for
> the full scoop.  The short story is that curval gives the current
> value of the sequence (for whichever backend you are connected to) and
> nextval will give you the next value of the sequence.
>
> Now let's say that you had two simple tables foo for the master record
> and bar for the detail records.
>
> test=# create table foo (id serial primary key, name text);
> NOTICE:  CREATE TABLE will create implicit sequence 'foo_id_seq' for SERIAL
> column 'foo.id' NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit
> index 'foo_pkey' for table 'foo' CREATE
>
> test=# create table bar (master int references foo, detail text);
> NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY
> check(s) CREATE
>
> You could then insert into these tables using something like this:
>
> test=# begin;
> BEGIN
> test=# insert into foo (name) values ('Jason');
> INSERT 67423220 1
> test=# insert into bar (master, detail) values (currval('foo_id_seq'),
> 'Does this work'); INSERT 67423221 1
> test=# insert into bar (master, detail) values (currval('foo_id_seq'),
> 'Apparently So!'); INSERT 67423222 1
> test=# commit;
> COMMIT
>
> As long as you hang onto your connection to the back end you don't
> even have to wrap this as one transaction.  Currval is connection
> dependent, and so as long as you have the same connection currval will
> give the correct answer, and currval is very very fast.
>
> Hope this was helpful,
> Jason