Thread: Re: [NOVICE] Last ID Problem

Re: [NOVICE] Last ID Problem

From

"Mark Cave-Ayland"

Date:

02 February 2005, 09:10:38

Hi Tom and others,

> I think the correct solution is not to mess with what's admittedly a
legacy aspect of
> our client API.  Instead we should invent the "INSERT RETURNING" and
"UPDATE RETURNING"
> commands that have been discussed repeatedly (see the pghackers archives).
That would
> allow people to get what they want, and do so in only one network round
trip, without
> any artificial dependencies on OIDs or TIDs or anything else.  It'd be
unportable, but
> surely no more so than relying on OIDs or TIDs ...

Just off the top of my head, would it not be feasible to add a column to
pg_class called lastinsert that points to the OID of the pg_attribute column
to return after an insert? It could be changed using something similar to
"ALTER TABLE x SET LASTINSERT TO y", but by default it would be set to the
OID of the primary key of the table if the table specified WITHOUT OIDS at
creation time, or the first column of the table otherwise. After the INSERT
command, the value of the resulting is column is passed back to the client.

I see that INSERT...RETURNING is a solution to the problem, but it seems
somewhat strange to have to use an unportable command just to be able to
return an identifier for the last inserted record...


Kind regards,

Mark.

------------------------
WebBased Ltd
South West Technology Centre
Tamar Science Park
Plymouth
PL6 8BT

T: +44 (0)1752 791021
F: +44 (0)1752 791023
W: http://www.webbased.co.uk

Re: [NOVICE] Last ID Problem

From

Tom Lane

Date:

02 February 2005, 15:35:03

"Mark Cave-Ayland" <m.cave-ayland@webbased.co.uk> writes:
> Just off the top of my head, would it not be feasible to add a column to
> pg_class called lastinsert that points to the OID of the pg_attribute column
> to return after an insert?

No.  The thing everyone is ignoring here is that the INSERT command tag
format is not something we can just go and change.  You certainly could
not put anything in it that wasn't an integer, and I'm not sure it would
even be safe to put a bigint.  So most of the cases you might actually
want (timestamp, bigserial, etc) would be ruled out.  Hardly worth
inventing such a feature.

> I see that INSERT...RETURNING is a solution to the problem, but it seems
> somewhat strange to have to use an unportable command just to be able to
> return an identifier for the last inserted record...

How is what you're suggesting more portable?
        regards, tom lane

Re: [NOVICE] Last ID Problem

From

Greg Stark

Date:

02 February 2005, 16:57:48

Tom Lane <tgl@sss.pgh.pa.us> writes:

> How is what you're suggesting more portable?

Well, the driver would be free to implement $sth->last_insert_id() using
whatever proprietary extensions it has available. The non-portableness would
at least be hidden in the driver layer. Switch out the driver and the right
thing would happen.

"INSERT/UPDATE ... RETURNING" isn't something a driver can take advantage of.
It would require it to modify your statements which it can't do safely. So
your application would have such non-portable SQL code written into it. Switch
databases and your application code needs to be ported.

-- 
greg

Re: [NOVICE] Last ID Problem

From

Tom Lane

Date:

02 February 2005, 17:18:54

Greg Stark <gsstark@mit.edu> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> How is what you're suggesting more portable?

> Well, the driver would be free to implement $sth->last_insert_id() using
> whatever proprietary extensions it has available. The non-portableness would
> at least be hidden in the driver layer.

Are you asserting that last_insert_id() is a portable function?  I doubt
it.
        regards, tom lane

Re: [NOVICE] Last ID Problem

From

Greg Stark

Date:

02 February 2005, 19:35:48

Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
> > Tom Lane <tgl@sss.pgh.pa.us> writes:
> >> How is what you're suggesting more portable?
> 
> > Well, the driver would be free to implement $sth->last_insert_id() using
> > whatever proprietary extensions it has available. The non-portableness would
> > at least be hidden in the driver layer.
> 
> Are you asserting that last_insert_id() is a portable function?  I doubt
> it.

Well I'm not sure what you mean by "portable". It's part of the DBI driver
definition, so in theory it is. Not all drivers will implement it though, or
implement it properly, and for some it may be more efficient than others.

For postgres it looks like currently it requires you to pass in the table and
field might even need a "driver-specific hint" telling it the sequence name.

At least an application using it has a hope of working on a new driver. An
application using RETURNING will only work on Oracle and one day Postgres.

So it would be nice if the Postgres driver could efficiently implement it
without having to do a second SELECT and without having to know out of band
info like a sequence name.


This is from the DBI documentation -- that is, the non-driver-specific
abstract interface documentation.

      "last_insert_id"            $rv = $dbh->last_insert_id($catalog, $schema, $table, $field);            $rv =
$dbh->last_insert_id($catalog,$schema, $table, $field, \%attr);
 
          Returns a value 'identifying' the row just inserted, if possible.          Typically this would be a value
assignedby the database server to          a column with an auto_increment or serial type. Returns undef if
thedriver does not support the method or can't determine the          value.
 
          The $catalog, $schema, $table, and $field parameters may be          required for some drivers (see below).
Ifyou don't know the          parameter values and your driver does not need them, then use          "undef" for each.
 
          There are several caveats to be aware of with this method if you          want to use it for portable
applications:
          * For some drivers the value may only available immediately after            the insert statement has
executed(e.g., mysql, Informix).
 
          * For some drivers the $catalog, $schema, $table, and $field            parameters are required (e.g., Pg),
forothers they are ignored            (e.g., mysql).
 
          * Drivers may return an indeterminate value if no insert has been            performed yet.
          * For some drivers the value may only be available if placeholders            have not been used (e.g.,
Sybase,MS SQL). In this case the value            returned would be from the last non-placeholder insert statement.
 
          * Some drivers may need driver-specific hints about how to get the            value. For example, being told
thename of the database            'sequence' object that holds the value. Any such hints are passed            as
driver-specificattributes in the \%attr parameter.
 
          * If the underlying database offers nothing better, then some            drivers may attempt to implement
thismethod by executing            ""select max($field) from $table"". Drivers using any approach            like this
shouldissue a warning if "AutoCommit" is true because            it is generally unsafe - another process may have
modifiedthe            table between your insert and the select. For situations where            you know it is safe,
suchas when you have locked the table, you            can silence the warning by passing "Warn" => 0 in \%attr.
 
          * If no insert has been performed yet, or the last insert failed,            then the value is implementation
defined.
          Given all the caveats above, it's clear that this method must be          used with care.
          The "last_insert_id" method was added in DBI 1.38.



-- 
greg

Re: [NOVICE] Last ID Problem

From

Oliver Jowett

Date:

02 February 2005, 20:45:33

Tom Lane wrote:
> Greg Stark <gsstark@mit.edu> writes:
> 
>>Tom Lane <tgl@sss.pgh.pa.us> writes:
>>
>>>How is what you're suggesting more portable?
> 
> 
>>Well, the driver would be free to implement $sth->last_insert_id() using
>>whatever proprietary extensions it has available. The non-portableness would
>>at least be hidden in the driver layer.
> 
> 
> Are you asserting that last_insert_id() is a portable function?  I doubt
> it.

I'm not familiar with the Perl interface, but JDBC has a standardized 
interface for this:

http://java.sun.com/j2se/1.5.0/docs/api/java/sql/Statement.html#executeUpdate(java.lang.String,%20int)
http://java.sun.com/j2se/1.5.0/docs/api/java/sql/Statement.html#getGeneratedKeys()

I tend to agree that a protocol-level change is easier to support in a 
driver. If it's done by extending INSERT/UPDATE, the driver will need to 
parse and modify queries which is hairy at the best of times.

-O

Re: [NOVICE] Last ID Problem

From

Tom Lane

Date:

02 February 2005, 21:04:42

Greg Stark <gsstark@mit.edu> writes:
> This is from the DBI documentation -- that is, the non-driver-specific
> abstract interface documentation.

>            Returns a value 'identifying' the row just inserted, if possible.
>            Typically this would be a value assigned by the database server to
>            a column with an auto_increment or serial type.

Aside from the numerous serious problems pointed out in the
documentation, this has an even more fatal objection, which is that it's
unspecified what the result value is and thus there is no portable way
of *using* the result after you have it.  (If the PG driver returns an
OID you certainly couldn't use that the same way as some other driver
that returns a primary key ... especially a multicolumn primary key ...)

This "portable" function is so unportable that I see no reason to
accept it as precedent.
        regards, tom lane

Re: [NOVICE] Last ID Problem

From

"Greg Sabino Mullane"

Date:

03 February 2005, 02:09:38

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Merlin Moncur wrote:
> That is a shortcoming of the DBD::pg driver which really should be
> returning a key (comprised of columns, some or none of which may be
> defaulted by the server).
Actually, the spec comes from DBI, not DBD::Pg, and is inspired by
MySQL's last_insert_id function. It is a poorly-speced function,
but we've done our best in the upcoming version of DBD::Pg,
which will support it.
Greg Stark wrote:
> For postgres it looks like currently it requires you to pass in
> the table and field might even need a "driver-specific hint" telling
> it the sequence name.
For the record, the only required field for DBD::Pg will be the table
name, although the name of the sequence is highly encouraged. Here's
the docs for the next version, the first which supports lii:
last_insert_id $rv = $dbh->last_insert_id($catalog, $schema, $table, $field); $rv = $dbh->last_insert_id($catalog,
$schema,$table, $field, \%attr);

Attempts to return the id of the last value to be inserted into a table.
You can either provide a sequence name (preferred) or provide a table
name with optional schema. The $catalog and $field arguments are always ignored.
The current value of the sequence is returned by a call to the
'currval' PostgreSQL function. This will fail if the sequence has not yet
been used in the current database connection.
If you do not know the name of the sequence, you can provide a table name and
DBD::Pg will attempt to return the correct value. To do this, there must be at
least one column in the table with a C<NOT NULL> constraint, that has a unique
constraint, and which uses a sequence as a default value. If more than one column
meets these conditions, the primary key will be used. This involves some
looking up of things in the system table, so DBD::Pg will cache the sequence
name for susequent calls. If you need to disable this caching for some reason,
you can control it via the 'pg_cache' attribute.
Please keep in mind that this method is far from foolproof, so make your
script use it properly. Specifically, make sure that it is called
immediately after the insert, and that the insert does not add a value
to the column that is using the sequence as a default value.
Some examples: $dbh->do("CREATE SEQUENCE lii_seq START 1"); $dbh->do("CREATE TABLE lii (   foobar INTEGER NOT NULL
UNIQUEDEFAULT nextval('lii_seq'),   baz VARCHAR)"); $SQL = "INSERT INTO lii(baz) VALUES (?)"; $sth =
$dbh->prepare($SQL);for (qw(uno dos tres quattro)) {   $sth->execute($_);   my $newid =
$dbh->last_insert_id(C<undef>,undef,undef,undef,{sequence=>'lii_seq'});  print "Last insert id was $newid\n"; }

If you did not want to worry about the sequence name: $dbh->do("CREATE TABLE lii2 (   foobar SERIAL UNIQUE,   baz
VARCHAR)");$SQL = "INSERT INTO lii2(baz) VALUES (?)"; $sth = $dbh->prepare($SQL); for (qw(uno dos tres quattro)) {
$sth->execute($_);  my $newid = $dbh->last_insert_id(undef,undef,"lii2",undef);   print "Last insert id was $newid\n";
}
- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200502022110
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----
iD8DBQFCAYgSvJuQZxSWSsgRAgg3AJ4id98pta0CQR2w3xgwkxnph7qW4wCeMAJH
g/eXhtcmvXei9mESDDXg/s8=
=QaUa
-----END PGP SIGNATURE-----

Re: [NOVICE] Last ID Problem

From

John Hansen

Date:

03 February 2005, 02:50:29

> Attempts to return the id of the last value to be inserted into a table.
> You can either provide a sequence name (preferred) or provide a table
> name with optional schema. The $catalog and $field arguments are always ignored.
> The current value of the sequence is returned by a call to the
> 'currval' PostgreSQL function. This will fail if the sequence has not yet
> been used in the current database connection.


This suffers from the same problems that currval does when using
connection pools tho.

I previously suggested a function similar to last_insert_id in behaviour,
and have attached it to this email for reference.

Even so, this also suffers from the same problems when using a connection pool.

The solution I proposed, namely having the tuple returned by
inserts/updates (perhaps even deletes?) would only mean changing the
client library to handle this, and as an example, libpg could easily
figure out the OID of said tuple and return that if it's present for
PQExec() (for backwards compatibility just as it does today,) and add a
separate PQExecSelect() that instead returns the tuple(s) as if they had
been SELECTed.

--
John Hansen <john@geeknet.com.au>
GeekNET

Attachment

Re: [NOVICE] Last ID Problem

From

"Greg Sabino Mullane"

Date:

03 February 2005, 05:16:36

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
> This suffers from the same problems that currval does when using
> connection pools tho.
I still don't see this as much of a real world problem however,
more of a "doctor, it hurts when I do this" variety. As the DBD::Pg docs
point out, you should not separate the calls to insert and currval
far apart, and certainly not across connections. I have a hard time
visualizing a case where an app would ever need to worry about the
problem anyway, unless they were using pooling in a strange way and
had a very poorly written application.
> The solution I proposed, namely having the tuple returned by
> inserts/updates (perhaps even deletes?) would only mean changing the
> client library to handle this, and as an example, libpg could easily
> figure out the OID of said tuple and return that if it's present for
> PQExec() (for backwards compatibility just as it does today,) and add a
> separate PQExecSelect() that instead returns the tuple(s) as if they had
> been SELECTed.
There's a few issues with the above, however, the most important of which
is that OIDs are going away, and then what do you use? Also, it does not
handle cases where the insert necessarily happened with a direct INSERT
via PQexec: the insert could have happened inside of a called function, or
a trigger could have inserted into three different tables. Truth be told, I
don't think the whole last_insert_id() in DBI is a very useful function
anyway. It's mainly (at this point) a quick abstraction of a nextval call
between Oracle and PostgreSQL. We do go out of our way to be more compatible
to MySQL by accepting just a table name, but one should really use the
sequence directly, IMO.
- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200502030012
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----
iD8DBQFCAbQRvJuQZxSWSsgRArYMAKC4Kgsv153HHbC05AtraAh4O7oL9wCgtDmR
zoucziPs5cyC1at00M8MC9w=
=PDUD
-----END PGP SIGNATURE-----

Re: [NOVICE] Last ID Problem

From

"Mark Cave-Ayland"

Date:

03 February 2005, 11:19:08

Hi Tom,

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: 02 February 2005 15:35
> To: Mark Cave-Ayland
> Cc: 'Alvaro Herrera'; 'Michael Fuhr'; 'Mitch Pirtle'; 'Tatsuo
> Ishii'; pgsql-hackers@postgresql.org; operationsengineer1@yahoo.com
> Subject: Re: [NOVICE] Last ID Problem

(cut)

> No.  The thing everyone is ignoring here is that the INSERT
> command tag format is not something we can just go and
> change.  You certainly could not put anything in it that
> wasn't an integer, and I'm not sure it would even be safe to
> put a bigint.  So most of the cases you might actually want
> (timestamp, bigserial, etc) would be ruled out.  Hardly worth
> inventing such a feature.

OK, I didn't realise it was the command tag that was the issue here. I took
a look at the libpq source to see how the INSERT tag works and it looks like
it currently assumes a string of numbers. So as a minimum it would need some
form of protocol extension to get this work (which I see from the archives
that you were not keen to pursue).

> How is what you're suggesting more portable?

I was hoping that it would only require minimal change (but obviously that
is not the case). I think, out of the remaining options, that keeping with
currval() is going to be the best approach - I can't really see the benefit
of using a non-SQL standard command just for SQL insertion, mainly as you
and others have suggested for portability reasons :(


Kind regards,

Mark.

------------------------
WebBased Ltd
South West Technology Centre
Tamar Science Park
Plymouth
PL6 8BT

T: +44 (0)1752 791021
F: +44 (0)1752 791023
W: http://www.webbased.co.uk