Thread: "RETURNING PRIMARY KEY" syntax extension
Hi, The JDBC API provides the getGeneratedKeys() method as a way of retrieving primary key values without the need to explicitly specify the primary key column(s). This is a widely-used feature, however the implementation has significant performance drawbacks. Currently this feature is implemented in the JDBC driver by appending "RETURNING *" to the supplied statement. However this means all columns of affected rows will be returned to the client, which causes significant performance problems, particularly on wide tables. To mitigate this, it would be desirable to enable the JDBC driver to request only the primary key value(s). One possible solution would be to have the driver request the primary key for a table, but this could cause a race condition where the primary key could change, and even if it does not, it would entail extra overhead. A more elegant and universal solution, which would allow the JDBC driver to request the primary key in a single request, would be to extend the RETURNING clause syntax with the option PRIMARY KEY. This resolves during parse analysis into the columns of the primary key, which can be done unambiguously because the table is already locked by that point and the primary key cannot change. A patch is attached which implements this, and will be added to the next commitfest. A separate patch will be submitted to the JDBC project. Example usage shown below. Regards Ian Barwick /* ---------------------------------------------- */ postgres=# CREATE TABLE foo (id SERIAL PRIMARY KEY); CREATE TABLE postgres=# INSERT INTO foo VALUES(DEFAULT) RETURNING PRIMARY KEY; id ---- 1 (1 row) INSERT 0 1 postgres=# CREATE TABLE bar (id1 INT NOT NULL, id2 INT NOT NULL, PRIMARY KEY(id1, id2)); CREATE TABLE postgres=# INSERT INTO bar VALUES(1,2) RETURNING PRIMARY KEY; id1 | id2 -----+----- 1 | 2 (1 row) INSERT 0 1 postgres=# INSERT INTO bar VALUES(2,1),(2,2) RETURNING PRIMARY KEY; id1 | id2 -----+----- 2 | 1 2 | 2 (2 rows) INSERT 0 2 postgres=# CREATE TABLE no_pkey (id SERIAL NOT NULL); CREATE TABLE postgres=# INSERT INTO no_pkey VALUES(DEFAULT) RETURNING id; id ---- 1 (1 row) INSERT 0 1 postgres=# INSERT INTO no_pkey VALUES(DEFAULT) RETURNING PRIMARY KEY; ERROR: Relation does not have any primary key(s) /* ---------------------------------------------- */ -- Ian Barwick http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Ian Barwick wrote > Hi, > > The JDBC API provides the getGeneratedKeys() method as a way of retrieving > primary key values without the need to explicitly specify the primary key > column(s). This is a widely-used feature, however the implementation has > significant > performance drawbacks. > > Currently this feature is implemented in the JDBC driver by appending > "RETURNING *" to the supplied statement. However this means all columns of > affected rows will be returned to the client, which causes significant > performance problems, particularly on wide tables. To mitigate this, it > would > be desirable to enable the JDBC driver to request only the primary key > value(s). Seems like a good idea. > ERROR: Relation does not have any primary key(s) "Relation does not have a primary key." or "Relation has no primary key." (preferred) By definition it cannot have more than one so it must have none. It could have multiple unique constraints but I do not believe they are considered if not tagged as primary. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/RETURNING-PRIMARY-KEY-syntax-extension-tp5806462p5806463.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
David G Johnston wrote > > Ian Barwick wrote >> Hi, >> >> The JDBC API provides the getGeneratedKeys() method as a way of >> retrieving >> primary key values without the need to explicitly specify the primary key >> column(s). This is a widely-used feature, however the implementation has >> significant >> performance drawbacks. >> >> Currently this feature is implemented in the JDBC driver by appending >> "RETURNING *" to the supplied statement. However this means all columns >> of >> affected rows will be returned to the client, which causes significant >> performance problems, particularly on wide tables. To mitigate this, it >> would >> be desirable to enable the JDBC driver to request only the primary key >> value(s). > Seems like a good idea. >> ERROR: Relation does not have any primary key(s) > "Relation does not have a primary key." > or > "Relation has no primary key." (preferred) > > By definition it cannot have more than one so it must have none. > > It could have multiple unique constraints but I do not believe they are > considered if not tagged as primary. Also, I did see where you account for auto-updatable views but what about complex views with instead of triggers? These can still be the target of DML queries but are not guaranteed (though can possibly) to return a well-defined primary key. At worse an explicit error about the view itself, not the apparent lack of primary key, should be emitted. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/RETURNING-PRIMARY-KEY-syntax-extension-tp5806462p5806464.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.
On 09/06/14 14:47, David G Johnston wrote: > Ian Barwick wrote >> Hi, >> >> The JDBC API provides the getGeneratedKeys() method as a way of retrieving >> primary key values without the need to explicitly specify the primary key >> column(s). This is a widely-used feature, however the implementation has >> significant >> performance drawbacks. >> >> Currently this feature is implemented in the JDBC driver by appending >> "RETURNING *" to the supplied statement. However this means all columns of >> affected rows will be returned to the client, which causes significant >> performance problems, particularly on wide tables. To mitigate this, it >> would >> be desirable to enable the JDBC driver to request only the primary key >> value(s). > > Seems like a good idea. > > >> ERROR: Relation does not have any primary key(s) > > "Relation does not have a primary key." > or > "Relation has no primary key." (preferred) > > By definition it cannot have more than one so it must have none. Ah yes, amazing what a fresh pair of eyes does :). The plural is the vestige of an earlier iteration which said something about the relation not having any primary key column(s). Will fix, thanks. Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Monday, June 9, 2014, Ian Barwick <ian@2ndquadrant.com> wrote:
Then again the extra returning checks may just amount noise.
On 09/06/14 14:47, David G Johnston wrote:Ian Barwick wroteHi,
The JDBC API provides the getGeneratedKeys() method as a way of retrieving
primary key values without the need to explicitly specify the primary key
column(s). This is a widely-used feature, however the implementation has
significant
performance drawbacks.
Currently this feature is implemented in the JDBC driver by appending
"RETURNING *" to the supplied statement. However this means all columns of
affected rows will be returned to the client, which causes significant
performance problems, particularly on wide tables. To mitigate this, it
would
be desirable to enable the JDBC driver to request only the primary key
value(s).
ISTM that having a non-null returning clause variable when no returning is present in the command makes things more complicated and introduces unnecessary checks in the not uncommon case of multiple non-returning commands being issued in series.
returningList was able to be null and so should returningClause. Then if non-null first check for the easy column listing and then check for the more expensive PK lookup request.
David J.
On 09/06/14 17:47, David G Johnston wrote: > Ian Barwick wrote >> Hi, >> >> The JDBC API provides the getGeneratedKeys() method as a way of retrieving >> primary key values without the need to explicitly specify the primary key >> column(s). This is a widely-used feature, however the implementation has >> significant >> performance drawbacks. >> >> Currently this feature is implemented in the JDBC driver by appending >> "RETURNING *" to the supplied statement. However this means all columns of >> affected rows will be returned to the client, which causes significant >> performance problems, particularly on wide tables. To mitigate this, it >> would >> be desirable to enable the JDBC driver to request only the primary key >> value(s). > Seems like a good idea. > > >> ERROR: Relation does not have any primary key(s) > "Relation does not have a primary key." > or > "Relation has no primary key." (preferred) > > By definition it cannot have more than one so it must have none. > > It could have multiple unique constraints but I do not believe they are > considered if not tagged as primary. > > David J. > > > > > > -- > View this message in context: http://postgresql.1045698.n5.nabble.com/RETURNING-PRIMARY-KEY-syntax-extension-tp5806462p5806463.html > Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. > >From memory all unique keys can be considered 'candidate primary keys', but only one can be designated 'the PRIMARY KEY'. I also like your preferred error message, and to the full extent of my decidedly Non-Authority, I hereby authorise it! :-) Cheers, Gavin
On 06/09/2014 09:06 AM, Gavin Flower wrote: > From memory all unique keys can be considered 'candidate primary keys', > but only one can be designated 'the PRIMARY KEY'. Almost. Candidate keys are also NOT NULL. -- Vik
David G Johnston <david.g.johnston@gmail.com> wrote: >> ERROR: Relation does not have any primary key(s) > > "Relation does not have a primary key." > or > "Relation has no primary key." (preferred) Project style says that the primary message should not capitalize the first word, nor should it end in a period. Detail and hints should be in sentence style, but not the message itself. http://www.postgresql.org/docs/9.3/interactive/error-style-guide.html#AEN100914 -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 06/09/2014 06:58 AM, Ian Barwick wrote: > Hi, > > The JDBC API provides the getGeneratedKeys() method as a way of > retrieving > primary key values without the need to explicitly specify the primary key > column(s). Is it defined by the standard, to return _only_ generated primary keys, and not for example generated alternate keys ? Cheers -- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ
Ian Barwick <ian@2ndquadrant.com> writes: > [ RETURNING PRIMARY KEY ] It looks to me like this is coded to have the expansion of the "primary key" done at parse time, which seems like fundamentally the wrong thing. Consider a view or rule containing this clause; the pkey might be different by the time execution rolls around. It'd be better probably if the rewriter or planner did the expansion (and threw the error for no-primary-key, if necessary). Alternatively, we could do it like this and consider that the view is dependent on the primary key constraint, but that seems inflexible. BTW, it seems like your representation of the clause was rather poorly chosen: it forces changing a whole lot of code that otherwise would not need to be changed. I'd have left returningList alone and put the returningPrimaryKey flag someplace else. regards, tom lane
On 09/06/14 23:42, Vik Fearing wrote: > On 06/09/2014 09:06 AM, Gavin Flower wrote: >> From memory all unique keys can be considered 'candidate primary keys', >> but only one can be designated 'the PRIMARY KEY'. > Almost. Candidate keys are also NOT NULL. Yeah, obviously! (Except, I did actually forget that - me bad.)
On 2014-06-09 13:42:22 +0200, Vik Fearing wrote: > On 06/09/2014 09:06 AM, Gavin Flower wrote: > > From memory all unique keys can be considered 'candidate primary keys', > > but only one can be designated 'the PRIMARY KEY'. > > Almost. Candidate keys are also NOT NULL. The list actually is a bit longer. They also cannot be partial. There's generally also the restriction that for some contexts - like e.g. foreign keys - primary/candidate keys may not be deferrable.. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 6/9/14, 8:35 AM, Hannu Krosing wrote: > On 06/09/2014 06:58 AM, Ian Barwick wrote: >> Hi, >> >> The JDBC API provides the getGeneratedKeys() method as a way of >> retrieving >> primary key values without the need to explicitly specify the primary key >> column(s). > Is it defined by the standard, to return _only_ generated primary keys, > and not > for example generated alternate keys ? I was wondering that myself. I think it's certainly reasonable to expect someone would wan RETURNING SEQUENCE VALUES, whichwould return the value of every column that owned a sequence (ie: ALTER SEQUENCE ... OWNED BY). ISTM that would certainlyhandle the performance aspect of this, and it sounds more in line with what I'd expect getGeneratedKeys() to do. -- Jim C. Nasby, Data Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
A definite +1 on this feature. A while ago I got partway through hacking the hibernate postgres dialect to make it issue a RETURNING clause to spit out the primary key before I realised that the driver was already doing a RETURNING * internally.
On 10 June 2014 05:53, Jim Nasby <jim@nasby.net> wrote:
> I was wondering that myself. I think it's certainly reasonable to expect
> someone would wan RETURNING SEQUENCE VALUES, which would return the value of
> every column that owned a sequence (ie: ALTER SEQUENCE ... OWNED BY). ISTM
> that would certainly handle the performance aspect of this, and it sounds
> more in line with what I'd expect getGeneratedKeys() to do.
Keep in mind that not all generated keys come from sequences. Many people have custom key generator functions, including UUIDs and other exotic things like Instagram's setup [1].
RETURNING GENERATED KEYS perhaps, but then how do we determine that? Any column that was filled with a default value? But that's potentially returning far more values than the user will want - I bet 99% of users just want their generated primary key.
The spec is a bit vague [2]:
Retrieves any auto-generated keys created as a result of executing
On 10 June 2014 05:53, Jim Nasby <jim@nasby.net> wrote:
> I was wondering that myself. I think it's certainly reasonable to expect
> someone would wan RETURNING SEQUENCE VALUES, which would return the value of
> every column that owned a sequence (ie: ALTER SEQUENCE ... OWNED BY). ISTM
> that would certainly handle the performance aspect of this, and it sounds
> more in line with what I'd expect getGeneratedKeys() to do.
Keep in mind that not all generated keys come from sequences. Many people have custom key generator functions, including UUIDs and other exotic things like Instagram's setup [1].
RETURNING GENERATED KEYS perhaps, but then how do we determine that? Any column that was filled with a default value? But that's potentially returning far more values than the user will want - I bet 99% of users just want their generated primary key.
The spec is a bit vague [2]:
Retrieves any auto-generated keys created as a result of executing
this Statement object. If this Statement object did not generate any
keys, an empty ResultSet object is returned.
Note:If the columns which represent the auto-generated keys were
The second paragraph refers to [3] and [4] where the application can specify which columns it's after. Given that there's a mechanism to specify which keys the application wants returned in the driver, and the driver in that case can just issue a RETURNING clause with a column list, my gut feel would be to just support returning primary keys as that will handle most cases of e.g. middleware like ORMs fetching that without needing to know the specific column names.
Note:If the columns which represent the auto-generated keys were
not specified, the JDBC driver implementation will determine the
columns which best represent the auto-generated keys.
The second paragraph refers to [3] and [4] where the application can specify which columns it's after. Given that there's a mechanism to specify which keys the application wants returned in the driver, and the driver in that case can just issue a RETURNING clause with a column list, my gut feel would be to just support returning primary keys as that will handle most cases of e.g. middleware like ORMs fetching that without needing to know the specific column names.
Cheers
Tom
On 06/10/2014 03:19 AM, Tom Dunstan wrote:
What about RETURNING CHANGED FIELDS ?A definite +1 on this feature. A while ago I got partway through hacking the hibernate postgres dialect to make it issue a RETURNING clause to spit out the primary key before I realised that the driver was already doing a RETURNING * internally.
On 10 June 2014 05:53, Jim Nasby <jim@nasby.net> wrote:
> I was wondering that myself. I think it's certainly reasonable to expect
> someone would wan RETURNING SEQUENCE VALUES, which would return the value of
> every column that owned a sequence (ie: ALTER SEQUENCE ... OWNED BY). ISTM
> that would certainly handle the performance aspect of this, and it sounds
> more in line with what I'd expect getGeneratedKeys() to do.
Keep in mind that not all generated keys come from sequences. Many people have custom key generator functions, including UUIDs and other exotic things like Instagram's setup [1].
RETURNING GENERATED KEYS perhaps, but then how do we determine that?
Might be quite complicated technically, but this is what is probably wanted.
Any column that was filled with a default value? But that's potentially returning far more values than the user will want - I bet 99% of users just want their generated primary key.
Probably not true - you would want your ORM model to be in sync with what is database after you save it if you plan to do any further processing using it.
At least I would :)
Why not then just leave the whole thing as it is on server side, and let the ORM specify which "generated keys" it wants ?
The spec is a bit vague [2]:
Retrieves any auto-generated keys created as a result of executingthis Statement object. If this Statement object did not generate anykeys, an empty ResultSet object is returned.
Note:If the columns which represent the auto-generated keys werenot specified, the JDBC driver implementation will determine thecolumns which best represent the auto-generated keys.
The second paragraph refers to [3] and [4] where the application can specify which columns it's after. Given that there's a mechanism to specify which keys the application wants returned in the driver, and the driver in that case can just issue a RETURNING clause with a column list, my gut feel would be to just support returning primary keys as that will handle most cases of e.g. middleware like ORMs fetching that without needing to know the specific column names.
-- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ
On 10 June 2014 17:49, Hannu Krosing <hannu@2ndquadrant.com> wrote:
Seems to be getting further away from something that describes the main use case - changed fields sounds like something that would apply to an update statement.
RETURNING GENERATED KEYS perhaps, but then how do we determine that?What about RETURNING CHANGED FIELDS ?
Might be quite complicated technically, but this is what is probably wanted.
Seems to be getting further away from something that describes the main use case - changed fields sounds like something that would apply to an update statement.
Probably not true - you would want your ORM model to be in sync with what is database after you save it if you plan to do any further processing using it.Any column that was filled with a default value? But that's potentially returning far more values than the user will want - I bet 99% of users just want their generated primary key.
Well, yes, but since RETURNING is non-standard most ORMs are unlikely to support fetching other generated values that way anyway. The ones that I've dealt with will do an insert, then a select to get the extra fields. I don't know if other JDBC drivers allow applications to just specify any old list of non-key columns to the execute method, but I suspect not, given that the way they fetch those columns is rather less general-purpose than our RETURNING syntax.
Why not then just leave the whole thing as it is on server side, and let the ORM specify which "generated keys" it wants ?The second paragraph refers to [3] and [4] where the application can specify which columns it's after. Given that there's a mechanism to specify which keys the application wants returned in the driver, and the driver in that case can just issue a RETURNING clause with a column list, my gut feel would be to just support returning primary keys as that will handle most cases of e.g. middleware like ORMs fetching that without needing to know the specific column names.
Because java-based ORMs (at least) mostly don't have to - other server/driver combos manage to implement getGeneratedKeys() without being explicitly given a column list, they just do the sane thing and return the appropriate identity column or whatever for the inserted row.
I agree that in hand-crafted JDBC there's no particular problem in making a user specify a column list, (although I don't think I've EVER seen anyone actually do that in the wild), but most middleware will expect getGeneratedKeys() to just work and we should try to do something about making that case work a bit more efficiently than it does now.
Cheers
Tom
On 06/10/2014 11:02 AM, Tom Dunstan wrote:
Not really - it applies to both INSERT and UPDATE if there are any triggers and/or default valuesOn 10 June 2014 17:49, Hannu Krosing <hannu@2ndquadrant.com> wrote:RETURNING GENERATED KEYS perhaps, but then how do we determine that?What about RETURNING CHANGED FIELDS ?
Might be quite complicated technically, but this is what is probably wanted.
Seems to be getting further away from something that describes the main use
case - changed fields sounds like something that would apply to an update statement.
The use-case is an extended version of getting the key, with the main aim of making sure
that your ORM model is the same as what is saved in database.
But does the ORM already not "know" the names of auto-generated keys and thus could easily replace them for * in RETURNING ?Probably not true - you would want your ORM model to be in sync with what is database after you save it if you plan to do any further processing using it.Any column that was filled with a default value? But that's potentially returning far more values than the user will want - I bet 99% of users just want their generated primary key.Well, yes, but since RETURNING is non-standard most ORMs are unlikely to support fetching other generated values that way anyway. The ones that I've dealt with will do an insert, then a select to get the extra fields. I don't know if other JDBC drivers allow applications to just specify any old list of non-key columns to the execute method, but I suspect not, given that the way they fetch those columns is rather less general-purpose than our RETURNING syntax.Why not then just leave the whole thing as it is on server side, and let the ORM specify which "generated keys" it wants ?The second paragraph refers to [3] and [4] where the application can specify which columns it's after. Given that there's a mechanism to specify which keys the application wants returned in the driver, and the driver in that case can just issue a RETURNING clause with a column list, my gut feel would be to just support returning primary keys as that will handle most cases of e.g. middleware like ORMs fetching that without needing to know the specific column names.Because java-based ORMs (at least) mostly don't have to - other server/driver combos manage to implement getGeneratedKeys() without being explicitly given a column list, they just do the sane thing and return the appropriate identity column or whatever for the inserted row.I agree that in hand-crafted JDBC there's no particular problem in making a user specify a column list, (although I don't think I've EVER seen anyone actually do that in the wild), but most middleware will expect getGeneratedKeys() to just work and we should try to do something about making that case work a bit more efficiently than it does now.
CheersTom
-- Hannu Krosing PostgreSQL Consultant Performance, Scalability and High Availability 2ndQuadrant Nordic OÜ
On 06/09/2014 07:13 PM, Andres Freund wrote: > On 2014-06-09 13:42:22 +0200, Vik Fearing wrote: >> On 06/09/2014 09:06 AM, Gavin Flower wrote: >>> From memory all unique keys can be considered 'candidate primary keys', >>> but only one can be designated 'the PRIMARY KEY'. >> >> Almost. Candidate keys are also NOT NULL. > > The list actually is a bit longer. They also cannot be partial. What? AFAIK, that only applies to an index. How can the data itself be partial? > There's generally also the restriction that for some contexts - like > e.g. foreign keys - primary/candidate keys may not be deferrable.. Again, what is deferrable data? -- Vik
On 2014-06-11 00:21:58 +0200, Vik Fearing wrote: > On 06/09/2014 07:13 PM, Andres Freund wrote: > > On 2014-06-09 13:42:22 +0200, Vik Fearing wrote: > >> On 06/09/2014 09:06 AM, Gavin Flower wrote: > >>> From memory all unique keys can be considered 'candidate primary keys', > >>> but only one can be designated 'the PRIMARY KEY'. > >> > >> Almost. Candidate keys are also NOT NULL. > > > > The list actually is a bit longer. They also cannot be partial. > > What? AFAIK, that only applies to an index. How can the data itself be > partial? I don't follow? Gavin above talked about unique keys - which in postgres you can create using CREATE UNIQUE INDEX. And if you make those partial they can't be used for this purpose. > > There's generally also the restriction that for some contexts - like > > e.g. foreign keys - primary/candidate keys may not be deferrable.. > > Again, what is deferrable data? You can define primary/unique constraints to be deferrable. c.f. CREATE TABLE docs. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-06-11 00:21:58 +0200, Vik Fearing wrote: >> What? AFAIK, that only applies to an index. How can the data itself be >> partial? > I don't follow? Gavin above talked about unique keys - which in postgres > you can create using CREATE UNIQUE INDEX. And if you make those partial > they can't be used for this purpose. I have a feeling this conversation is going in the wrong direction. ISTM that to be useful at all, the set of columns that would be returned by a clause like this has to be *extremely* predictable; otherwise the application won't know what to do with the results. If the app has to examine the table's metadata to identify what it's getting, what's the point of the feature at all as opposed to just listing the columns you want explicitly? So I doubt that the use-case for anything more complicated than returning the primary key, full stop, is really there. I'm not even 100% sold that automatically returning the primary key is going to save any application logic. Could somebody point out *exactly* where an app is going to save effort with this type of syntax, compared to requesting the columns it wants by name? Is it going to save enough to justify depending on a syntax that won't be universal for a long time to come? regards, tom lane
On 11 June 2014 10:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I'm not even 100% sold that automatically returning the primary keyis going to save any application logic. Could somebody point out
*exactly* where an app is going to save effort with this type of
syntax, compared to requesting the columns it wants by name?
Is it going to save enough to justify depending on a syntax that won't
be universal for a long time to come?
Well, in e.g. Hibernate there's piece of code which calls getGeneratedKeys() to fetch the inserted primary key (it only supports fetching a single generated key column in this way) if the underlying database supports that. The postgresql dialect specifies that it does support that code path, so at the moment any hibernate users who aren't explicitly specifying the "sequence" type for their id generation will be calling that, and the JDBC driver will be appending "RETURNING *" under the hood for all inserts.
Looking at the oracle hibernate dialect is instructive as to the state of support for the explicit-column-list variation:
// Oracle driver reports to support getGeneratedKeys(), but they only// support the version taking an array of the names of the columns to// be returned (via its RETURNING clause). No other driver seems to// support this overloaded version.
And so hibernate doesn't support the explicit-column-list version at all since apparently no-one else supports it, and just marks that code path as unsupported for oracle. I presume that the situation is similar in other java-based ORMs.
Looking at some other drivers that I would expect to support getGeneratedKeys() in a sane way given their identity/auto-increment semantics reveals:
- JTDS driver for MSSQL/Sybase piggybacks a second query to do "SELECT SCOPE_IDENTITY() AS blah" / "SELECT @@IDENTITY AS blah" to fetch the key if that was requested. It looks like this driver does support specifying the column name, but it only allows a single column to be given, and it promptly ignores the passed in value and calls the non-specified version.
- MySQL driver internally returns a single ID with the query result, and the driver then appears to add an auto-increment amount to calculate the rest of the values. I guess MySQL must allocate the ids in guaranteed-sequential chunks. MySQL only supports a single auto-increment key. If called with the explicit column version, the passed-in column names are ignored.
So looks like other JDBC driver/server combos only support this for single-column primary keys. But for those cases things pretty much work as expected. It would be nice to be able to support at least primary keys with this feature.
We could try to teach every ORM out there to call the explicit column-list version, but given other lack of support for it I doubt they'll be interested, especially if the reason is because we don't want to add enough support to make getGeneratedKeys() work efficiently.
FWIW I reckon for most users of ORMs at least it will be enough to support this for direct inserts to tables - views is a nice-to-have but I'd take tables-only over not at all.
Cheers
Tom
Is it going to save enough to justify depending on a syntax that won't
be universal for a long time to come?
Oh, and on the won't-be-universal-for-a-while point - the status quo works fine, it's just less efficient than it should be. Once someone upgrades to 9.5 or whatever, and upgrades to the matching JDBC driver version, they'll get the newer efficient call for free.
In the python world PEP249 has a lastrowid property that drivers can implement, but I don't know how much it's used or our support for it. Any python devs out there want to chime in? I don't know about other drivers.
Obviously anyone hand-crafting their queries won't be able to do that until they know it's safe. But that's always the case with new syntax.
Cheers
Tom
On Wed, Jun 11, 2014 at 2:39 AM, Tom Lane wrote:
I'm not even 100% sold that automatically returning the primary key
is going to save any application logic. Could somebody point out
*exactly* where an app is going to save effort with this type of
syntax, compared to requesting the columns it wants by name?
I haven't checked the code, but I am hoping it will help with the problem where a RETURNING * is added to a statement that is not an insert or update by the JDBC driver. That has been reported on the JDBC list at least twice, and the proposed workaround is neither very elegant nor very robust: https://groups.google.com/forum/#!msg/pgsql.interfaces.jdbc/7WY60JX3qyo/-v1fqDqLQKwJ
Jochem
Jochem van Dieten
http://jochem.vandieten.net/
On 14/06/12 18:46, Jochem van Dieten wrote: > On Wed, Jun 11, 2014 at 2:39 AM, Tom Lane wrote: > > I'm not even 100% sold that automatically returning the primary key > is going to save any application logic. Could somebody point out > *exactly* where an app is going to save effort with this type of > syntax, compared to requesting the columns it wants by name? > > > I haven't checked the code, but I am hoping it will help with the problem > where a RETURNING * is added to a statement that is not an insert or update > by the JDBC driver. That has been reported on the JDBC list at least twice, > and the proposed workaround is neither very elegant nor very robust: > https://groups.google.com/forum/#!msg/pgsql.interfaces.jdbc/7WY60JX3qyo/-v1fqDqLQKwJ Unfortunately that seems to be a JDBC-specific issue, which is outside of the scope of this particular patch (which proposes additional server-side syntax intended to make RETURNING * operations more efficient for certain use cases, but which is in itself not a JDBC change). Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Jun 12, 2014 at 12:25 PM, Ian Barwick wrote:
Unfortunately that seems to be a JDBC-specific issue, which is outsideOn 14/06/12 18:46, Jochem van Dieten wrote:
> I haven't checked the code, but I am hoping it will help with the problem
> where a RETURNING * is added to a statement that is not an insert or update
> by the JDBC driver. That has been reported on the JDBC list at least twice,
> and the proposed workaround is neither very elegant nor very robust:
> https://groups.google.com/forum/#!msg/pgsql.interfaces.jdbc/7WY60JX3qyo/-v1fqDqLQKwJ
of the scope of this particular patch (which proposes additional server-side
syntax intended to make RETURNING * operations more efficient for
certain use cases, but which is in itself not a JDBC change).
But the obvious way to fix the JDBC issue is not to fix it by adding a 'mini parser' on the JDBC side, but to make SELECT ... RETURNING PRIMARY KEY a regular select that silently ignores the returning clause and doesn't throw an error on the server-side.
That might still be outside the scope of this particular patch, but it would provide (additional) justification if it were supported.
Jochem
Jochem van Dieten
http://jochem.vandieten.net/
On 2014-06-12 13:58:31 +0200, Jochem van Dieten wrote: > But the obvious way to fix the JDBC issue is not to fix it by adding a > 'mini parser' on the JDBC side, but to make SELECT ... RETURNING PRIMARY > KEY a regular select that silently ignores the returning clause and doesn't > throw an error on the server-side. Brr. Then it'd need to be added to all statements, not just SELECT. I've my doubts that's going to happen. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 14/06/12 20:58, Jochem van Dieten wrote: > On Thu, Jun 12, 2014 at 12:25 PM, Ian Barwick wrote: > > On 14/06/12 18:46, Jochem van Dieten wrote: > > I haven't checked the code, but I am hoping it will help with the problem > > where a RETURNING * is added to a statement that is not an insert or update > > by the JDBC driver. That has been reported on the JDBC list at least twice, > > and the proposed workaround is neither very elegant nor very robust: > > https://groups.google.com/forum/#!msg/pgsql.interfaces.jdbc/7WY60JX3qyo/-v1fqDqLQKwJ > > Unfortunately that seems to be a JDBC-specific issue, which is outside > of the scope of this particular patch (which proposes additional server-side > syntax intended to make RETURNING * operations more efficient for > certain use cases, but which is in itself not a JDBC change). > > > But the obvious way to fix the JDBC issue is not to fix it by adding a 'mini parser' on > the JDBC side, but to make SELECT ... RETURNING PRIMARY KEY a regular select that silently > ignores the returning clause and doesn't throw an error on the server-side. > > That might still be outside the scope of this particular patch, but it would provide > (additional) justification if it were supported. That would be adding superfluous, unused and unusable syntax of no potential value (there is no SELECT ... RETURNING and it wouldn't make any sense if there was) as a workaround for a driver issue - not going to happen. Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hello All,
--
Rushabh Lathia
I assigned my self as reviewer of the patch. I gone through the
mail chain discussion and in that question has been raised about
the feature and its implementation, so would like to know what is
the current status of this project/patch.
Regards,
On Thu, Jun 12, 2014 at 5:53 PM, Ian Barwick <ian@2ndquadrant.com> wrote:
That would be adding superfluous, unused and unusable syntax of no potential valueOn 14/06/12 20:58, Jochem van Dieten wrote:
> On Thu, Jun 12, 2014 at 12:25 PM, Ian Barwick wrote:
>
> On 14/06/12 18:46, Jochem van Dieten wrote:
> > I haven't checked the code, but I am hoping it will help with the problem
> > where a RETURNING * is added to a statement that is not an insert or update
> > by the JDBC driver. That has been reported on the JDBC list at least twice,
> > and the proposed workaround is neither very elegant nor very robust:
> > https://groups.google.com/forum/#!msg/pgsql.interfaces.jdbc/7WY60JX3qyo/-v1fqDqLQKwJ
>
> Unfortunately that seems to be a JDBC-specific issue, which is outside
> of the scope of this particular patch (which proposes additional server-side
> syntax intended to make RETURNING * operations more efficient for
> certain use cases, but which is in itself not a JDBC change).
>
>
> But the obvious way to fix the JDBC issue is not to fix it by adding a 'mini parser' on
> the JDBC side, but to make SELECT ... RETURNING PRIMARY KEY a regular select that silently
> ignores the returning clause and doesn't throw an error on the server-side.
>
> That might still be outside the scope of this particular patch, but it would provide
> (additional) justification if it were supported.
(there is no SELECT ... RETURNING and it wouldn't make any sense if there was) as a
workaround for a driver issue - not going to happen.
Regards
Ian Barwick
--
Ian Barwick http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Rushabh Lathia
Hi On 14/06/25 15:13, Rushabh Lathia wrote: > Hello All, > > I assigned my self as reviewer of the patch. I gone through the > mail chain discussion and in that question has been raised about > the feature and its implementation, so would like to know what is > the current status of this project/patch. > > Regards, I'll be submitting a revised version of this patch very shortly. Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 25/06/14 16:04, Ian Barwick wrote: > Hi > > On 14/06/25 15:13, Rushabh Lathia wrote: >> Hello All, >> >> I assigned my self as reviewer of the patch. I gone through the >> mail chain discussion and in that question has been raised about >> the feature and its implementation, so would like to know what is >> the current status of this project/patch. >> >> Regards, > > I'll be submitting a revised version of this patch very shortly. Revised version of the patch attached, which implements the expansion of "primary key" in the rewrite phase per Tom Lane's suggestion upthread [*] [*] http://www.postgresql.org/message-id/28583.1402325169@sss.pgh.pa.us Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Thanks for sharing latest patch.
For me this syntax is limiting the current returning clause implementation.
Because its not allowing the returning PRIMARY KEYS with other columns, and
if user or application require that they need to do RETURNING *. Following
syntax not working, which i think is very useful in term if someone want to use
returning clause with PRIMARY KEY as well as some other table columns.
INSERT INTO dept VALUES (10,'ACCOUNTING','NEW YORK') returning primary key, dname;
I think allowing other columns with PRIMARY KEY would be more useful syntax.
Even in later versions if we want to extend this syntax to return UNIQUE KEY,
SEQUENCE VALUES, etc.. comma separation syntax will be more handy.
For me this syntax is limiting the current returning clause implementation.
Because its not allowing the returning PRIMARY KEYS with other columns, and
if user or application require that they need to do RETURNING *. Following
syntax not working, which i think is very useful in term if someone want to use
returning clause with PRIMARY KEY as well as some other table columns.
INSERT INTO dept VALUES (10,'ACCOUNTING','NEW YORK') returning primary key, dname;
I think allowing other columns with PRIMARY KEY would be more useful syntax.
Even in later versions if we want to extend this syntax to return UNIQUE KEY,
SEQUENCE VALUES, etc.. comma separation syntax will be more handy.
Others please share your inputs/thoughts.
On Thu, Jun 26, 2014 at 8:11 AM, Ian Barwick <ian@2ndquadrant.com> wrote:
On 25/06/14 16:04, Ian Barwick wrote:Revised version of the patch attached, which implements the expansionHi
On 14/06/25 15:13, Rushabh Lathia wrote:Hello All,
I assigned my self as reviewer of the patch. I gone through the
mail chain discussion and in that question has been raised about
the feature and its implementation, so would like to know what is
the current status of this project/patch.
Regards,
I'll be submitting a revised version of this patch very shortly.
of "primary key" in the rewrite phase per Tom Lane's suggestion upthread [*]
[*] http://www.postgresql.org/message-id/28583.1402325169@sss.pgh.pa.us
Regards
Ian Barwick
--
Ian Barwick http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Rushabh Lathia
On 27/06/14 00:12, Rushabh Lathia wrote: > Thanks for sharing latest patch. > > For me this syntax is limiting the current returning clause > implementation. > Because its not allowing the returning PRIMARY KEYS with other > columns, and > if user or application require that they need to do RETURNING *. Following > syntax not working, which i think is very useful in term if someone > want to use > returning clause with PRIMARY KEY as well as some other table columns. > > INSERT INTO dept VALUES (10,'ACCOUNTING','NEW YORK') returning primary > key, dname; > > I think allowing other columns with PRIMARY KEY would be more useful > syntax. > Even in later versions if we want to extend this syntax to return > UNIQUE KEY, > SEQUENCE VALUES, etc.. comma separation syntax will be more handy. > > Others please share your inputs/thoughts. > > [...] I agree 100%. I note that DELETE & INSERT have a RETURNING clause. So the semantics and syntax should be the same, as much as is practicable. Cheers, Gavin
On 27 June 2014 06:14, Gavin Flower <GavinFlower@archidevsys.co.nz> wrote:
On 27/06/14 00:12, Rushabh Lathia wrote:INSERT INTO dept VALUES (10,'ACCOUNTING','NEW YORK') returning primary key, dname;
I think allowing other columns with PRIMARY KEY would be more useful syntax.
Even in later versions if we want to extend this syntax to return UNIQUE KEY,
SEQUENCE VALUES, etc.. comma separation syntax will be more handy.
I agree 100%.
If the query is being hand-crafted, what's to stop the query writer from just listing the id columns in the returning clause? And someone specifying RETURNING * is getting all the columns anyway.
The target use-case for this feature is a database driver that has just done an insert and doesn't know what the primary key columns are - in that case mixing them with any other columns is actually counter-productive as the driver won't know which columns are which. What use cases are there where the writer of the query knows enough to write specific columns in the RETURNING clause but not enough to know which column is the id column?
Consistency is nice, and I can understand wanting to treat the PRIMARY KEY bit as just another set of columns in the list to return, but I'd hate to see this feature put on the back-burner to support use-cases that are already handled by the current RETURNING feature. Maybe it's easy to do, though.. I haven't looked into the implementation at all.
Cheers
Tom
On 27/06/14 09:09, Tom Dunstan wrote: > On 27 June 2014 06:14, Gavin Flower <GavinFlower@archidevsys.co.nz <mailto:GavinFlower@archidevsys.co.nz>> wrote: > > On 27/06/14 00:12, Rushabh Lathia wrote: > > INSERT INTO dept VALUES (10,'ACCOUNTING','NEW YORK') returning primary key, dname; > > I think allowing other columns with PRIMARY KEY would be more useful syntax. > Even in later versions if we want to extend this syntax to return UNIQUE KEY, > SEQUENCE VALUES, etc.. comma separation syntax will be more handy. > > > I agree 100%. > > > If the query is being hand-crafted, what's to stop the query writer from just listing the> id columns in the returningclause? And someone specifying RETURNING * is getting all the> columns anyway. > > The target use-case for this feature is a database driver that has just done an insert and> doesn't know what the primarykey columns are - in that case mixing them with any other> columns is actually counter-productive as the driver won'tknow which columns are which.> What use cases are there where the writer of the query knows enough to write specificcolumns> in the RETURNING clause but not enough to know which column is the id column? > > Consistency is nice, and I can understand wanting to treat the PRIMARY KEY bit as just > another set of columns in the list to return, but I'd hate to see this feature put on> the back-burner to support use-casesthat are already handled by the current RETURNING> feature. Maybe it's easy to do, though.. I haven't looked intothe implementation at all. Normal columns are injected into the query's returning list at parse time, whereas this version of the patch handles expansion of PRIMARY KEY at the rewrite stage, which would make handling a mix of PRIMARY KEY and normal output expressions somewhat tricky to handle. (In order to maintain the columns in their expected position you'd have to add some sort of placeholder/dummy TargetEntry to the returning list at parse time, then rewrite it later with the expanded primary key columns, or something equally messy). On the other hand, it should be fairly straightforward to handle a list of keywords for expansion (e.g. "RETURNING PRIMARY KEY, UNIQUE KEYS, SEQUENCE VALUES") should the need arise. Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
.) Patch gets applied through patch -p1 (git apply fails)
.) trailing whitespace in the patch at various places
.) Unnecessary new line + and - in the patch.
(src/backend/rewrite/rewriteManip.c::getInsertSelectQuery())
(src/include/rewrite/rewriteManip.h)
.) In map_primary_key_to_list() patch using INDEX_ATTR_BITMAP_IDENTITY_KEY
bitmap to get the keycols. In IndexAttrBitmapKind there is also
INDEX_ATTR_BITMAP_KEY, so was confuse between INDEX_ATTR_BITMAP_KEY and
INDEX_ATTR_BITMAP_IDENTITY_KEY and also haven't found related comments in
the code. Later with use of testcase and debugging found confirmed that
INDEX_ATTR_BITMAP_IDENTITY_KEY only returns the Primary key.
.) At present in patch when RETURNING PRIMARY KEY is used on table having no
primary key it throw an error. If I am not wrong JDBC will be using that into
getGeneratedKeys(). Currently this feature is implemented in the JDBC driver by
appending "RETURNING *" to the supplied statement. With this implementation
it will replace "RETURNING *" with "RETURNING PRIMARY KEY", right ? So just
wondering what JDBC expect getGeneratedKeys() to return when query don't
have primary key and user called executeUpdate() with
Statement.RETURN_GENERATED_KEYS? I looked at JDBC specification but its not
clear what it will return when table don't have keys. Can you please let us
know your finding on this ?
.) Already raised my concern regarding syntax limiting the current returning
clause implementation.
bitmap to get the keycols. In IndexAttrBitmapKind there is also
INDEX_ATTR_BITMAP_KEY, so was confuse between INDEX_ATTR_BITMAP_KEY and
INDEX_ATTR_BITMAP_IDENTITY_KEY and also haven't found related comments in
the code. Later with use of testcase and debugging found confirmed that
INDEX_ATTR_BITMAP_IDENTITY_KEY only returns the Primary key.
.) At present in patch when RETURNING PRIMARY KEY is used on table having no
primary key it throw an error. If I am not wrong JDBC will be using that into
getGeneratedKeys(). Currently this feature is implemented in the JDBC driver by
appending "RETURNING *" to the supplied statement. With this implementation
it will replace "RETURNING *" with "RETURNING PRIMARY KEY", right ? So just
wondering what JDBC expect getGeneratedKeys() to return when query don't
have primary key and user called executeUpdate() with
Statement.RETURN_GENERATED_KEYS? I looked at JDBC specification but its not
clear what it will return when table don't have keys. Can you please let us
know your finding on this ?
.) Already raised my concern regarding syntax limiting the current returning
clause implementation.
On Fri, Jun 27, 2014 at 6:22 AM, Ian Barwick <ian@2ndquadrant.com> wrote:
On 27/06/14 09:09, Tom Dunstan wrote:Normal columns are injected into the query's returning list at parse time, whereasOn 27 June 2014 06:14, Gavin Flower <GavinFlower@archidevsys.co.nz <mailto:GavinFlower@archidevsys.co.nz>> wrote:> id columns in the returning clause? And someone specifying RETURNING * is getting all the
On 27/06/14 00:12, Rushabh Lathia wrote:
INSERT INTO dept VALUES (10,'ACCOUNTING','NEW YORK') returning primary key, dname;
I think allowing other columns with PRIMARY KEY would be more useful syntax.
Even in later versions if we want to extend this syntax to return UNIQUE KEY,
SEQUENCE VALUES, etc.. comma separation syntax will be more handy.
I agree 100%.
If the query is being hand-crafted, what's to stop the query writer from just listing the
> columns anyway.> doesn't know what the primary key columns are - in that case mixing them with any other
The target use-case for this feature is a database driver that has just done an insert and
> columns is actually counter-productive as the driver won't know which columns are which.
> What use cases are there where the writer of the query knows enough to write specific columns
> in the RETURNING clause but not enough to know which column is the id column?> the back-burner to support use-cases that are already handled by the current RETURNING
Consistency is nice, and I can understand wanting to treat the PRIMARY KEY bit as just
another set of columns in the list to return, but I'd hate to see this feature put on
> feature. Maybe it's easy to do, though.. I haven't looked into the implementation at all.
this version of the patch handles expansion of PRIMARY KEY at the rewrite stage, which
would make handling a mix of PRIMARY KEY and normal output expressions somewhat tricky
to handle. (In order to maintain the columns in their expected position you'd
have to add some sort of placeholder/dummy TargetEntry to the returning list at parse
time, then rewrite it later with the expanded primary key columns, or something
equally messy).
On the other hand, it should be fairly straightforward to handle a list of keywords
for expansion (e.g. "RETURNING PRIMARY KEY, UNIQUE KEYS, SEQUENCE VALUES") should
the need arise.
Regards
Ian Barwick
--
Ian Barwick http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Rushabh Lathia
On Tue, Jul 1, 2014 at 8:00 AM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote: > .) In map_primary_key_to_list() patch using INDEX_ATTR_BITMAP_IDENTITY_KEY > bitmap to get the keycols. In IndexAttrBitmapKind there is also > INDEX_ATTR_BITMAP_KEY, so was confuse between INDEX_ATTR_BITMAP_KEY and > INDEX_ATTR_BITMAP_IDENTITY_KEY and also haven't found related comments in > the code. Later with use of testcase and debugging found confirmed that > INDEX_ATTR_BITMAP_IDENTITY_KEY only returns the Primary key. Actually, that depends on how REPLICA IDENTITY is set. IOW, you can't assume that will give you the primary key. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 14/07/01 23:13, Robert Haas wrote: > On Tue, Jul 1, 2014 at 8:00 AM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote: >> .) In map_primary_key_to_list() patch using INDEX_ATTR_BITMAP_IDENTITY_KEY >> bitmap to get the keycols. In IndexAttrBitmapKind there is also >> INDEX_ATTR_BITMAP_KEY, so was confuse between INDEX_ATTR_BITMAP_KEY and >> INDEX_ATTR_BITMAP_IDENTITY_KEY and also haven't found related comments in >> the code. Later with use of testcase and debugging found confirmed that >> INDEX_ATTR_BITMAP_IDENTITY_KEY only returns the Primary key. > > Actually, that depends on how REPLICA IDENTITY is set. IOW, you can't > assume that will give you the primary key. Damn, fooled by the name. Thanks for the info; I'll rework the patch accordingly. Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/07/14 15:16, Ian Barwick wrote: > On 14/07/01 23:13, Robert Haas wrote: >> On Tue, Jul 1, 2014 at 8:00 AM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote: >>> .) In map_primary_key_to_list() patch using INDEX_ATTR_BITMAP_IDENTITY_KEY >>> bitmap to get the keycols. In IndexAttrBitmapKind there is also >>> INDEX_ATTR_BITMAP_KEY, so was confuse between INDEX_ATTR_BITMAP_KEY and >>> INDEX_ATTR_BITMAP_IDENTITY_KEY and also haven't found related comments in >>> the code. Later with use of testcase and debugging found confirmed that >>> INDEX_ATTR_BITMAP_IDENTITY_KEY only returns the Primary key. >> >> Actually, that depends on how REPLICA IDENTITY is set. IOW, you can't >> assume that will give you the primary key. > > Damn, fooled by the name. Thanks for the info; I'll rework the patch > accordingly. Attached version implements an IndexAttrBitmapKind "INDEX_ATTR_BITMAP_PRIMARY_KEY", which will return the primary key column(s). Note this would require a catalog version bump. Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On 01/07/14 21:00, Rushabh Lathia wrote: > > I spent some more time on the patch and here are my review comments. > > .) Patch gets applied through patch -p1 (git apply fails) > > .) trailing whitespace in the patch at various places Not sure where you see this, unless it's in the tests, where it's required. > .) Unnecessary new line + and - in the patch. > (src/backend/rewrite/rewriteManip.c::getInsertSelectQuery()) > (src/include/rewrite/rewriteManip.h) Fixed. > .) patch has proper test coverage and regression running cleanly. > > .) In map_primary_key_to_list() patch using INDEX_ATTR_BITMAP_IDENTITY_KEY > bitmap to get the keycols. In IndexAttrBitmapKind there is also > INDEX_ATTR_BITMAP_KEY, so was confuse between INDEX_ATTR_BITMAP_KEY and > INDEX_ATTR_BITMAP_IDENTITY_KEY and also haven't found related comments in > the code. Later with use of testcase and debugging found confirmed that > INDEX_ATTR_BITMAP_IDENTITY_KEY only returns the Primary key. Revised patch version (see other mail) fixes this by introducing INDEX_ATTR_BITMAP_PRIMARY_KEY. > .) At present in patch when RETURNING PRIMARY KEY is used on table having no > primary key it throw an error. If I am not wrong JDBC will be using that into > getGeneratedKeys(). Currently this feature is implemented in the JDBC driver by > appending "RETURNING *" to the supplied statement. With this implementation > it will replace "RETURNING *" with "RETURNING PRIMARY KEY", right ? So just > wondering what JDBC expect getGeneratedKeys() to return when query don't > have primary key and user called executeUpdate() with > Statement.RETURN_GENERATED_KEYS? I looked at JDBC specification but its not > clear what it will return when table don't have keys. Can you please let us > know your finding on this ? The spec [*] is indeed frustratingly vague: The method Statement.getGeneratedKeys, which can be called to retrieve the generated value, returns a ResultSet objectwith a column for each automatically generated value. The methods execute, executeUpdate or Connection.prepareStatementaccept an optional parameter, which can be used to indicate that any auto generated valuesshould be returned when the statement is executed or prepared. [*] http://download.oracle.com/otn-pub/jcp/jdbc-4_1-mrel-spec/jdbc4.1-fr-spec.pdf I understand this to mean that no rows will be returned if no auto-generated values are not present. As-is, the patch will raise an error if the target table does not have a primary key, which makes sense from the point of view of the proposed syntax, but which will make it impossible for the JDBC driver to implement the above understanding of the spec (i.e. return nothing if no primary key exists). It would be simple enough not to raise an error in this case, but that means the query would be effectively failing silently and I don't think that's desirable behaviour. A better solution would be to have an optional "IF EXISTS" clause: RETURNING PRIMARY KEY [ IF EXISTS ] which would be easy enough to implement. Thoughts? Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Jul 3, 2014 at 6:56 AM, Ian Barwick <ian@2ndquadrant.com> wrote:
Hmm liked the idea about adding [ IF EXISTS ]. Looking at the grammar so far we
having [ IF EXISTS ] option only for DDL commands (mainly DROP) not sure whether
its ok to use such syntax for DML commands.
On 01/07/14 21:00, Rushabh Lathia wrote:Not sure where you see this, unless it's in the tests, where it's
I spent some more time on the patch and here are my review comments.
.) Patch gets applied through patch -p1 (git apply fails)
.) trailing whitespace in the patch at various places
required.
Please ignore the whitespace comment. Don't know somehow that was due to
my local changes in the branch.
Fixed..) Unnecessary new line + and - in the patch.
(src/backend/rewrite/rewriteManip.c::getInsertSelectQuery())
(src/include/rewrite/rewriteManip.h)Revised patch version (see other mail) fixes this by introducing.) patch has proper test coverage and regression running cleanly.
.) In map_primary_key_to_list() patch using INDEX_ATTR_BITMAP_IDENTITY_KEY
bitmap to get the keycols. In IndexAttrBitmapKind there is also
INDEX_ATTR_BITMAP_KEY, so was confuse between INDEX_ATTR_BITMAP_KEY and
INDEX_ATTR_BITMAP_IDENTITY_KEY and also haven't found related comments in
the code. Later with use of testcase and debugging found confirmed that
INDEX_ATTR_BITMAP_IDENTITY_KEY only returns the Primary key.
INDEX_ATTR_BITMAP_PRIMARY_KEY.
Looks good.
The spec [*] is indeed frustratingly vague:.) At present in patch when RETURNING PRIMARY KEY is used on table having no
primary key it throw an error. If I am not wrong JDBC will be using that into
getGeneratedKeys(). Currently this feature is implemented in the JDBC driver by
appending "RETURNING *" to the supplied statement. With this implementation
it will replace "RETURNING *" with "RETURNING PRIMARY KEY", right ? So just
wondering what JDBC expect getGeneratedKeys() to return when query don't
have primary key and user called executeUpdate() with
Statement.RETURN_GENERATED_KEYS? I looked at JDBC specification but its not
clear what it will return when table don't have keys. Can you please let us
know your finding on this ?
The method Statement.getGeneratedKeys, which can be called to retrieve the generated
value, returns a ResultSet object with a column for each automatically generated value.
The methods execute, executeUpdate or Connection.prepareStatement accept an optional
parameter, which can be used to indicate that any auto generated values should be
returned when the statement is executed or prepared.
[*] http://download.oracle.com/otn-pub/jcp/jdbc-4_1-mrel-spec/jdbc4.1-fr-spec.pdf
I understand this to mean that no rows will be returned if no auto-generated values
are not present.
As-is, the patch will raise an error if the target table does not have a primary key,
which makes sense from the point of view of the proposed syntax, but which will
make it impossible for the JDBC driver to implement the above understanding of the spec
(i.e. return nothing if no primary key exists).
The basic reason for this patch is to allow JDBC to use this syntax for getGeneratedKeys().
Now because this patch throw an error when table don't have any primary key, this patch
won't be useful for the getGeneratedKeys() implementation.
won't be useful for the getGeneratedKeys() implementation.
It would be simple enough not to raise an error in this case, but that means the
query would be effectively failing silently and I don't think that's desirable
behaviour.
Yes, not to raise an error won't be desirable behavior.
A better solution would be to have an optional "IF EXISTS" clause:
RETURNING PRIMARY KEY [ IF EXISTS ]
which would be easy enough to implement.
Thoughts?
having [ IF EXISTS ] option only for DDL commands (mainly DROP) not sure whether
its ok to use such syntax for DML commands.
Others please share your thoughts/comments.
Ian Barwick
--
Ian Barwick http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Rushabh Lathia
Rushabh Lathia <rushabh.lathia@gmail.com> writes: > On Thu, Jul 3, 2014 at 6:56 AM, Ian Barwick <ian@2ndquadrant.com> wrote: >> A better solution would be to have an optional "IF EXISTS" clause: >> RETURNING PRIMARY KEY [ IF EXISTS ] > Hmm liked the idea about adding [ IF EXISTS ]. Looking at the grammar so > far we > having [ IF EXISTS ] option only for DDL commands (mainly DROP) not sure > whether > its ok to use such syntax for DML commands. > Others please share your thoughts/comments. TBH, I thought that RETURNING PRIMARY KEY was already fairly iffy in that the application would have little idea what it was getting back. IF EXISTS would make it so spongy as to be useless, IMO. It sounds to me like designing this for JDBC's getGeneratedKeys method is a mistake. There isn't going to be any way that the driver can support that without having looked at the table's metadata for itself, and if it's going to do that then it doesn't need a crutch that only partially solves the problem anyhow. regards, tom lane
On Thu, Jul 3, 2014 at 7:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > TBH, I thought that RETURNING PRIMARY KEY was already fairly iffy > in that the application would have little idea what it was getting back. > IF EXISTS would make it so spongy as to be useless, IMO. Seems easy enough to look at the count of columns in the result set. But it seems like noise words -- if you don't put IF EXISTS then surely you'll get the same behaviour anyways, no? Fwiw when I wrote ORM-like layers I used to describe the output of the query, including sometimes issuing WHERE 1<>0 queries and looking at the output column types when I needed that before executing the query. Using table metadata would have required a much more in depth understanding of how the database worked and also wouldn't handle expressions, joins, set operations, etc. -- greg
Greg Stark <stark@mit.edu> writes: > On Thu, Jul 3, 2014 at 7:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> TBH, I thought that RETURNING PRIMARY KEY was already fairly iffy >> in that the application would have little idea what it was getting back. >> IF EXISTS would make it so spongy as to be useless, IMO. > Seems easy enough to look at the count of columns in the result set. Sure, if you know how many columns are in the table to begin with; but that goes back to having table metadata. If you don't, you're probably going to be issuing "RETURNING *", and AFAICS "RETURNING *, PRIMARY KEY" would be entirely useless (even without IF EXISTS :-(). I'm still unconvinced that there's a robust use-case here. regards, tom lane
On 4 July 2014 00:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
TBH, I thought that RETURNING PRIMARY KEY was already fairly iffyin that the application would have little idea what it was getting back.
IF EXISTS would make it so spongy as to be useless, IMO.
IF EXISTS is pretty pointless - while the behaviour of getGeneratedKeys() isn't defined for cases where there aren't any, it's only meaningful if the caller has previously asked for the keys to be returned, and someone asking to do that where it doesn't make sense can get an error as far as I'm concerned. No-one does this in practice.
Looking at the feature as a more general SQL construct, ISTM that if someone requests RETURNING PRIMARY KEY where there isn't one, an error is appropriate. And for the IF EXISTS case, when on earth will someone request a primary key even if they're not sure one exists?
It sounds to me like designing this for JDBC's getGeneratedKeys method
is a mistake. There isn't going to be any way that the driver can support
that without having looked at the table's metadata for itself, and if
it's going to do that then it doesn't need a crutch that only partially
solves the problem anyhow.
Sure it can - it append RETURNING PRIMARY KEY and hand back a ResultSet from whatever was returned. It's CURRENTLY doing that, but it's appending RETURNING * and leaving it up to the caller of getGeneratedKeys() to work out which columns the caller is interested in.
Turns out that's actually ok - most Java-based ORMs have more than enough metadata about the tables they manage to know which columns are the primary key. It's the driver that doesn't have that information, which is where RETURNING PRIMARY KEY can help by not forcing the driver to request that every single column is returned.
The only downside that I see is cases where someone requests the keys to be returned but already has a RETURNING clause in their insert statement - what if the requested columns don't include the primary key? IMO it's probably enough to document that if the caller wants to issue a RETURNING clause then they better make sure it contains the columns they're interested in. This is already way outside anything that standard ORMs will do (they don't know about RETURNING) so anyone doing this is hand-crafting anyway.
Cheers
Tom
Tom Dunstan <pgsql@tomd.cc> writes: > On 4 July 2014 00:07, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> It sounds to me like designing this for JDBC's getGeneratedKeys method >> is a mistake. There isn't going to be any way that the driver can support >> that without having looked at the table's metadata for itself, and if >> it's going to do that then it doesn't need a crutch that only partially >> solves the problem anyhow. > Sure it can - it append RETURNING PRIMARY KEY and hand back a ResultSet > from whatever was returned. It's CURRENTLY doing that, but it's appending > RETURNING * and leaving it up to the caller of getGeneratedKeys() to work > out which columns the caller is interested in. I might be mistaken, but as I read the spec for getGeneratedKeys, whether or not a column is in a primary key has got nothing to do with whether that column should be returned. It looks to me like what you're supposed to return is columns with computed default values, eg, serial columns. (Whether we should define it as being *exactly* columns created by the SERIAL macro is an interesting question, but for sure those ought to be included.) Now, the pkey might be a serial column ... but it equally well might not be; it might not have any default expression at all. And there certainly could be serial columns that weren't in the pkey. The fact that the spec is kinda fuzzy doesn't entitle us to ignore the plain meaning of the term "generated key". regards, tom lane
On 4 July 2014 11:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Sure it can - it append RETURNING PRIMARY KEY and hand back a ResultSetI might be mistaken, but as I read the spec for getGeneratedKeys, whether
> from whatever was returned. It's CURRENTLY doing that, but it's appending
> RETURNING * and leaving it up to the caller of getGeneratedKeys() to work
> out which columns the caller is interested in.
or not a column is in a primary key has got nothing to do with whether
that column should be returned. It looks to me like what you're supposed
to return is columns with computed default values, eg, serial columns.
(Whether we should define it as being *exactly* columns created by the
SERIAL macro is an interesting question, but for sure those ought to be
included.) Now, the pkey might be a serial column ... but it equally
well might not be; it might not have any default expression at all.
And there certainly could be serial columns that weren't in the pkey.
The fact that the spec is kinda fuzzy doesn't entitle us to ignore the
plain meaning of the term "generated key".
Well, strictly speaking no, but in reality it doesn't look like other databases that support this feature support anything other than returning auto-increment fields on primary keys, often only supporting a single column (see [1]). Oracle doesn't support even that - callers have to call the variant that specifies a column list, since historically there was no real support for recognising such columns, oracle being sequence based.
As such, there can't be any callers in the wild that will expect this to return anything other than the primary key, because there's no support for doing anything else. And if the caller really DOES want other columns returned, they can always call the variant that allows them to specify those columns, or just issue a normal query with a returning clause. This feature really exists to support the default case where those columns aren't specified by the caller, and given current driver support (and sanity) the only thing any caller will expect from such a result is the primary key.
Cheers
Tom
On Fri, Jul 4, 2014 at 7:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> It sounds to me like designing this for JDBC's getGeneratedKeys methodI might be mistaken, but as I read the spec for getGeneratedKeys, whether
>> is a mistake. There isn't going to be any way that the driver can support
>> that without having looked at the table's metadata for itself, and if
>> it's going to do that then it doesn't need a crutch that only partially
>> solves the problem anyhow.
> Sure it can - it append RETURNING PRIMARY KEY and hand back a ResultSet
> from whatever was returned. It's CURRENTLY doing that, but it's appending
> RETURNING * and leaving it up to the caller of getGeneratedKeys() to work
> out which columns the caller is interested in.
or not a column is in a primary key has got nothing to do with whether
that column should be returned. It looks to me like what you're supposed
to return is columns with computed default values, eg, serial columns.
(Whether we should define it as being *exactly* columns created by the
SERIAL macro is an interesting question, but for sure those ought to be
included.) Now, the pkey might be a serial column ... but it equally
well might not be; it might not have any default expression at all.
And there certainly could be serial columns that weren't in the pkey.
100% agree with Tom.
The fact that the spec is kinda fuzzy doesn't entitle us to ignore the
plain meaning of the term "generated key".
regards, tom lane
--
Rushabh Lathia
On Fri, Jul 4, 2014 at 12:36 AM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote: > On Fri, Jul 4, 2014 at 7:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Tom Dunstan <pgsql@tomd.cc> writes: >> > On 4 July 2014 00:07, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> >> It sounds to me like designing this for JDBC's getGeneratedKeys method >> >> is a mistake. There isn't going to be any way that the driver can >> >> support >> >> that without having looked at the table's metadata for itself, and if >> >> it's going to do that then it doesn't need a crutch that only partially >> >> solves the problem anyhow. >> >> > Sure it can - it append RETURNING PRIMARY KEY and hand back a ResultSet >> > from whatever was returned. It's CURRENTLY doing that, but it's >> > appending >> > RETURNING * and leaving it up to the caller of getGeneratedKeys() to >> > work >> > out which columns the caller is interested in. >> >> I might be mistaken, but as I read the spec for getGeneratedKeys, whether >> or not a column is in a primary key has got nothing to do with whether >> that column should be returned. It looks to me like what you're supposed >> to return is columns with computed default values, eg, serial columns. >> (Whether we should define it as being *exactly* columns created by the >> SERIAL macro is an interesting question, but for sure those ought to be >> included.) Now, the pkey might be a serial column ... but it equally >> well might not be; it might not have any default expression at all. >> And there certainly could be serial columns that weren't in the pkey. > > 100% agree with Tom. Well, we could have a RETURNING GENERATED COLUMNS construct, or something like that. I can certainly see the usefulness of such a thing. As a general note not necessarily related to this specific issue, I think we should be careful not to lose sight of the point of this feature, which is to allow pgsql-jdbc to more closely adhere to the JDBC specification. While it's great if this feature has general utility, if it doesn't help pgsql-jdbc, then it fails to achieve the author's original intention. We need to make sure we're not throwing up too many obstacles in the way of better driver compliance if we want to have good drivers. As a code-level comment, I am not too find of adding yet another value for IndexAttrBitmapKind; every values we compute in RelationGetIndexAttrBitmap adds overhead for everyone, even people not using the feature. Admittedly, it's only paid once per relcache load, but does map_primary_key_to_list() really need this information cached, or can it just look through the index list for the primary key, and then work directly from that? Also, map_primary_key_to_list() tacitly assumes that an auto-updatable view has only 1 from-list item. That seems like a good thing to check with an Assert(), just in case we should support some other case in the future. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Jul 7, 2014 at 5:50 PM, Robert Haas <robertmhaas@gmail.com> wrote:
+ /* Handle RETURNING PRIMARY KEY */
+ if(query->returningPK)
+ {
+ RangeTblEntry *rte = (RangeTblEntry *) list_nth(query->rtable, query->resultRelation - 1);
+ Relation rel = heap_open(rte->relid, RowExclusiveLock);
+ query->returningList = map_primary_key_to_list(rel, query);
+ heap_close(rel, NoLock);
+ }
Well, we could have a RETURNING GENERATED COLUMNS construct, orOn Fri, Jul 4, 2014 at 12:36 AM, Rushabh Lathia
<rushabh.lathia@gmail.com> wrote:
> On Fri, Jul 4, 2014 at 7:37 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Tom Dunstan <pgsql@tomd.cc> writes:
>> > On 4 July 2014 00:07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> >> It sounds to me like designing this for JDBC's getGeneratedKeys method
>> >> is a mistake. There isn't going to be any way that the driver can
>> >> support
>> >> that without having looked at the table's metadata for itself, and if
>> >> it's going to do that then it doesn't need a crutch that only partially
>> >> solves the problem anyhow.
>>
>> > Sure it can - it append RETURNING PRIMARY KEY and hand back a ResultSet
>> > from whatever was returned. It's CURRENTLY doing that, but it's
>> > appending
>> > RETURNING * and leaving it up to the caller of getGeneratedKeys() to
>> > work
>> > out which columns the caller is interested in.
>>
>> I might be mistaken, but as I read the spec for getGeneratedKeys, whether
>> or not a column is in a primary key has got nothing to do with whether
>> that column should be returned. It looks to me like what you're supposed
>> to return is columns with computed default values, eg, serial columns.
>> (Whether we should define it as being *exactly* columns created by the
>> SERIAL macro is an interesting question, but for sure those ought to be
>> included.) Now, the pkey might be a serial column ... but it equally
>> well might not be; it might not have any default expression at all.
>> And there certainly could be serial columns that weren't in the pkey.
>
> 100% agree with Tom.
something like that. I can certainly see the usefulness of such a
thing.
As a general note not necessarily related to this specific issue, I
think we should be careful not to lose sight of the point of this
feature, which is to allow pgsql-jdbc to more closely adhere to the
JDBC specification. While it's great if this feature has general
utility, if it doesn't help pgsql-jdbc, then it fails to achieve the
author's original intention. We need to make sure we're not throwing
up too many obstacles in the way of better driver compliance if we
want to have good drivers.
As a code-level comment, I am not too find of adding yet another value
for IndexAttrBitmapKind; every values we compute in
RelationGetIndexAttrBitmap adds overhead for everyone, even people not
using the feature. Admittedly, it's only paid once per relcache load,
but does
map_primary_key_to_list() really need this information cached, or can
it just look through the index list for the primary key, and then work
directly from that?
Also, map_primary_key_to_list() tacitly assumes that an auto-updatable
view has only 1 from-list item. That seems like a good thing to check
with an Assert(), just in case we should support some other case in
the future.
Another code-level comment is, why we need RowExclusiveLock before calling
map_primary_key_to_list() ? I think we should have explanation for that.
+ /* Handle RETURNING PRIMARY KEY */
+ if(query->returningPK)
+ {
+ RangeTblEntry *rte = (RangeTblEntry *) list_nth(query->rtable, query->resultRelation - 1);
+ Relation rel = heap_open(rte->relid, RowExclusiveLock);
+ query->returningList = map_primary_key_to_list(rel, query);
+ heap_close(rel, NoLock);
+ }
--
Rushabh Lathia