Thread: Proposal: RETURNING primary_key()

Proposal: RETURNING primary_key()

From

"Igal @ Lucee.org"

Date:

07 March 2016, 23:32:43

THE ISSUE:

In JDBC there is a flag called RETURN_GENERATED_KEYS -- 
https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#RETURN_GENERATED_KEYS

Which is left quite ambiguous, but in general it is used to return the 
"generated" Primary Key on INSERT/UPDATE/DELETE -- which is mostly 
useful in the case of INSERT, of course, as the other commands do not 
generate a key.

We can already add RETURNING after an INSERT, e.g.
  CREATE TABLE test (name TEXT, id SERIAL PRIMARY KEY);  INSERT INTO test VALUES ('PostgresQL') RETURNING id;

But the problem is that we need to know in advance the name of the "id" 
column, because if we had created the table like so:
  CREATE TABLE test (name TEXT, test_id SERIAL PRIMARY KEY);

Then we would need to use RETURNING "test_id" instead of "id".

The problem is that we do not always know in advance what the Primary 
Key is, and therefore a solution that was implemented in the pgjdbc 
driver was to append " RETURNING * " to the query, but that has its own 
problems, like returning a lot of data that is not needed, etc.  (you 
can see a longer discussion at https://github.com/pgjdbc/pgjdbc/issues/488 )


THE PROPOSAL:

The proposal is to allow something like RETURNING primary_key() (it can 
be a keyword, not necessarily a function), e.g.
  INSERT INTO test VALUES ('PostgresQL') RETURNING primary_key();

Which will return a record set according to the PRIMARY KEY that is set 
on the table.  So if the primary is "id", then you would get a column 
named "id", and if it is "test_id" you would get a column named 
"test_id" with the correct values.

If the PRIMARY KEY is made of multiple column, then all of those columns 
will be returned.

If the table does not have a PRIMARY KEY constraint then NULL will be 
returned with some arbitrary column name.

I would go further and suggest to add a function that will return the 
last primary key from a table, e.g.:
  SELECT last_primary_key() FROM test;

This of course can be beneficial for many users, and not only the JDBC 
community.

Thank you for your time and consideration,


Igal

Re: Proposal: RETURNING primary_key()

From

"Joshua D. Drake"

Date:

07 March 2016, 23:45:29

On 03/07/2016 12:32 PM, Igal @ Lucee.org wrote:

> The problem is that we do not always know in advance what the Primary
> Key is, and therefore a solution that was implemented in the pgjdbc

I agree that the problem is that you don't always know what the primary 
key is.

I would argue the solution is to check before you write the query.

JD

-- 
Command Prompt, Inc.                  http://the.postgres.company/                        +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.

Re: Proposal: RETURNING primary_key()

From

"Igal @ Lucee.org"

Date:

07 March 2016, 23:51:42

On 3/7/2016 12:45 PM, Joshua D. Drake wrote:
>
> I agree that the problem is that you don't always know what the 
> primary key is.
>
> I would argue the solution is to check before you write the query.

Sure, that would be great, but perhaps I should have give some more context:

We have an application server which allows our developers to query 
databases with simplified syntax.  Our code is written in a generic way 
to allow the developers that use our application server to pass whatever 
query they want into the database server, whether it's SQL Server, 
MySQL, Oracle, etc.

The code that we use to wrap the SQL statement, as well as the JDBC 
code, has no idea about the table or its constraints, so it's not like 
I'm writing my own queries, and am just being lazy at checking what the 
primary key is.  I just can't know what the developer has in his database.

Sure, I can probably query it via metadata tables, etc., but that would 
be a much slower process.

Igal

Re: Proposal: RETURNING primary_key()

From

Tom Lane

Date:

08 March 2016, 00:20:07

"Igal @ Lucee.org" <igal@lucee.org> writes:
> On 3/7/2016 12:45 PM, Joshua D. Drake wrote:
>> I agree that the problem is that you don't always know what the 
>> primary key is.
>> I would argue the solution is to check before you write the query.

Yeah.  I'm rather suspicious of this proposal; I do not think it's
actually very useful to return a primary-key value without any indication
of what the primary key is.  There are also corner cases where it seems
pretty ill-defined.  For example, suppose you do this on an inheritance
parent table that has a pkey defined, but not all its child tables do
(or maybe they do but their pkeys aren't identical to the parent's).
What should happen then?

> Sure, that would be great, but perhaps I should have give some more context:
> We have an application server which allows our developers to query 
> databases with simplified syntax.  Our code is written in a generic way 
> to allow the developers that use our application server to pass whatever 
> query they want into the database server, whether it's SQL Server, 
> MySQL, Oracle, etc.

That's an exceptionally weak use-case to argue for this with.  Unless
you can get *all* those DBMS suppliers to invent equivalent features,
you're going to have to have pkey-querying logic anyway.  The argument
for bespoke syntax for it in just one DBMS seems pretty weak.

I am fairly sure, also, that all of those systems have support for the
SQL-standard information_schema views.  So if you write a pkey-identifying
query against those views, you'd have some chance of a solution that
actually did work everywhere.
        regards, tom lane

Re: Proposal: RETURNING primary_key()

From

"Igal @ Lucee.org"

Date:

08 March 2016, 00:36:21

On 3/7/2016 1:20 PM, Tom Lane wrote:
>
> Yeah.  I'm rather suspicious of this proposal; I do not think it's
> actually very useful to return a primary-key value without any indication
> of what the primary key is.  There are also corner cases where it seems
> pretty ill-defined.  For example, suppose you do this on an inheritance
> parent table that has a pkey defined, but not all its child tables do
> (or maybe they do but their pkeys aren't identical to the parent's).
> What should happen then?
First, thank you for your reply.  I appreciate it.  I do not know the 
answer to that question.
>
> That's an exceptionally weak use-case to argue for this with.  Unless
> you can get *all* those DBMS suppliers to invent equivalent features,
> you're going to have to have pkey-querying logic anyway.  The argument
> for bespoke syntax for it in just one DBMS seems pretty weak.
Fair enough, but my idea was that this will be used by the JDBC driver 
in this case.  The other DBMS suppliers have their JDBC driver return a 
value, usually it is SERIAL type.  But there is no standard for the 
column name.  In SQL Server, for example, it is IDENTITYCOL while in 
MySQL it is GENERATED_KEY.

The thing is that in SQL Server I can do, for example, "SELECT 
@@identity" and get the last value that was inserted.  In SQL Server, 
however, Microsoft took the easy way and enforced only a single 
auto-generated identity column per table.  The closest thing I can do in 
PostgreSQL is "SELECT lastval()" but what if there are multiple 
sequences in that table?
> I am fairly sure, also, that all of those systems have support for the
> SQL-standard information_schema views.  So if you write a pkey-identifying
> query against those views, you'd have some chance of a solution that
> actually did work everywhere.
The other JDBC drivers return the last SERIAL value from the table that 
had the insert, so there's no issue there.  Querying the 
information_schema views with each INSERT will probably cause a major 
performance hit.

Anyway, I trust that you know much more about databases than I do, so if 
you don't think that it's a good idea, I accept that.

Best,


Igal

Re: Proposal: RETURNING primary_key()

From

Ian Barwick

Date:

08 March 2016, 03:17:40

Hi

On 08/03/16 05:32, Igal @ Lucee.org wrote:
> THE ISSUE:
> 
> In JDBC there is a flag called RETURN_GENERATED_KEYS --
https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#RETURN_GENERATED_KEYS
> 
(...)
> THE PROPOSAL:
> 
> The proposal is to allow something like RETURNING primary_key() (it can be a keyword, not
> necessarily a function), e.g.
> 
> INSERT INTO test VALUES ('PostgresQL') RETURNING primary_key();

FYI something similar has been proposed before:
 http://www.postgresql.org/message-id/53953EFB.8070701@2ndquadrant.com

The linked thread might provide more insights into the issues surrounding
this proposal.


Regards

Ian Barwick

-- Ian Barwick                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Proposal: RETURNING primary_key()

From

"Igal @ Lucee.org"

Date:

08 March 2016, 03:56:14

Ian,

On 3/7/2016 4:17 PM, Ian Barwick wrote:
>
> FYI something similar has been proposed before:
>
>    http://www.postgresql.org/message-id/53953EFB.8070701@2ndquadrant.com
>
> The linked thread might provide more insights into the issues surrounding
> this proposal.

It's funny how I've encountered the same issue and reached the same 
conclusion as you did.  The main difference is that I suggested 
returning NULL values instead of throwing an error.

I read through the whole thread and it seems to me like there was quite 
a bit of support for that feature, with Tom still unconvinced that this 
feature is useful -- but quite a few others who see the benefit in it, 
especially Java users who experience that problem first hand -- and 
Rushabh complaining about white space in the patch?

I'm not sure why it was not accepted at the end?

Igal

Re: Proposal: RETURNING primary_key()

From

Craig Ringer

Date:

08 March 2016, 07:02:23

On 8 March 2016 at 08:56, Igal @ Lucee.org <igal@lucee.org> wrote:

I'm not sure why it was not accepted at the end?

The biggest issue, though it might not be clear from that thread, is that what exactly it means to "return generated keys" is poorly defined by JDBC, and not necessarily the same thing as "return the PRIMARY KEY".

Should we return the DEFAULT on a UNIQUE column, for example?

IMO other vendors' drivers should be tested for behaviour in a variety of cases. Ideally the JDBC test suite too. Then specify the exact behaviour of what we need to satisfy the JDBC driver's requirements and anything else that might be related.

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Proposal: RETURNING primary_key()

From

Tom Lane

Date:

08 March 2016, 07:19:04

Craig Ringer <craig@2ndquadrant.com> writes:
> On 8 March 2016 at 08:56, Igal @ Lucee.org <igal@lucee.org> wrote:
>> I'm not sure why it was not accepted at the end?

> The biggest issue, though it might not be clear from that thread, is that
> what exactly it means to "return generated keys" is poorly defined by JDBC,
> and not necessarily the same thing as "return the PRIMARY KEY".
>
> Should we return the DEFAULT on a UNIQUE column, for example?
>
> IMO other vendors' drivers should be tested for behaviour in a variety of
> cases.

Yeah.  It was asserted in the earlier thread that other vendors implement
this feature as "return the pkey", but that seems to conflict with the
plain language of the JDBC spec: generated columns are an entirely
different thing than primary key columns.  So really what I'd like to see
is some work on surveying other implementations to confirm exactly what
behavior they implement.  If we're to go against what the spec seems to
say, I want to see a whole lot of evidence that other people do it
consistently in a different way.
        regards, tom lane

Re: Proposal: RETURNING primary_key()

From

Robert Haas

Date:

08 March 2016, 23:12:59

On Mon, Mar 7, 2016 at 11:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Craig Ringer <craig@2ndquadrant.com> writes:
>> On 8 March 2016 at 08:56, Igal @ Lucee.org <igal@lucee.org> wrote:
>>> I'm not sure why it was not accepted at the end?
>
>> The biggest issue, though it might not be clear from that thread, is that
>> what exactly it means to "return generated keys" is poorly defined by JDBC,
>> and not necessarily the same thing as "return the PRIMARY KEY".
>>
>> Should we return the DEFAULT on a UNIQUE column, for example?
>>
>> IMO other vendors' drivers should be tested for behaviour in a variety of
>> cases.
>
> Yeah.  It was asserted in the earlier thread that other vendors implement
> this feature as "return the pkey", but that seems to conflict with the
> plain language of the JDBC spec: generated columns are an entirely
> different thing than primary key columns.  So really what I'd like to see
> is some work on surveying other implementations to confirm exactly what
> behavior they implement.  If we're to go against what the spec seems to
> say, I want to see a whole lot of evidence that other people do it
> consistently in a different way.

I agree that some research should be done on how this works in other
systems, but I think we have a general problem with the server lacking
certain capabilities that make it easy to implement a high-quality
JDBC driver.  And I think it would be good to work on figuring out how
to fix that.  I feel that some of the replies on this thread were
rather hostile considering that the goal -- good connectors for the
database server -- is extremely important.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Proposal: RETURNING primary_key()

From

"Igal @ Lucee.org"

Date:

09 March 2016, 00:40:32

On 3/8/2016 12:12 PM, Robert Haas wrote:
> I agree that some research should be done on how this works in other
> systems, but I think we have a general problem with the server lacking
> certain capabilities that make it easy to implement a high-quality
> JDBC driver.  And I think it would be good to work on figuring out how
> to fix that.
I will try to gather more information about the other DBMSs and drivers 
and will post my findings here when I have them.

Best,


Igal

Re: Proposal: RETURNING primary_key()

From

Craig Ringer

Date:

09 March 2016, 03:42:58

On 9 March 2016 at 05:40, Igal @ Lucee.org <igal@lucee.org> wrote:

On 3/8/2016 12:12 PM, Robert Haas wrote:
I agree that some research should be done on how this works in other
systems, but I think we have a general problem with the server lacking
certain capabilities that make it easy to implement a high-quality
JDBC driver. And I think it would be good to work on figuring out how
to fix that.
I will try to gather more information about the other DBMSs and drivers and will post my findings here when I have them.

Thanks. I know that's not the most fun thing to do in the world, but it's often needed when implementing something where part of the goal is being compatible with other vendors, etc.

Currently I suggest using Connection.prepareStatement(..., String[] generatedKeyColumns) where possible. I realise that's not practical for all apps, which is why supporting the int flag form better is desirable, and we just have to figure out what exactly we should be returning...

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Proposal: RETURNING primary_key()

From

Craig Ringer

Date:

09 March 2016, 04:12:31

On 9 March 2016 at 04:12, Robert Haas <robertmhaas@gmail.com> wrote:

I think we have a general problem with the server lacking
certain capabilities that make it easy to implement a high-quality
JDBC driver. And I think it would be good to work on figuring out how
to fix that.

There are a few frustrations, to be sure, but I'm not sure there's actually a ton server-side that drastically limits the driver.

One of the worst problems (IMO) is in the driver architecture its self. It attempts to prevent blocking by guestimating the server's send buffer state and its recv buffer state, trying to stop them filling and causing the server to block on writes. It should just avoid blocking on its own send buffer, which it can control with confidence. Or use some of Java's rather good concurrency/threading features to simultaneously consume data from the receive buffer and write to the send buffer when needed, like pgjdbc-ng does. This makes making use of the pipelining features in Pg's protocol way harder and less efficient than it should be - but then, PgJDBC still does this better than libpq, which can't pipeline queries at all.

There certainly are server/protocol frustrations.

QUERY CANCEL RACES

---

Query cancellation sucks badly. Not because it requires a new connection, though that's unfortunate, but because cancel is backend-level not statement-level. A statement cancellation key returned as an immediate response to the Execute message would be quite handy, so we could include it in cancel requests and eliminate the race by having the cancel request be a no-op if the statement cancel key doesn't match the currently running statement.

EARLY CONNECTION CHARSETS

---

There's no way to know the charset of early connection error messages, which is a flaw in the protocol that isn't specific to PgJDBC its self. Similarly, you can't specify the text encoding of usernames, passwords, etc sent to the server.

PER-QUERY GUCs

---

We also have no way to set GUCs per-query, and we need it for statement_timeout. I really wish Parse and Execute messages allowed statement-scoped GUCs to be passed at the protocol level. This would be very, very helpful. The driver can probably work around it by fetching and then SETing statement_timeout, running the query, then reSETing it afterwards in a piplelined set of queries, but .... yuck. Also, log spam galore.

GENERATED KEYS AND RETURNING

---

To get generated keys we have to hack the statement text. There's no protocol-level equivalent, like we have for row-count limits in the v3 protocol. The ability to specify the set of returned columns at the protocol level would be very nice. That said, hacking the statement text isn't *too* bad, mostly because few people are going to do their own RETURNING statement *and* request generated keys from the driver, the only time this becomes an issue.

STRING TYPE ISSUES

---

PgJDBC can work around Pg's IMO somewhat overzealous type checks for string types by passing string parameters as being of unknown-type. The JDBC interface offers us no easy way to differentiate between "this parameter is a real textual value" and "this parameter is a string representation of something that might be another type". We can do it with setObject and extension class wrappers, but then the user has to import the JDBC driver's classes directly, use PgJDBC-specific API, etc. The people who have the most problem with our current behaviour are those least able to do that, users who're behind a query generation layer or ORM. I'd like to just make stringtype=unspecified the default in PgJDBC and be done with it; users can still specify an explicit cast to 'text' in the SQL if they want

PROTOCOL-LEVEL SAVEPOINTS

---

psqlODBC would benefit from protocol-level SAVEPOINT and ROLLBACK TO SAVEPOINT, mostly to reduce logspam and parser overhead. PgJDBC would be able to use this to emulate other DBMSes error handling behaviour too, when requested by a client. (Yes, I know about the correctness and performance issues, but you tell that to someone who just wants to Port Their Stuff From Oracle But Can't Change The Code).

SERVER_VERSION_NUM

---

server_version_num should be GUC_REPORT and it's really annoying that it isn't. I never agreed with the arguments about why that wasn't changed, and I still want it changed.

LOST TYPMOD, NULLABILITY INFO

---

The server throws away typmod and nullability knowledge as soon as you do anything with a column. This is frustrating for the driver's metadata API support. Having result columns marked non-null in Describe would be handy.

LAZY BYTEA

---

The protocol offers no way to lazily fetch large values like BYTEA. Many vendors can fetch small results and return a handle that gets larger results from the server on-demand. This means that many clients expect that

SELECT * FROM my_table_with_100MB_bytea_column;

will not fetch all those bytea values to the client until/unless they're actually accessed. They don't have to generate new and different queries each time. ORMs in particular benefit from this. Ideally we'd have the protocol-level ability to return a handle to the relevant TOAST entry that clients can then fetch using further protocol messages on-demand so long as they're on the same session, haven't committed or rolled back, and haven't run another statement. This would make working with big binary objects in the DB considerably more practical.

I'm sure there are others I haven't remembered or run into in there too.

See https://wiki.postgresql.org/wiki/Todo#Wire_Protocol_Changes for some.

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Proposal: RETURNING primary_key()

From

Robert Haas

Date:

09 March 2016, 16:30:53

On Tue, Mar 8, 2016 at 8:12 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> There certainly are server/protocol frustrations.

I'm sympathetic to all of these and think we should work on fixing
them, particularly...

> STRING TYPE ISSUES
> ---
>
> PgJDBC can work around Pg's IMO somewhat overzealous type checks ...

This.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Proposal: RETURNING primary_key()

From

Craig Ringer

Date:

09 March 2016, 17:27:09

On 9 March 2016 at 21:30, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Mar 8, 2016 at 8:12 PM, Craig Ringer <craig@2ndquadrant.com> wrote:
> There certainly are server/protocol frustrations.

I'm sympathetic to all of these and think we should work on fixing
them, particularly...

> STRING TYPE ISSUES
> ---
>
> PgJDBC can work around Pg's IMO somewhat overzealous type checks ...

This.

I've raised that multiple times and got nowhere. More importantly, I'm reasonably convinced that passing string types as UNKNOWNOID is what users generally want and expect anyway, and is consistent with what happens when you write string literals directly in SQL. I think we should just change PgJDBC to default to this already-optional behaviour, which is currently controlled by the stringtype=unspecified JDBC parameter.

Other drivers have similar issues, and can fix it the same way. Maybe we should document it somewhere, but I think this is in many ways the least deserving of attention. Partly because clients can work around it easily, partly because the energy input required for any change will be prohibitive and is better spent elsewhere.

I'd *much* rather have things like query cancel cookies, per-query GUCs at the protocol level, etc.

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Proposal: RETURNING primary_key()

From

"Igal @ Lucee.org"

Date:

09 March 2016, 19:41:28

On 3/8/2016 5:12 PM, Craig Ringer wrote:
> One of the worst problems (IMO) is in the driver architecture its 
> self. It attempts to prevent blocking by guestimating the server's 
> send buffer state and its recv buffer state, trying to stop them 
> filling and causing the server to block on writes. It should just 
> avoid blocking on its own send buffer, which it can control with 
> confidence. Or use some of Java's rather good concurrency/threading 
> features to simultaneously consume data from the receive buffer and 
> write to the send buffer when needed, like pgjdbc-ng does.

Are there good reasons to use pgjdbc over pgjdbc-ng then?

Re: Proposal: RETURNING primary_key()

From

Craig Ringer

Date:

10 March 2016, 04:49:51

On 10 March 2016 at 00:41, Igal @ Lucee.org <igal@lucee.org> wrote:

On 3/8/2016 5:12 PM, Craig Ringer wrote:
One of the worst problems (IMO) is in the driver architecture its self. It attempts to prevent blocking by guestimating the server's send buffer state and its recv buffer state, trying to stop them filling and causing the server to block on writes. It should just avoid blocking on its own send buffer, which it can control with confidence. Or use some of Java's rather good concurrency/threading features to simultaneously consume data from the receive buffer and write to the send buffer when needed, like pgjdbc-ng does.

Are there good reasons to use pgjdbc over pgjdbc-ng then?

Maturity, support for older versions (-ng just punts on support for anything except new releases) and older JDBC specs, completeness of support for some extensions. TBH I haven't done a ton with -ng yet.

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Proposal: RETURNING primary_key()

From

"Igal @ Lucee.org"

Date:

10 March 2016, 22:07:45

On 3/8/2016 4:42 PM, Craig Ringer wrote:

On 9 March 2016 at 05:40, Igal @ Lucee.org <igal@lucee.org> wrote:

I will try to gather more information about the other DBMSs and drivers and will post my findings here when I have them.

Thanks. I know that's not the most fun thing to do in the world, but it's often needed when implementing something where part of the goal is being compatible with other vendors, etc.

It seems that the implementations vary by the driver, and not the server, as evidenced by the Microsoft SQL Server drivers -- I tested both the official MS driver and the open sourced jTDS driver.

I noticed that you usually don't put html in the emails here, but I think that it's appropriate here to show the information in a clear way (also, according to my computer it's 2016). I hope that it will be rendered properly:

	MySQL	DB2	SQL Server (MS)	SQL Server (jTDS)	Oracle
Returned Type	SET	SET	ROW	ROW	ROW
Column Name	GENERATED_KEY	[name of identity col]	GENERATED_KEYS	ID	ROWID
Column Type	Unknown (numeric)	integer	numeric	numeric	ROWID
Value	Each inserted value to identity column	Each inserted value to identity column	Last inserted value to identity column	Last inserted value to identity column	internal address location that does not change on UPDATE
Example	(1), (2)	(1), (2)	(2)	(2)	AAAE5nAABAAALCxAAM

Some notes and observations:

It's the Wild West! Each implementation does something completely different. Even when something looks similar, e.g. the returned column name from MySQL and SQL Server (MS), it's not: notice the plural in SQL Server's column name, which is ironic as they only return a single value, as opposed to MySQL which returns a SET.

This has been an "interesting experience" as it was my first exposure to some of those DBMSs. It only reinforced my decision to choose PostgreSQL moving forward, over the alternatives (after using SQL Server for about 20 years).

More notes on the different DBMSs:

The first thing that I tested was against MySQL:

    CREATE TABLE IF NOT EXISTS test_jdbc(name VARCHAR(64), id SERIAL);

An insert to that table via JDBC, with int flag RETURN_GENERATED_KEYS returns a result set with a column named "GENERATED_KEY " and type "UNKNOWN" (as per ResultSetMetaData's getColumnTypeName()), each row in the result set corresponded with an inserted record, so for example:

    INSERT INTO test_jdbc(name) VALUES ('JDBC'), ('PostgreSQL');

returned two rows with the value of the "id" column for the inserted row in each, e.g.

GENERATED_KEY-------------78
Trying to add multiple SERIAL columns to a table results in an error:

    CREATE TABLE IF NOT EXISTS jdbc(j_name VARCHAR(64), j_id SERIAL, id2 SERIAL)

Error Code: 1075. Incorrect table definition; there can be only one auto column and it must be defined as a key

SQL Server: via the Microsoft driver

Created table with the command:

    CREATE TABLE dbo.jdbc (
        j_name varchar(64) NOT NULL,
        j_id int IDENTITY(1,1) NOT NULL
    )

Generated Keys return a single row with a column named "GENERATED_KEYS" of type numeric, and the value is the last inserted id (i.e. sequence). This is different from MySQL which returns a row with the id for each inserted record.

SQL Server: via the jTDS driver

Generated Keys return a single row with a column named "ID" of type numeric, and the value is the last inserted id (i.e. sequence). The behavior is similar to the Microsoft driver, but the column name is different.

Oracle:

Oracle returns the column ROWID which is of type ROWID as well:
https://docs.oracle.com/cd/B19306_01/server.102/b14200/pseudocolumns008.htm

This seems to be similar to PostgreSQL's ctid, but unlike ctid -- when I UPDATE the record the ROWID remains unchanged.

In my test I got the value "AAAE5nAABAAALCxAAM", and when I later ran:

    SELECT * FROM jdbc WHERE ROWID='AAAE5nAABAAALCxAAM';

I got the information back from that row. Updating that row does not change its ROWID.

When I tried to insert multiple values with RETURN_GENERATED_KEYS I got an error: java.sql.SQLSyntaxErrorException: ORA-00900: invalid SQL statement

    INSERT INTO jdbc(j_name) SELECT 'PG 9.5.0' FROM DUAL UNION SELECT 'PG 9.5.1' FROM DUAL

The rows are, however, inserted into the table. Running the same INSERT command without RETURN_GENERATED_KEYS works without error.

(Side note: This was my first, and hopefully my last, experience with Oracle database, and it's been a real PITA. If I had tried it out some 20 years ago then the experience would have probably led me to sell the stock short, which would have probably ended with my bankruptcy. Go figure...)

IBM DB2:

    CREATE TABLE jdbc(j_name VARCHAR(64), j_id INT NOT NULL GENERATED ALWAYS AS IDENTITY)

Generated Keys return a set with the column named "J_ID" of type integer. One row for each inserted row.

(Side note: after wasting almost a full day setting up and connecting to the DB2 server I realized why Oracle was so successful)

Re: Proposal: RETURNING primary_key()

From

Robert Haas

Date:

11 March 2016, 06:31:45

On Mar 10, 2016, at 2:07 PM, Igal @ Lucee.org <igal@lucee.org> wrote:

> (Side note:  This was my first, and hopefully my last, experience with Oracle database, and it's been a real PITA.
IfI had tried it out some 20 years ago then the experience would have probably led me to sell the stock short, which
wouldhave probably ended with my bankruptcy.  Go figure...) 
>
> (Side note: after wasting almost a full day setting up and connecting to the DB2 server I realized why Oracle was so
successful)

This email made me laugh.

...Robert

Re: Proposal: RETURNING primary_key()

From

Craig Ringer

Date:

11 March 2016, 07:28:39

On 11 March 2016 at 03:07, Igal @ Lucee.org <igal@lucee.org> wrote:

I noticed that you usually don't put html in the emails here, but I think that it's appropriate here to show the information in a clear way (also, according to my computer it's 2016).

Pretty sure we have at least one person here using mailreader software that's old enough to vote in most countries, but I tend to share the sentiment. At least when there's actually a functional reason like this :)

Thanks so much for doing this testing.

I hope that it will be rendered properly:

MySQL DB2 SQL Server (MS) SQL Server (jTDS) Oracle
Returned Type SET SET ROW ROW ROW
Column Name GENERATED_KEY [name of identity col] GENERATED_KEYS ID ROWID
Column Type Unknown (numeric) integer numeric numeric ROWID
Value Each inserted value to identity column Each inserted value to identity column Last inserted value to identity column Last inserted value to identity column internal address location that does not change on UPDATE
Example (1), (2) (1), (2) (2) (2) AAAE5nAABAAALCxAAM

Some notes and observations:

It's the Wild West! Each implementation does something completely different.

I honestly didn't expect that. I knew Oracle returned ROWID, but I have to admit I thought the others would probably just return the key column(s).

When you supply the column type, does that (with the exception of Oracle) match the column type of the generated key col?

Did you try GENERATED ALWAYS cols (where supported), UNIQUE columns with DEFAULTs, composite columns, etc? Part of the question for Pg is what exactly we should and should not be returning.

(Side note: This was my first, and hopefully my last, experience with Oracle database, and it's been a real PITA. If I had tried it out some 20 years ago then the experience would have probably led me to sell the stock short, which would have probably ended with my bankruptcy. Go figure...)

I rather less than fondly recall my own attempts to get Oracle Express installed and running for some test or another a while ago. Amazing that it can be that fiddly. MS-SQL on the other hand "just worked" and dropped me into the most gorgeously wonderful admin tool and SQL editor ever.

I wonder if any of these drivers have extension options and compat flags that you have to turn on to get better behaviour like returning a set? Or if they're just that limited?

Anyway, from the sounds of this we have a fair bit of freedom to define what we want at both the Pg and driver level so long as we satisfy the basic constraint that we should return a set of generated keys in the case where a statement does an insert that adds rows to a table with a SERIAL (or an owned SEQUENCE). Seems like we could do pretty much whatever we want for multiple-generated-columns cases etc.

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Proposal: RETURNING primary_key()

From

"Joshua D. Drake"

Date:

11 March 2016, 07:52:15

On 03/10/2016 08:28 PM, Craig Ringer wrote:
> On 11 March 2016 at 03:07, Igal @ Lucee.org <igal@lucee.org
> <mailto:igal@lucee.org>> wrote:
>
>
>     I noticed that you usually don't put html in the emails here, but I
>     think that it's appropriate here to show the information in a clear
>     way (also, according to my computer it's 2016).
>
>
> Pretty sure we have at least one person here using mailreader software
> that's old enough to vote in most countries, but I tend to share the
> sentiment. At least when there's actually a functional reason like this :)

That person needs to suck it up. Email is no longer just fixed width 
text and hasn't been in a decade.

JD

-- 
Command Prompt, Inc.                  http://the.postgres.company/                     +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.

Re: Proposal: RETURNING primary_key()

From

Vladimir Sitnikov

Date:

11 March 2016, 11:00:16

Igal, thanks for the analysis.

Craig>Part of the question for Pg is what exactly we should and should not be returning.

I think the following might be a good starting point: return set of columns that would identify the inserted row(s).

E.g. columns of any primary key would do. Columns of any unique key would do as well.

"returning *" would do as well, however it would return excessive columns, thus it would be less efficient.

I do not think it makes sense to tie "generated keys" to sequences or things like that.

For example:

1) Consider Pg returns column_name==ABC, value==42. That would mean client could locate exactly that row via "where ABC=42"

2) Same for multicolumn keys: Pg just returns (col1, col2) == (42, 146). Then client would be able to locate the row via "where col1=42 and col2=146

3) If multiple unique keys present, it is fine if Pg returns one or the another depending on the phase of the moon. Yet more compact key would be preferable to save on bandwidth.

Does that make sense?

I think naming the resulting column(s) like "generated_key" / "generated_keys" does not make much sense. Especially, for multi-column keys.

If ctid was update-proof, it could could do. Unfortunately, ctid might easily get changed.

Theoretical end-to-end (it is the only use of "generated keys" I can imagine at the moment):

1) Client issues an insert statement, asking "generated keys"

2) Pg inserts the row, and returns resultset with "primary key" (or unique key) columns.

3) Client stores it somewhere. For instance, passes that to UI.

4) As UI wants to update the row, client just uses those keys to identify the row to update.

PS. Frankly speaking, I feel "generated keys" is more like a "plug & pray" kind of API. ORMs should know the column names of the primary keys => ORMs should use "return specific column names" API, not just "return something generated".

Vladimir

Re: Proposal: RETURNING primary_key()

From

Craig Ringer

Date:

11 March 2016, 11:40:16

On 11 March 2016 at 16:00, Vladimir Sitnikov <sitnikov.vladimir@gmail.com> wrote:

Igal, thanks for the analysis.

Craig>Part of the question for Pg is what exactly we should and should not be returning.

I think the following might be a good starting point: return set of columns that would identify the inserted row(s).
E.g. columns of any primary key would do. Columns of any unique key would do as well.
"returning *" would do as well, however it would return excessive columns, thus it would be less efficient.

I do not think it makes sense to tie "generated keys" to sequences or things like that.

That's why (sorry, Igal) I'd like to see some more tests for cases other than identity columns. How is GENERATED ALWAYS handled, if supported? What about if it's on a UNIQUE column? How about a PRIMARY KEY whose value is assigned by a DEFAULT or by a trigger?

Based on the rather funky behaviour Igal found I suspect the answer will be "nothing much" for all of those, i.e. it just doesn't work with other drivers/vendors. But I'd like to know. I

2) Same for multicolumn keys: Pg just returns (col1, col2) == (42, 146). Then client would be able to locate the row via "where col1=42 and col2=146

Yeah, I was wondering about composite PKs. I think Igal focused only on generated synthetic keys, which are after all overwhelmingly common case when getting generated keys.

3) If multiple unique keys present, it is fine if Pg returns one or the another depending on the phase of the moon. Yet more compact key would be preferable to save on bandwidth.

I disagree there. Behavour must be well-defined and predictable unless it's really unavoidable.

I think naming the resulting column(s) like "generated_key" / "generated_keys" does not make much sense. Especially, for multi-column keys.

Yeah. At least in PgJDBC where it's a separate resultset (IIRC), so you have metadata that means you don't have to guess column names etc.

If ctid was update-proof, it could could do. Unfortunately, ctid might easily get changed.

Indeed. Especially since many of the apps that want to fetch generated keys will be connection-pool oriented apps doing optimistic concurrency control - ORMs and the like. So they won't be able to hold the transaction that added the row open (to hold locks and protect against vacuum) while fetching more info about the row. That'd be quite undesirable for performance anyway, since it'd force at least one extra round-trip; you couldn't pipeline the query for more info about the row until you knew the ctid of the inserted row.

using ctid is a nonstarter IMO, at least as far as the client goes.

PS. Frankly speaking, I feel "generated keys" is more like a "plug & pray" kind of API. ORMs should know the column names of the primary keys => ORMs should use "return specific column names" API, not just "return something generated".

Yep. There are many "should"s. I absolutely agree that this is one of them.

One reason some clients do it this way is that earlier versions of the JDBC API didn't have the String[] generatedKeys form of prepareStatement. So they had to cope with not being able to ask for specific cols and getting whatever the DB handed them.

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Proposal: RETURNING primary_key()

From

"Igal @ Lucee.org"

Date:

11 March 2016, 19:50:43

On 3/11/2016 12:40 AM, Craig Ringer wrote:

That's why (sorry, Igal) I'd like to see some more tests for cases other than identity columns. How is GENERATED ALWAYS handled, if supported? What about if it's on a UNIQUE column? How about a PRIMARY KEY whose value is assigned by a DEFAULT or by a trigger?

I was using Oracle 11g XE, GENERATED ALWAYS was not available. This is the code I used for Oracle:

CREATE TABLE jdbc (j_name VARCHAR2(64) NOT NULL, j_id NUMBER(10) NOT NULL); CREATE SEQUENCE jdbc_seq; CREATE OR REPLACE TRIGGER jdbc_seq_trigger BEFORE INSERT ON jdbc FOR EACH ROW WHEN (new.j_id IS NULL) BEGIN SELECT jdbc_seq.NEXTVAL INTO :new.j_id FROM dual; END; /

For DB2 the type is indeed GENERATED ALWAYS AS IDENTITY:

j_id INT GENERATED ALWAYS AS IDENTITY

Originally the name was ID but when both DB2 and MS/jTDS returned a column named "ID" I realized that it might come from the column name, so I modified the column name. DB2 was indeed returning the column name, while MS/jTDS returns a column named "ID" regardless of the actual column name.

Based on the rather funky behaviour Igal found I suspect the answer will be "nothing much" for all of those, i.e. it just doesn't work with other drivers/vendors. But I'd like to know.

I agree, but I can test it if you give me the SQL commands. I do want to remove all of that horrible software from my workstation as soon as possible, but it can wait if more testing is required.

2) Same for multicolumn keys: Pg just returns (col1, col2) == (42, 146). Then client would be able to locate the row via "where col1=42 and col2=146

Yeah, I was wondering about composite PKs. I think Igal focused only on generated synthetic keys, which are after all overwhelmingly common case when getting generated keys.

If you give me the code that you want to test I will test it.

3) If multiple unique keys present, it is fine if Pg returns one or the another depending on the phase of the moon. Yet more compact key would be preferable to save on bandwidth.

I disagree there. Behavour must be well-defined and predictable unless it's really unavoidable.

I agree with Craig.

I think naming the resulting column(s) like "generated_key" / "generated_keys" does not make much sense. Especially, for multi-column keys.

Yeah. At least in PgJDBC where it's a separate resultset (IIRC), so you have metadata that means you don't have to guess column names etc.

I'm not sure how multi-column keys work. In both MySQL and SQL Server for example, you can not have more than one SEQUENCE column, so perhaps that's their "solution".

Igal

Re: Proposal: RETURNING primary_key()

From

Michael Paquier

Date:

12 March 2016, 13:28:05

On Fri, Mar 11, 2016 at 5:28 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
> On 11 March 2016 at 03:07, Igal @ Lucee.org <igal@lucee.org> wrote:
>> I noticed that you usually don't put html in the emails here, but I think that it's appropriate here to show the
informationin a clear way (also, according to my computer it's 2016).
 
>
> Pretty sure we have at least one person here using mailreader software that's old enough to vote in most countries,
butI tend to share the sentiment. At least when there's actually a functional reason like this :)
 

This one made me smile. Extracting a quote from another project's
slogan: All mail clients s**k. Some of them just s**k less.
-- 
Michael

Re: Proposal: RETURNING primary_key()

From

Gavin Flower

Date:

12 March 2016, 13:38:00

On 12/03/16 23:27, Michael Paquier wrote:
> On Fri, Mar 11, 2016 at 5:28 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
>> On 11 March 2016 at 03:07, Igal @ Lucee.org <igal@lucee.org> wrote:
>>> I noticed that you usually don't put html in the emails here, but I think that it's appropriate here to show the
informationin a clear way (also, according to my computer it's 2016).
 
>> Pretty sure we have at least one person here using mailreader software that's old enough to vote in most countries,
butI tend to share the sentiment. At least when there's actually a functional reason like this :)
 
> This one made me smile. Extracting a quote from another project's
> slogan: All mail clients s**k. Some of them just s**k less.

s**k == sulk???

Re: Proposal: RETURNING primary_key()

From

Dave Cramer

Date:

03 April 2016, 18:22:20

On 9 March 2016 at 20:49, Craig Ringer <craig@2ndquadrant.com> wrote:

On 10 March 2016 at 00:41, Igal @ Lucee.org <igal@lucee.org> wrote:
On 3/8/2016 5:12 PM, Craig Ringer wrote:
One of the worst problems (IMO) is in the driver architecture its self. It attempts to prevent blocking by guestimating the server's send buffer state and its recv buffer state, trying to stop them filling and causing the server to block on writes. It should just avoid blocking on its own send buffer, which it can control with confidence. Or use some of Java's rather good concurrency/threading features to simultaneously consume data from the receive buffer and write to the send buffer when needed, like pgjdbc-ng does.

Are there good reasons to use pgjdbc over pgjdbc-ng then?

Maturity, support for older versions (-ng just punts on support for anything except new releases) and older JDBC specs, completeness of support for some extensions. TBH I haven't done a ton with -ng yet.

I'd like to turn this question around. Are there good reasons to use -ng over pgjdbc ?

As to your question, you may be interested to know that pgjdbc is more performant than ng.

Dave Cramer

davec@postgresintl.com

www.postgresintl.com

Re: Proposal: RETURNING primary_key()

From

"Igal @ Lucee.org"

Date:

03 April 2016, 19:19:08

On 4/3/2016 8:21 AM, Dave Cramer wrote:

On 9 March 2016 at 20:49, Craig Ringer <craig@2ndquadrant.com> wrote:
On 3/8/2016 5:12 PM, Craig Ringer wrote:

Are there good reasons to use pgjdbc over pgjdbc-ng then?

Maturity, support for older versions (-ng just punts on support for anything except new releases) and older JDBC specs, completeness of support for some extensions. TBH I haven't done a ton with -ng yet.

I'd like to turn this question around. Are there good reasons to use -ng over pgjdbc ?

As to your question, you may be interested to know that pgjdbc is more performant than ng.

That's good to know, but unfortunately pgjdbc is unusable for us until
https://github.com/pgjdbc/pgjdbc/issues/488 is fixed.

Also, as I mentioned in the ticket, I can't imagine RETURNING * being performant if, for example, I INSERT a large chunk of data like an image data or an uploaded file.

Igal

Re: Proposal: RETURNING primary_key()

From

Stephen Frost

Date:

03 April 2016, 22:35:33

* Dave Cramer (pg@fastcrypt.com) wrote:
> On 9 March 2016 at 20:49, Craig Ringer <craig@2ndquadrant.com> wrote:
>
> > On 10 March 2016 at 00:41, Igal @ Lucee.org <igal@lucee.org> wrote:
> >
> >> On 3/8/2016 5:12 PM, Craig Ringer wrote:
> >>
> >>> One of the worst problems (IMO) is in the driver architecture its self.
> >>> It attempts to prevent blocking by guestimating the server's send buffer
> >>> state and its recv buffer state, trying to stop them filling and causing
> >>> the server to block on writes. It should just avoid blocking on its own
> >>> send buffer, which it can control with confidence. Or use some of Java's
> >>> rather good concurrency/threading features to simultaneously consume data
> >>> from the receive buffer and write to the send buffer when needed, like
> >>> pgjdbc-ng does.
> >>>
> >>
> >> Are there good reasons to use pgjdbc over pgjdbc-ng then?
> >>
> >>
> > Maturity, support for older versions (-ng just punts on support for
> > anything except new releases) and older JDBC specs, completeness of support
> > for some extensions. TBH I haven't done a ton with -ng yet.
>
> I'd like to turn this question around. Are there good reasons to use -ng
> over pgjdbc ?

Not generally much of a JDBC user myself, but the inability to avoid
polling for LISTEN notifications is a pretty big annoyance, which I just
ran into with a client.  I understand that -ng has a way to avoid that,
even for SSL connections.

> As to your question, you may be interested to know that pgjdbc is more
> performant than ng.

Interesting, good to know.

Thanks!

Stephen

Re: Proposal: RETURNING primary_key()

From

Dave Cramer

Date:

04 April 2016, 02:33:28

On 3 April 2016 at 15:35, Stephen Frost <sfrost@snowman.net> wrote:

* Dave Cramer (pg@fastcrypt.com) wrote:
> On 9 March 2016 at 20:49, Craig Ringer <craig@2ndquadrant.com> wrote:
>
> > On 10 March 2016 at 00:41, Igal @ Lucee.org <igal@lucee.org> wrote:
> >
> >> On 3/8/2016 5:12 PM, Craig Ringer wrote:
> >>
> >>> One of the worst problems (IMO) is in the driver architecture its self.
> >>> It attempts to prevent blocking by guestimating the server's send buffer
> >>> state and its recv buffer state, trying to stop them filling and causing
> >>> the server to block on writes. It should just avoid blocking on its own
> >>> send buffer, which it can control with confidence. Or use some of Java's
> >>> rather good concurrency/threading features to simultaneously consume data
> >>> from the receive buffer and write to the send buffer when needed, like
> >>> pgjdbc-ng does.
> >>>
> >>
> >> Are there good reasons to use pgjdbc over pgjdbc-ng then?
> >>
> >>
> > Maturity, support for older versions (-ng just punts on support for
> > anything except new releases) and older JDBC specs, completeness of support
> > for some extensions. TBH I haven't done a ton with -ng yet.
>
> I'd like to turn this question around. Are there good reasons to use -ng
> over pgjdbc ?

Not generally much of a JDBC user myself, but the inability to avoid
polling for LISTEN notifications is a pretty big annoyance, which I just
ran into with a client. I understand that -ng has a way to avoid that,
even for SSL connections.

Yes, it is a custom api. Easy enough to add. Is this something of interest ?

Dave Cramer

davec@postgresintl.com

www.postgresintl.com

Re: Proposal: RETURNING primary_key()

From

Dave Cramer

Date:

04 April 2016, 02:34:57

On 3 April 2016 at 12:18, Igal @ Lucee.org <igal@lucee.org> wrote:

On 4/3/2016 8:21 AM, Dave Cramer wrote:

On 9 March 2016 at 20:49, Craig Ringer <craig@2ndquadrant.com> wrote:
On 3/8/2016 5:12 PM, Craig Ringer wrote:

Are there good reasons to use pgjdbc over pgjdbc-ng then?

Maturity, support for older versions (-ng just punts on support for anything except new releases) and older JDBC specs, completeness of support for some extensions. TBH I haven't done a ton with -ng yet.

I'd like to turn this question around. Are there good reasons to use -ng over pgjdbc ?

As to your question, you may be interested to know that pgjdbc is more performant than ng.
That's good to know, but unfortunately pgjdbc is unusable for us until
https://github.com/pgjdbc/pgjdbc/issues/488 is fixed.

Also, as I mentioned in the ticket, I can't imagine RETURNING * being performant if, for example, I INSERT a large chunk of data like an image data or an uploaded file.

Thanks for the reminder!

So I"m guessing the reason to use ng is to avoid returning * ?

Dave Cramer

davec@postgresintl.com

www.postgresintl.com

Igal

Re: Proposal: RETURNING primary_key()

From

Stephen Frost

Date:

04 April 2016, 04:56:33

Dave,

* Dave Cramer (pg@fastcrypt.com) wrote:
> On 3 April 2016 at 15:35, Stephen Frost <sfrost@snowman.net> wrote:
> > Not generally much of a JDBC user myself, but the inability to avoid
> > polling for LISTEN notifications is a pretty big annoyance, which I just
> > ran into with a client.  I understand that -ng has a way to avoid that,
> > even for SSL connections.
>
> Yes, it is a custom api. Easy enough to add. Is this something of interest ?

I'd say that there is definite interest in this and there's a lot of
conversation about it on the interwebs (stackoverflow, etc).

My understanding is that the problem is actually with the SSL library
that the JDBC driver uses and that it basically lies about if there are
bytes available for reading (claiming that there never is by always
returning zero).  The -ng driver, as I understand it, uses a newer SSL
library which better supports asking if there are bytes available to
read.

Thanks!

Stephen

Re: Proposal: RETURNING primary_key()

From

Dave Cramer

Date:

04 April 2016, 05:14:15

On 3 April 2016 at 21:56, Stephen Frost <sfrost@snowman.net> wrote:

Dave,

* Dave Cramer (pg@fastcrypt.com) wrote:
> On 3 April 2016 at 15:35, Stephen Frost <sfrost@snowman.net> wrote:
> > Not generally much of a JDBC user myself, but the inability to avoid
> > polling for LISTEN notifications is a pretty big annoyance, which I just
> > ran into with a client. I understand that -ng has a way to avoid that,
> > even for SSL connections.
>
> Yes, it is a custom api. Easy enough to add. Is this something of interest ?

I'd say that there is definite interest in this and there's a lot of
conversation about it on the interwebs (stackoverflow, etc).

My understanding is that the problem is actually with the SSL library
that the JDBC driver uses and that it basically lies about if there are
bytes available for reading (claiming that there never is by always
returning zero). The -ng driver, as I understand it, uses a newer SSL
library which better supports asking if there are bytes available to
read.

Hmmm. that complicates things...

Async notification is the easier part, I wasn't aware that the ssl library had this problem though

Dave Cramer

davec@postgresintl.com

www.postgresintl.com

Re: Proposal: RETURNING primary_key()

From

Craig Ringer

Date:

04 April 2016, 05:17:58

On 4 April 2016 at 10:13, Dave Cramer <pg@fastcrypt.com> wrote:

Async notification is the easier part, I wasn't aware that the ssl library had this problem though

AFAIK the issue is that even if there are bytes available on the underlying socket, the SSL lib doesn't know if that means there are bytes readable from the wrapped SSL socket. The traffic on the underlying socket could be renegotiation messages or whatever.

We really need non-blocking reads.

Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: Proposal: RETURNING primary_key()

From

Stephen Frost

Date:

04 April 2016, 05:19:16

Dave,

* Dave Cramer (pg@fastcrypt.com) wrote:
> On 3 April 2016 at 21:56, Stephen Frost <sfrost@snowman.net> wrote:
> > * Dave Cramer (pg@fastcrypt.com) wrote:
> > > On 3 April 2016 at 15:35, Stephen Frost <sfrost@snowman.net> wrote:
> > > > Not generally much of a JDBC user myself, but the inability to avoid
> > > > polling for LISTEN notifications is a pretty big annoyance, which I
> > just
> > > > ran into with a client.  I understand that -ng has a way to avoid that,
> > > > even for SSL connections.
> > >
> > > Yes, it is a custom api. Easy enough to add. Is this something of
> > interest ?
> >
> > I'd say that there is definite interest in this and there's a lot of
> > conversation about it on the interwebs (stackoverflow, etc).
> >
> > My understanding is that the problem is actually with the SSL library
> > that the JDBC driver uses and that it basically lies about if there are
> > bytes available for reading (claiming that there never is by always
> > returning zero).  The -ng driver, as I understand it, uses a newer SSL
> > library which better supports asking if there are bytes available to
> > read.
>
> Hmmm. that complicates things...
>
> Async notification is the easier part, I wasn't aware that the ssl library
> had this problem though

Right.  It's not sufficient to simply poll the JDBC driver to see if
there are notifications currently, you have to actually generate traffic
between the client and the server, to force the driver to read from the
SSL library and discover any notifications which have arrived from the
server.  That can be done by issuing an all-whitespace command, which
the server will respond to with an EmptyQueryMessage (iirc), but you
can't simply have the Java side sit in a select() loop or similar
waiting for notifications to arrive.

Thanks!

Stephen

Re: Proposal: RETURNING primary_key()

From

Stephen Frost

Date:

04 April 2016, 05:20:22

* Craig Ringer (craig@2ndquadrant.com) wrote:
> On 4 April 2016 at 10:13, Dave Cramer <pg@fastcrypt.com> wrote:
> > Async notification is the easier part, I wasn't aware that the ssl library
> > had this problem though
>
> AFAIK the issue is that even if there are bytes available on the underlying
> socket, the SSL lib doesn't know if that means there are bytes readable
> from the wrapped SSL socket. The traffic on the underlying socket could be
> renegotiation messages or whatever.
>
> We really need non-blocking reads.

That would certainly be a good way to address this, but I'm guessing
it's non-trivial to implement.

Thanks!

Stephen

Re: Proposal: RETURNING primary_key()

From

Dave Cramer

Date:

04 April 2016, 05:23:22

On 3 April 2016 at 22:20, Stephen Frost <sfrost@snowman.net> wrote:

* Craig Ringer (craig@2ndquadrant.com) wrote:
> On 4 April 2016 at 10:13, Dave Cramer <pg@fastcrypt.com> wrote:
> > Async notification is the easier part, I wasn't aware that the ssl library
> > had this problem though
>
> AFAIK the issue is that even if there are bytes available on the underlying
> socket, the SSL lib doesn't know if that means there are bytes readable
> from the wrapped SSL socket. The traffic on the underlying socket could be
> renegotiation messages or whatever.
>
> We really need non-blocking reads.

That would certainly be a good way to address this, but I'm guessing
it's non-trivial to implement.

AFAICT, the ng driver still has to generate traffic as well.

Dave Cramer

davec@postgresintl.com

www.postgresintl.com

Thanks!

Stephen

Re: Proposal: RETURNING primary_key()

From

"Igal @ Lucee.org"

Date:

04 April 2016, 07:50:22

On 4/3/2016 4:34 PM, Dave Cramer wrote:

On 4/3/2016 8:21 AM, Dave Cramer wrote:

I'd like to turn this question around. Are there good reasons to use -ng over pgjdbc ?

As to your question, you may be interested to know that pgjdbc is more performant than ng.
That's good to know, but unfortunately pgjdbc is unusable for us until
https://github.com/pgjdbc/pgjdbc/issues/488 is fixed.

Also, as I mentioned in the ticket, I can't imagine RETURNING * being performant if, for example, I INSERT a large chunk of data like an image data or an uploaded file.

Thanks for the reminder!

So I"m guessing the reason to use ng is to avoid returning * ?

I'm not sure if you're serious or if you're just trying to be "cute". This ticket should still be fixed. It really doesn't make any sense to me that the driver will just blindly append "RETURNING *" to the query.

If I want to return all of the columns from an UPDATE or an INSERT -- then I will add "RETURNING *" myself. And if I don't add it, then I probably don't want the driver to second guess me, or to think that it knows better than I do what I want. If I wanted software that thinks that it knows what I want better than I do -- then I would stick with SQL Server rather than switch to Postgres.

The driver used to work until someone decided to append "RETURNING *" to the SQL code and make it unusable in many cases.

Was there any discussion on this before it was added?

Igal Sapir
Lucee Core Developer
Lucee.org

Re: Proposal: RETURNING primary_key()

From

"David G. Johnston"

Date:

04 April 2016, 08:52:25

On Sun, Apr 3, 2016 at 9:49 PM, Igal @ Lucee.org <igal@lucee.org> wrote:

On 4/3/2016 4:34 PM, Dave Cramer wrote:

On 4/3/2016 8:21 AM, Dave Cramer wrote:

I'd like to turn this question around. Are there good reasons to use -ng over pgjdbc ?

As to your question, you may be interested to know that pgjdbc is more performant than ng.
That's good to know, but unfortunately pgjdbc is unusable for us until
https://github.com/pgjdbc/pgjdbc/issues/488 is fixed.

Also, as I mentioned in the ticket, I can't imagine RETURNING * being performant if, for example, I INSERT a large chunk of data like an image data or an uploaded file.

Thanks for the reminder!

So I"m guessing the reason to use ng is to avoid returning * ?

I'm not sure if you're serious or if you're just trying to be "cute". This ticket should still be fixed. It really doesn't make any sense to me that the driver will just blindly append "RETURNING *" to the query.

If I want to return all of the columns from an UPDATE or an INSERT -- then I will add "RETURNING *" myself. And if I don't add it, then I probably don't want the driver to second guess me, or to think that it knows better than I do what I want. If I wanted software that thinks that it knows what I want better than I do -- then I would stick with SQL Server rather than switch to Postgres.

The driver used to work until someone decided to append "RETURNING *" to the SQL code and make it unusable in many cases.

Was there any discussion on this before it was added?

Except the main problem you describe is one where you WANT the driver to be smart and understand that even though you've asked it to return generated keys the statement you've provided it is one that incapable of doing so. Thus you do want it to interpret what you've told it and to do what you mean and not what you say.

Obviously the problem is solvable - you yourself have said other's have solved it. That is one piece of good news - the other piece is that PostgreSQL, and the JDBC driver in question, is open source software.

Somehow the driver needs to determine, reliably and ideally inexpensively, how to effect:

"This parameter is ignored if the SQL statement is not an INSERT statement, or an SQL statement able to return auto-generated keys (the list of such statements is vendor-specific)."

https://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#prepareStatement(java.lang.String,%20int)

Discussions and patches exploring how to go about that are welcomed.

I do think that issue 488 needs to separate out and fix the non-conformance to the API that is present - namely not ignoring the "int" argument when the supplied statement is not capable (i.e., not an INSERT statement) - and posted such (and a bit more) on the issue itself.

David J.