Thread: Insert Documentation - Returning Clause and Order
Hey,
Would it be accurate to add the following sentence to the INSERT documentation under "Outputs"?
"...inserted or updated by the command." For a multiple-values insertion, the order of output rows will match the order that rows are presented in the values or query clause.
David J.
On Wed, Dec 9, 2020 at 9:10 PM David G. Johnston <david.g.johnston@gmail.com> wrote: > > Hey, > > Would it be accurate to add the following sentence to the INSERT documentation under "Outputs"? > > "...inserted or updated by the command." For a multiple-values insertion, the order of output rows will match the orderthat rows are presented in the values or query clause. Postgres's current implementation may be doing so, but I don't think that can be guaranteed in possible implementations. I don't think restricting choice of implementation to guarantee that is a good idea either. -- Best Wishes, Ashutosh Bapat
On Thursday, December 10, 2020, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
On Wed, Dec 9, 2020 at 9:10 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> Hey,
>
> Would it be accurate to add the following sentence to the INSERT documentation under "Outputs"?
>
> "...inserted or updated by the command." For a multiple-values insertion, the order of output rows will match the order that rows are presented in the values or query clause.
Postgres's current implementation may be doing so, but I don't think
that can be guaranteed in possible implementations. I don't think
restricting choice of implementation to guarantee that is a good idea
either.
Yeah, the ongoing work on parallel inserts would seem to be an issue. We should probably document that though. And maybe as part of parallel inserts patch provide a user-specifiable way to ask for such a guarantee if needed. ‘Insert returning ordered”
David J.
On Thu, Dec 10, 2020 at 7:49 PM David G. Johnston <david.g.johnston@gmail.com> wrote: > > On Thursday, December 10, 2020, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: >> >> On Wed, Dec 9, 2020 at 9:10 PM David G. Johnston >> <david.g.johnston@gmail.com> wrote: >> > >> > Hey, >> > >> > Would it be accurate to add the following sentence to the INSERT documentation under "Outputs"? >> > >> > "...inserted or updated by the command." For a multiple-values insertion, the order of output rows will match the orderthat rows are presented in the values or query clause. >> >> Postgres's current implementation may be doing so, but I don't think >> that can be guaranteed in possible implementations. I don't think >> restricting choice of implementation to guarantee that is a good idea >> either. >> > > Yeah, the ongoing work on parallel inserts would seem to be an issue. We should probably document that though. And maybeas part of parallel inserts patch provide a user-specifiable way to ask for such a guarantee if needed. ‘Insert returningordered” I am curious about the usecase which needs that guarantee? Don't you have a column on which you can ORDER BY so that it returns the same order as INSERT? -- Best Wishes, Ashutosh Bapat
On Fri, Dec 11, 2020 at 6:24 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
On Thu, Dec 10, 2020 at 7:49 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
> Yeah, the ongoing work on parallel inserts would seem to be an issue. We should probably document that though. And maybe as part of parallel inserts patch provide a user-specifiable way to ask for such a guarantee if needed. ‘Insert returning ordered”
I am curious about the usecase which needs that guarantee? Don't you
have a column on which you can ORDER BY so that it returns the same
order as INSERT?
This comes up periodically in the context of auto-generated keys being returned - specifically on the JDBC project list (maybe elsewhere...). If one adds 15 VALUES entries to an insert and then sends them in bulk to the server it would be helpful if the generated keys could be matched up one-to-one with the keyless objects in the client. Basically "pipelining" the client and server.
David J.
On Friday, December 11, 2020, David G. Johnston <david.g.johnston@gmail.com> wrote: > > On Fri, Dec 11, 2020 at 6:24 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote: >> >> On Thu, Dec 10, 2020 at 7:49 PM David G. Johnston >> <david.g.johnston@gmail.com> wrote: >> >> > Yeah, the ongoing work on parallel inserts would seem to be an issue. We should probably document that though. Andmaybe as part of parallel inserts patch provide a user-specifiable way to ask for such a guarantee if needed. ‘Insertreturning ordered” >> >> I am curious about the usecase which needs that guarantee? Don't you >> have a column on which you can ORDER BY so that it returns the same >> order as INSERT? > > > This comes up periodically in the context of auto-generated keys being returned - specifically on the JDBC project list(maybe elsewhere...). If one adds 15 VALUES entries to an insert and then sends them in bulk to the server it wouldbe helpful if the generated keys could be matched up one-to-one with the keyless objects in the client. Basically "pipelining"the client and server. That’s a great use case. It’s not so much about ordering, per se, but about identity. Certainly almost every ORM, and maybe even other forms of application code, need to be able to associate the serial column value returned with what it inserted. I'd expect something like that (whether by ordering explicitly or by providing some kind of mapping between indexes in the statement data and the inserted/returned row values). James
On Sat, Dec 12, 2020 at 7:02 AM James Coleman <jtc331@gmail.com> wrote:
Certainly almost every ORM, and maybe even other forms of application
code, need to be able to associate the serial column value returned
with what it inserted.
Yet most ORM would perform single inserts at a time, not in bulk, making such a feature irrelevant to them.
I don't think having such a feature is all that important personally, but the question comes every so often and it would be nice to be able to point at the documentation for a definitive answer - not just one inferred from a lack of documentation - especially since the observed behavior is that order is preserved today.
David J.
On Sat, Dec 12, 2020 at 10:11 AM David G. Johnston <david.g.johnston@gmail.com> wrote: > > On Sat, Dec 12, 2020 at 7:02 AM James Coleman <jtc331@gmail.com> wrote: >> >> >> Certainly almost every ORM, and maybe even other forms of application >> code, need to be able to associate the serial column value returned >> with what it inserted. > > > Yet most ORM would perform single inserts at a time, not in bulk, making such a feature irrelevant to them. I think that's a pretty hasty generalization. It's the majority of use cases in an ORM, sure, but plenty of ORMs (and libraries or applications using them) support inserting batches where performance requires it. Rails/ActiveRecord is actually integrating that feature into core (though many Ruby libraries already add that support, as does, for example, the application I spend the majority of time working on). James
On Sat, Dec 12, 2020 at 8:41 PM David G. Johnston <david.g.johnston@gmail.com> wrote: > > On Sat, Dec 12, 2020 at 7:02 AM James Coleman <jtc331@gmail.com> wrote: >> >> >> Certainly almost every ORM, and maybe even other forms of application >> code, need to be able to associate the serial column value returned >> with what it inserted. > > > Yet most ORM would perform single inserts at a time, not in bulk, making such a feature irrelevant to them. > > I don't think having such a feature is all that important personally, but the question comes every so often and it wouldbe nice to be able to point at the documentation for a definitive answer - not just one inferred from a lack of documentation- especially since the observed behavior is that order is preserved today. > That's a valid usecase, but adding such a guarantee in documentation would restrict implementation. So at best we can say "no order is guaranteed". But we write what's guaranteed. Anything not written in the documents is not guaranteed. There are ways to get it working, but let's not go into those details in this thread. -- Best Wishes, Ashutosh Bapat
On Mon, Dec 14, 2020 at 7:09 AM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
But we write what's guaranteed. Anything not written in
the documents is not guaranteed.
In the case of LIMIT we go to great lengths to write what isn't guaranteed. I suggest that this is similar enough in nature to warrant the same emphasis.
"Thus, using different LIMIT/OFFSET values to select different subsets of a query result will give inconsistent results unless you enforce a predictable result ordering with ORDER BY. This is not a bug; it is an inherent consequence of the fact that SQL does not promise to deliver the results of a query in any particular order unless ORDER BY is used to constrain the order.
It is even possible for repeated executions of the same LIMIT query to return different subsets of the rows of a table, if there is not an ORDER BY to enforce selection of a deterministic subset. Again, this is not a bug; determinism of the results is simply not guaranteed in such a case."
It is even possible for repeated executions of the same LIMIT query to return different subsets of the rows of a table, if there is not an ORDER BY to enforce selection of a deterministic subset. Again, this is not a bug; determinism of the results is simply not guaranteed in such a case."
I'd go so far as to say that it's more important here since the observed behavior is that things are ordered, and expected to be ordered, while with limit the non-determinism seems more obvious.
David J.