On 13 January 2017 at 15:20, Dave Cramer <pg@fastcrypt.com> wrote:
On 13 January 2017 at 06:01, Christopher BROWN <brown@reflexe.fr> wrote:
Hello,
I've at last had an opportunity to use PostgreSQL 9.6 (instead of 9.4) for a project, and have been able to use the ON CONFLICT DO UPDATE clause. As it can involve repeating parameter values (including, as is my case, large binary files), I'm wondering if there's any performance or resource-usage side effects (or workarounds) when doing something like this:
INSERT INTO foo (c_uuid, file_data, file_name) VALUES (?, ?, ?)
ON CONFLICT (c_uuid) DO UPDATE SET file_data = ?, file_name = ?
On the Java side, that means invoking PreparedStatement::setBinaryStream TWICE for the same data.
I'm not noticing side effects on localhost (but haven't profiled either). Is this fine, or a potential problem? Is there a better approach?
So to summarize you would like some sort of optimization where we figured out that two of the values were the same in the initial insert and the on conflict and only send it once ?
To summarize, I'd like to avoid sending the same data twice, if that has any potentially significant performance issues and/or resource-usage issues. I don't know if that's the case or not.
The SQL syntax (and the query workflow, and parameter order) is specific to PostgreSQL, so I'm not against something that varies from the basic "PreparedStatement" syntax. If there's a safe way of figuring out the intention, that'd be excellent. Maybe if the driver has "seen" the parameter set (via PreparedStatement::setXXX) in the INSERT part for field "X", and it "sees" the field "X" is unset in the ON CONFLICT part, or if you could have some sort of back-reference (either by index, e.g. $1, $2 or whatever, or by name if that's possible).