Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE - Mailing list pgsql-hackers

From Robert Haas
Subject Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Date
Msg-id CA+TgmoY94DRtdnyGfC0bg3Dy+nzqavCTOYX57Sf6YY0=szxKZA@mail.gmail.com
Whole thread Raw
In response to Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE  (Peter Geoghegan <pg@heroku.com>)
Responses Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
List pgsql-hackers
On Thu, Sep 26, 2013 at 3:07 PM, Peter Geoghegan <pg@heroku.com> wrote:
> When you consider that the feature will frequently be used with the
> assumption that updating is a much more likely outcome, it becomes
> clear that we need to be careful about this sort of interplay.

I think one thing that's pretty clear at this point is that almost any
version of this feature could be optimized for either the insert case
or the update case.  For example, my proposal could be modified to
search for a conflicting tuple first, potentially wasting an index
probes (or multiple index probes, if you want to search for potential
conflicts in multiple indexes) if we're inserting, but winning heavily
in the update case.  As written, it's optimized for the insert case.

In fact, I don't know how to know which of these things we should
optimize for.  I wrote part of the code for an EDB proprietary feature
that can do insert-or-update loads about 6 months ago[1], and we
optimized it for updates.  That was not, however, a matter of
principal; it just turned out to be easier to implement that way.  In
fact, I would have assumed that the insert-mostly case was more
likely, but I think the real answer is that some environments will be
insert-mostly and some will be update-mostly and some will be a mix.

If we really want to squeeze out every last drop of possible
performance, we might need two modes: one that assumes we'll mostly
insert, and another that assumes we'll mostly update.  That seems a
frustrating amount of detail to have to expose to the user; an
implementation that was efficient in both cases would be very
desirable, but I do not have a good idea how to get there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] In case you're wondering, attempting to use that feature to upsert
an invisible tuple will result in the load failing with a unique index
violation.



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Next
From: Robert Haas
Date:
Subject: Re: Minmax indexes