Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Date
Msg-id 52D3A8AC.9040509@vmware.com
Whole thread Raw
In response to Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE  (Peter Geoghegan <pg@heroku.com>)
Responses Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
List pgsql-hackers
On 01/11/2014 12:39 PM, Peter Geoghegan wrote:
> In any case, my patch is bound to win decisively for the other
> extreme, the insert-only case, because the overhead of doing an index
> scan first is always wasted there with your approach, and the overhead
> of extended btree leaf page locking has been shown to be quite low.

Quite possibly. Run the benchmark, and we'll see how big a difference 
we're talking about.

> In
> the past you've spoken of avoiding that overhead through an adaptive
> strategy based on statistics, but I think you'll have a hard time
> beating a strategy where the decision comes as late as possible, and
> is informed by highly localized page-level metadata already available.
> My implementation can abort an attempt to just read an existing
> would-be duplicate very inexpensively (with no strong locks), going
> back to just after the _bt_search() to get a heavyweight lock if just
> reading doesn't work out (if there is no duplicate found), so as to
> not waste all of its prior work. Doing one of the two extremes of
> insert-mostly or update-only well is relatively easy; dynamically
> adapting to one or the other is much harder. Especially if it's a
> consistent mix of inserts and updates, where general observations
> aren't terribly useful.

Another way to optimize it is to keep the b-tree page pinned after doing 
the pre-check. Then you don't need to descend the tree again when doing 
the insert. That would require small indexam API changes, but wouldn't 
be too invasive, I think.

> All other concerns of mine still remain, including the concern over
> the extra locking of the proc array - I'm concerned about the
> performance impact of that on other parts of the system not exercised
> by this test.

Yeah, I'm not thrilled about that part either. Fortunately there are 
other ways to implement that. In fact, I think you could just not bother 
taking the ProcArrayLock when setting the fields. The danger is that 
another backend sees a mixed state of the fields, but that's OK. The 
worst that can happen is that it will do an unnecessary lock/release on 
the heavy-weight lock. And to reduce the overhead when reading the 
fields, you could merge the SpeculativeInsertionIsInProgress() check 
into TransactionIdIsInProgress(). The call site in tqual.c always calls 
it together with TransactionIdIsInProgress(), which scans the proc array 
anyway, while holding the lock.

- Heikki



pgsql-hackers by date:

Previous
From: Jeevan Chalke
Date:
Subject: Re: [PATCH] Filter error log statements by sqlstate
Next
From: Michael Meskes
Date:
Subject: Re: ECPG regression tests generating warnings