On Fri, Oct 3, 2014 at 2:03 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> That lowers the bar from what I thought everyone agreed on. Namely, if two
> backends run a similar UPSERT command concurrently on a table that has more
> than one unique constraint, they might deadlock, causing one of them to
> throw an error instead of INSERTing or UPDATEing anything.
It lowers the bar to a level that I am not willing to limbo dance
under. You don't even need two unique constraints. Nothing as
"complicated" as that is required.
When this happens with MySQL, they have the good sense to call it a
bug [1], and even fix it. I find the comparison with conventional
insertion entirely unconvincing.
> I'm sure that's useful enough in many applications, but I'd like to have a
> more robust implementation. The shorter we can keep the list of caveats, the
> better.
INSERT and UPDATE are supposed to be fairly well balanced here.
Conflicts are the norm.
>> The value in the index needs to be protected by a block level lock, so
>> we can check it quickly, but the eventual heap work is serialized by
>> transactional semantics.
>>
>> I think a little perspective is due here and we should stick to the
>> main use case, not cater for bizarre edge cases.
>
>
> I'm trying to bisect your thoughts on exactly what use cases you think we
> must support, and which ones you consider bizarre edge cases, and what
> exactly is acceptable behavior in those edge cases.
"Lots of concurrency" is not an edge-case.
>> Any form of tuple locking that uses the general lock manager will not
>> be usable. I can't see it is worth the overhead of doing that to
>> protect against deadlocks that would only be experienced by people
>> doing foolish things.
>
> Maybe, maybe not, but let's define the acceptable behavior first, and think
> about the implementation second.
+1. Updating a lot with UPSERT is not foolish. That's all it took to
make earlier prototypes deadlock.
> I'm pretty sure all of the approaches
> discussed so far can be made fast enough, and the bloat issues can be made
> small enough, that it doesn't matter much which one we choose from a
> performance point of view. The differences are in what use cases they can
> support, and the maintainability of the code.
+1
What do we get for giving up on not having unprincipled deadlocks
here? What's the advantage? Assuming that this is a bizarre edge-case
(note: it isn't), what do we get in return for giving up on fixing it?
[1] MySQL bug #52020
--
Peter Geoghegan