Re: Efficiently advancing a sequence without risking it going backwards. - Mailing list pgsql-general
From | Christopher Browne |
---|---|
Subject | Re: Efficiently advancing a sequence without risking it going backwards. |
Date | |
Msg-id | CAFNqd5XT=961hETVmzyzHHJeW=wv+KsWXk54TwUCsgX0PP-VRg@mail.gmail.com Whole thread Raw |
In response to | Re: Efficiently advancing a sequence without risking it going backwards. (Jeremy Schneider <schneider@ardentperf.com>) |
Responses |
Re: Efficiently advancing a sequence without risking it going backwards.
Re: Efficiently advancing a sequence without risking it going backwards. |
List | pgsql-general |
On Thu, 9 Jul 2020 at 12:59, Jeremy Schneider <schneider@ardentperf.com> wrote:
> On Jul 6, 2020, at 19:06, Paul McGarry <paul@paulmcgarry.com> wrote:
>
> I don't think I can use setval(), because it risks making sequences go backwards, eg:
>
> 1) Check values
> DB1sequence: 1234
> DB2sequence: 1233 (1 behind)
> 2) setval('DB2sequence',1234);
>
> but if between (1) and (2) there are 2 nextval(DB2sequence) calls on another process, (2) would take the sequence back from 1235 to 1234 and I would end up trying to create a duplicate key ID from the sequence.
An ability to “lock” the sequence momentarily would give you the tool you need, but I don’t think it’s there.
Total hack, but if your application or users can retry when the rare error is encountered then one idea is to rename the sequence momentarily while you do the setval() then rename it back. Do an initial check without renaming, then re-check after renaming and before the setval() call.
If you put retry logic into your application then make sure to include back-off logic so you don’t get an outage induced by thundering herd.
This is increasingly looking like a set of attempts to intentionally abuse what sequences were designed for.
The use-case where you need a lock on the value so that there can't possibly be a hole in the sequence points at the notion of having some other kind of a function that takes out a lock on a table, and serially gives out "MAX+1" as the next value.
That isn't a very difficult function to write; the problem with it is that that sort of function will forcibly serialize all inserts through the function+table lock that is giving out "MAX+1" values. That's going to be WAY slower than using a sequence object, and about 98% of the time, people will prefer the sequence object, particularly because it's about 98% faster.
I'm not quite sure if anyone has put out there a standard-ish idiom for this; that seems like a not TOO difficult "exercise for the user."
There will definitely be more failure cases, and *wildly* more fighting, in a concurrent environment, over tuple locks.
- An obvious failure is that if one connection asks for the new MAX+1, gets it, and then the transaction fails, for some later, out-of-relevant-scope, reason, you'll still potentially get some "holes" in the series of values.
- If there are 10 connections trying to get MAX+1 concurrently, only one can get it at a time, and that connection can't relinquish the lock until its transaction has completed, and the 9 must wait, regardless of how much work the "winner" still has to do.
These are amongst the reasons why people conclude they *don't* want that kind of functionality.
It makes me think that the problem needs to be taken back to that initial point of "I think I need some somewhat coordinated sequences", and poke at what the *real* requirement is there, and why someone thinks that the values should be "somewhat coordinated." Something seems off there.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"
question, "How would the Lone Ranger handle this?"
pgsql-general by date: