Home > mailing lists

Re: Issue with the PRNG used by Postgres - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Issue with the PRNG used by Postgres
Date	April 10 20:12:21
Msg-id	4085126.1712769141@sss.pgh.pa.us Whole thread Raw
In response to	Re: Issue with the PRNG used by Postgres (Parag Paul <parag.paul@gmail.com>)
List	pgsql-hackers

Tree view

Parag Paul <parag.paul@gmail.com> writes:
> Yes, the probability of this happening is astronomical, but in production
> with 128 core servers with 7000 max_connections, with petabyte scale data,
> this did repro 2 times in the last month. We had to move to a local
> approach to manager our ratelimiting counters.
> This is not reproducible very easily. I feel that we should at least shield
> ourselves with the following change, so that we at least increase the delay
> by 1000us every time. We will follow a linear back off, but better than no
> backoff.

I still say you are proposing to band-aid the wrong thing.  Moreover:

* the proposed patch will cause the first few cur_delay values to grow
much faster than before, with direct performance impact to everyone,
whether they are on 128-core servers or not;

* if we are in a regime where xoroshiro repeatedly returns zero
across multiple backends, your patch doesn't improve the situation
AFAICS, because the backends will still choose the same series
of cur_delay values and thus continue to exhibit thundering-herd
behavior.  Indeed, as coded I think the patch makes it *more*
likely that the same series of cur_delay values would be chosen
by multiple backends.

            regards, tom lane

pgsql-hackers by date:

From: Andres Freund
Date: 10 April, 20:08:46
Subject: Re: Issue with the PRNG used by Postgres

From: Jeff Davis
Date: 10 April, 20:14:16
Subject: Re: Improve eviction algorithm in ReorderBuffer

Re: Issue with the PRNG used by Postgres - Mailing list pgsql-hackers

Previous

Next