Re: MultiXact\SLRU buffers configuration - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: MultiXact\SLRU buffers configuration |
Date | |
Msg-id | 20201028233243.ygm6yqlynkqpzekr@development Whole thread Raw |
In response to | Re: MultiXact\SLRU buffers configuration (Andrey Borodin <x4mmm@yandex-team.ru>) |
Responses |
Re: MultiXact\SLRU buffers configuration
|
List | pgsql-hackers |
Hi, On Wed, Oct 28, 2020 at 12:34:58PM +0500, Andrey Borodin wrote: >Tomas, thanks for looking into this! > >> 28 окт. 2020 г., в 06:36, Tomas Vondra <tomas.vondra@2ndquadrant.com> написал(а): >> >> >> This thread started with a discussion about making the SLRU sizes >> configurable, but this patch version only adds a local cache. Does this >> achieve the same goal, or would we still gain something by having GUCs >> for the SLRUs? >> >> If we're claiming this improves performance, it'd be good to have some >> workload demonstrating that and measurements. I don't see anything like >> that in this thread, so it's a bit hand-wavy. Can someone share details >> of such workload (even synthetic one) and some basic measurements? > >All patches in this thread aim at the same goal: improve performance in presence of MultiXact locks contention. >I could not build synthetical reproduction of the problem, however I did some MultiXact stressing here [0]. It's a clumsytest program, because it still is not clear to me which parameters of workload trigger MultiXact locks contention.In generic case I was encountering other locks like *GenLock: XidGenLock, MultixactGenLock etc. Yet our productionsystem encounters this problem approximately once in a month through this year. > >Test program locks for share different set of tuples in presence of concurrent full scans. >To produce a set of locks we choose one of 14 bits. If a row number has this bit set to 0 we add lock it. >I've been measuring time to lock all rows 3 time for each of 14 bits, observing total time to set all locks. >During the test I was observing locks in pg_stat_activity, if they did not contain enough MultiXact locks I was tuning parametersfurther (number of concurrent clients, number of bits, select queries etc). > >Why is it so complicated? It seems that other reproductions of a problem were encountering other locks. > It's not my intention to be mean or anything like that, but to me this means we don't really understand the problem we're trying to solve. Had we understood it, we should be able to construct a workload reproducing the issue ... I understand what the individual patches are doing, and maybe those changes are desirable in general. But without any benchmarks from a plausible workload I find it hard to convince myself that: (a) it actually will help with the issue you're observing on production and (b) it's actually worth the extra complexity (e.g. the lwlock changes) I'm willing to invest some of my time into reviewing/testing this, but I think we badly need better insight into the issue, so that we can build a workload reproducing it. Perhaps collecting some perf profiles and a sample of the queries might help, but I assume you already tried that. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: