Re: Let's make PostgreSQL multi-threaded - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: Let's make PostgreSQL multi-threaded |
Date | |
Msg-id | 20230613.174658.548424684295647548.horikyota.ntt@gmail.com Whole thread Raw |
In response to | Re: Let's make PostgreSQL multi-threaded (Konstantin Knizhnik <knizhnik@garret.ru>) |
Responses |
Re: Let's make PostgreSQL multi-threaded
|
List | pgsql-hackers |
At Tue, 13 Jun 2023 11:20:56 +0300, Konstantin Knizhnik <knizhnik@garret.ru> wrote in > > > On 13.06.2023 10:55 AM, Kyotaro Horiguchi wrote: > > At Tue, 13 Jun 2023 09:55:36 +0300, Konstantin Knizhnik > > <knizhnik@garret.ru> wrote in > >> Postgres backend is "thick" not because of large number of local > >> variables. > >> It is because of local caches: catalog cache, relation cache, prepared > >> statements cache,... > >> If they are not rewritten, then backend still may consume a lot of > >> memory even if it will be thread rather then process. > >> But threads simplify development of global caches, although it can be > >> done with DSM. > > With the process model, that local stuff are flushed out upon > > reconnection. If we switch to the thread model, we will need an > > expiration mechanism for those stuff. > > We already have invalidation mechanism. It will be also used in case > of shared cache, but we do not need to send invalidations to all > backends. Invalidation is not expiration. > I do not completely understand your point. > Right now caches (for example catalog cache) is not limited at all. > So if you have very large database schema, then this cache will > consume a lot of memory (multiplied by number of > backends). The fact that it is flushed out upon reconnection can not > help much: what if backends are not going to disconnect? Right now, if one out of many backends creates a huge system catalog cahce, it can be cleard upon disconnection. The same client can repeat this process, but users can ensure such situations don't persist. However, with the thread model, we won't be able to clear parts of the cache that aren't required by the active backends anymore. (Of course with threads, we can avoid duplications, though.) > In case of shared cache we will have to address the same problem: > whether this cache should be limited (with some replacement discipline > as LRU). > Or it is unlimited. In case of shared cache, size of the cache is less > critical because it is not multiplied by number of backends. Yes. > So we can assume that catalog and relation cache should always fir in > memory (otherwise significant rewriting of all Postgtres code working > with relations will be needed). I'm not sure that is ture.. But likely to be? > But Postgres also have temporary tables. For them we may need local > backend cache in any case. > Global temp table patch was not approved so we still have to deal with > this awful temp tables. > > In any case I do not understand why do we need some expiration > mechanism for this caches. I don't think it is efficient that PostgreSQL to consume a large amount of memory for seldom-used content. While we may not need expiration mechanism for moderate use cases, I have observed instances where a single process hogs a significant amount of memory, particularly for intermittent tasks. > If there is some relation than information about this relation should > be kept in the cache as long as this relation is alive. > If there is not enough memory to cache information about all > relations, then we may need some replacement algorithm. > But I do not think that there is any sense to remove some item fro the > cache just because it is too old. Ah. I see. I am fine with a replacement mechanishm. But the evicition algorithm seems almost identical to the exparation algorithm. The algorithm will not be simply driven by object age, but I'm not sure we need more than access frequency. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: