Re: Global shared meta cache - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: Global shared meta cache
Date
Msg-id 78466f46-e3a0-e958-434f-955104b0f84c@postgrespro.ru
Whole thread Raw
In response to Global shared meta cache  ("Ideriha, Takeshi" <ideriha.takeshi@jp.fujitsu.com>)
Responses Re: Global shared meta cache  (Robert Haas <robertmhaas@gmail.com>)
RE: Global shared meta cache  ("Ideriha, Takeshi" <ideriha.takeshi@jp.fujitsu.com>)
List pgsql-hackers

On 26.06.2018 09:48, Ideriha, Takeshi wrote:
> Hi, hackers!
>
> My customer created hundreds of thousands of partition tables and tried to select data from hundreds of
applications,
> which resulted in enormous consumption of memory because it consumed # of backend multiplied by # of local memory
(ex.100 backends X 1GB = 100GB).
 
> Relation caches are loaded on each backend local memory.
>
> To address this issue I'm trying to move meta caches like catcache or relcache into shared memory.
>
> This topic seems to have been discussed several times.
> For instance this thread:
> https://www.postgresql.org/message-id/CA%2BTgmobjDw_SWsxyJwT9z-YOwWv0ietuQx5fb%3DWEYdDfvCbzGQ%40mail.gmail.com
>
> In my understanding, it discussed moving catcache and relcache to shared memory rather than current local backend
memory,
> and is most concerned with performance overhead.
>
> Robert Haas wrote:
>> I think it would be interested for somebody to build a prototype here
>> that ignores all the problems but the first and uses some
>> straightforward, relatively unoptimized locking strategy for the first
>> problem. Then benchmark it. If the results show that the idea has
>> legs, then we can try to figure out what a real implementation would
>> look like.
>> (One possible approach: use Thomas Munro's DHT stuff to build the shared cache.)
> I'm inspired by this comment and now developing a prototype (please see attached),
> but I haven't yet put cache structure on shared memory.
> Instead, I put dummy data on shared memory which is initialized at startup,
> and then acquire/release lock just before/after searching/creating catcache entry.
>
> I haven't considered relcache and catcachelist either.
> It is difficult for me to do everything at one time with right direction.
> So I'm trying to make small prototype and see what I'm walking on the proper way.
>
> I tested pgbench to compare master branch with my patch.
>
> 0) Environment
>     - RHEL 7.4
>     - 16 cores
>     - 128 GB memory
>
> 1) Initialized with pgbench -i -s10
>
> 2) benchmarked 3 times for each conditions and got the average result of TPS.
>                                       |master branch | prototype      | proto/master (%)
>     ------------------------------------------------------------------------------------
>     pgbench -c48 -T60 -Msimple -S   | 131297       |130541       |101%
>     pgbench -c48 -T60 -Msimple      | 4956          |4965       |95%
>     pgbench -c48 -T60 -Mprepared -S |129688       |132538       |97%
>     pgbench -c48 -T60 -Mprepared    |5113       |4615       |84%
>
>    This result seems to show except for prepared protocol with "not only SELECT" it didn't make much difference.
>     
>
> What do you think about it?
> Before I dig deeper, I want to hear your thoughts.
>
> Best regards,
> Takeshi Ideriha
>

Hi,
I really think that we need to move to global caches (and especially  
catalog caches) in Postgres.
Modern NUMA servers may have hundreds of cores and to be able to utilize  
all of them, we may need to start large number (hundreds) of backends.
Memory overhead of local cache multiplied by 1000 can be quite significant.

But I am not sure that just using RW lock will be enough replace local  
cache with global.
I am quite skeptical concerning performance results you have provided.
Once dataset completely fits in memory (which is true in your case),  
select-only pgbench with prepared statements should be about two times  
faster,
than without prepared statements. And in your case performance with  
prepared statements is even worser.

I wonder if you have repeated each measurement multiple time, to make  
sure that it is not just a fluctuation.
Also which postgresql configuration you have used. If it is default  
postgresql.conf with 128Mb shared buffers size,
then you are measuring time of disk access and catalog cache is not  
relevant for performance in this case.

Below are result I got with pgbench scale 100 (with scale 10 results are  
slightly better) at my desktop with just 16Gb of RAM and 4 ccore.:

                                    |master branch | prototype      | proto/master (%)
    ------------------------------------------------------------------------------------
    pgbench -c10 -T60 -Msimple -S   | 187189       |182123       |97%
    pgbench -c10 -T60 -Msimple      | 15495          |15112       |97%
    pgbench -c10 -T60 -Mprepared -S | 98273       |92810          |94%
    pgbench -c10 -T60 -Mprepared    | 25796       |25169       |97%

As you see there are no surprises here: negative effect of shared cache  
is the largest for the case of non-prepared selects
(because selects themselves are much faster than updates and during  
compilation we have to access relations multiple times).










-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: Expression errors with "FOR UPDATE" and postgres_fdw with partitionwise join enabled.
Next
From: Peter Eisentraut
Date:
Subject: Re: Test-cases for deferred constraints in plpgsql_transaction.sql