Re: Cache relation sizes? - Mailing list pgsql-hackers
From | andres@anarazel.de |
---|---|
Subject | Re: Cache relation sizes? |
Date | |
Msg-id | 20190206085652.yfmh6d5nm475jkt2@alap3.anarazel.de Whole thread Raw |
In response to | RE: Cache relation sizes? ("Jamison, Kirk" <k.jamison@jp.fujitsu.com>) |
Responses |
RE: Cache relation sizes?
|
List | pgsql-hackers |
On 2019-02-06 08:50:45 +0000, Jamison, Kirk wrote: > On February 6, 2019, 08:25AM +0000, Kyotaro HORIGUCHI wrote: > > >At Wed, 6 Feb 2019 06:29:15 +0000, "Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com> wrote: > >> Although I haven't looked deeply at Thomas's patch yet, there's currently no place to store the size per relation inshared memory. You have to wait for the global metacache that Ideriha-san is addressing. Then, you can store the relationsize in the RelationData structure in relcache. > > >Just one counter in the patch *seems* to give significant gain comparing to the complexity, given that lseek is so complexor it brings latency, especially on workloads where file is scarcely changed. Though I didn't run it on a test bench. > > > > > (2) Is the MdSharedData temporary or permanent in shared memory? > > > Permanent in shared memory. > > I'm not sure the duration of the 'permanent' there, but it disappears when server stops. Anyway it doesn't need to bepermanent beyond a server restart. > > > Thank you for the insights. > I did a simple test in the previous email using simple syscall tracing, > the patch significantly reduced the number of lseek syscall. > (but that simple test might not be enough to describe the performance benefit) > > Regarding Tsunakawa-san's comment, > in Thomas' patch, he made a place in shared memory that stores the > relsize_change_counter, so I am thinking of utilizing the same, > but for caching the relsize itself. > > Perhaps I should explain further the intention for the design. > > First step, to cache the file size in the shared memory. Similar to the > intention or purpose of the patch written by Thomas, to reduce the > number of lseek(SEEK_END) by caching the relation size without using > additional locks. The difference is by caching a rel size on the shared > memory itself. I wonder if there are problems that you can see with > this approach. > > Eventually, the next step is to have a structure in shared memory > that caches file addresses along with their sizes (Andres' idea of > putting an open relations table in the shared memory). With a > structure that group buffers into file relation units, we can get > file size directly from shared memory, so the assumption is it would > be faster whenever we truncate/extend our relations because we can > track the offset of the changes in size and use range for managing > the truncation, etc.. > The next step is a complex direction that needs serious discussion, > but I am wondering if we can proceed with the first step for now if > the idea and direction are valid. Maybe I'm missing something here, but why is it actually necessary to have the sizes in shared memory, if we're just talking about caching sizes? It's pretty darn cheap to determine the filesize of a file that has been recently stat()/lseek()/ed, and introducing per-file shared data adds *substantial* complexity, because the amount of shared memory needs to be pre-determined. The reason I want to put per-relation data into shared memory is different, it's about putting the buffer mapping into shared memory, and that, as a prerequisite, also need per-relation data. And there's a limit of the number of relation sthat need to be open (one per cached page at max), and space can be freed by evicting pages. Greetings, Andres Freund
pgsql-hackers by date: