Thread: Dynamically sizing FSM?

Dynamically sizing FSM?

From
Josh Berkus
Date:
All,

Hey, is there any good reason why FSM is sized by a static GUC variable?   
Why couldn't we just automatically have the system use as much memory as 
it needs for FSM, provided that it's not more than some reasonable limit, 
like 15% of shared memory?

Seems like that would eliminate one area of user confusion, as well as 
over-allocation.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco


Re: Dynamically sizing FSM?

From
Bruce Momjian
Date:
Josh Berkus wrote:
> All,
> 
> Hey, is there any good reason why FSM is sized by a static GUC variable?   
> Why couldn't we just automatically have the system use as much memory as 
> it needs for FSM, provided that it's not more than some reasonable limit, 
> like 15% of shared memory?
> 
> Seems like that would eliminate one area of user confusion, as well as 
> over-allocation.

I don't think any of our shared memory segments auto-size.  What would
you take memory from to increase FSM?

--  Bruce Momjian   bruce@momjian.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Dynamically sizing FSM?

From
Gregory Stark
Date:
"Bruce Momjian" <bruce@momjian.us> writes:

> Josh Berkus wrote:
>> All,
>> 
>> Hey, is there any good reason why FSM is sized by a static GUC variable?   
>> Why couldn't we just automatically have the system use as much memory as 
>> it needs for FSM, provided that it's not more than some reasonable limit, 
>> like 15% of shared memory?
>> 
>> Seems like that would eliminate one area of user confusion, as well as 
>> over-allocation.
>
> I don't think any of our shared memory segments auto-size.  What would
> you take memory from to increase FSM?

The obvious answer to this question is the shared buffer cache.

The real problem is that we don't have, and don't particularly want, a memory
manager for the shared memory. So where and how do you keep track of which
memory is being used for what?

You could sort of get it for free by just using the buffer manager to open FSM
data files -- even getting spilling to disk of FSM data for rarely used
relations for free. But then you would be fighting so much machinery, for
example, log flushing buffers before flushing, that it might be easier to just
have a separate data structure.

I think replacing the FSM with something more flexible is on several
developers' long-term todo lists, but it's not entirely clear yet -- at least
to me -- what features we need. Someone working on vacuum or bgwriter
improvements will probably find the FSM a stumbling block along the way and
know better what needs to be done to it.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com


Re: Dynamically sizing FSM?

From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes:
> The real problem is that we don't have, and don't particularly want, a memory
> manager for the shared memory.

No, the real problem is that you can't re-size a SysV shared memory
segment on the fly --- there's no portable API for that, anyway.
Therefore there's not much point in having dynamic memory management
within the segment: you pretty much have to predetermine the total size
of each structure you want to have in shared memory, so that you know
what size segment to create in the first place.

I'm of the opinion that the solution to FSM being fixed-size is to keep
it somewhere else, ie, on disk (possibly with some sort of cache in
shared memory for currently-used entries).
        regards, tom lane


Re: Dynamically sizing FSM?

From
ITAGAKI Takahiro
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I'm of the opinion that the solution to FSM being fixed-size is to keep
> it somewhere else, ie, on disk (possibly with some sort of cache in
> shared memory for currently-used entries).

What do you think dynamic allocation from shared_buffers? ie, remove
a buffer page in the shared buffer pool and use the 8kB of memory
for another purpose. To be sure, we don't free from out-of-FSM-memory,
but it can get rid of deciding the amount of FSM buffers.

I think we could use the above as "shared memory allocator".
It is useful for Dead Space Map, shared prepared statements, and so on.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center




Re: Dynamically sizing FSM?

From
Tom Lane
Date:
ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I'm of the opinion that the solution to FSM being fixed-size is to keep
>> it somewhere else, ie, on disk (possibly with some sort of cache in
>> shared memory for currently-used entries).

> What do you think dynamic allocation from shared_buffers? ie, remove
> a buffer page in the shared buffer pool and use the 8kB of memory
> for another purpose.

The problem with that is that (a) it creates more contention load on the
shared buffer pool's management structures, and (b) if the chosen buffer
is dirty then you have a different subsystem trying to do buffer I/O,
which is at best a modularity bug and at worst a correctness or deadlock
problem.

We use separate buffer areas for xlog, clog, subtrans, etc than for the
main buffer arena.  I think it's a good idea to keep that approach for
any buffer space created for FSM.  It might represent a slightly
inefficient use of the shared memory as a whole, but it helps preserve
the developers' sanity ;-)
        regards, tom lane


Re: Dynamically sizing FSM?

From
"Takayuki Tsunakawa"
Date:
From: "ITAGAKI Takahiro" <itagaki.takahiro@oss.ntt.co.jp>
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>> I'm of the opinion that the solution to FSM being fixed-size is to
keep
>> it somewhere else, ie, on disk (possibly with some sort of cache in
>> shared memory for currently-used entries).
>
> What do you think dynamic allocation from shared_buffers? ie, remove
> a buffer page in the shared buffer pool and use the 8kB of memory
> for another purpose. To be sure, we don't free from
out-of-FSM-memory,
> but it can get rid of deciding the amount of FSM buffers.
> I think we could use the above as "shared memory allocator".
> It is useful for Dead Space Map, shared prepared statements, and so
on.

Yes! I'm completely in favor of Itagaki-san.  Separating the cache for
FSM may produce a new configuration parameter like fsm_cache_size,
which the normal users would not desire (unless they like enjoying
difficult DBMS.)
I think that integrating the treatment of space management structure
and data area is good.  That means, for example, implementing "Free
Space Table" described in section 14.2.2.1 of Jim Gray's book
"Transaction Processing: Concepts and Techniques", though it may have
been discussed in PostgreSQL community far long ago (really?).  Of
course, some refinements may be necessary to tune to PostgreSQL's
concept, say, creating one free space table file for each data file to
make the implementation easy.  It would reduce the source code solely
for FSM.

In addition, it would provide the transactional space management.  If
I understand correctly, in the current implementation, updates to FSM
are lost when the server crashes, aren't they?  The idea assumes that
FSM will be rebuilt by vacuum because vacuum is inevitable.  If
updates to space management area were made transactional, it might
provide the infrastructure for "vacuumless PostgreSQL."







Re: Dynamically sizing FSM?

From
ITAGAKI Takahiro
Date:
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> > What do you think dynamic allocation from shared_buffers? ie, remove
> > a buffer page in the shared buffer pool and use the 8kB of memory
> > for another purpose.
> 
> The problem with that is that (a) it creates more contention load on the
> shared buffer pool's management structures, and (b) if the chosen buffer
> is dirty then you have a different subsystem trying to do buffer I/O,
> which is at best a modularity bug and at worst a correctness or deadlock
> problem.

(a) I'm thinking that another hash table manages removed buffers.
Those buffers are marked with a new BM_SPECIAL flags or something
in BufferDesc->flags. We lookup them through module-specific hash
tables, so that buffer management hash tables (BufTable) are not used.

(b) Maybe we need a new abstraction layer under the buffer cache module.
A new "memory pool" subsystem will preserve our sanity.

+-- shared memory pool  <- no more than "a bank of memory" +-- page cache        <- currently called "shared buffers"
+--other modules using shared buffers
 


> It might represent a slightly
> inefficient use of the shared memory as a whole, but it helps preserve
> the developers' sanity ;-)

Yeah, I see. That's a bother :-)
But are there any requests to resize memory resources at runtime?
I want to use the dynamic shmem allocator for FSM and DSM
if available. If anyone want to use it for another purpose,
inventing it as a generalized form will be good.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center