Thread: Does RelCache/SysCache shrink except when relations are deleted?

Does RelCache/SysCache shrink except when relations are deleted?

From
"MauMau"
Date:
Hello,

Please let me ask you some questions about RelCache/SysCache/CatCache 
design. I know I should post this to pgsql-general, but I decided to post 
here because the content includes design questions.

<<Background>>
My customer is facing a "out of memory" problem during a batch job. I'd like 
to know the cause and solutions. PostgreSQL version is 8.2.7 (32-bit on 
Linux).

The batch job consists of two steps in a single psql session:

1. call some PL/pgSQL function (say "somefunc" here)
2. VACUUM tables (at this time, maintenance_work_mem=256MB)

The step 2 emitted the following messages in syslog.

ERROR:  out of memory
DETAIL:  Failed on request of size 268435452.
STATEMENT:  VACUUM some_table_name

somefunc copies rows from a single table to 100,000 tables (table_1 - 
table_100000) as follows:

[somefunc]
FOR id in 1 .. 100000 LOOP   check if the table "table_${ID}" exists by searching pg_class   if the table exists
INSERTINTO table_${id} SELECT * FROM some_table           WHERE pk = id;   else /* the table does not exist */
CREATETABLE table_${id} AS SELECT * FROM some_table           WHERE pk = id;
 
END LOOP;

Before starting somefunc, the virtual memory of the backend postgres is 
1.6GB, as reported by top command as "VIRT" column. When somefunc completes, 
it becomes 2.6GB. So, VACUUM cannot allocate 256MB because the virtual 
memory space is full.

This is all the information I have now. I requested the customer to collect 
PostgreSQL server log so that memory context statistics can be obtained when 
"out of memory" occurs. Plus, I asked for the result of "SHOW ALL" and the 
minimal procedure to reproduce the problem. However, I'd like to ask your 
opinions rather than waiting for the problem to happen again.


<<Question>>
I'm guessing that CacheMemoryContext might be using much memory, because 
somefunc accesses as many as 100,000 tables. But I don't understand 
RelCache/SysCache implementation yet.

Q1: When are the RelCache/SysCache entries removed from CacheMemoryContext?
Are they removed only when the corresponding relations are deleted? If so, 
"many tables and indexes" is not friendly for the current PostgreSQL?

Q2: somefunc increased 1GB of virtual memory after accessing 100,000 tables. 
This means that one table uses 10KB of local memory.
Is it common that this much memory is used for RelCache/SysCache or other 
control information?
Does the number of attributes in a table affect local memory usage much?

Q3: I think one solution is to run VACUUM in a separate psql session.
Are there any other solutions you can think of?

Q4: The customer says one strange thing. If the 100,000 tables exist before 
somefunc starts (i.e., somefunc just copy records), the virtual memory of 
postgres does not increase.
Is there anything to reason about his comment?

Regards
MauMau



Re: Does RelCache/SysCache shrink except when relations are deleted?

From
Merlin Moncure
Date:
2011/9/28 MauMau <maumau307@gmail.com>:
> Hello,
>
> Please let me ask you some questions about RelCache/SysCache/CatCache
> design. I know I should post this to pgsql-general, but I decided to post
> here because the content includes design questions.
>
> <<Background>>
> My customer is facing a "out of memory" problem during a batch job. I'd like
> to know the cause and solutions. PostgreSQL version is 8.2.7 (32-bit on
> Linux).
>
> The batch job consists of two steps in a single psql session:
>
> 1. call some PL/pgSQL function (say "somefunc" here)
> 2. VACUUM tables (at this time, maintenance_work_mem=256MB)
>
> The step 2 emitted the following messages in syslog.
>
> ERROR:  out of memory
> DETAIL:  Failed on request of size 268435452.
> STATEMENT:  VACUUM some_table_name
>
> somefunc copies rows from a single table to 100,000 tables (table_1 -
> table_100000) as follows:
>
> [somefunc]
> FOR id in 1 .. 100000 LOOP
>   check if the table "table_${ID}" exists by searching pg_class
>   if the table exists
>       INSERT INTO table_${id} SELECT * FROM some_table
>           WHERE pk = id;
>   else /* the table does not exist */
>       CREATE TABLE table_${id} AS SELECT * FROM some_table
>           WHERE pk = id;
> END LOOP;
>
> Before starting somefunc, the virtual memory of the backend postgres is
> 1.6GB, as reported by top command as "VIRT" column. When somefunc completes,
> it becomes 2.6GB. So, VACUUM cannot allocate 256MB because the virtual
> memory space is full.
>
> This is all the information I have now. I requested the customer to collect
> PostgreSQL server log so that memory context statistics can be obtained when
> "out of memory" occurs. Plus, I asked for the result of "SHOW ALL" and the
> minimal procedure to reproduce the problem. However, I'd like to ask your
> opinions rather than waiting for the problem to happen again.
>
>
> <<Question>>
> I'm guessing that CacheMemoryContext might be using much memory, because
> somefunc accesses as many as 100,000 tables. But I don't understand
> RelCache/SysCache implementation yet.
>
> Q1: When are the RelCache/SysCache entries removed from CacheMemoryContext?
> Are they removed only when the corresponding relations are deleted? If so,
> "many tables and indexes" is not friendly for the current PostgreSQL?
>
> Q2: somefunc increased 1GB of virtual memory after accessing 100,000 tables.
> This means that one table uses 10KB of local memory.
> Is it common that this much memory is used for RelCache/SysCache or other
> control information?
> Does the number of attributes in a table affect local memory usage much?
>
> Q3: I think one solution is to run VACUUM in a separate psql session.
> Are there any other solutions you can think of?
>
> Q4: The customer says one strange thing. If the 100,000 tables exist before
> somefunc starts (i.e., somefunc just copy records), the virtual memory of
> postgres does not increase.
> Is there anything to reason about his comment?

can we see all of your memory settings plus physical memory?  the
solution is probably going to be reducing shared buffers an/or adding
physical memory.

merlin


Re: Does RelCache/SysCache shrink except when relations are deleted?

From
"MauMau"
Date:
From: "Merlin Moncure" <mmoncure@gmail.com>
can we see all of your memory settings plus physical memory?  the
solution is probably going to be reducing shared buffers an/or adding
physical memory.

Thank you for your response.

The amount of physical memory is 8GB, which is enough for the workload. I 
asked the customer for the output of "SHOW ALL", but I haven't received it 
yet. However, shared_buffers should be less than 1.6GB because, as I wrote 
in the previous mail, top command showed 1.6GB in "VIRT" column before 
executing somefunc() PL/pgSQL function.

The direct cause of "out of memory" is that the virtual memory became full. 
32-bit Linux can allocate 3GB of user space in the virtual address space of 
each process. somefunc() used 1.0GB, which led to 2.6GB of virtual memory. 
After somefunc(), VACUUM tried to allocate 256MB of maintenance_work_mem. 
That allocation failed because the virtual address space was almost full.

As you mentioned, decreasing shared_buffers will be one of the solutions. 
However, we want to know why somefunc() uses so much memory. Therefore, the 
following is the core question. Q2 and Q3 are supplementary ones. It is just 
my guess that RelCache/SysCache may be the cause.

2011/9/28 MauMau <maumau307@gmail.com>:
> Q1: When are the RelCache/SysCache entries removed from 
> CacheMemoryContext?
> Are they removed only when the corresponding relations are deleted? If so,
> "many tables and indexes" is not friendly for the current PostgreSQL?

Regards
MauMau



Re: Does RelCache/SysCache shrink except when relations are deleted?

From
Alvaro Herrera
Date:
Excerpts from MauMau's message of jue sep 29 09:23:48 -0300 2011:

> The amount of physical memory is 8GB, which is enough for the workload. I 
> asked the customer for the output of "SHOW ALL", but I haven't received it 
> yet. However, shared_buffers should be less than 1.6GB because, as I wrote 
> in the previous mail, top command showed 1.6GB in "VIRT" column before 
> executing somefunc() PL/pgSQL function.

You don't really know this; some operating systems (Linux in particular)
does not show shared memory as in use by a process until it is accessed.
It may very well have well over 1.6 GB of shared_buffers, yet not show
that in VIRT.

-- 
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Does RelCache/SysCache shrink except when relations are deleted?

From
"MauMau"
Date:
From: "Alvaro Herrera" <alvherre@commandprompt.com>
> You don't really know this; some operating systems (Linux in particular)
> does not show shared memory as in use by a process until it is accessed.
> It may very well have well over 1.6 GB of shared_buffers, yet not show
> that in VIRT.

Oh, really? When I started psql just after I set shared_buffers to 2500MB 
and ran pg_ctl start, "ps -o vsz -p postgres_PID" showed about 2500MB+some. 
ps's vsz is also the amount of virtual memory. But I want to know the 
shared_buffers setting.

Anyway, I'd appreciate if anyone could tell me about RelCache/SysCache. As 
far as I read the code, PostgreSQL seems to use memory for RelCache/SysCache 
without limit until the relations are dropped.

Regards
MauMau




Re: Does RelCache/SysCache shrink except when relations are deleted?

From
Merlin Moncure
Date:
On Thu, Sep 29, 2011 at 7:23 AM, MauMau <maumau307@gmail.com> wrote:
> From: "Merlin Moncure" <mmoncure@gmail.com>
> can we see all of your memory settings plus physical memory?  the
> solution is probably going to be reducing shared buffers an/or adding
> physical memory.
>
> Thank you for your response.
>
> The amount of physical memory is 8GB, which is enough for the workload. I
> asked the customer for the output of "SHOW ALL", but I haven't received it
> yet. However, shared_buffers should be less than 1.6GB because, as I wrote
> in the previous mail, top command showed 1.6GB in "VIRT" column before
> executing somefunc() PL/pgSQL function.
>
> The direct cause of "out of memory" is that the virtual memory became full.
> 32-bit Linux can allocate 3GB of user space in the virtual address space of
> each process. somefunc() used 1.0GB, which led to 2.6GB of virtual memory.
> After somefunc(), VACUUM tried to allocate 256MB of maintenance_work_mem.
> That allocation failed because the virtual address space was almost full.
>
> As you mentioned, decreasing shared_buffers will be one of the solutions.
> However, we want to know why somefunc() uses so much memory. Therefore, the
> following is the core question. Q2 and Q3 are supplementary ones. It is just
> my guess that RelCache/SysCache may be the cause.

Oh -- I missed earlier that this was 32 bit o/s.  Well, I'd consider
drastically reducing shared buffers, down to say 256-512mb range.
Postgres function plans and various other structures, tables,
attributes are indeed cached and can use up a considerable amount of
memory in pathological cases -- this is largely depending on the
number of tables/views, number of functions and number of connections.I briefly looked at the relcache etc a little
whileback on a related 
complaint and the takeaway is that the caching is heavy handed and
fairly brute force but legit and a huge win for most cases. This stuff
lives in the cache memory context and a couple of users (not that
many) have bumped into high memory usage.  Solutions tend to include:

*) not rely on implementation that requires 100000 tables
*) use connection pooler
*) reset connections
*) go to 64 bit o/s
*) reduce shared_buffers for leaner memory profile (especially in 32 bit os)


Like I said, this doesn't really come up this often but the 'real'
solution in terms of postgrs is probably some kind of upper bound in
the amount of cache memory used plus some intelligence in the cache
implementation.  This is tricky stuff though and so far no credible
proposals have been made and the demand for the feature is not very
high.

merlin


Re: Does RelCache/SysCache shrink except when relations are deleted?

From
Robert Haas
Date:
On Thu, Sep 29, 2011 at 9:39 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> Like I said, this doesn't really come up this often but the 'real'
> solution in terms of postgrs is probably some kind of upper bound in
> the amount of cache memory used plus some intelligence in the cache
> implementation.  This is tricky stuff though and so far no credible
> proposals have been made and the demand for the feature is not very
> high.

We (i.e. $EMPLOYER) have a customer who ran into this problem (i.e.
relcache/syscache memory usage shooting through the roof) in testing,
so I'm somewhat motivated to see if we can't come up with a fix.  I am
fairly sure that was on a 64-bit build, so the issue wasn't just that
they didn't have enough address space.  It seems that we used to have
some kind of LRU algorithm to prevent excessive memory usage, but we
rippped it out because it was too expensive (see commit
8b9bc234ad43dfa788bde40ebf12e94f16556b7f).  I don't have a brilliant
idea at the moment, but I wonder if we could come up with something
that's cheap enough to manage that it doesn't materially affect
performance in normal cases, but just kicks in when things get really
out of control.

A trivial algorithm would be - if you're about to run out of memory,
flush all the caches; or evict 10% of the entries at random.  Of
course, the problem with anything like this is that it's hard to know
when you're about to run out of memory before you actually do, and any
hard-coded limit you care to set will sometimes be wrong.  So maybe
that's not the right approach.  At the same time, I don't think that
simply hoping the user has enough memory is an adequate answer.

One thing to consider is that in some cases a user may plan to do
something like touch every table in the database exactly once and then
exit.  In that case, if we knew in advance what the user's intentions
were, we'd want to use an MRU eviction algorithm rather than LRU.
Again, we don't know that in advance.  But in such a use case it's
reasonable for the user to expect that the amount of backend-private
memory used for caching will not grow without bound.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Does RelCache/SysCache shrink except when relations are deleted?

From
"MauMau"
Date:
From: "Merlin Moncure" <mmoncure@gmail.com>
--------------------------------------------------
Oh -- I missed earlier that this was 32 bit o/s.  Well, I'd consider
drastically reducing shared buffers, down to say 256-512mb range.
Postgres function plans and various other structures, tables,
attributes are indeed cached and can use up a considerable amount of
memory in pathological cases -- this is largely depending on the
number of tables/views, number of functions and number of connections.I briefly looked at the relcache etc a little
whileback on a related
 
complaint and the takeaway is that the caching is heavy handed and
fairly brute force but legit and a huge win for most cases. This stuff
lives in the cache memory context and a couple of users (not that
many) have bumped into high memory usage.  Solutions tend to include:

*) not rely on implementation that requires 100000 tables
*) use connection pooler
*) reset connections
*) go to 64 bit o/s
*) reduce shared_buffers for leaner memory profile (especially in 32 bit os)


Like I said, this doesn't really come up this often but the 'real'
solution in terms of postgrs is probably some kind of upper bound in
the amount of cache memory used plus some intelligence in the cache
implementation.  This is tricky stuff though and so far no credible
proposals have been made and the demand for the feature is not very
high.
--------------------------------------------------


Thank you very much. I'm relieved I could understand the reason. I will 
report it to the customer and ask him to consider taking the following 
measures:

* reduce shared_buffers
* run somefunc() and VACUUM in different psql sessions
* process 100,000 tables in multiple psql sessions

Regards
MauMau




Re: Does RelCache/SysCache shrink except when relations are deleted?

From
Merlin Moncure
Date:
On Thu, Sep 29, 2011 at 8:59 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Sep 29, 2011 at 9:39 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> Like I said, this doesn't really come up this often but the 'real'
>> solution in terms of postgrs is probably some kind of upper bound in
>> the amount of cache memory used plus some intelligence in the cache
>> implementation.  This is tricky stuff though and so far no credible
>> proposals have been made and the demand for the feature is not very
>> high.
>
> We (i.e. $EMPLOYER) have a customer who ran into this problem (i.e.
> relcache/syscache memory usage shooting through the roof) in testing,
> so I'm somewhat motivated to see if we can't come up with a fix.  I am
> fairly sure that was on a 64-bit build, so the issue wasn't just that
> they didn't have enough address space.  It seems that we used to have
> some kind of LRU algorithm to prevent excessive memory usage, but we
> rippped it out because it was too expensive (see commit
> 8b9bc234ad43dfa788bde40ebf12e94f16556b7f).  I don't have a brilliant
> idea at the moment, but I wonder if we could come up with something
> that's cheap enough to manage that it doesn't materially affect
> performance in normal cases, but just kicks in when things get really
> out of control.
>
> A trivial algorithm would be - if you're about to run out of memory,
> flush all the caches; or evict 10% of the entries at random.  Of
> course, the problem with anything like this is that it's hard to know
> when you're about to run out of memory before you actually do, and any
> hard-coded limit you care to set will sometimes be wrong.  So maybe
> that's not the right approach.  At the same time, I don't think that
> simply hoping the user has enough memory is an adequate answer.
>
> One thing to consider is that in some cases a user may plan to do
> something like touch every table in the database exactly once and then
> exit.  In that case, if we knew in advance what the user's intentions
> were, we'd want to use an MRU eviction algorithm rather than LRU.
> Again, we don't know that in advance.  But in such a use case it's
> reasonable for the user to expect that the amount of backend-private
> memory used for caching will not grow without bound.

I think this (cache memory usage) is a reasonable setting for a GUC,
Maybe if you keep it very simple, say only activate cache cleanup when
the limit is exceeded, you have more freedom to dump cache using
fancier methods like a calculated benefit.  You'd probably have to
expose another knob to guarantee maximum cache sweep runtime though.
Perhaps even user visible cache management features  (an extension of
DISCARD?) could be exposed...

Hm, what might make this complicated is that you'd probably want all
the various caches to live under the same umbrella with a central
authority making decisions about what stays and what goes.

On Thu, Sep 29, 2011 at 9:22 AM, MauMau <maumau307@gmail.com> wrote:
> * reduce shared_buffers
> * run somefunc() and VACUUM in different psql sessions
> * process 100,000 tables in multiple psql sessions

that's a start.  don't be afraid to reset the connection after
somefunc() and at appropriate times from the 'processors'.

merlin


Re: Does RelCache/SysCache shrink except when relations are deleted?

From
Tom Lane
Date:
"MauMau" <maumau307@gmail.com> writes:
> Anyway, I'd appreciate if anyone could tell me about RelCache/SysCache. As 
> far as I read the code, PostgreSQL seems to use memory for RelCache/SysCache 
> without limit until the relations are dropped.

That's correct.  We used to have a limit on the size of catcache
(if memory serves, it was something like 5000 entries).  We got rid of
it after observing that performance fell off a cliff as soon as you had
a working set larger than the cache limit.  Trust me, if we had a limit,
you'd still be here complaining, the complaint would just take a
different form ;-)

I concur with Merlin's advice to rethink your schema.  100000 tables is
far beyond what any sane design could require, and is costing you on
many levels (I'm sure the OS and filesystem aren't that happy with it
either).
        regards, tom lane


Re: Does RelCache/SysCache shrink except when relations are deleted?

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> ... It seems that we used to have
> some kind of LRU algorithm to prevent excessive memory usage, but we
> rippped it out because it was too expensive (see commit
> 8b9bc234ad43dfa788bde40ebf12e94f16556b7f).

Not only was it too expensive, but performance fell off a cliff as soon
as you had a catalog working set large enough to cause the code to
actually do something,  I'm not in favor of putting anything like that
back in ---- people who have huge catalogs will just start complaining
about something different, ie, why did their apps get so much slower.

The short answer here is "if you want a database with 100000 tables,
you'd better be running it on more than desktop-sized hardware".
        regards, tom lane


Re: Does RelCache/SysCache shrink except when relations are deleted?

From
"MauMau"
Date:
From: "Tom Lane" <tgl@sss.pgh.pa.us>
> That's correct.  We used to have a limit on the size of catcache
> (if memory serves, it was something like 5000 entries).  We got rid of
> it after observing that performance fell off a cliff as soon as you had
> a working set larger than the cache limit.  Trust me, if we had a limit,
> you'd still be here complaining, the complaint would just take a
> different form ;-)

Yes, I can imagine. Now I'll believe that caching catalog entries in local 
memory without bound is one of PostgreSQL's elaborations for performance. 
64-bit computing makes that approach legit. Oracle avoids duplicate catalog 
entries by storing them in a shared memory, but that should necessate some 
kind of locking when accessing the shared catalog entries. PostgreSQL's 
approach, which does not require locking, is better for many-core 
environments.

> I concur with Merlin's advice to rethink your schema.  100000 tables is
> far beyond what any sane design could require, and is costing you on
> many levels (I'm sure the OS and filesystem aren't that happy with it
> either).

I agree. I'll suggest that to the customer, too. Thank you very much.

Regards
MauMau




Re: Does RelCache/SysCache shrink except when relations are deleted?

From
Robert Haas
Date:
On Thu, Sep 29, 2011 at 10:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> ... It seems that we used to have
>> some kind of LRU algorithm to prevent excessive memory usage, but we
>> rippped it out because it was too expensive (see commit
>> 8b9bc234ad43dfa788bde40ebf12e94f16556b7f).
>
> Not only was it too expensive, but performance fell off a cliff as soon
> as you had a catalog working set large enough to cause the code to
> actually do something,  ...

Sure, a big working set is going to cause a performance problem if you
start flushing cache entries that are being regularly used.  But the
point is just because you have, at some time, accessed 100,000 tables
during a session does not mean that your working set is that large.
The working set is the set of things that you are actually using
regularly, not the things you've *ever* accessed.

In addition to the problem of blowing out memory, there are a number
of other things about the current code that don't seem well-suited to
dealing with large numbers of tables.  For example, catcache hash
tables can't be resized, so for very large numbers of entries you can
potentially have to walk a very long chain.  And, you can exhaust the
shared memory space for the primary lock table, leading to, for
example, inability to back up the database using pg_dump (ouch!).

I can't really explain why people seem to keep wanting to create
hundreds of thousands or even millions of tables, but it's not like
MauMau's customer is the first one to try to do this, and I'm sure
they won't be the last.  I don't want to de-optimize the more common
(and sensible) cases too much, but "slow" still trumps "fails
outright".

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Does RelCache/SysCache shrink except when relations are deleted?

From
Merlin Moncure
Date:
On Thu, Sep 29, 2011 at 10:22 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> I can't really explain why people seem to keep wanting to create
> hundreds of thousands or even millions of tables, but it's not like
> MauMau's customer is the first one to try to do this, and I'm sure
> they won't be the last.  I don't want to de-optimize the more common
> (and sensible) cases too much, but "slow" still trumps "fails
> outright".

Yeah -- maybe baby steps in the right direction would be track cache
memory usage and add instrumentation so the user could get a readout
on usage -- this would also help us diagnose memory issues in the
field.  Also, thinking about it more, a DISCARD based cache flush
(DISCARD CACHES TO xyz) wrapping a monolithic LRU sweep could help
users deal with these cases without having to figure out how to make
an implementation that pleases everyone.

merlin