Thread: Does RelCache/SysCache shrink except when relations are deleted?
Hello, Please let me ask you some questions about RelCache/SysCache/CatCache design. I know I should post this to pgsql-general, but I decided to post here because the content includes design questions. <<Background>> My customer is facing a "out of memory" problem during a batch job. I'd like to know the cause and solutions. PostgreSQL version is 8.2.7 (32-bit on Linux). The batch job consists of two steps in a single psql session: 1. call some PL/pgSQL function (say "somefunc" here) 2. VACUUM tables (at this time, maintenance_work_mem=256MB) The step 2 emitted the following messages in syslog. ERROR: out of memory DETAIL: Failed on request of size 268435452. STATEMENT: VACUUM some_table_name somefunc copies rows from a single table to 100,000 tables (table_1 - table_100000) as follows: [somefunc] FOR id in 1 .. 100000 LOOP check if the table "table_${ID}" exists by searching pg_class if the table exists INSERTINTO table_${id} SELECT * FROM some_table WHERE pk = id; else /* the table does not exist */ CREATETABLE table_${id} AS SELECT * FROM some_table WHERE pk = id; END LOOP; Before starting somefunc, the virtual memory of the backend postgres is 1.6GB, as reported by top command as "VIRT" column. When somefunc completes, it becomes 2.6GB. So, VACUUM cannot allocate 256MB because the virtual memory space is full. This is all the information I have now. I requested the customer to collect PostgreSQL server log so that memory context statistics can be obtained when "out of memory" occurs. Plus, I asked for the result of "SHOW ALL" and the minimal procedure to reproduce the problem. However, I'd like to ask your opinions rather than waiting for the problem to happen again. <<Question>> I'm guessing that CacheMemoryContext might be using much memory, because somefunc accesses as many as 100,000 tables. But I don't understand RelCache/SysCache implementation yet. Q1: When are the RelCache/SysCache entries removed from CacheMemoryContext? Are they removed only when the corresponding relations are deleted? If so, "many tables and indexes" is not friendly for the current PostgreSQL? Q2: somefunc increased 1GB of virtual memory after accessing 100,000 tables. This means that one table uses 10KB of local memory. Is it common that this much memory is used for RelCache/SysCache or other control information? Does the number of attributes in a table affect local memory usage much? Q3: I think one solution is to run VACUUM in a separate psql session. Are there any other solutions you can think of? Q4: The customer says one strange thing. If the 100,000 tables exist before somefunc starts (i.e., somefunc just copy records), the virtual memory of postgres does not increase. Is there anything to reason about his comment? Regards MauMau
2011/9/28 MauMau <maumau307@gmail.com>: > Hello, > > Please let me ask you some questions about RelCache/SysCache/CatCache > design. I know I should post this to pgsql-general, but I decided to post > here because the content includes design questions. > > <<Background>> > My customer is facing a "out of memory" problem during a batch job. I'd like > to know the cause and solutions. PostgreSQL version is 8.2.7 (32-bit on > Linux). > > The batch job consists of two steps in a single psql session: > > 1. call some PL/pgSQL function (say "somefunc" here) > 2. VACUUM tables (at this time, maintenance_work_mem=256MB) > > The step 2 emitted the following messages in syslog. > > ERROR: out of memory > DETAIL: Failed on request of size 268435452. > STATEMENT: VACUUM some_table_name > > somefunc copies rows from a single table to 100,000 tables (table_1 - > table_100000) as follows: > > [somefunc] > FOR id in 1 .. 100000 LOOP > check if the table "table_${ID}" exists by searching pg_class > if the table exists > INSERT INTO table_${id} SELECT * FROM some_table > WHERE pk = id; > else /* the table does not exist */ > CREATE TABLE table_${id} AS SELECT * FROM some_table > WHERE pk = id; > END LOOP; > > Before starting somefunc, the virtual memory of the backend postgres is > 1.6GB, as reported by top command as "VIRT" column. When somefunc completes, > it becomes 2.6GB. So, VACUUM cannot allocate 256MB because the virtual > memory space is full. > > This is all the information I have now. I requested the customer to collect > PostgreSQL server log so that memory context statistics can be obtained when > "out of memory" occurs. Plus, I asked for the result of "SHOW ALL" and the > minimal procedure to reproduce the problem. However, I'd like to ask your > opinions rather than waiting for the problem to happen again. > > > <<Question>> > I'm guessing that CacheMemoryContext might be using much memory, because > somefunc accesses as many as 100,000 tables. But I don't understand > RelCache/SysCache implementation yet. > > Q1: When are the RelCache/SysCache entries removed from CacheMemoryContext? > Are they removed only when the corresponding relations are deleted? If so, > "many tables and indexes" is not friendly for the current PostgreSQL? > > Q2: somefunc increased 1GB of virtual memory after accessing 100,000 tables. > This means that one table uses 10KB of local memory. > Is it common that this much memory is used for RelCache/SysCache or other > control information? > Does the number of attributes in a table affect local memory usage much? > > Q3: I think one solution is to run VACUUM in a separate psql session. > Are there any other solutions you can think of? > > Q4: The customer says one strange thing. If the 100,000 tables exist before > somefunc starts (i.e., somefunc just copy records), the virtual memory of > postgres does not increase. > Is there anything to reason about his comment? can we see all of your memory settings plus physical memory? the solution is probably going to be reducing shared buffers an/or adding physical memory. merlin
From: "Merlin Moncure" <mmoncure@gmail.com> can we see all of your memory settings plus physical memory? the solution is probably going to be reducing shared buffers an/or adding physical memory. Thank you for your response. The amount of physical memory is 8GB, which is enough for the workload. I asked the customer for the output of "SHOW ALL", but I haven't received it yet. However, shared_buffers should be less than 1.6GB because, as I wrote in the previous mail, top command showed 1.6GB in "VIRT" column before executing somefunc() PL/pgSQL function. The direct cause of "out of memory" is that the virtual memory became full. 32-bit Linux can allocate 3GB of user space in the virtual address space of each process. somefunc() used 1.0GB, which led to 2.6GB of virtual memory. After somefunc(), VACUUM tried to allocate 256MB of maintenance_work_mem. That allocation failed because the virtual address space was almost full. As you mentioned, decreasing shared_buffers will be one of the solutions. However, we want to know why somefunc() uses so much memory. Therefore, the following is the core question. Q2 and Q3 are supplementary ones. It is just my guess that RelCache/SysCache may be the cause. 2011/9/28 MauMau <maumau307@gmail.com>: > Q1: When are the RelCache/SysCache entries removed from > CacheMemoryContext? > Are they removed only when the corresponding relations are deleted? If so, > "many tables and indexes" is not friendly for the current PostgreSQL? Regards MauMau
Excerpts from MauMau's message of jue sep 29 09:23:48 -0300 2011: > The amount of physical memory is 8GB, which is enough for the workload. I > asked the customer for the output of "SHOW ALL", but I haven't received it > yet. However, shared_buffers should be less than 1.6GB because, as I wrote > in the previous mail, top command showed 1.6GB in "VIRT" column before > executing somefunc() PL/pgSQL function. You don't really know this; some operating systems (Linux in particular) does not show shared memory as in use by a process until it is accessed. It may very well have well over 1.6 GB of shared_buffers, yet not show that in VIRT. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
From: "Alvaro Herrera" <alvherre@commandprompt.com> > You don't really know this; some operating systems (Linux in particular) > does not show shared memory as in use by a process until it is accessed. > It may very well have well over 1.6 GB of shared_buffers, yet not show > that in VIRT. Oh, really? When I started psql just after I set shared_buffers to 2500MB and ran pg_ctl start, "ps -o vsz -p postgres_PID" showed about 2500MB+some. ps's vsz is also the amount of virtual memory. But I want to know the shared_buffers setting. Anyway, I'd appreciate if anyone could tell me about RelCache/SysCache. As far as I read the code, PostgreSQL seems to use memory for RelCache/SysCache without limit until the relations are dropped. Regards MauMau
On Thu, Sep 29, 2011 at 7:23 AM, MauMau <maumau307@gmail.com> wrote: > From: "Merlin Moncure" <mmoncure@gmail.com> > can we see all of your memory settings plus physical memory? the > solution is probably going to be reducing shared buffers an/or adding > physical memory. > > Thank you for your response. > > The amount of physical memory is 8GB, which is enough for the workload. I > asked the customer for the output of "SHOW ALL", but I haven't received it > yet. However, shared_buffers should be less than 1.6GB because, as I wrote > in the previous mail, top command showed 1.6GB in "VIRT" column before > executing somefunc() PL/pgSQL function. > > The direct cause of "out of memory" is that the virtual memory became full. > 32-bit Linux can allocate 3GB of user space in the virtual address space of > each process. somefunc() used 1.0GB, which led to 2.6GB of virtual memory. > After somefunc(), VACUUM tried to allocate 256MB of maintenance_work_mem. > That allocation failed because the virtual address space was almost full. > > As you mentioned, decreasing shared_buffers will be one of the solutions. > However, we want to know why somefunc() uses so much memory. Therefore, the > following is the core question. Q2 and Q3 are supplementary ones. It is just > my guess that RelCache/SysCache may be the cause. Oh -- I missed earlier that this was 32 bit o/s. Well, I'd consider drastically reducing shared buffers, down to say 256-512mb range. Postgres function plans and various other structures, tables, attributes are indeed cached and can use up a considerable amount of memory in pathological cases -- this is largely depending on the number of tables/views, number of functions and number of connections.I briefly looked at the relcache etc a little whileback on a related complaint and the takeaway is that the caching is heavy handed and fairly brute force but legit and a huge win for most cases. This stuff lives in the cache memory context and a couple of users (not that many) have bumped into high memory usage. Solutions tend to include: *) not rely on implementation that requires 100000 tables *) use connection pooler *) reset connections *) go to 64 bit o/s *) reduce shared_buffers for leaner memory profile (especially in 32 bit os) Like I said, this doesn't really come up this often but the 'real' solution in terms of postgrs is probably some kind of upper bound in the amount of cache memory used plus some intelligence in the cache implementation. This is tricky stuff though and so far no credible proposals have been made and the demand for the feature is not very high. merlin
On Thu, Sep 29, 2011 at 9:39 AM, Merlin Moncure <mmoncure@gmail.com> wrote: > Like I said, this doesn't really come up this often but the 'real' > solution in terms of postgrs is probably some kind of upper bound in > the amount of cache memory used plus some intelligence in the cache > implementation. This is tricky stuff though and so far no credible > proposals have been made and the demand for the feature is not very > high. We (i.e. $EMPLOYER) have a customer who ran into this problem (i.e. relcache/syscache memory usage shooting through the roof) in testing, so I'm somewhat motivated to see if we can't come up with a fix. I am fairly sure that was on a 64-bit build, so the issue wasn't just that they didn't have enough address space. It seems that we used to have some kind of LRU algorithm to prevent excessive memory usage, but we rippped it out because it was too expensive (see commit 8b9bc234ad43dfa788bde40ebf12e94f16556b7f). I don't have a brilliant idea at the moment, but I wonder if we could come up with something that's cheap enough to manage that it doesn't materially affect performance in normal cases, but just kicks in when things get really out of control. A trivial algorithm would be - if you're about to run out of memory, flush all the caches; or evict 10% of the entries at random. Of course, the problem with anything like this is that it's hard to know when you're about to run out of memory before you actually do, and any hard-coded limit you care to set will sometimes be wrong. So maybe that's not the right approach. At the same time, I don't think that simply hoping the user has enough memory is an adequate answer. One thing to consider is that in some cases a user may plan to do something like touch every table in the database exactly once and then exit. In that case, if we knew in advance what the user's intentions were, we'd want to use an MRU eviction algorithm rather than LRU. Again, we don't know that in advance. But in such a use case it's reasonable for the user to expect that the amount of backend-private memory used for caching will not grow without bound. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
From: "Merlin Moncure" <mmoncure@gmail.com> -------------------------------------------------- Oh -- I missed earlier that this was 32 bit o/s. Well, I'd consider drastically reducing shared buffers, down to say 256-512mb range. Postgres function plans and various other structures, tables, attributes are indeed cached and can use up a considerable amount of memory in pathological cases -- this is largely depending on the number of tables/views, number of functions and number of connections.I briefly looked at the relcache etc a little whileback on a related complaint and the takeaway is that the caching is heavy handed and fairly brute force but legit and a huge win for most cases. This stuff lives in the cache memory context and a couple of users (not that many) have bumped into high memory usage. Solutions tend to include: *) not rely on implementation that requires 100000 tables *) use connection pooler *) reset connections *) go to 64 bit o/s *) reduce shared_buffers for leaner memory profile (especially in 32 bit os) Like I said, this doesn't really come up this often but the 'real' solution in terms of postgrs is probably some kind of upper bound in the amount of cache memory used plus some intelligence in the cache implementation. This is tricky stuff though and so far no credible proposals have been made and the demand for the feature is not very high. -------------------------------------------------- Thank you very much. I'm relieved I could understand the reason. I will report it to the customer and ask him to consider taking the following measures: * reduce shared_buffers * run somefunc() and VACUUM in different psql sessions * process 100,000 tables in multiple psql sessions Regards MauMau
On Thu, Sep 29, 2011 at 8:59 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Sep 29, 2011 at 9:39 AM, Merlin Moncure <mmoncure@gmail.com> wrote: >> Like I said, this doesn't really come up this often but the 'real' >> solution in terms of postgrs is probably some kind of upper bound in >> the amount of cache memory used plus some intelligence in the cache >> implementation. This is tricky stuff though and so far no credible >> proposals have been made and the demand for the feature is not very >> high. > > We (i.e. $EMPLOYER) have a customer who ran into this problem (i.e. > relcache/syscache memory usage shooting through the roof) in testing, > so I'm somewhat motivated to see if we can't come up with a fix. I am > fairly sure that was on a 64-bit build, so the issue wasn't just that > they didn't have enough address space. It seems that we used to have > some kind of LRU algorithm to prevent excessive memory usage, but we > rippped it out because it was too expensive (see commit > 8b9bc234ad43dfa788bde40ebf12e94f16556b7f). I don't have a brilliant > idea at the moment, but I wonder if we could come up with something > that's cheap enough to manage that it doesn't materially affect > performance in normal cases, but just kicks in when things get really > out of control. > > A trivial algorithm would be - if you're about to run out of memory, > flush all the caches; or evict 10% of the entries at random. Of > course, the problem with anything like this is that it's hard to know > when you're about to run out of memory before you actually do, and any > hard-coded limit you care to set will sometimes be wrong. So maybe > that's not the right approach. At the same time, I don't think that > simply hoping the user has enough memory is an adequate answer. > > One thing to consider is that in some cases a user may plan to do > something like touch every table in the database exactly once and then > exit. In that case, if we knew in advance what the user's intentions > were, we'd want to use an MRU eviction algorithm rather than LRU. > Again, we don't know that in advance. But in such a use case it's > reasonable for the user to expect that the amount of backend-private > memory used for caching will not grow without bound. I think this (cache memory usage) is a reasonable setting for a GUC, Maybe if you keep it very simple, say only activate cache cleanup when the limit is exceeded, you have more freedom to dump cache using fancier methods like a calculated benefit. You'd probably have to expose another knob to guarantee maximum cache sweep runtime though. Perhaps even user visible cache management features (an extension of DISCARD?) could be exposed... Hm, what might make this complicated is that you'd probably want all the various caches to live under the same umbrella with a central authority making decisions about what stays and what goes. On Thu, Sep 29, 2011 at 9:22 AM, MauMau <maumau307@gmail.com> wrote: > * reduce shared_buffers > * run somefunc() and VACUUM in different psql sessions > * process 100,000 tables in multiple psql sessions that's a start. don't be afraid to reset the connection after somefunc() and at appropriate times from the 'processors'. merlin
"MauMau" <maumau307@gmail.com> writes: > Anyway, I'd appreciate if anyone could tell me about RelCache/SysCache. As > far as I read the code, PostgreSQL seems to use memory for RelCache/SysCache > without limit until the relations are dropped. That's correct. We used to have a limit on the size of catcache (if memory serves, it was something like 5000 entries). We got rid of it after observing that performance fell off a cliff as soon as you had a working set larger than the cache limit. Trust me, if we had a limit, you'd still be here complaining, the complaint would just take a different form ;-) I concur with Merlin's advice to rethink your schema. 100000 tables is far beyond what any sane design could require, and is costing you on many levels (I'm sure the OS and filesystem aren't that happy with it either). regards, tom lane
Robert Haas <robertmhaas@gmail.com> writes: > ... It seems that we used to have > some kind of LRU algorithm to prevent excessive memory usage, but we > rippped it out because it was too expensive (see commit > 8b9bc234ad43dfa788bde40ebf12e94f16556b7f). Not only was it too expensive, but performance fell off a cliff as soon as you had a catalog working set large enough to cause the code to actually do something, I'm not in favor of putting anything like that back in ---- people who have huge catalogs will just start complaining about something different, ie, why did their apps get so much slower. The short answer here is "if you want a database with 100000 tables, you'd better be running it on more than desktop-sized hardware". regards, tom lane
From: "Tom Lane" <tgl@sss.pgh.pa.us> > That's correct. We used to have a limit on the size of catcache > (if memory serves, it was something like 5000 entries). We got rid of > it after observing that performance fell off a cliff as soon as you had > a working set larger than the cache limit. Trust me, if we had a limit, > you'd still be here complaining, the complaint would just take a > different form ;-) Yes, I can imagine. Now I'll believe that caching catalog entries in local memory without bound is one of PostgreSQL's elaborations for performance. 64-bit computing makes that approach legit. Oracle avoids duplicate catalog entries by storing them in a shared memory, but that should necessate some kind of locking when accessing the shared catalog entries. PostgreSQL's approach, which does not require locking, is better for many-core environments. > I concur with Merlin's advice to rethink your schema. 100000 tables is > far beyond what any sane design could require, and is costing you on > many levels (I'm sure the OS and filesystem aren't that happy with it > either). I agree. I'll suggest that to the customer, too. Thank you very much. Regards MauMau
On Thu, Sep 29, 2011 at 10:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> ... It seems that we used to have >> some kind of LRU algorithm to prevent excessive memory usage, but we >> rippped it out because it was too expensive (see commit >> 8b9bc234ad43dfa788bde40ebf12e94f16556b7f). > > Not only was it too expensive, but performance fell off a cliff as soon > as you had a catalog working set large enough to cause the code to > actually do something, ... Sure, a big working set is going to cause a performance problem if you start flushing cache entries that are being regularly used. But the point is just because you have, at some time, accessed 100,000 tables during a session does not mean that your working set is that large. The working set is the set of things that you are actually using regularly, not the things you've *ever* accessed. In addition to the problem of blowing out memory, there are a number of other things about the current code that don't seem well-suited to dealing with large numbers of tables. For example, catcache hash tables can't be resized, so for very large numbers of entries you can potentially have to walk a very long chain. And, you can exhaust the shared memory space for the primary lock table, leading to, for example, inability to back up the database using pg_dump (ouch!). I can't really explain why people seem to keep wanting to create hundreds of thousands or even millions of tables, but it's not like MauMau's customer is the first one to try to do this, and I'm sure they won't be the last. I don't want to de-optimize the more common (and sensible) cases too much, but "slow" still trumps "fails outright". -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Sep 29, 2011 at 10:22 AM, Robert Haas <robertmhaas@gmail.com> wrote: > I can't really explain why people seem to keep wanting to create > hundreds of thousands or even millions of tables, but it's not like > MauMau's customer is the first one to try to do this, and I'm sure > they won't be the last. I don't want to de-optimize the more common > (and sensible) cases too much, but "slow" still trumps "fails > outright". Yeah -- maybe baby steps in the right direction would be track cache memory usage and add instrumentation so the user could get a readout on usage -- this would also help us diagnose memory issues in the field. Also, thinking about it more, a DISCARD based cache flush (DISCARD CACHES TO xyz) wrapping a monolithic LRU sweep could help users deal with these cases without having to figure out how to make an implementation that pleases everyone. merlin
Hello,
It has been over a decade since the discussion of this email. I would like to know if there has been any proposal regarding the eviction of relcache/syscache.
We have recently encountered a similar memory problem: relcache/syscache keeps growing for the reason that connections have been reserved for a long time (one hour or more), and has more than 100000+ tables in the database.
Regards
Bowen Shi
On Wed, Dec 4, 2024 at 11:23 AM MauMau <maumau307@gmail.com> wrote:
Hello,
Please let me ask you some questions about RelCache/SysCache/CatCache
design. I know I should post this to pgsql-general, but I decided to post
here because the content includes design questions.
<<Background>>
My customer is facing a "out of memory" problem during a batch job. I'd like
to know the cause and solutions. PostgreSQL version is 8.2.7 (32-bit on
Linux).
The batch job consists of two steps in a single psql session:
1. call some PL/pgSQL function (say "somefunc" here)
2. VACUUM tables (at this time, maintenance_work_mem=256MB)
The step 2 emitted the following messages in syslog.
ERROR: out of memory
DETAIL: Failed on request of size 268435452.
STATEMENT: VACUUM some_table_name
somefunc copies rows from a single table to 100,000 tables (table_1 -
table_100000) as follows:
[somefunc]
FOR id in 1 .. 100000 LOOP
check if the table "table_${ID}" exists by searching pg_class
if the table exists
INSERT INTO table_${id} SELECT * FROM some_table
WHERE pk = id;
else /* the table does not exist */
CREATE TABLE table_${id} AS SELECT * FROM some_table
WHERE pk = id;
END LOOP;
Before starting somefunc, the virtual memory of the backend postgres is
1.6GB, as reported by top command as "VIRT" column. When somefunc completes,
it becomes 2.6GB. So, VACUUM cannot allocate 256MB because the virtual
memory space is full.
This is all the information I have now. I requested the customer to collect
PostgreSQL server log so that memory context statistics can be obtained when
"out of memory" occurs. Plus, I asked for the result of "SHOW ALL" and the
minimal procedure to reproduce the problem. However, I'd like to ask your
opinions rather than waiting for the problem to happen again.
<<Question>>
I'm guessing that CacheMemoryContext might be using much memory, because
somefunc accesses as many as 100,000 tables. But I don't understand
RelCache/SysCache implementation yet.
Q1: When are the RelCache/SysCache entries removed from CacheMemoryContext?
Are they removed only when the corresponding relations are deleted? If so,
"many tables and indexes" is not friendly for the current PostgreSQL?
Q2: somefunc increased 1GB of virtual memory after accessing 100,000 tables.
This means that one table uses 10KB of local memory.
Is it common that this much memory is used for RelCache/SysCache or other
control information?
Does the number of attributes in a table affect local memory usage much?
Q3: I think one solution is to run VACUUM in a separate psql session.
Are there any other solutions you can think of?
Q4: The customer says one strange thing. If the 100,000 tables exist before
somefunc starts (i.e., somefunc just copy records), the virtual memory of
postgres does not increase.
Is there anything to reason about his comment?
Regards
MauMau
From: Bowen Shi zxwsbg12138@gmail.com
> It has been over a decade since the discussion of this email. I would like to know if there has been any proposal regarding the eviction of relcache/syscache.
Ideriha proposed Global Catalog Cache or something (I don't remember the precise name) several years ago, which tries to relation metadata in shared memory, but it was so difficult that it didn't reach commit.
Regards,
MauMau