Thread: Postgres Connections Requiring Large Amounts of Memory
PostgreSQL Version: 7.2.3 OS : Red Hat 7.3 with Kernel 2.4.18-5 and SGI_XFS I currently have two processes which create several persistent connections to the database. One process primarily does inserts and the other primarily does selects. Both processes run 24/7. My problem is that the memory used by the connections appears to grow over time, especially when the amount of data entering the system is increased. The connections sometimes take up wards of 450 MB of memory causing other applications on the system to swap. Is there anyway to limit the amount of memory used by a given connection or is there something I may be doing that is requiring the connection to need more memory? -Dawn
Dawn Hollingsworth <dmh@airdefense.net> writes: > PostgreSQL Version: 7.2.3 > My problem is that the memory used by the connections appears to grow > over time, especially when the amount of data entering the system is > increased. We have fixed memory-leak problems in the past, and I wouldn't be surprised if some remain, but you'll have to give a lot more detail about what you're doing if you want help. A leak that persists across transaction boundaries is fairly surprising --- I think I can safely say that there are none in the normal code paths. I'm guessing you must be using some off-the-beaten-path feature. regards, tom lane
We have just recently hired a database consultant familiar with Postgres and just on his cursory glance we are not doing anything really crazy. There are two things which might be considered off the beaten path though: 1. We have tables that have over 500 columns which we continually insert into and select from. 2. Our stored procedures take more than 16 parameters so in the file config.h the value INDEX_MAX_KEYS was increased to 100. -Dawn On Mon, 2003-06-16 at 20:45, Tom Lane wrote: > Dawn Hollingsworth <dmh@airdefense.net> writes: > > PostgreSQL Version: 7.2.3 > > > My problem is that the memory used by the connections appears to grow > > over time, especially when the amount of data entering the system is > > increased. > > We have fixed memory-leak problems in the past, and I wouldn't be > surprised if some remain, but you'll have to give a lot more detail > about what you're doing if you want help. A leak that persists across > transaction boundaries is fairly surprising --- I think I can safely > say that there are none in the normal code paths. I'm guessing you must > be using some off-the-beaten-path feature. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings
Dawn Hollingsworth <dmh@airdefense.net> writes: > There are two things which might be considered off the beaten path > though: > 1. We have tables that have over 500 columns which we continually insert > into and select from. > 2. Our stored procedures take more than 16 parameters so in the file > config.h the value INDEX_MAX_KEYS was increased to 100. Neither of those raises a red flag with me. What would be useful to try to narrow things down is to look at the output of "MemoryContextStats(TopMemoryContext)" in a backend that's grown to a large size. This is a fairly primitive routine unfortunately; there is no built-in way to invoke it other than by calling it manually with a debugger, and it is only bright enough to write to stderr, not syslog. If you have stderr going somewhere useful (not /dev/null) and you built with debugging symbols, then you could attach to a running backend right now with gdb and get some useful info. If you don't have debugging symbols then you'll need to either rebuild with 'em, or create some other way to call the function. (There is a bit of stub code marked #ifdef SHOW_MEMORY_STATS in postgres.c that might be worth enabling, but I think it's only a sketch and won't compile as-is, since I don't see a ShowStats variable anywhere.) regards, tom lane
I installed postgres with debug compiled in and ran the same tests.
I attached gdb to a connection using just over 400MB( according to top) and ran "MemoryContextStats(TopMemoryContext)"
Here's the output:
TopMemoryContext: 49176 total in 6 blocks; 16272 free (44 chunks); 32904 used
TopTransactionContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
TransactionCommandContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
SPI Plan: 7168 total in 3 blocks; 3904 free (4 chunks); 3264 used
SPI Plan: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
SPI Plan: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
SPI Plan: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
SPI Plan: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
SPI Plan: 7168 total in 3 blocks; 3904 free (4 chunks); 3264 used
SPI Plan: 3072 total in 2 blocks; 864 free (0 chunks); 2208 used
SPI Plan: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
SPI Plan: 7168 total in 3 blocks; 3904 free (4 chunks); 3264 used
SPI Plan: 3072 total in 2 blocks; 720 free (0 chunks); 2352 used
SPI Plan: 1024 total in 1 blocks; 184 free (0 chunks); 840 used
SPI Plan: 261120 total in 8 blocks; 20416 free (0 chunks); 240704 used
SPI Plan: 7168 total in 3 blocks; 3904 free (4 chunks); 3264 used
SPI Plan: 261120 total in 8 blocks; 18456 free (0 chunks); 242664 used
SPI Plan: 7168 total in 3 blocks; 4016 free (5 chunks); 3152 used
SPI Plan: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
SPI Plan: 31744 total in 5 blocks; 15024 free (0 chunks); 16720 used
SPI Plan: 1024 total in 1 blocks; 184 free (0 chunks); 840 used
SPI Plan: 523264 total in 9 blocks; 80504 free (0 chunks); 442760 used
SPI Plan: 7168 total in 3 blocks; 3904 free (4 chunks); 3264 used
SPI Plan: 523264 total in 9 blocks; 79992 free (0 chunks); 443272 used
SPI Plan: 7168 total in 3 blocks; 4016 free (5 chunks); 3152 used
SPI Plan: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
SPI Plan: 7168 total in 3 blocks; 1816 free (0 chunks); 5352 used
SPI Plan: 1024 total in 1 blocks; 184 free (0 chunks); 840 used
SPI Plan: 261120 total in 8 blocks; 130824 free (3 chunks); 130296 used
SPI Plan: 7168 total in 3 blocks; 3904 free (4 chunks); 3264 used
SPI Plan: 261120 total in 8 blocks; 130032 free (0 chunks); 131088 used
SPI Plan: 7168 total in 3 blocks; 4016 free (5 chunks); 3152 used
SPI Plan: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
SPI Plan: 130048 total in 7 blocks; 36512 free (0 chunks); 93536 used
SPI Plan: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
SPI Plan: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
SPI Plan: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
SPI Plan: 1024 total in 1 blocks; 184 free (0 chunks); 840 used
SPI Plan: 130048 total in 7 blocks; 37976 free (0 chunks); 92072 used
SPI Plan: 7168 total in 3 blocks; 3904 free (4 chunks); 3264 used
SPI Plan: 130048 total in 7 blocks; 34688 free (0 chunks); 95360 used
SPI Plan: 7168 total in 3 blocks; 3904 free (4 chunks); 3264 used
SPI Plan: 3072 total in 2 blocks; 864 free (0 chunks); 2208 used
SPI Plan: 7168 total in 3 blocks; 3904 free (4 chunks); 3264 used
SPI Plan: 7168 total in 3 blocks; 3904 free (4 chunks); 3264 used
SPI Plan: 7168 total in 3 blocks; 3904 free (4 chunks); 3264 used
SPI Plan: 3072 total in 2 blocks; 536 free (0 chunks); 2536 used
SPI Plan: 7168 total in 3 blocks; 4016 free (5 chunks); 3152 used
SPI Plan: 7168 total in 3 blocks; 4016 free (5 chunks); 3152 used
SPI Plan: 7168 total in 3 blocks; 4016 free (5 chunks); 3152 used
SPI Plan: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
SPI Plan: 31744 total in 5 blocks; 15024 free (0 chunks); 16720 used
SPI Plan: 1024 total in 1 blocks; 184 free (0 chunks); 840 used
SPI Plan: 130048 total in 7 blocks; 26120 free (0 chunks); 103928 used
SPI Plan: 7168 total in 3 blocks; 3904 free (4 chunks); 3264 used
SPI Plan: 130048 total in 7 blocks; 24232 free (0 chunks); 105816 used
SPI Plan: 7168 total in 3 blocks; 3904 free (4 chunks); 3264 used
SPI Plan: 3072 total in 2 blocks; 128 free (0 chunks); 2944 used
SPI Plan: 7168 total in 3 blocks; 4016 free (5 chunks); 3152 used
SPI Plan: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
SPI Plan: 7168 total in 3 blocks; 1816 free (0 chunks); 5352 used
SPI Plan: 7168 total in 3 blocks; 1816 free (0 chunks); 5352 used
SPI Plan: 7168 total in 3 blocks; 1816 free (0 chunks); 5352 used
SPI Plan: 1024 total in 1 blocks; 184 free (0 chunks); 840 used
SPI Plan: 64512 total in 6 blocks; 24688 free (0 chunks); 39824 used
SPI Plan: 64512 total in 6 blocks; 23792 free (0 chunks); 40720 used
SPI Plan: 7168 total in 3 blocks; 3904 free (4 chunks); 3264 used
SPI Plan: 64512 total in 6 blocks; 23464 free (0 chunks); 41048 used
SPI Plan: 31744 total in 5 blocks; 10088 free (0 chunks); 21656 used
SPI Plan: 7168 total in 3 blocks; 3904 free (4 chunks); 3264 used
SPI Plan: 31744 total in 5 blocks; 8576 free (0 chunks); 23168 used
SPI Plan: 7168 total in 3 blocks; 4016 free (5 chunks); 3152 used
SPI Plan: 3096 total in 2 blocks; 8 free (0 chunks); 3088 used
SPI Plan: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
SPI Plan: 1024 total in 1 blocks; 184 free (0 chunks); 840 used
SPI Plan: 31744 total in 5 blocks; 6672 free (0 chunks); 25072 used
SPI Plan: 3072 total in 2 blocks; 1840 free (0 chunks); 1232 used
SPI Plan: 1024 total in 1 blocks; 256 free (0 chunks); 768 used
QueryContext: 24576 total in 2 blocks; 15304 free (56 chunks); 9272 used
DeferredTriggerSession: 0 total in 0 blocks; 0 free (0 chunks); 0 used
PortalMemory: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
CacheMemoryContext: 2108952 total in 10 blocks; 1070136 free (3338 chunks); 1038816 used
bss_pkey: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_epoch_sensor_10_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_epoch_sensor_9_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_epoch_sensor_8_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_epoch_sensor_7_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_epoch_sensor_6_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_epoch_sensor_5_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_epoch_sensor_4_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_epoch_sensor_3_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_epoch_sensor_2_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_epoch_sensor_1_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_epoch_bss_id_idx: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
station_epochint_idx: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
station_epoch_pkey: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
station_sensor_10_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_sensor_9_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_sensor_8_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_sensor_7_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_sensor_6_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_sensor_5_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_sensor_4_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_sensor_3_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_sensor_2_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_sensor_1_idx: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_lastseenint_idx: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
station_lastseen_idx: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
station_firstseenint_idx: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
station_firstseen_idx: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
station_pkey_idx: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
station_cfg_view_pkey: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
bss_cfg_view_pkey: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
sensor_cfg_view_pk: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
station_epoch_sum_pk_idx: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_index_indrelid_index: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
pg_relcheck_rcrelid_index: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
pg_attrdef_adrelid_adnum_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_shadow_usesysid_index: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
pg_trigger_tgrelid_index: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
pg_language_oid_index: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
pg_proc_oid_index: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
pg_aggregate_name_type_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_type_oid_index: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
pg_proc_proname_narg_type_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_amop_opc_strategy_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_operator_oid_index: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
pg_amproc_opc_procnum_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
pg_index_indexrelid_index: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
pg_type_typname_index: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
pg_class_oid_index: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
pg_class_relname_index: 1024 total in 1 blocks; 680 free (0 chunks); 344 used
pg_attribute_relid_attnum_index: 1024 total in 1 blocks; 392 free (0 chunks); 632 used
MdSmgr: 8192 total in 1 blocks; 4072 free (1 chunks); 4120 used
DynaHash: 8192 total in 1 blocks; 6944 free (0 chunks); 1248 used
DynaHashTable: 8192 total in 1 blocks; 5080 free (0 chunks); 3112 used
DynaHashTable: 42008 total in 2 blocks; 6112 free (0 chunks); 35896 used
DynaHashTable: 8192 total in 1 blocks; 6112 free (0 chunks); 2080 used
DynaHashTable: 8192 total in 1 blocks; 3000 free (0 chunks); 5192 used
DynaHashTable: 8192 total in 1 blocks; 3000 free (0 chunks); 5192 used
DynaHashTable: 24576 total in 2 blocks; 13224 free (4 chunks); 11352 used
DynaHashTable: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
DynaHashTable: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
DynaHashTable: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
DynaHashTable: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
DynaHashTable: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
ErrorContext: 8192 total in 1 blocks; 8176 free (1 chunks); 16 used
Is there any other information I could provide that would be useful? I'm going to try to enable SHOW_MEMORY_STATS next.
-Dawn
On Mon, 2003-06-16 at 21:34, Tom Lane wrote:
> What would be useful to try to narrow things down is to look at the > output of "MemoryContextStats(TopMemoryContext)" in a backend that's > grown to a large size. This is a fairly primitive routine > unfortunately; there is no built-in way to invoke it other than by > calling it manually with a debugger, and it is only bright enough > to write to stderr, not syslog. If you have stderr going somewhere > useful (not /dev/null) and you built with debugging symbols, then you > could attach to a running backend right now with gdb and get some useful > info. If you don't have debugging symbols then you'll need to either > rebuild with 'em, or create some other way to call the function. > (There is a bit of stub code marked #ifdef SHOW_MEMORY_STATS in > postgres.c that might be worth enabling, but I think it's only a sketch > and won't compile as-is, since I don't see a ShowStats variable > anywhere.) > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster
Dawn Hollingsworth <dmh@airdefense.net> writes: > I attached gdb to a connection using just over 400MB( according to top) > and ran "MemoryContextStats(TopMemoryContext)" Hmm. This only seems to account for about 5 meg of space, which means either that lots of space is being used and released, or that the leak is coming from direct malloc calls rather than palloc. I doubt the latter though; we don't use too many direct malloc calls. On the former theory, could it be something like updating a large number of tuples in one transaction in a table with foreign keys? The pending-triggers list could have swelled up and then gone away again. The large number of SPI Plan contexts seems a tad fishy, and even more so the fact that some of them are rather large. They still only account for a couple of meg, so they aren't directly the problem, but perhaps they are related to the problem. I presume these came from either foreign-key triggers or something you've written in PL functions. Can you tell us more about what you use in that line? regards, tom lane
The database is used to store information for a network management application. Almost all the Primary Keys are MACADDR or MACADDR,TIMSTAMPTZ and the Foreign Keys are almost always on one MACADDR column with "ON UPDATE CASCADE ON DELETE CASCADE". It's not very complicated. I have not written any triggers of my own.
The connection I was looking at only does inserts and updates, no deletes. All database access is made through stored procedures using plpgsql. The stored procedures all work like:
table1( id MACADDR, ... Primary Key(id) )
table2( id MACADDR, mytime TIMESTAMPTZ, .... Primary Key(id, mytime), FOREIGN KEY(id) REFERENCES table1 ON UPDATE CASCADE ON DELETE CASCADE)
Update table1
if update row count = 0 then
insert into table1
end if
insert into table 2
I'm not starting any of my own transactions and I'm not calling stored procedures from withing stored procedures. The stored procedures do have large parameters lists, up to 100. The tables are from 300 to 500 columns. 90% of the columns are either INT4 or INT8. Some of these tables are inherited. Could that be causing problems?
- Dawn
> Hmm. This only seems to account for about 5 meg of space, which means > either that lots of space is being used and released, or that the leak > is coming from direct malloc calls rather than palloc. I doubt the > latter though; we don't use too many direct malloc calls. > > On the former theory, could it be something like updating a large > number of tuples in one transaction in a table with foreign keys? > The pending-triggers list could have swelled up and then gone away > again. > > The large number of SPI Plan contexts seems a tad fishy, and even more > so the fact that some of them are rather large. They still only account > for a couple of meg, so they aren't directly the problem, but perhaps > they are related to the problem. I presume these came from either > foreign-key triggers or something you've written in PL functions. Can > you tell us more about what you use in that line? > > regards, tom lane
Dawn Hollingsworth <dmh@airdefense.net> writes: > The database is used to store information for a network management > application. Almost all the Primary Keys are MACADDR or > MACADDR,TIMSTAMPTZ and the Foreign Keys are almost always on one MACADDR > column with "ON UPDATE CASCADE ON DELETE CASCADE". It's not very > complicated. I have not written any triggers of my own. > The connection I was looking at only does inserts and updates, no > deletes. All database access is made through stored procedures using > plpgsql. The stored procedures all work like: > table1( id MACADDR, ... Primary Key(id) ) > table2( id MACADDR, mytime TIMESTAMPTZ, .... Primary Key(id, mytime), > FOREIGN KEY(id) REFERENCES table1 ON UPDATE CASCADE ON DELETE CASCADE) > Update table1 > if update row count = 0 then > insert into table1 > end if > insert into table 2 > I'm not starting any of my own transactions and I'm not calling stored > procedures from withing stored procedures. The stored procedures do have > large parameters lists, up to 100. The tables are from 300 to 500 > columns. 90% of the columns are either INT4 or INT8. Some of these > tables are inherited. Could that be causing problems? The only theory I can come up with is that the deferred trigger list is getting out of hand. Since you have foreign keys in all the tables, each insert or update is going to add a trigger event to the list of stuff to check at commit. The event entries aren't real large but they could add up if you insert or update a lot of stuff in a single transaction. How many rows do you process per transaction? regards, tom lane
Each stored procedure only updates one row and inserts one row.
I just connected the user interface to the database. It only does selects on startup. It's connection jumped to a memory usage of 256M. It's not getting any larger but it's not getting any smaller either.
I'm going to compile postgres with the SHOW_MEMORY_STATS. I'm assuming I can just set ShowStats equal to 1. I'll also pare down the application to only use one of the stored procedures for less noise and maybe I can track where memory might be going. And in the meantime I'll get a test going with Postgres 7.3 to see if I get the same behavior.
Any other suggestions?
-Dawn
On Tue, 2003-06-17 at 22:03, Tom Lane wrote:
> The only theory I can come up with is that the deferred trigger list is > getting out of hand. Since you have foreign keys in all the tables, > each insert or update is going to add a trigger event to the list of > stuff to check at commit. The event entries aren't real large but they > could add up if you insert or update a lot of stuff in a single > transaction. How many rows do you process per transaction? > > regards, tom lane
Dawn Hollingsworth <dmh@airdefense.net> writes: > I just connected the user interface to the database. It only does > selects on startup. It's connection jumped to a memory usage of 256M. > It's not getting any larger but it's not getting any smaller either. Um, are you sure that's actual memory usage? On some platforms "top" seems to count the Postgres shared memory block as part of the address space of each backend. How big is your shared memory block? (ipcs may help here) regards, tom lane
----- Original Message ----- From: "Dawn Hollingsworth" <dmh@airdefense.net> Sent: Tuesday, June 17, 2003 11:42 AM > I'm not starting any of my own transactions and I'm not calling stored > procedures from withing stored procedures. The stored procedures do have > large parameters lists, up to 100. The tables are from 300 to 500 Geez! I don't think it'll help you find the memory leak (if any), but couldn't you normalize the tables to smaller ones? That may be a pain when updating (views and rules), but I think it'd worth in resources (time and memory, but maybe not disk space). I wonder what is the maximum number of updated cols and the minimum correlation between their semantics in a single transaction (i.e. one func call), since there are "only" 100 params for a proc. > columns. 90% of the columns are either INT4 or INT8. Some of these > tables are inherited. Could that be causing problems? Huh. It's still 30-50 columns (a size of a fairly large table for me) of other types :) G. ------------------------------- cut here -------------------------------