Thread: postgres 8.4.9: running out of memory (malloc fails) when running a transaction that runs a LOT of selects

We yesterday encountered a program that in a degenerate case issued in
a single transaction a huge number of selects (in a single transaction
but each select in a separate call to PGExec) (huge = ~ 400,000).

That transaction would continue to eat memory up until a point where
calls to malloc (in aset.c) would fail and log for example:

,"out of memory","Failed on request of size 11."

Both from that transaction and random others.  In additional observation
of interest to us was that while both VIRT and RES was growing VIRT was
always roughly double of RES.  Which seems to indicate that whatever
allocations were done not all of the memory allocated was actually
touched (server is a Centos6 box).

So I have two questions:
  - Is that expected expected behaviour?  The transaction was    in READ_COMMITED mode, and my best guess is that this
implies   that some snapshot is taken before each subselect and all    of them are only freed once the transaction is
finished
  - Any recommendations on the use of overcommit?  We had it    disabled on the assumption that with overcommit the OOM
  killer might kill a random process and that it is better    to instead have a process that is actually allocating
fail   (and on the additional assumption that any memory allocated    by postgres would actually be touched).
 

This is not a huge production issue for us as we can fix the
program to no longer issue huge numbers of selects.  But we
would like to understand.

Thank you very much,

Bene

PS:
The machine has huge amounts of memory and during normal operations
it looks like this:

-bash-4.1$ cat /proc/meminfo
MemTotal:       49413544 kB
MemFree:         1547604 kB
Buffers:            5808 kB
Cached:         43777988 kB
SwapCached:            0 kB
Active:         18794732 kB
Inactive:       27309980 kB
Active(anon):   13786796 kB
Inactive(anon):  1411296 kB
Active(file):    5007936 kB
Inactive(file): 25898684 kB
Unevictable:       71928 kB
Mlocked:           55576 kB
SwapTotal:       2047992 kB
SwapFree:        2047992 kB
Dirty:             12100 kB
Writeback:         79684 kB
AnonPages:       2393372 kB
Mapped:         12887392 kB
Shmem:          12871676 kB
Slab:            1050468 kB
SReclaimable:     190068 kB
SUnreclaim:       860400 kB
KernelStack:        4832 kB
PageTables:       450584 kB
NFS_Unstable:      23312 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    26754764 kB
Committed_AS:   17394312 kB
VmallocTotal:   34359738367 kB
VmallocUsed:      120472 kB
VmallocChunk:   34333956900 kB
HardwareCorrupted:     0 kB
AnonHugePages:   1599488 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        5604 kB
DirectMap2M:     2078720 kB
DirectMap1G:    48234496 kB


On 20.07.2012 10:19, Benedikt Grundmann wrote:
> We yesterday encountered a program that in a degenerate case issued in
> a single transaction a huge number of selects (in a single transaction
> but each select in a separate call to PGExec) (huge = ~ 400,000).
>
> That transaction would continue to eat memory up until a point where
> calls to malloc (in aset.c) would fail and log for example:
>
> ,"out of memory","Failed on request of size 11."
>
> ...
>
>     - Is that expected expected behaviour?  The transaction was
>       in READ_COMMITED mode, and my best guess is that this implies
>       that some snapshot is taken before each subselect and all
>       of them are only freed once the transaction is finished

In more complicated scenarios, with e.g subtransactions, it's normal 
that there's some growth in memory usage like that. But with simple 
SELECTs, I don't think there should be.

Can you write a simple self-contained test script that reproduces this? 
That would make it easier to see where the memory is going.

PS, you should upgrade, the latest minor version in 8.4.x series is 
8.4.12. If possible, upgrading to a more recent major version would be a 
good idea too. I don't know if it will help with this problem, but it 
might..

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Hi,

On Friday, July 20, 2012 09:19:31 AM Benedikt Grundmann wrote:
> We yesterday encountered a program that in a degenerate case issued in
> a single transaction a huge number of selects (in a single transaction
> but each select in a separate call to PGExec) (huge = ~ 400,000).
> 
> That transaction would continue to eat memory up until a point where
> calls to malloc (in aset.c) would fail and log for example:
> 
> ,"out of memory","Failed on request of size 11."
Could you show us the different statements youre running in that transaction? 
Are you using any user defined functions, deferred foreign keys, or anything  
like that?

After youve got that "out of memory" message, the log should show a list of 
memory contexts with the amount of memory allocated in each. Could you include 
that in a mail?

Greetings,

Andres
-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services


First of all thanks to everyone who has replied so far.

On Fri, Jul 20, 2012 at 10:04 AM, Andres Freund <andres@2ndquadrant.com> wrote:
>
> Hi,
>
> On Friday, July 20, 2012 09:19:31 AM Benedikt Grundmann wrote:
> > We yesterday encountered a program that in a degenerate case
> > issued in a single transaction a huge number of selects (in a
> > single transaction but each select in a separate call to PGExec)
> > (huge = ~ 400,000).

> > That transaction would continue to eat memory up until a point
> > where calls to malloc (in aset.c) would fail and log for example:

> > ,"out of memory","Failed on request of size 11."
> Could you show us the different statements youre running in that transaction?

They all look like this:

DECLARE sqmlcursor51587 CURSOR FOR select

entry_time,source,bad_fields,isin,sedol,cusip,bloomberg,reuters,exchange_code,currency,description,bbg_instrument_type,instrument_type,specifics,definer,primary_exchange,is_primary_security,is_primary_listing,tags,bloomberg_id,status
from vw_instruments_v7 where jane_symbol = E'FOO BAR' and true and
effective_until = (select max(effective_until) from
vw_instruments_v7)"

Sorry I imagine that the fact that this generates a cursor every time
is important
but it had honestly escaped my attention, because the library we use to query
the database uses CURSORs basically for every select, so that it can process
the data in batches (in this particular case that is conceptually unnecessary as
the query will only return one row, but the library does not know that).

> Are you using any user defined functions, deferred foreign keys, or anything
> like that?

No there are several versions of the above view and while the one
mentioned above contains calls to regexp_replace I can reproduce
the same behaviour with a different version of the view that just
renames columns of the underlying table.

> After youve got that "out of memory" message, the log should show
> a list of memory contexts with the amount of memory allocated in
> each. Could you include that in a mail?

We are using csv logging, which through me off for a moment because I couldn't
find it in there.  But indeed in the .log file I see memory contexts but
I don't know how to correlate them.

I assume they only get written when out of memory happens, so I have included
below the very first one.

TopMemoryContext: 268528136 total in 31 blocks; 37640 free (160
chunks); 268490496 used TopTransactionContext: 24576 total in 2 blocks; 14872 free (4
chunks); 9704 used Local Buffer Lookup Table: 2088960 total in 8 blocks; 234944 free
(25 chunks); 1854016 used Type information cache: 24576 total in 2 blocks; 11888 free (5
chunks); 12688 used Operator lookup cache: 24576 total in 2 blocks; 11888 free (5
chunks); 12688 used PL/PgSQL function context: 24576 total in 2 blocks; 14384 free (14
chunks); 10192 used CFuncHash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used Rendezvous variable hash: 8192
totalin 1 blocks; 1680 free (0
 
chunks); 6512 used PLpgSQL function cache: 24520 total in 2 blocks; 3744 free (0
chunks); 20776 used Operator class cache: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used MessageContext:
131072total in 5 blocks; 54792 free (5 chunks); 76280 used smgr relation table: 24576 total in 2 blocks; 5696 free (4
chunks);18880 used TransactionAbortContext: 32768 total in 1 blocks; 32736 free (0
 
chunks); 32 used Portal hash: 8192 total in 1 blocks; 1680 free (0 chunks); 6512 used PortalMemory: 8192 total in 1
blocks;7888 free (1 chunks); 304 used   PortalHeapMemory: 1024 total in 1 blocks; 848 free (0 chunks); 176 used
ExecutorState:65600 total in 4 blocks; 33792 free (18 chunks); 31808 used       ExprContext: 0 total in 0 blocks; 0
free(0 chunks); 0 used Relcache by OID: 24576 total in 2 blocks; 11792 free (3 chunks); 12784 used CacheMemoryContext:
1342128total in 21 blocks; 290016 free (11
 
chunks); 1052112 used   idx_raw_cfd_bear_commodity_position_records_position_date: 2048
total in 1 blocks; 752 free (0 chunks); 1296 used   CachedPlan: 3072 total in 2 blocks; 2008 free (2 chunks); 1064 used
 CachedPlanSource: 3072 total in 2 blocks; 1656 free (0 chunks); 1416 used   SPI Plan: 1024 total in 1 blocks; 808 free
(0chunks); 216 used   CachedPlan: 3072 total in 2 blocks; 1984 free (1 chunks); 1088 used   CachedPlanSource: 3072
totalin 2 blocks; 1696 free (0 chunks); 1376 used   SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used
CachedPlan:1024 total in 1 blocks; 312 free (0 chunks); 712 used   CachedPlanSource: 3072 total in 2 blocks; 1584 free
(0chunks); 1488 used   SPI Plan: 1024 total in 1 blocks; 832 free (0 chunks); 192 used   CachedPlan: 1024 total in 1
blocks;304 free (0 chunks); 720 used   CachedPlanSource: 3072 total in 2 blocks; 1960 free (3 chunks); 1112 used   SPI
Plan:1024 total in 1 blocks; 808 free (0 chunks); 216 used   CachedPlan: 1024 total in 1 blocks; 304 free (0 chunks);
720used   CachedPlanSource: 3072 total in 2 blocks; 1960 free (3 chunks); 1112 used   SPI Plan: 1024 total in 1 blocks;
808free (0 chunks); 216 used   CachedPlan: 3072 total in 2 blocks; 80 free (0 chunks); 2992 used   CachedPlanSource:
3072total in 2 blocks; 672 free (0 chunks); 2400 used   SPI Plan: 1024 total in 1 blocks; 808 free (0 chunks); 216 used
 lu_pnl_fr_fill_file_account_map_pkey: 2048 total in 1 blocks; 752
 
free (0 chunks); 1296 used   lu_pnl_fr_fill_file_account_map_fill_file_acronym_key: 2048 total
in 1 blocks; 752 free (0 chunks); 1296 used   CachedPlan: 1024 total in 1 blocks; 312 free (0 chunks); 712 used
CachedPlanSource:3072 total in 2 blocks; 1584 free (0 chunks); 1488 used   SPI Plan: 1024 total in 1 blocks; 832 free
(0chunks); 192 used   CachedPlan: 1024 total in 1 blocks; 280 free (0 chunks); 744 used   CachedPlanSource: 3072 total
in2 blocks; 1944 free (2 chunks); 1128 used   SPI Plan: 1024 total in 1 blocks; 832 free (0 chunks); 192 used
pg_shdepend_depender_index:2048 total in 1 blocks; 392 free (0
 
chunks); 1656 used   pg_shdepend_reference_index: 2048 total in 1 blocks; 608 free (0
chunks); 1440 used   pg_depend_reference_index: 2048 total in 1 blocks; 456 free (0
chunks); 1592 used   pg_depend_depender_index: 2048 total in 1 blocks; 456 free (0
chunks); 1592 used   pg_toast_2619_index: 2048 total in 1 blocks; 608 free (0 chunks); 1440 used
lu_tax_treatment_types_report_name_key:2048 total in 1 blocks;
 
752 free (0 chunks); 1296 used   lu_tax_treatment_types_pkey: 2048 total in 1 blocks; 752 free (0
chunks); 1296 used   lu_tax_treatment_types_mnemonic_key: 2048 total in 1 blocks; 752
free (0 chunks); 1296 used   lu_msa_groups_report_name_key: 2048 total in 1 blocks; 752 free (0
chunks); 1296 used   lu_msa_groups_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
lu_msa_groups_mnemonic_key:2048 total in 1 blocks; 752 free (0
 
chunks); 1296 used   lu_account_groups_report_name_key: 2048 total in 1 blocks; 752
free (0 chunks); 1296 used   lu_account_groups_pkey: 2048 total in 1 blocks; 752 free (0
chunks); 1296 used   lu_account_groups_mnemonic_key: 2048 total in 1 blocks; 752 free
(0 chunks); 1296 used   lu_business_groups_report_name_key: 2048 total in 1 blocks; 752
free (0 chunks); 1296 used   lu_business_groups_pkey: 2048 total in 1 blocks; 752 free (0
chunks); 1296 used   lu_business_groups_mnemonic_key: 2048 total in 1 blocks; 752 free
(0 chunks); 1296 used   lu_strategies_report_name_key: 2048 total in 1 blocks; 752 free (0
chunks); 1296 used   lu_strategies_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
lu_strategies_mnemonic_key:2048 total in 1 blocks; 752 free (0
 
chunks); 1296 used   lu_sub_strategies_report_name_key: 2048 total in 1 blocks; 752
free (0 chunks); 1296 used   lu_sub_strategies_pkey: 2048 total in 1 blocks; 752 free (0
chunks); 1296 used   lu_sub_strategies_mnemonic_key: 2048 total in 1 blocks; 752 free
(0 chunks); 1296 used   lu_margin_types_report_name_key: 2048 total in 1 blocks; 752 free
(0 chunks); 1296 used   lu_margin_types_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
lu_margin_types_mnemonic_key:2048 total in 1 blocks; 752 free (0
 
chunks); 1296 used   lu_clearing_firm_legal_entities_mnemonic_key: 2048 total in 1
blocks; 752 free (0 chunks); 1296 used   lu_clearing_firm_legal_entities_report_name_key: 2048 total in 1
blocks; 752 free (0 chunks); 1296 used   lu_clearing_firm_legal_entities_pkey: 2048 total in 1 blocks; 752
free (0 chunks); 1296 used   lu_account_types_report_name_key: 2048 total in 1 blocks; 752 free
(0 chunks); 1296 used   lu_account_types_pkey: 2048 total in 1 blocks; 752 free (0
chunks); 1296 used   lu_account_types_mnemonic_key: 2048 total in 1 blocks; 752 free (0
chunks); 1296 used   lu_partners_report_name_key: 2048 total in 1 blocks; 752 free (0
chunks); 1296 used   lu_partners_pkey: 2048 total in 1 blocks; 752 free (0 chunks); 1296 used
lu_partners_mnemonic_key:2048 total in 1 blocks; 752 free (0
 
chunks); 1296 used   lu_internal_firms_report_name_key: 2048 total in 1 blocks; 752
free (0 chunks); 1296 used   lu_internal_firms_pkey: 2048 total in 1 blocks; 752 free (0
chunks); 1296 used   lu_internal_firms_mnemonic_key: 2048 total in 1 blocks; 752 free
(0 chunks); 1296 used   lu_internal_firms_legacy_name_key: 2048 total in 1 blocks; 752
free (0 chunks); 1296 used   lu_internal_firms_legacy_mnemonic_key: 2048 total in 1 blocks; 752
free (0 chunks); 1296 used   idx_lu_clearing_firms_id_mnemonic: 2048 total in 1 blocks; 656
free (0 chunks); 1392 used   lu_clearing_firms_report_name_key: 2048 total in 1 blocks; 752
free (0 chunks); 1296 used   lu_clearing_firms_pkey: 2048 total in 1 blocks; 752 free (0
chunks); 1296 used   lu_clearing_firms_mnemonic_key: 2048 total in 1 blocks; 752 free
(0 chunks); 1296 used   idx_lu_pnl_fr_accounts_clearing_firm_id_clearing_firm: 2048 total
in 1 blocks; 656 free (0 chunks); 1392 used   idx_lu_pnl_fr_account_avg_price_account_code_2: 2048 total in 1
blocks; 752 free (0 chunks); 1296 used   idx_lu_pnl_fr_account_avg_price_account_code: 2048 total in 1
blocks; 752 free (0 chunks); 1296 used   idx_lu_pnl_fr_account_account_code_key: 2048 total in 1 blocks;
752 free (0 chunks); 1296 used   idx_lu_pnl_fr_account_account_code: 2048 total in 1 blocks; 752
free (0 chunks); 1296 used   lu_pnl_fr_accounts_pkey: 2048 total in 1 blocks; 752 free (0
chunks); 1296 used   pg_index_indrelid_index: 2048 total in 1 blocks; 704 free (0
chunks); 1344 used   pg_constraint_conrelid_index: 2048 total in 1 blocks; 704 free (0
chunks); 1344 used   pg_attrdef_adrelid_adnum_index: 2048 total in 1 blocks; 608 free
(0 chunks); 1440 used   pg_toast_2618_index: 2048 total in 1 blocks; 608 free (0 chunks); 1440 used
vw_pnl_fr_accounts:171008 total in 23 blocks; 3152 free (0
 
chunks); 167856 used   pg_opclass_am_name_nsp_index: 3072 total in 2 blocks; 1448 free (4
chunks); 1624 used   pg_foreign_data_wrapper_name_index: 3072 total in 2 blocks; 1744
free (3 chunks); 1328 used   pg_enum_oid_index: 3072 total in 2 blocks; 1744 free (3 chunks); 1328 used
pg_class_relname_nsp_index:3072 total in 2 blocks; 1600 free (2
 
chunks); 1472 used   pg_foreign_server_oid_index: 3072 total in 2 blocks; 1744 free (3
chunks); 1328 used   pg_statistic_relid_att_index: 3072 total in 2 blocks; 1600 free (2
chunks); 1472 used   pg_cast_source_target_index: 3072 total in 2 blocks; 1600 free (2
chunks); 1472 used   pg_language_name_index: 3072 total in 2 blocks; 1744 free (3
chunks); 1328 used   pg_authid_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used
pg_amop_fam_strat_index:3072 total in 2 blocks; 1384 free (2
 
chunks); 1688 used   pg_index_indexrelid_index: 3072 total in 2 blocks; 1696 free (2
chunks); 1376 used   pg_ts_template_tmplname_index: 3072 total in 2 blocks; 1648 free
(2 chunks); 1424 used   pg_ts_config_map_index: 3072 total in 2 blocks; 1496 free (4
chunks); 1576 used   pg_opclass_oid_index: 3072 total in 2 blocks; 1696 free (2
chunks); 1376 used   pg_foreign_data_wrapper_oid_index: 3072 total in 2 blocks; 1744
free (3 chunks); 1328 used   pg_auth_members_member_role_index: 3072 total in 2 blocks; 1600
free (2 chunks); 1472 used   pg_ts_dict_oid_index: 3072 total in 2 blocks; 1744 free (3
chunks); 1328 used  pg_conversion_default_index: 3072 total in 2 blocks; 1432 free (3
chunks); 1640 used   pg_operator_oprname_l_r_n_index: 3072 total in 2 blocks; 1384 free
(2 chunks); 1688 used   pg_trigger_tgrelid_tgname_index: 3072 total in 2 blocks; 1600 free
(2 chunks); 1472 used   pg_enum_typid_label_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used   pg_ts_config_oid_index: 3072 total in 2 blocks; 1744 free (3
chunks); 1328 used   pg_user_mapping_oid_index: 3072 total in 2 blocks; 1744 free (3
chunks); 1328 used   pg_opfamily_am_name_nsp_index: 3072 total in 2 blocks; 1496 free
(4 chunks); 1576 used   pg_type_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used
pg_aggregate_fnoid_index:3072 total in 2 blocks; 1744 free (3
 
chunks); 1328 used   pg_constraint_oid_index: 3072 total in 2 blocks; 1744 free (3
chunks); 1328 used   pg_rewrite_rel_rulename_index: 3072 total in 2 blocks; 1600 free
(2 chunks); 1472 used   pg_ts_parser_prsname_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used   pg_ts_config_cfgname_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used   pg_ts_parser_oid_index: 3072 total in 2 blocks; 1744 free (3
chunks); 1328 used   pg_operator_oid_index: 3072 total in 2 blocks; 1696 free (2
chunks); 1376 used   pg_namespace_nspname_index: 3072 total in 2 blocks; 1696 free (2
chunks); 1376 used   pg_ts_template_oid_index: 3072 total in 2 blocks; 1744 free (3
chunks); 1328 used   pg_amop_opr_fam_index: 3072 total in 2 blocks; 1600 free (2
chunks); 1472 used   pg_ts_dict_dictname_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used   pg_auth_members_role_member_index: 3072 total in 2 blocks; 1648
free (2 chunks); 1424 used   pg_type_typname_nsp_index: 3072 total in 2 blocks; 1600 free (2
chunks); 1472 used   pg_opfamily_oid_index: 3072 total in 2 blocks; 1744 free (3
chunks); 1328 used   pg_class_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used
pg_proc_proname_args_nsp_index:3072 total in 2 blocks; 1496 free
 
(4 chunks); 1576 used   pg_attribute_relid_attnum_index: 3072 total in 2 blocks; 1600 free
(2 chunks); 1472 used   pg_proc_oid_index: 3072 total in 2 blocks; 1696 free (2 chunks); 1376 used
pg_language_oid_index:3072 total in 2 blocks; 1696 free (2
 
chunks); 1376 used   pg_namespace_oid_index: 3072 total in 2 blocks; 1696 free (2
chunks); 1376 used   pg_database_oid_index: 3072 total in 2 blocks; 1696 free (2
chunks); 1376 used   pg_amproc_fam_proc_index: 3072 total in 2 blocks; 1384 free (2
chunks); 1688 used   pg_authid_rolname_index: 3072 total in 2 blocks; 1696 free (2
chunks); 1376 used   pg_foreign_server_name_index: 3072 total in 2 blocks; 1744 free (3
chunks); 1328 used   pg_attribute_relid_attnam_index: 3072 total in 2 blocks; 1600 free
(2 chunks); 1472 used   pg_conversion_oid_index: 3072 total in 2 blocks; 1744 free (3
chunks); 1328 used   pg_user_mapping_user_server_index: 3072 total in 2 blocks; 1648
free (2 chunks); 1424 used   pg_conversion_name_nsp_index: 3072 total in 2 blocks; 1648 free (2
chunks); 1424 used MdSmgr: 8192 total in 1 blocks; 5632 free (0 chunks); 2560 used LOCALLOCK hash: 24576 total in 2
blocks;11872 free (5 chunks); 12704 used Timezones: 83472 total in 2 blocks; 3744 free (0 chunks); 79728 used
ErrorContext:8192 total in 1 blocks; 8160 free (7 chunks); 32 used
 


On Fri, Jul 20, 2012 at 9:10 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> On 20.07.2012 10:19, Benedikt Grundmann wrote:
>>
>> We yesterday encountered a program that in a degenerate case issued in
>> a single transaction a huge number of selects (in a single transaction
>> but each select in a separate call to PGExec) (huge = ~ 400,000).
>>
>> That transaction would continue to eat memory up until a point where
>> calls to malloc (in aset.c) would fail and log for example:
>>
>> ,"out of memory","Failed on request of size 11."
>>
>> ...
>>
>>
>>     - Is that expected expected behaviour?  The transaction was
>>       in READ_COMMITED mode, and my best guess is that this implies
>>       that some snapshot is taken before each subselect and all
>>       of them are only freed once the transaction is finished
>
>
> In more complicated scenarios, with e.g subtransactions, it's normal that
> there's some growth in memory usage like that. But with simple SELECTs, I
> don't think there should be.
>
> Can you write a simple self-contained test script that reproduces this? That
> would make it easier to see where the memory is going.
>
Assuming that it isn't obvious now that I realized that we generate a cursor
every time, I will give it a shot otherwise.

> PS, you should upgrade, the latest minor version in 8.4.x series is 8.4.12.
> If possible, upgrading to a more recent major version would be a good idea
> too. I don't know if it will help with this problem, but it might..
>
We are in fact automatically doing an upgrade in testing to 9.1 every day
at the moment.  We plan to pull the trigger in production in a few weeks.

Thanks,

Bene


On Fri, Jul 20, 2012 at 10:46 AM, Benedikt Grundmann
<bgrundmann@janestreet.com> wrote:
>
> DECLARE sqmlcursor51587 CURSOR FOR select
>
entry_time,source,bad_fields,isin,sedol,cusip,bloomberg,reuters,exchange_code,currency,description,bbg_instrument_type,instrument_type,specifics,definer,primary_exchange,is_primary_security,is_primary_listing,tags,bloomberg_id,status
> from vw_instruments_v7 where jane_symbol = E'FOO BAR' and true and
> effective_until = (select max(effective_until) from
> vw_instruments_v7)"
>
> Sorry I imagine that the fact that this generates a cursor every time
> is important
> but it had honestly escaped my attention, because the library we use to query
> the database uses CURSORs basically for every select, so that it can process
> the data in batches (in this particular case that is conceptually unnecessary as
> the query will only return one row, but the library does not know that).
>
Actually I believe this must be it.  I just went back and checked the library
and it does not CLOSE the cursors.  This is normally not a problem as most
transactions we have run one or two queries only...  I'll patch the library
to CLOSE the cursor when all the data has been delivered and test if the
error does not happen then.

I also noticed just know that all TopMemoryContext's after the first one
look significantly different.  They contain large PortalMemory sections.
Are those related to cursors?
TransactionAbortContext: 32768 total in 1 blocks; 32736 free (0
chunks); 32 used Portal hash: 8380416 total in 10 blocks; 3345088 free (34 chunks);
5035328 used PortalMemory: 16769024 total in 11 blocks; 2737280 free (15 chunks);
14031744 used   PortalHeapMemory: 56320 total in 9 blocks; 4320 free (0 chunks); 52000 used     ExecutorState: 57344
totalin 3 blocks; 15248 free (3 chunks); 42096 used       ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
   ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used       ExprContext: 0 total in 0 blocks; 0 free (0
chunks);0 used       ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used       ExprContext: 0 total in 0
blocks;0 free (0 chunks); 0 used   PortalHeapMemory: 56320 total in 9 blocks; 4320 free (0 chunks); 52000 used
ExecutorState:57344 total in 3 blocks; 15248 free (3 chunks); 42096 used       ExprContext: 0 total in 0 blocks; 0 free
(0chunks); 0 used       ExprContext: 8192 total in 1 blocks; 8160 free (0 chunks); 32 used       ExprContext: 0 total
in0 blocks; 0 free (0 chunks); 0 used       ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext:0 total in 0 blocks; 0 free (0 chunks); 0 used
 
...


Thanks everyone,

Bene


On Fri, Jul 20, 2012 at 10:56 AM, Benedikt Grundmann
<bgrundmann@janestreet.com> wrote:
> On Fri, Jul 20, 2012 at 10:46 AM, Benedikt Grundmann
> <bgrundmann@janestreet.com> wrote:
>>
> Actually I believe this must be it.  I just went back and checked the library
> and it does not CLOSE the cursors.  This is normally not a problem as most
> transactions we have run one or two queries only...  I'll patch the library
> to CLOSE the cursor when all the data has been delivered and test if the
> error does not happen then.
>
Just to confirm that indeed fixed it.  Sorry for the noise.

Bene


Hi,

On Friday, July 20, 2012 11:56:09 AM Benedikt Grundmann wrote:
> I also noticed just know that all TopMemoryContext's after the first one
> look significantly different.  They contain large PortalMemory sections.
> Are those related to cursors?
Yes.

Andres
-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services