Thread: Running out of memory on vacuum

Running out of memory on vacuum

From
Ioana Danes
Date:
Hi all,
 
I have a production database that sometimes runs out of
memory=at nightly vacuum.
 
The application runs typically with around 40 post=res
connections but there are times when the connections increase because =f some
queries going on. The reason is that the operations are slow, the t=rminals
time out and try to reconnect using new connections. Some time ago=I started to
have problems with too many connections being open so I lower=d the limit to
300 connections. It was all good until recently when even w=th 180 connections
I am running out of memory on vacuum... So the connecti=ns increase to 180 and
the system still runs properly for other 2 days but=then at nightly vacuum runs
out of memory.
The fix is to restart postgre= ... If I only close the
connections the problem is still these so I need =o restart postgres.
If I don't restart postgres then the system will run=out
of memory on queries at a point...
Another important thing is that d=ring vacuum at 1am
nothing else is going on that server so all the connect=ons are idle. 
 
2013-05-14 06:53:51.449
CST,"postgres","abrazo",8=41,"[local]",519233dc.2225,3,"VACUUM",2013-05-14
06:53:48 CST,174/67143,0,=RROR,53200,"out of memory","Failed on
request of size 668141820.",,,,,"vac=um;",,,"psql"
 
OS:
 
SUSE Linux Enterprise Server 11 (x86_64) VER=ION = 11
PATCHLEVEL = 2
 
Suse compiled postgre version :
 =A0                        =A0                        =A0     
version

----------------------------------------------=--------------------------------------------------------------------------=--
 PostgreSQL 9.0.3 on x86_64-suse-linux-gnu, compiled
by GCC gcc (SU=E Linux) 4.3.4 [gcc-4_3-branch revision 152973], 64-bit
 
 
 
Postgre= configuration parameters:
 
max_connections = 300
shared_buffers =3D 2GB
maintenance_work_mem = 2GB
effective_cache_size = 4GB
wor=_mem = 2MB
wal_buffers = 1MB
 
 
checkpoint_segments = 16
au=ovacuum = on
 
 
wal_level = hot_standby
archive_mode = on
a=chive_command = 'test -f
/cbnDBscripts/tmp/PITR_primarydb_stop_backup ||=rsync --timeout=30 -atz %p
spsnidb1:/data01/wal_files/%f'
archive_time=ut = 60
 
 
free
             
    total  =A0   
used       free    
shared    buffer=     cached
Mem:      
7880512    7825136 =A0   
55376         
0      72376  =A0 4537644 -/+
buffers/cache:    3215116    4665396
Swap:=A0    
2097148         
0    2097148
 
=etc/sysctl.conf
 
kernel.shmmax=2684354560
kernel.shmall=26843545=0
vm.swappiness=0
vm.overcommit_memory=2
 
postgres log:
TopM=moryContext: 179088 total in 18 blocks; 8608 free
(13 chunks); 170480 used=0A  TopTransactionContext: 24576 total in 2
blocks; 21888 free (43 chunk=); 2688 used
  Attopt cache: 57344 total in 3 blocks; 25824 free
(5 ch=nks); 31520 used
  Type information cache: 24576 total in 2 blocks;
11=88 free (5 chunks); 12688 used
  MessageContext: 8192 total in 1 block=; 4952 free
(1 chunks); 3240 used
  Operator class cache: 8192 total i= 1 blocks; 1680
free (0 chunks); 6512 used
  smgr relation table: 5734= total in 3 blocks;
21984 free (10 chunks); 35360 used
  TransactionAb=rtContext: 32768 total in 1 blocks;
32736 free (0 chunks); 32 used
  P=rtal hash: 8192 total in 1 blocks; 1680 free (0
chunks); 6512 used
  P=rtalMemory: 8192 total in 1 blocks; 7888 free (0
chunks); 304 used
 =A0  PortalHeapMemory: 2048 total in 1 blocks;
2016 free (3 chunks); 32 u=ed
      Vacuum: 24576 total in 2
blocks; 15392 free (0 chunks=; 9184 used
  Relcache by OID: 24576 total in 2 blocks; 6592
free (3 c=unks); 17984 used
  CacheMemoryContext: 2390704 total in 22 blocks;
64=048 free (4 chunks); 1749656 used
    pk_cbntransaction: 2048 tota= in 1
blocks; 752 free (0 chunks); 1296 used
    rel_transactionre=ailerid_fk: 2048
total in 1 blocks; 752 free (0 chunks); 1296 used
 =A0  rel_transaction_promotion_fk: 2048 total
in 1 blocks; 752 free (0 ch=nks); 1296 used
    reference_64_fk: 2048 total in 1
blocks; 752 f=ee (0 chunks); 1296 used
    reference_49_fk: 2048 total in 1
bloc=s; 752 free (0 chunks); 1296 used
    reference_46_fk: 2048 total =n 1
blocks; 752 free (0 chunks); 1296 used
    reference_28_fk: 20=8 total in 1
blocks; 752 free (0 chunks); 1296 used
    ix_transac=ion_date: 2048 total in 1
blocks; 504 free (0 chunks); 1544 used
  =A0 ix_referenceid_transac: 2048 total in 1
blocks; 752 free (0 chunks); 12=6 used
    ix_onlinetick_cbntrans: 2048 total in
1 blocks; 752 fre= (0 chunks); 1296 used
    ix_drawid_transaction: 2048 total in 1
=locks; 752 free (0 chunks); 1296 used
   
ix_cbntransaction_termin=lid_transactiondate: 2048 total in 1 blocks; 656 free
(0 chunks); 1392 use=
    ix_cbntransaction_purchaseid: 2048
total in 1 blocks; 752 fre= (0 chunks); 1296 used
    pk_cardactivity: 2048 total in 1
blocks= 752 free (0 chunks); 1296 used
    pk_business_type: 2048 total i= 1
blocks; 752 free (0 chunks); 1296 used
    pg_toast_207856_inde=: 2048 total in 1
blocks; 656 free (0 chunks); 1392 used
    pk_ba=k: 2048 total in 1 blocks; 752
free (0 chunks); 1296 used
    ix_a=dit_1: 2048 total in 1 blocks;
504 free (0 chunks); 1544 used
    =g_toast_207843_index: 2048 total in 1
blocks; 656 free (0 chunks); 1392 us=d
    pk_itshipment: 2048 total in 1 blocks;
504 free (0 chunks); =544 used
    pk_itpack: 2048 total in 1 blocks; 504
free (0 chunks=; 1544 used
    pg_toast_25392890_index: 2048 total in
1 blocks; 6=6 free (0 chunks); 1392 used
    pk_drawpatterns: 2048 total in 1
=locks; 504 free (0 chunks); 1544 used
    pg_toast_50580_index: 20=8 total in 1
blocks; 656 free (0 chunks); 1392 used
    rel_draw_g=meid_fk: 2048 total in 1
blocks; 752 free (0 chunks); 1296 used
  =A0 reference_29_fk: 2048 total in 1 blocks;
752 free (0 chunks); 1296 used=0A    pk_draw: 2048 total in 1
blocks; 752 free (0 chunks); 1296 use=
    pg_toast_62768_index: 2048 total in 1
blocks; 656 free (0 chu=ks); 1392 used
    pg_toast_39412_index: 2048 total in 1
blocks; 6=6 free (0 chunks); 1392 used
    ix_a_gamewinner_2006_7: 2048 tota= in
1 blocks; 752 free (0 chunks); 1296 used
    ix_a_gamewinner_2=06_6: 2048 total in
1 blocks; 752 free (0 chunks); 1296 used
    i=_a_gamewinner_2006_5: 2048 total in
1 blocks; 752 free (0 chunks); 1296 us=d
    ix_a_gamewinner_2006_4: 2048 total in
1 blocks; 752 free (0 =hunks); 1296 used
    ix_a_gamewinner_2006_2: 2048 total in
1 bloc=s; 752 free (0 chunks); 1296 used
    ix_a_gamewinner_2006_1: 2048=total in
1 blocks; 656 free (0 chunks); 1392 used
    pk_a_gamewin=er_2006: 2048 total in 1
blocks; 752 free (0 chunks); 1296 used
  =A0 pg_toast_39404_index: 2048 total in 1
blocks; 656 free (0 chunks); 1392=used
    ix_a_gamewinner_2005_7: 2048 total in
1 blocks; 752 free =0 chunks); 1296 used
    ix_a_gamewinner_2005_6: 2048 total in
1 b=ocks; 752 free (0 chunks); 1296 used
    ix_a_gamewinner_2005_5: 2=48 total in
1 blocks; 752 free (0 chunks); 1296 used
    ix_a_game=inner_2005_4: 2048 total in
1 blocks; 752 free (0 chunks); 1296 used
 =A0  ix_a_gamewinner_2005_2: 2048 total in 1
blocks; 752 free (0 chunks);=1296 used
    ix_a_gamewinner_2005_1: 2048 total in
1 blocks; 656 =ree (0 chunks); 1392 used
    pk_a_gamewinner_2005: 2048 total in =
blocks; 752 free (0 chunks); 1296 used
    pg_toast_39428_index: =048 total in 1
blocks; 656 free (0 chunks); 1392 used
    pk_a_cbn=ransaction_2008: 2048 total
in 1 blocks; 752 free (0 chunks); 1296 used
=A0   ix_cashdrawerbalance_1: 2048 total in 1
blocks; 656 free (0 chunk=); 1392 used
    pk_cashdrawerbalance: 2048 total in 1
blocks; 752=free (0 chunks); 1296 used
    pg_toast_62658_index: 2048 total in=1
blocks; 656 free (0 chunks); 1392 used
    pk_cashdrawer: 2048 t=tal in 1 blocks;
752 free (0 chunks); 1296 used
    pk_account: 20=8 total in 1 blocks;
752 free (0 chunks); 1296 used
    ix_cashdra=ertransaction_cashdrawerid_transdate:
2048 total in 1 blocks; 656 free (0 =hunks); 1392 used
    pk_cashdrawertransaction: 2048 total
in 1 bl=cks; 752 free (0 chunks); 1296 used
    pg_toast_62672_index: 2048=total in 1
blocks; 656 free (0 chunks); 1392 used
    ix_cashdrawe=balanceitem_1: 2048 total
in 1 blocks; 752 free (0 chunks); 1296 used
=A0   pg_toast_62666_index: 2048 total in 1
blocks; 656 free (0 chunks)= 1392 used
    pg_toast_22502278_index: 2048 total in
1 blocks; 65= free (0 chunks); 1392 used
    pg_toast_26623904_index: 2048 tota= in
1 blocks; 656 free (0 chunks); 1392 used
    pg_toast_26623898=index: 2048 total in
1 blocks; 656 free (0 chunks); 1392 used
    =k_dbversion: 2048 total in 1 blocks;
504 free (0 chunks); 1544 used
 =A0  pg_toast_39501_index: 2048 total in 1
blocks; 656 free (0 chunks); 1=92 used
    pk_a_matchgameplay_2009: 2048 total in
1 blocks; 752 f=ee (0 chunks); 1296 used
    pg_toast_39436_index: 2048 total in
1=blocks; 656 free (0 chunks); 1392 used
    pk_a_matchgameplay_2008= 2048 total in
1 blocks; 752 free (0 chunks); 1296 used
    pg_toa=t_39396_index: 2048 total in 1
blocks; 656 free (0 chunks); 1392 used
=A0   pk_a_matchgameplay_2007: 2048 total in 1
blocks; 752 free (0 chun=s); 1296 used
    pg_toast_39388_index: 2048 total in 1
blocks; 65= free (0 chunks); 1392 used
    pk_a_matchgameplay_2006: 2048 tota= in
1 blocks; 752 free (0 chunks); 1296 used
    pg_toast_39380_in=ex: 2048 total in 1
blocks; 656 free (0 chunks); 1392 used
    pk_=_matchgameplay_2005: 2048 total in
1 blocks; 752 free (0 chunks); 1296 use=
    pk_accountforms: 2048 total in 1
blocks; 656 free (0 chunks);=1392 used
    pk_validzreporttypes: 2048 total in 1
blocks; 752 fr=e (0 chunks); 1296 used
    pg_toast_63590_index: 2048 total in 1
=locks; 656 free (0 chunks); 1392 used
    ix_validtransactiontype_=serdesc: 2048
total in 1 blocks; 752 free (0 chunks); 1296 used
  =A0 pk_validtransactiontype: 2048 total in 1
blocks; 752 free (0 chunks); 1=96 used
    pk_validtransactionstatus: 2048 total
in 1 blocks; 752=free (0 chunks); 1296 used
    pk_address: 2048 total in 1
blocks;=752 free (0 chunks); 1296 used
    pg_toast_62598_index: 2048 tota= in 1
blocks; 656 free (0 chunks); 1392 used
    pg_toast_62590_in=ex: 2048 total in 1
blocks; 656 free (0 chunks); 1392 used
    pk_=ctivitycontext: 2048 total in 1
blocks; 752 free (0 chunks); 1296 used
=A0   pg_toast_25232699_index: 2048 total in 1
blocks; 656 free (0 chun=s); 1392 used
    pg_toast_25232693_index: 2048 total in
1 blocks;=656 free (0 chunks); 1392 used
    pg_toast_26623892_index: 2048 t=tal in
1 blocks; 656 free (0 chunks); 1392 used
    pg_toast_26623=86_index: 2048 total in
1 blocks; 656 free (0 chunks); 1392 used
  =A0 pg_cast_oid_index: 2048 total in 1
blocks; 752 free (0 chunks); 1296 us=d
    pg_toast_507675_index: 2048 total in 1
blocks; 656 free (0 c=unks); 1392 used
    pk_a_cbntransaction_2013: 2048 total
in 1 blo=ks; 752 free (0 chunks); 1296 used
    pg_toast_21922437_index: 20=8 total in
1 blocks; 656 free (0 chunks); 1392 used
    pk_dbcompa=e_tables: 2048 total in 1
blocks; 752 free (0 chunks); 1296 used
  =A0 pg_toast_507667_index: 2048 total in 1
blocks; 656 free (0 chunks); 139= used
    pk_a_cbntransaction_2012: 2048 total
in 1 blocks; 752 fr=e (0 chunks); 1296 used
    pg_toast_499794_index: 2048 total in
1=blocks; 656 free (0 chunks); 1392 used
    pg_inherits_relid_seqno=index: 2048
total in 1 blocks; 656 free (0 chunks); 1392 used
    =g_inherits_parent_index: 2048 total
in 1 blocks; 752 free (0 chunks); 1296=used
    pg_toast_11474_index: 2048 total in 1
blocks; 656 free (0=chunks); 1392 used
    pg_toast_11469_index: 2048 total in 1
block=; 656 free (0 chunks); 1392 used
    pg_toast_11464_index: 2048 to=al in 1
blocks; 656 free (0 chunks); 1392 used
    pg_toast_2604_i=dex: 2048 total in 1
blocks; 656 free (0 chunks); 1392 used
    pg=attrdef_oid_index: 2048 total in 1
blocks; 752 free (0 chunks); 1296 used=0A   
pg_attrdef_adrelid_adnum_index: 2048 total in 1 blocks; 608 fr=e (0 chunks);
1440 used
    pg_toast_11459_index: 2048 total in 1
=locks; 656 free (0 chunks); 1392 used
    pg_toast_11454_index: 20=8 total in 1
blocks; 656 free (0 chunks); 1392 used
    pg_largeob=ect_loid_pn_index: 2048
total in 1 blocks; 656 free (0 chunks); 1392 used=0A   
pg_default_acl_oid_index: 2048 total in 1 blocks; 752 free (0 =hunks); 1296
used
    pg_toast_2396_index: 2048 total in 1
blocks;=656 free (0 chunks); 1392 used
    pg_shdescription_o_c_index: 204= total
in 1 blocks; 656 free (0 chunks); 1392 used
    pg_shdepend=reference_index: 2048
total in 1 blocks; 656 free (0 chunks); 1392 used
=A0   pg_shdepend_depender_index: 2048 total in
1 blocks; 440 free (0 c=unks); 1608 used
    pg_pltemplate_name_index: 2048 total
in 1 blo=ks; 752 free (0 chunks); 1296 used
    pg_tablespace_spcname_index= 2048
total in 1 blocks; 752 free (0 chunks); 1296 used
    pg_toa=t_2964_index: 2048 total in 1
blocks; 656 free (0 chunks); 1392 used
 =A0  pg_depend_reference_index: 2048 total in
1 blocks; 456 free (0 chunk=); 1592 used
    pg_depend_depender_index: 2048 total
in 1 blocks;=456 free (0 chunks); 1592 used
    pg_toast_2609_index: 2048 total=in 1
blocks; 656 free (0 chunks); 1392 used
    pg_description_o_c=o_index: 2048 total
in 1 blocks; 504 free (0 chunks); 1544 used
  =A0 pg_toast_2620_index: 2048 total in 1
blocks; 656 free (0 chunks); 1392 =sed
    pg_trigger_oid_index: 2048 total in 1
blocks; 752 free (0 =hunks); 1296 used
    pg_trigger_tgconstraint_index: 2048
total in=1 blocks; 752 free (0 chunks); 1296 used
    pg_toast_2618_index: =048 total in 1
blocks; 656 free (0 chunks); 1392 used
    pg_rewri=e_oid_index: 2048 total in 1
blocks; 752 free (0 chunks); 1296 used
 =A0  pg_largeobject_metadata_oid_index: 2048
total in 1 blocks; 752 free =0 chunks); 1296 used
    pg_amproc_oid_index: 2048 total in 1
bloc=s; 752 free (0 chunks); 1296 used
    pg_amop_oid_index: 2048 tota= in 1
blocks; 752 free (0 chunks); 1296 used
    pg_am_oid_index: =048 total in 1
blocks; 752 free (0 chunks); 1296 used
    pg_am_na=e_index: 2048 total in 1
blocks; 752 free (0 chunks); 1296 used
  =A0 pg_toast_1262_index: 2048 total in 1
blocks; 656 free (0 chunks); 1392 =sed
    pg_toast_2606_index: 2048 total in 1
blocks; 656 free (0 c=unks); 1392 used
    pg_constraint_contypid_index: 2048
total in 1=blocks; 752 free (0 chunks); 1296 used
    pg_constraint_conrelid_=ndex: 2048
total in 1 blocks; 752 free (0 chunks); 1296 used
    p=_constraint_conname_nsp_index: 2048
total in 1 blocks; 656 free (0 chunks)= 1392 used
    pg_toast_11484_index: 2048 total in 1
blocks; 656 f=ee (0 chunks); 1392 used
    pg_toast_11479_index: 2048 total in
1=blocks; 656 free (0 chunks); 1392 used
    pg_toast_1255_index: 20=8 total in 1
blocks; 656 free (0 chunks); 1392 used
    pg_toast_2=19_index: 2048 total in 1
blocks; 608 free (0 chunks); 1440 used
  =A0 pg_index_indrelid_index: 2048 total in 1
blocks; 704 free (0 chunks); 1=44 used
    pg_db_role_setting_databaseid_rol_index:
2048 total in=1 blocks; 656 free (0 chunks); 1392 used
    pg_opclass_am_name_ns=_index: 3072
total in 2 blocks; 1448 free (4 chunks); 1624 used
  =A0 pg_foreign_data_wrapper_name_index: 3072
total in 2 blocks; 1744 free (= chunks); 1328 used
    pg_enum_oid_index: 3072 total in 2
blocks;=1744 free (3 chunks); 1328 used
    pg_class_relname_nsp_index: 30=2 total
in 2 blocks; 1648 free (2 chunks); 1424 used
    pg_foreig=_server_oid_index: 3072
total in 2 blocks; 1744 free (3 chunks); 1328 used=0A   
pg_statistic_relid_att_inh_index: 3072 total in 2 blocks; 1448=free (4 chunks);
1624 used
    pg_cast_source_target_index: 3072
t=tal in 2 blocks; 1600 free (2 chunks); 1472 used
    pg_language_n=me_index: 3072 total in
2 blocks; 1744 free (3 chunks); 1328 used
  =A0 pg_amop_fam_strat_index: 3072 total in 2
blocks; 1384 free (2 chunks); =688 used
    pg_index_indexrelid_index: 3072 total
in 2 blocks; 16=6 free (2 chunks); 1376 used
    pg_ts_template_tmplname_index: 30=2
total in 2 blocks; 1648 free (2 chunks); 1424 used
    pg_ts_con=ig_map_index: 3072 total in
2 blocks; 1496 free (4 chunks); 1576 used
=A0   pg_opclass_oid_index: 3072 total in 2
blocks; 1696 free (2 chunks=; 1376 used
    pg_foreign_data_wrapper_oid_index:
3072 total in 2=blocks; 1744 free (3 chunks); 1328 used
    pg_ts_dict_oid_index: =072 total in 2
blocks; 1744 free (3 chunks); 1328 used
    pg_conv=rsion_default_index: 3072
total in 2 blocks; 1432 free (3 chunks); 1640 us=d
    pg_operator_oprname_l_r_n_index: 3072
total in 2 blocks; 143= free (3 chunks); 1640 used
    pg_trigger_tgrelid_tgname_index: 3=72
total in 2 blocks; 1600 free (2 chunks); 1472 used
    pg_enum_=ypid_label_index: 3072 total
in 2 blocks; 1648 free (2 chunks); 1424 used=0A    pg_ts_config_oid_index:
3072 total in 2 blocks; 1744 free (3 c=unks); 1328 used
    pg_user_mapping_oid_index: 3072 total
in 2 bl=cks; 1744 free (3 chunks); 1328 used
    pg_opfamily_am_name_nsp_i=dex: 3072
total in 2 blocks; 1496 free (4 chunks); 1576 used
    p=_type_oid_index: 3072 total in 2
blocks; 1696 free (2 chunks); 1376 used=0A   
pg_aggregate_fnoid_index: 3072 total in 2 blocks; 1744 free (3=chunks); 1328
used
    pg_constraint_oid_index: 3072 total in
2 bl=cks; 1744 free (3 chunks); 1328 used
    pg_rewrite_rel_rulename_i=dex: 3072
total in 2 blocks; 1648 free (2 chunks); 1424 used
    p=_ts_parser_prsname_index: 3072 total
in 2 blocks; 1648 free (2 chunks); 14=4 used
    pg_ts_config_cfgname_index: 3072 total
in 2 blocks; 164= free (2 chunks); 1424 used
    pg_ts_parser_oid_index: 3072 total=in
2 blocks; 1744 free (3 chunks); 1328 used
    pg_operator_oid_i=dex: 3072 total in 2
blocks; 1696 free (2 chunks); 1376 used
    p=_namespace_nspname_index: 3072 total
in 2 blocks; 1744 free (3 chunks); 13=8 used
    pg_ts_template_oid_index: 3072 total
in 2 blocks; 1744 =ree (3 chunks); 1328 used
    pg_amop_opr_fam_index: 3072 total in=2
blocks; 1600 free (2 chunks); 1472 used
    pg_default_acl_role_=sp_obj_index:
3072 total in 2 blocks; 1496 free (4 chunks); 1576 used
=A0   pg_ts_dict_dictname_index: 3072 total in
2 blocks; 1648 free (2 c=unks); 1424 used
    pg_type_typname_nsp_index: 3072 total
in 2 bl=cks; 1648 free (2 chunks); 1424 used
    pg_opfamily_oid_index: 30=2 total in 2
blocks; 1744 free (3 chunks); 1328 used
    pg_class_=id_index: 3072 total in 2
blocks; 1696 free (2 chunks); 1376 used
  =A0 pg_proc_proname_args_nsp_index: 3072
total in 2 blocks; 1496 free (4 ch=nks); 1576 used
    pg_attribute_relid_attnum_index: 3072
total in=2 blocks; 1600 free (2 chunks); 1472 used
    pg_proc_oid_index: 3=72 total in 2
blocks; 1744 free (3 chunks); 1328 used
    pg_langu=ge_oid_index: 3072 total in 2
blocks; 1744 free (3 chunks); 1328 used
=A0   pg_namespace_oid_index: 3072 total in 2
blocks; 1744 free (3 chun=s); 1328 used
    pg_amproc_fam_proc_index: 3072 total
in 2 blocks= 1384 free (2 chunks); 1688 used
    pg_foreign_server_name_index:=3072
total in 2 blocks; 1744 free (3 chunks); 1328 used
    pg_att=ibute_relid_attnam_index: 3072
total in 2 blocks; 1648 free (2 chunks); 14=4 used
    pg_conversion_oid_index: 3072 total in
2 blocks; 1744 f=ee (3 chunks); 1328 used
    pg_user_mapping_user_server_index:
30=2 total in 2 blocks; 1648 free (2 chunks); 1424 used
    pg_conver=ion_name_nsp_index: 3072
total in 2 blocks; 1648 free (2 chunks); 1424 use=
    pg_authid_oid_index: 3072 total in 2
blocks; 1696 free (2 chu=ks); 1376 used
    pg_auth_members_member_role_index:
3072 total i= 2 blocks; 1648 free (2 chunks); 1424 used
    pg_tablespace_oid_i=dex: 3072 total in
2 blocks; 1744 free (3 chunks); 1328 used
    p=_database_datname_index: 3072 total
in 2 blocks; 1696 free (2 chunks); 137= used
    pg_auth_members_role_member_index:
3072 total in 2 block=; 1648 free (2 chunks); 1424 used
    pg_database_oid_index: 3072 =otal in 2
blocks; 1696 free (2 chunks); 1376 used
    pg_authid_ro=name_index: 3072 total in
2 blocks; 1696 free (2 chunks); 1376 used
  =dSmgr: 24576 total in 2 blocks; 9504 free (0
chunks); 15072 used
  LOC=LLOCK hash: 24576 total in 2 blocks; 15984
free (5 chunks); 8592 used
=A0 Timezones: 83472 total in 2 blocks; 3744 free (0
chunks); 79728 used
=A0 ErrorContext: 8192 total in 1 blocks; 8160 free (4
chunks); 32 used =0A Any suggestions are greatly appreciated
 
Thanks,
ioana


Re: Running out of memory on vacuum

From
Igor Neyman
Date:
> Subject: [GENERAL] Running out of memory on vacuum
>
> Hi all,
>
> I have a production database that sometimes runs out of memory=at
> nightly vacuum.
>
> The application runs typically with around 40 post=res connections but
> there are times when the connections increase because =f some queries
> going on. The reason is that the operations are slow, the t=rminals
> time out and try to reconnect using new connections. Some time ago=I
> started to have problems with too many connections being open so I
> lower=d the limit to
> 300 connections. It was all good until recently when even w=th 180
> connections I am running out of memory on vacuum... So the connecti=ns
> increase to 180 and the system still runs properly for other 2 days
> but=then at nightly vacuum runs out of memory.
> The fix is to restart postgre= ... If I only close the connections the
> problem is still these so I need =o restart postgres.
> If I don't restart postgres then the system will run=out of memory on
> queries at a point...
> Another important thing is that d=ring vacuum at 1am nothing else is
> going on that server so all the connect=ons are idle.
>
> 2013-05-14 06:53:51.449
> CST,"postgres","abrazo",8=41,"[local]",519233dc.2225,3,"VACUUM",2013-
> 05-14
> 06:53:48 CST,174/67143,0,=RROR,53200,"out of memory","Failed on request
> of size 668141820.",,,,,"vac=um;",,,"psql"
>
> OS:
>
> SUSE Linux Enterprise Server 11 (x86_64) VER=ION = 11 PATCHLEVEL = 2
>
> Suse compiled postgre version :
>  =A0                        =A0                        =A0 version
> ----------------------------------------------=------------------------
> --------------------------------------------------=--
>  PostgreSQL 9.0.3 on x86_64-suse-linux-gnu, compiled by GCC gcc (SU=E
> Linux) 4.3.4 [gcc-4_3-branch revision 152973], 64-bit
>
> Postgre= configuration parameters:
>
> max_connections = 300
> shared_buffers =3D 2GB
> maintenance_work_mem = 2GB
> effective_cache_size = 4GB
> wor=_mem = 2MB
> wal_buffers = 1MB
>
>
> checkpoint_segments = 16
> au=ovacuum = on
>
>
> wal_level = hot_standby
> archive_mode = on
> a=chive_command = 'test -f
> /cbnDBscripts/tmp/PITR_primarydb_stop_backup ||=rsync --timeout=30 -atz
> %p spsnidb1:/data01/wal_files/%f'
> archive_time=ut = 60
>
>
> free
>
>     total  =A0
> used       free
> shared    buffer=     cached
> Mem:
> 7880512    7825136 =A0
> 55376
> 0      72376  =A0 4537644 -/+
> buffers/cache:    3215116    4665396
> Swap:=A0
> 2097148
> 0    2097148
>
> =etc/sysctl.conf
>
> kernel.shmmax=2684354560
> kernel.shmall=26843545=0
> vm.swappiness=0
> vm.overcommit_memory=2
>
.....
>
> Thanks,
> ioana


1. You have autovacuum running.  Is there specific reason to run "manual" vacuum nightly?
2. Reduce maintenance_work_mem, you may have manual vacuum and autovacuum running at the same time, each requesting 2GB
(yourcurrent setting). 
3. Use connection pooling software (e.g. PgBouncer) and reduce max_connections.

Igor Neyman




Re: Running out of memory on vacuum

From
Ioana Danes
Date:
Hi Igor,

1. I could remove the nightly vacuum but I think that is
 not the cause. The vacuum is only catching the problem. If I ignore the
 vacuum message for few days the system is gonna run out of memory on
queries...
2. There is no autovacuum running in the same time. I
tried to run vacuum verbose manually and checked what else was going on
on the server.
I also reduced the maintenance work mem to 1 GB but I get the same error.
3.
 I do use connection pooling. I have 1500 terminals selling and at busy
times I might need more than 100 active connections but just
occationally...

Thanks for quick response,



----- Original Message -----
From: Igor Neyman <ineyman@perceptron.com>
To: Ioana Danes <ioanasoftware@yahoo.ca>; PostgreSQL General <pgsql-general@postgresql.org>
Cc:
Sent: Tuesday, May 14, 2013 10:11:19 AM
Subject: Re: [GENERAL] Running out of memory on vacuum

> Subject: [GENERAL] Running out of memory on vacuum
>
> Hi all,
>
> I have a production database that sometimes runs out of memory=at
> nightly vacuum.
>
> The application runs typically with around 40 post=res connections but
> there are times when the connections increase because =f some queries
> going on. The reason is that the operations are slow, the t=rminals
> time out and try to reconnect using new connections. Some time ago=I
> started to have problems with too many connections being open so I
> lower=d the limit to
> 300 connections. It was all good until recently when even w=th 180
> connections I am running out of memory on vacuum... So the connecti=ns
> increase to 180 and the system still runs properly for other 2 days
> but=then at nightly vacuum runs out of memory.
> The fix is to restart postgre= ... If I only close the connections the
> problem is still these so I need =o restart postgres.
> If I don't restart postgres then the system will run=out of memory on
> queries at a point...
> Another important thing is that d=ring vacuum at 1am nothing else is
> going on that server so all the connect=ons are idle.
>
> 2013-05-14 06:53:51.449
> CST,"postgres","abrazo",8=41,"[local]",519233dc.2225,3,"VACUUM",2013-
> 05-14
> 06:53:48 CST,174/67143,0,=RROR,53200,"out of memory","Failed on request
> of size 668141820.",,,,,"vac=um;",,,"psql"
>
> OS:
>
> SUSE Linux Enterprise Server 11 (x86_64) VER=ION = 11 PATCHLEVEL = 2
>
> Suse compiled postgre version :
>  =A0                        =A0                        =A0 version
> ----------------------------------------------=------------------------
> --------------------------------------------------=--
>  PostgreSQL 9.0.3 on x86_64-suse-linux-gnu, compiled by GCC gcc (SU=E
> Linux) 4.3.4 [gcc-4_3-branch revision 152973], 64-bit
>
> Postgre= configuration parameters:
>
> max_connections = 300
> shared_buffers =3D 2GB
> maintenance_work_mem = 2GB
> effective_cache_size = 4GB
> wor=_mem = 2MB
> wal_buffers = 1MB
>
>
> checkpoint_segments = 16
> au=ovacuum = on
>
>
> wal_level = hot_standby
> archive_mode = on
> a=chive_command = 'test -f
> /cbnDBscripts/tmp/PITR_primarydb_stop_backup ||=rsync --timeout=30 -atz
> %p spsnidb1:/data01/wal_files/%f'
> archive_time=ut = 60
>
>
> free
>
>     total  =A0
> used       free
> shared    buffer=     cached
> Mem:
> 7880512    7825136 =A0
> 55376
> 0      72376  =A0 4537644 -/+
> buffers/cache:    3215116    4665396
> Swap:=A0
> 2097148
> 0    2097148
>
> =etc/sysctl.conf
>
> kernel.shmmax=2684354560
> kernel.shmall=26843545=0
> vm.swappiness=0
> vm.overcommit_memory=2
>
.....
>
> Thanks,
> ioana


1. You have autovacuum running.  Is there specific reason to run "manual" vacuum nightly?
2. Reduce maintenance_work_mem, you may have manual vacuum and autovacuum running at the same time, each requesting 2GB
(yourcurrent setting). 
3. Use connection pooling software (e.g. PgBouncer) and reduce max_connections.

Igor Neyman




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: Running out of memory on vacuum

From
Igor Neyman
Date:

> -----Original Message-----
> From: Ioana Danes [mailto:ioanasoftware@yahoo.ca]
> Sent: Tuesday, May 14, 2013 10:30 AM
> To: Igor Neyman; PostgreSQL General
> Subject: Re: [GENERAL] Running out of memory on vacuum
>
> Hi Igor,
>
> 1. I could remove the nightly vacuum but I think that is  not the
> cause. The vacuum is only catching the problem. If I ignore the  vacuum
> message for few days the system is gonna run out of memory on
> queries...
> 2. There is no autovacuum running in the same time. I tried to run
> vacuum verbose manually and checked what else was going on on the
> server.
> I also reduced the maintenance work mem to 1 GB but I get the same
> error.
> 3.
>  I do use connection pooling. I have 1500 terminals selling and at busy
> times I might need more than 100 active connections but just
> occationally...
>
> Thanks for quick response,
>
>

You still didn't explain, why do you need manual vacuuming.
You have autovacuum set "on", so it'll wake up every so often do its job based on other autovacuum config parameters.

What kind of connection pooling are you using?
Is it set for session pooling, or transaction, or statement pooling?

Having more than 100 active connections at a time does not mean that all of them executing queries at the same time.
Unless you have a lot of processor cores (and I mean a LOT!), it does not make sense to set pool at 180, even less so
at300. 

Igor Neyman


Re: Running out of memory on vacuum

From
Ioana Danes
Date:

Hi Igor,

I don't need the manual vacuum. I just don't want to remove it now because it gives me a clue that something is wrong
andI need to restart postgres to free up the memory. Otherwise I run out of memory later in the day and that is
somethingI want to avoid. 

Even if "I don't need it" I don't think it is "normal" to run out of memory in these conditions. (I had it off for a
whileand I did run out of memory on pg_dump). 


We use jdbc connection pooling and it does make sense to have it higher than 100 at some points during the day. Anyhow
atthe time of the vacuum there is nothing else going on on the database. Sales are off. 

Thanks,



----- Original Message -----
From: Igor Neyman <ineyman@perceptron.com>
To: Ioana Danes <ioanasoftware@yahoo.ca>; PostgreSQL General <pgsql-general@postgresql.org>
Cc:
Sent: Tuesday, May 14, 2013 11:06:25 AM
Subject: Re: [GENERAL] Running out of memory on vacuum



> -----Original Message-----
> From: Ioana Danes [mailto:ioanasoftware@yahoo.ca]
> Sent: Tuesday, May 14, 2013 10:30 AM
> To: Igor Neyman; PostgreSQL General
> Subject: Re: [GENERAL] Running out of memory on vacuum
>
> Hi Igor,
>
> 1. I could remove the nightly vacuum but I think that is  not the
> cause. The vacuum is only catching the problem. If I ignore the  vacuum
> message for few days the system is gonna run out of memory on
> queries...
> 2. There is no autovacuum running in the same time. I tried to run
> vacuum verbose manually and checked what else was going on on the
> server.
> I also reduced the maintenance work mem to 1 GB but I get the same
> error.
> 3.
>  I do use connection pooling. I have 1500 terminals selling and at busy
> times I might need more than 100 active connections but just
> occationally...
>
> Thanks for quick response,
>
>

You still didn't explain, why do you need manual vacuuming.
You have autovacuum set "on", so it'll wake up every so often do its job based on other autovacuum config parameters.

What kind of connection pooling are you using?
Is it set for session pooling, or transaction, or statement pooling?

Having more than 100 active connections at a time does not mean that all of them executing queries at the same time.
Unless you have a lot of processor cores (and I mean a LOT!), it does not make sense to set pool at 180, even less so
at300. 

Igor Neyman


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: Running out of memory on vacuum

From
Igor Neyman
Date:

> -----Original Message-----
> From: Ioana Danes [mailto:ioanasoftware@yahoo.ca]
> Sent: Tuesday, May 14, 2013 11:29 AM
> To: Igor Neyman; PostgreSQL General
> Subject: Re: [GENERAL] Running out of memory on vacuum
>
>
>
> Hi Igor,
>
> I don't need the manual vacuum. I just don't want to remove it now
> because it gives me a clue that something is wrong and I need to
> restart postgres to free up the memory. Otherwise I run out of memory
> later in the day and that is something I want to avoid.
>
> Even if "I don't need it" I don't think it is "normal" to run out of
> memory in these conditions. (I had it off for a while and I did run out
> of memory on pg_dump).
>
>
> We use jdbc connection pooling and it does make sense to have it higher
> than 100 at some points during the day. Anyhow at the time of the
> vacuum there is nothing else going on on the database. Sales are off.
>
> Thanks,
>

Ioana,

You are probably correct that manual vacuum just highlights the problem, causing it to occur sooner rather than later.

The real problem is the number of connections, your hardware isn't good enough to support over a 100 of connections.
Connection pooler is needed, but not JDBC connection pooling, which probably does nothing for you.

I'd suggest that server-side connection pooler would fix your problem, if set up properly.
Take a look at PgBouncer.  It is "lightweight", very easy to install and configure.  I've been using it for years and
havenothing but the praise for it. 

Igor Neyman


Re: Running out of memory on vacuum

From
Scott Marlowe
Date:
On Tue, May 14, 2013 at 8:30 AM, Ioana Danes <ioanasoftware@yahoo.ca> wrote:
> Hi Igor,
>
> 1. I could remove the nightly vacuum but I think that is
>  not the cause. The vacuum is only catching the problem. If I ignore the
>  vacuum message for few days the system is gonna run out of memory on
> queries...

You should be able to run vacuum any time really, so yeah no great
need to kill it off right away.

> 2. There is no autovacuum running in the same time. I
> tried to run vacuum verbose manually and checked what else was going on
> on the server.
> I also reduced the maintenance work mem to 1 GB but I get the same error.

Set it something MUCH lower. Like 256MB or something.
Also set your shared_buffers lower if you can, 2G on an 8G machines (I
think that's what you have) is pretty high, and if you're running out
of memory, it's definitely not helping. By the time you run out of
memory the OS is likely swapping out your shared_buffers (it doesn't
know any better, it's just one more thing to swap out, and if some
part isn't getting accessed a lot it gets swapped out to make room)

How big is your swap? How much is getting used? How much memory can
you put in this machine? My laptop has 12G, my netbook has 8G, may
production servers have 512GB to 1TB of memory. 8GB was a typical
memory size for a medium sized DB server about 8 or so years ago.
Memory is cheap, downtime and troubleshooting are not.

> 3. I do use connection pooling. I have 1500 terminals selling and at busy
> times I might need more than 100 active connections but just
> occationally...

You can pool and they should just queue, depending on what method of
pooling you're using. That said I've seen perfectly reasonable
behaviour on larger machines witih ~500 or more connections (when
you've got HUNDREDS of servers needing a dozen persistent connections
each you just have to deal sometimes I guess).

So, tl;dr: Get more memory or lower your shared_buffers /
main_work_mem to something like 512MB each and see if that helps. Also
see what ELSE is using all your memory when this happens. A stack
trace is ok, but something like top with M for sorting is probably
more useful.


Re: Running out of memory on vacuum

From
Ioana Danes
Date:

Hi Scott,


I am running with the same configuration since 2 years ago and this only started to happen 2-3 month ago. 

I agree that in some conditions that configuration could not be enough for over 100 connections and in that case I
wouldexpect the system to complain during these queries but it does not happen.  

During the vacuum analyze there is nothing else going on on that server. Top shows only one process running and that is
postgreswith vacuum analyze (I also checked the pg_stat_activity). The few connections from the application are <IDLE>
andthe processes are sleeping so they should not use much memory. 
I think it is a memory leak when "too many" connections are open otherwise I can't explain. 

I will try to lower the parameters as you suggested but I have a feeling that this is only going to delay the
behavior...

Thanks a lot for your response,
ioana



----- Original Message -----
From: Scott Marlowe <scott.marlowe@gmail.com>
To: Ioana Danes <ioanasoftware@yahoo.ca>
Cc: Igor Neyman <ineyman@perceptron.com>; PostgreSQL General <pgsql-general@postgresql.org>
Sent: Tuesday, May 14, 2013 12:14:18 PM
Subject: Re: [GENERAL] Running out of memory on vacuum

On Tue, May 14, 2013 at 8:30 AM, Ioana Danes <ioanasoftware@yahoo.ca> wrote:
> Hi Igor,
>
> 1. I could remove the nightly vacuum but I think that is
>  not the cause. The vacuum is only catching the problem. If I ignore the
>  vacuum message for few days the system is gonna run out of memory on
> queries...

You should be able to run vacuum any time really, so yeah no great
need to kill it off right away.

> 2. There is no autovacuum running in the same time. I
> tried to run vacuum verbose manually and checked what else was going on
> on the server.
> I also reduced the maintenance work mem to 1 GB but I get the same error.

Set it something MUCH lower. Like 256MB or something.
Also set your shared_buffers lower if you can, 2G on an 8G machines (I
think that's what you have) is pretty high, and if you're running out
of memory, it's definitely not helping. By the time you run out of
memory the OS is likely swapping out your shared_buffers (it doesn't
know any better, it's just one more thing to swap out, and if some
part isn't getting accessed a lot it gets swapped out to make room)

How big is your swap? How much is getting used? How much memory can
you put in this machine? My laptop has 12G, my netbook has 8G, may
production servers have 512GB to 1TB of memory. 8GB was a typical
memory size for a medium sized DB server about 8 or so years ago.
Memory is cheap, downtime and troubleshooting are not.

> 3. I do use connection pooling. I have 1500 terminals selling and at busy
> times I might need more than 100 active connections but just
> occationally...

You can pool and they should just queue, depending on what method of
pooling you're using. That said I've seen perfectly reasonable
behaviour on larger machines witih ~500 or more connections (when
you've got HUNDREDS of servers needing a dozen persistent connections
each you just have to deal sometimes I guess).

So, tl;dr: Get more memory or lower your shared_buffers /
main_work_mem to something like 512MB each and see if that helps. Also
see what ELSE is using all your memory when this happens. A stack
trace is ok, but something like top with M for sorting is probably
more useful.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



Re: Running out of memory on vacuum

From
Scott Marlowe
Date:
Well definitely look at getting more memory in it if you can. 8G is
seriously pretty small.


Re: Running out of memory on vacuum

From
Ioana Danes
Date:
I agree and I will do.
Now let me ask you this. How much memory would be decent you put on a server with 2000 users creating transactions
every4-10 seconds (2 to 20 inserts) at pick times? I know more should be considered when taking such decision but I
wouldlike to know your point of view at a first sight... 

Thanks



----- Original Message -----
From: Scott Marlowe <scott.marlowe@gmail.com>
To: Ioana Danes <ioanasoftware@yahoo.ca>
Cc: Igor Neyman <ineyman@perceptron.com>; PostgreSQL General <pgsql-general@postgresql.org>
Sent: Tuesday, May 14, 2013 1:04:06 PM
Subject: Re: [GENERAL] Running out of memory on vacuum

Well definitely look at getting more memory in it if you can. 8G is
seriously pretty small.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



Re: Running out of memory on vacuum

From
John R Pierce
Date:
On 5/14/2013 10:25 AM, Ioana Danes wrote:
> I agree and I will do.
> Now let me ask you this. How much memory would be decent you put on a server with 2000 users creating transactions
every4-10 seconds (2 to 20 inserts) at pick times? I know more should be considered when taking such decision but I
wouldlike to know your point of view at a first sight... 

how many 100s of CPU cores do you have to execute those 1000+ concurrent
transactions?


--
john r pierce                                      37N 122W
somewhere on the middle of the left coast



Re: Running out of memory on vacuum

From
Scott Marlowe
Date:
On Tue, May 14, 2013 at 11:25 AM, Ioana Danes <ioanasoftware@yahoo.ca> wrote:
> I agree and I will do.
> Now let me ask you this. How much memory would be decent you put on a server with 2000 users creating transactions
every4-10 seconds (2 to 20 inserts) at pick times? I know more should be considered when taking such decision but I
wouldlike to know your point of view at a first sight... 

2000 users running a transaction every 4 seconds each is 2000/4 tps or
500 tps.  500 tps is no big deal for most servers with a decent RAID
array and battery backed controller or running on a single SSD. Memory
wise if you need to have a connection open and just waiting for the
next transaction, you'll need ~6MB free per connection for the basic
backend, plus extra memory for sorts etc. Let's say 10MB. Double that
for a fudge factor. Times 2000. That's 4GB just to hold all that state
in memory. After that you want maint work mem, shared buffers and then
add all that up and double it so the OS can do a lot of caching. So,
I'd say look at going to at least 16G. Again, I'd fudge factor that to
32G just to be sure.

I have built servers that held open ~1000 connections, most idle but
persistent on 8 core 32G machines with 16 drives in a RAID controller
with a battery back RAID that were plenty fast in that situation. 32G
is pretty darned cheap, assuming your server can hold that much
memory. If it can hold more great, if it's not too much look at 64G
and more. How big is your data store? The more of it you can fit in
kernel cache the better. If you're dealing with a 10G database great,
if it's 500GB then try to get as much memory as possible up to 512GB
or so into that machine.

On Tue, May 14, 2013 at 3:32 PM, John R Pierce wrote:

> how many 100s of CPU cores do you have to execute those 1000+ concurrent transactions?

I think you're misreading the OP's post. 2000 clients running a
transaction every 4 seconds == 500 tps. With an SSD my laptop could do
that with 16G RAM probably.


Re: Running out of memory on vacuum

From
Scott Marlowe
Date:
Meant to add: I'd definitely be looking at using pgbouncer if you can
to pool locally. Makes a huge difference in how the machine behaves
should things go badly (i.e. it starts to slow down and connections
want to pile up)

On Tue, May 14, 2013 at 4:15 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Tue, May 14, 2013 at 11:25 AM, Ioana Danes <ioanasoftware@yahoo.ca> wrote:
>> I agree and I will do.
>> Now let me ask you this. How much memory would be decent you put on a server with 2000 users creating transactions
every4-10 seconds (2 to 20 inserts) at pick times? I know more should be considered when taking such decision but I
wouldlike to know your point of view at a first sight... 
>
> 2000 users running a transaction every 4 seconds each is 2000/4 tps or
> 500 tps.  500 tps is no big deal for most servers with a decent RAID
> array and battery backed controller or running on a single SSD. Memory
> wise if you need to have a connection open and just waiting for the
> next transaction, you'll need ~6MB free per connection for the basic
> backend, plus extra memory for sorts etc. Let's say 10MB. Double that
> for a fudge factor. Times 2000. That's 4GB just to hold all that state
> in memory. After that you want maint work mem, shared buffers and then
> add all that up and double it so the OS can do a lot of caching. So,
> I'd say look at going to at least 16G. Again, I'd fudge factor that to
> 32G just to be sure.
>
> I have built servers that held open ~1000 connections, most idle but
> persistent on 8 core 32G machines with 16 drives in a RAID controller
> with a battery back RAID that were plenty fast in that situation. 32G
> is pretty darned cheap, assuming your server can hold that much
> memory. If it can hold more great, if it's not too much look at 64G
> and more. How big is your data store? The more of it you can fit in
> kernel cache the better. If you're dealing with a 10G database great,
> if it's 500GB then try to get as much memory as possible up to 512GB
> or so into that machine.
>
> On Tue, May 14, 2013 at 3:32 PM, John R Pierce wrote:
>
>> how many 100s of CPU cores do you have to execute those 1000+ concurrent transactions?
>
> I think you're misreading the OP's post. 2000 clients running a
> transaction every 4 seconds == 500 tps. With an SSD my laptop could do
> that with 16G RAM probably.



--
To understand recursion, one must first understand recursion.


Re: Running out of memory on vacuum

From
Ioana Danes
Date:

Hello Scott,

I will look into using pgbouncer at a point. For now I will try to increase the memory. From practice I see that 16GB
itis not enough unless I lower max_connections to 200. I have another production server with 16 GB and it is stable if
theconnections open are less than 200, once it is crossing that limit it has the same pb at vacuum. One problem I think
isthat the connection pooling we are using (dbcp) is configured with min idle connections to huge number so it keeps
allthe connections (300) to the db open. I will change that to 25-40.  

So I have another question if you don't mind. How much memory is used on <IDLE> connections? I thought that if the
systemis quiet then very little should be used for IDLE connections, but apparently I am wrong. Do you have any
documentationyou can point me to so I can educate myself on this subject?  

I had the system running on a test environment for the whole night and in the morning I stopped all the apache clients
sothe db is quiet but I still kept the connections open. There are 297 idle connection.  


free shows:

             total       used       free     shared    buffers     cached
Mem:      16793380   13999196    2794184          0     256108    7656180
-/+ buffers/cache:    6086908   10706472
Swap:      4194300          0    4194300


I stopped the application server so all the connections were closed.

free shows:


             total       used       free     shared    buffers     cached
Mem:      16793380    8408604    8384776          0     256372    7558216
-/+ buffers/cache:     594016   16199364
Swap:      4194300          0    4194300

From this


From this I see 5GB in 297 connections. Is this normal behavior?

I am living in Ottawa so I hope I will see you at pgcon.


Thanks a lot for your reply,
Ioana 


----- Original Message -----
From: Scott Marlowe <scott.marlowe@gmail.com>
To: Ioana Danes <ioanasoftware@yahoo.ca>
Cc: Igor Neyman <ineyman@perceptron.com>; PostgreSQL General <pgsql-general@postgresql.org>
Sent: Tuesday, May 14, 2013 6:16:38 PM
Subject: Re: [GENERAL] Running out of memory on vacuum

Meant to add: I'd definitely be looking at using pgbouncer if you can
to pool locally. Makes a huge difference in how the machine behaves
should things go badly (i.e. it starts to slow down and connections
want to pile up)

On Tue, May 14, 2013 at 4:15 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Tue, May 14, 2013 at 11:25 AM, Ioana Danes <ioanasoftware@yahoo.ca> wrote:
>> I agree and I will do.
>> Now let me ask you this. How much memory would be decent you put on a server with 2000 users creating transactions
every4-10 seconds (2 to 20 inserts) at pick times? I know more should be considered when taking such decision but I
wouldlike to know your point of view at a first sight... 
>
> 2000 users running a transaction every 4 seconds each is 2000/4 tps or
> 500 tps.  500 tps is no big deal for most servers with a decent RAID
> array and battery backed controller or running on a single SSD. Memory
> wise if you need to have a connection open and just waiting for the
> next transaction, you'll need ~6MB free per connection for the basic
> backend, plus extra memory for sorts etc. Let's say 10MB. Double that
> for a fudge factor. Times 2000. That's 4GB just to hold all that state
> in memory. After that you want maint work mem, shared buffers and then
> add all that up and double it so the OS can do a lot of caching. So,
> I'd say look at going to at least 16G. Again, I'd fudge factor that to
> 32G just to be sure.
>
> I have built servers that held open ~1000 connections, most idle but
> persistent on 8 core 32G machines with 16 drives in a RAID controller
> with a battery back RAID that were plenty fast in that situation. 32G
> is pretty darned cheap, assuming your server can hold that much
> memory. If it can hold more great, if it's not too much look at 64G
> and more. How big is your data store? The more of it you can fit in
> kernel cache the better. If you're dealing with a 10G database great,
> if it's 500GB then try to get as much memory as possible up to 512GB
> or so into that machine.
>
> On Tue, May 14, 2013 at 3:32 PM, John R Pierce wrote:
>
>> how many 100s of CPU cores do you have to execute those 1000+ concurrent transactions?
>
> I think you're misreading the OP's post. 2000 clients running a
> transaction every 4 seconds == 500 tps. With an SSD my laptop could do
> that with 16G RAM probably.



--
To understand recursion, one must first understand recursion.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general