Thread: Memory usage during vacuum
Version: PostgreSQL 7.4.1 on i686-pc-cygwin, compiled by GCC gcc (GCC) 3.3.1 (cygming special) postgresql.conf settings: tcpip_socket = true max_connections = 16 shared_buffers = 2048 # min 16, at least max_connections*2, 8KB each sort_mem = 2048 # min 64, size in KB vacuum_mem = 8192 # min 1024, size in KB wal_buffers = 16 # min 4, 8KB each checkpoint_segments = 9 # in logfile segments, min 1, 16MB each effective_cache_size = 3000 # typically 8KB each random_page_cost = 2 # units are one sequential page fetch cost cpu_index_tuple_cost = 0.0001 # 0.001(same) default_statistics_target = 300 # range 1-1000 log_timestamp = true stats_start_collector = true stats_command_string = true stats_block_level = true stats_row_level = true stats_reset_on_server_start = false This is on a workstation so I've purposely limited the amount of memory that would be used. I would have assumed that some combination of (shared_buffers*8 + vacuum_mem) plus a little overhead would be in the neighborhood of the maximum amount of memory used during the vacuum analyze process. However, I've noticed that when I hit some very large tables the backend's memory usage will soar to as high as 100+ megs. I'm trying to keep postgresql's memory usage under 40 megs under all conditions so that other services/applications don't grind to a halt due to swapping. Is there any way to achieve my goal? Regards, Shelby Cain __________________________________ Do you Yahoo!? Yahoo! Finance Tax Center - File online. File on time. http://taxes.yahoo.com/filing.html
Shelby Cain <alyandon@yahoo.com> writes: > I'm trying to keep postgresql's memory usage > under 40 megs under all conditions so that other > services/applications don't grind to a halt due to > swapping. Is there any way to achieve my goal? Don't use VACUUM FULL. The vacuum_mem setting only limits the space consumed by plain VACUUM --- VACUUM FULL needs to keep track of all the free space in the table, and will eat as much memory as it has to to do that. regards, tom lane
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 How about plugging in more memory ? 40MB seems a bit low for a database server footprint - well, certainly depends on what you do. But if your machine starts swapping with an extra 40 MB of memory consumption I'd say the machine is undersized for the application. I usually have around 500 MB free memory with everything running. Memory is cheap nowadays... On Thursday 25 March 2004 08:15 am, Tom Lane wrote: > Shelby Cain <alyandon@yahoo.com> writes: > > I'm trying to keep postgresql's memory usage > > under 40 megs under all conditions so that other > > services/applications don't grind to a halt due to > > swapping. Is there any way to achieve my goal? > > Don't use VACUUM FULL. The vacuum_mem setting only limits the space > consumed by plain VACUUM --- VACUUM FULL needs to keep track of all the > free space in the table, and will eat as much memory as it has to to do > that. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match - -- UC - -- Open Source Solutions 4U, LLC 2570 Fleetwood Drive Phone: +1 650 872 2425 San Bruno, CA 94066 Cell: +1 650 302 2405 United States Fax: +1 650 872 2417 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQFAYw22jqGXBvRToM4RAghbAKCbXZ9avDIMwpxOyo3g+iyoTmJNSQCgkk3n 2a8HrY9gxBNMk/2iwLrnHEA= =TZob -----END PGP SIGNATURE-----
I apologize for my original post being unclear. I'm running "vacuum analyze" and seeing the behavior mentioned. Does specifying the analyze option imply "vacuum full"? On a hunch I just ran analyze <really big table> and the backend's memory usage soared up to 100+ megs. I suspect that means it isn't the vacuum but the analyze that is eating all my precious ram. :) Any tips on minimizing the memory footprint during analyze (ie: backing off the 300 setting that I'm currently using) or is this just something I'll have to live with? Regards, Shelby Cain --- Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Don't use VACUUM FULL. The vacuum_mem setting only > limits the space > consumed by plain VACUUM --- VACUUM FULL needs to > keep track of all the > free space in the table, and will eat as much memory > as it has to to do > that. > > regards, tom lane __________________________________ Do you Yahoo!? Yahoo! Finance Tax Center - File online. File on time. http://taxes.yahoo.com/filing.html
I agree in principle that the solution is to run on a server with more memory instead of my local development box. However, I'm not going to be able to simply request that additional memory be installed as these are "standard" boxes that IT distributes to employees. Regardless, I'm more curious about whether I was overlookign a setting that could reduce the memory footprint during a vacuum analyze cycle than about getting it reduced. If it becomes a major pain I'll simply run the thing on off hours while I'm not at work. :) Regards, Shelby Cain --- "Uwe C. Schroeder" <uwe@oss4u.com> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > How about plugging in more memory ? > 40MB seems a bit low for a database server footprint > - well, certainly depends > on what you do. > But if your machine starts swapping with an extra 40 > MB of memory consumption > I'd say the machine is undersized for the > application. I usually have around > 500 MB free memory with everything running. Memory > is cheap nowadays... > > On Thursday 25 March 2004 08:15 am, Tom Lane wrote: > > Shelby Cain <alyandon@yahoo.com> writes: > > > I'm trying to keep postgresql's memory usage > > > under 40 megs under all conditions so that other > > > services/applications don't grind to a halt due > to > > > swapping. Is there any way to achieve my goal? > > > > Don't use VACUUM FULL. The vacuum_mem setting > only limits the space > > consumed by plain VACUUM --- VACUUM FULL needs to > keep track of all the > > free space in the table, and will eat as much > memory as it has to to do > > that. > > > > regards, tom lane > > > > ---------------------------(end of > broadcast)--------------------------- > > TIP 9: the planner will ignore your desire to > choose an index scan if your > > joining column's datatypes do not match > > - -- > UC > > - -- > Open Source Solutions 4U, LLC 2570 Fleetwood Drive > Phone: +1 650 872 2425 San Bruno, CA 94066 > Cell: +1 650 302 2405 United States > Fax: +1 650 872 2417 > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.2.3 (GNU/Linux) > > iD8DBQFAYw22jqGXBvRToM4RAghbAKCbXZ9avDIMwpxOyo3g+iyoTmJNSQCgkk3n > 2a8HrY9gxBNMk/2iwLrnHEA= > =TZob > -----END PGP SIGNATURE----- > __________________________________ Do you Yahoo!? Yahoo! Finance Tax Center - File online. File on time. http://taxes.yahoo.com/filing.html
Shelby Cain <alyandon@yahoo.com> writes: > I apologize for my original post being unclear. I'm > running "vacuum analyze" and seeing the behavior > mentioned. Does specifying the analyze option imply > "vacuum full"? No; I just figured you were probably using FULL without saying so. However ... > On a hunch I just ran analyze <really big table> and > the backend's memory usage soared up to 100+ megs. ... the issue is obviously ANALYZE and not VACUUM at all. What statistics targets are you using? It's hard to believe ANALYZE would eat that much space unless it's being asked for a really large target. regards, tom lane
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Thursday 25 March 2004 09:12 am, Shelby Cain wrote: > I agree in principle that the solution is to run on a > server with more memory instead of my local > development box. However, I'm not going to be able to > simply request that additional memory be installed as > these are "standard" boxes that IT distributes to > employees. Too bad that. > > Regardless, I'm more curious about whether I was > overlookign a setting that could reduce the memory > footprint during a vacuum analyze cycle than about > getting it reduced. If it becomes a major pain I'll > simply run the thing on off hours while I'm not at > work. :) That's what I would do. Depending on how many transactions (inserts/deletes) you do a day, it might be perfectly fine to just run a vacuum full analyze every night via cron. You can reduce the per connection settings (max connections allowed as well as shared buffers), but that may have a rather significant impact on performance. So it's probably best to leave those settings alone (when developing you don't want to wait a minute per query) and rather do the cleanup at night, or maybe an additional one during lunch (if needed). > > Regards, > > Shelby Cain > > --- "Uwe C. Schroeder" <uwe@oss4u.com> wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA1 > > > > > > How about plugging in more memory ? > > 40MB seems a bit low for a database server footprint > > - well, certainly depends > > on what you do. > > But if your machine starts swapping with an extra 40 > > MB of memory consumption > > I'd say the machine is undersized for the > > application. I usually have around > > 500 MB free memory with everything running. Memory > > is cheap nowadays... > > > > On Thursday 25 March 2004 08:15 am, Tom Lane wrote: > > > Shelby Cain <alyandon@yahoo.com> writes: > > > > I'm trying to keep postgresql's memory usage > > > > under 40 megs under all conditions so that other > > > > services/applications don't grind to a halt due > > > > to > > > > > > swapping. Is there any way to achieve my goal? > > > > > > Don't use VACUUM FULL. The vacuum_mem setting > > > > only limits the space > > > > > consumed by plain VACUUM --- VACUUM FULL needs to > > > > keep track of all the > > > > > free space in the table, and will eat as much > > > > memory as it has to to do > > > > > that. > > > > > > regards, tom lane > > > > > > ---------------------------(end of > > > > broadcast)--------------------------- > > > > > TIP 9: the planner will ignore your desire to > > > > choose an index scan if your > > > > > joining column's datatypes do not match > > > > - -- > > UC > > > > - -- > > Open Source Solutions 4U, LLC 2570 Fleetwood Drive > > Phone: +1 650 872 2425 San Bruno, CA 94066 > > Cell: +1 650 302 2405 United States > > Fax: +1 650 872 2417 > > -----BEGIN PGP SIGNATURE----- > > Version: GnuPG v1.2.3 (GNU/Linux) > > iD8DBQFAYw22jqGXBvRToM4RAghbAKCbXZ9avDIMwpxOyo3g+iyoTmJNSQCgkk3n > > > 2a8HrY9gxBNMk/2iwLrnHEA= > > =TZob > > -----END PGP SIGNATURE----- > > __________________________________ > Do you Yahoo!? > Yahoo! Finance Tax Center - File online. File on time. > http://taxes.yahoo.com/filing.html - -- UC - -- Open Source Solutions 4U, LLC 2570 Fleetwood Drive Phone: +1 650 872 2425 San Bruno, CA 94066 Cell: +1 650 302 2405 United States Fax: +1 650 872 2417 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQFAYxazjqGXBvRToM4RAh1RAKCV9ozDim5gBWYdX+VBdyiT8Ych1QCgyqZb Eo6UT9r2K2z6TE1jQpg/PEU= =4ogO -----END PGP SIGNATURE-----
Currently my default is 300 (yes - very large I know) but overriding default_statistics_target with a value of 1 and re-running vacuum analyze on the same large table results in no change in maximum memory consumption during the process that I can see. It should be noted that I see this behavior only with one table. All other tables, even relatively large ones only result in a backend that grows to consume about 30 megabytes. Here is some sample output: c1scain=# set default_statistics_target=300; SET c1scain=# vacuum verbose analyze inventory_txns; INFO: vacuuming "public.inventory_txns" INFO: index "idx_inventory_txns_txndate" now contains 957655 row versions in 5770 pages DETAIL: 1686 index pages have been deleted, 1686 are currently reusable. CPU 0.51s/0.21u sec elapsed 5.20 sec. INFO: index "idx_inventory_txns_locitemnbr" now contains 957655 row versions in 4752 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.53s/0.09u sec elapsed 2.73 sec. INFO: "inventory_txns": found 0 removable, 957655 nonremovable row versions in 39549 pages DETAIL: 0 dead row versions cannot be removed yet. There were 15914 unused item pointers. 0 pages are entirely empty. CPU 2.16s/0.60u sec elapsed 15.76 sec. INFO: analyzing "public.inventory_txns" INFO: "inventory_txns": 39549 pages, 150000 rows sampled, 944729 estimated total rows I notice that postgresql decided to sample 150000 rows. Now, when I try the following: c1scain=# set default_statistics_target=1; SET c1scain=# vacuum verbose analyze inventory_txns; INFO: vacuuming "public.inventory_txns" INFO: index "idx_inventory_txns_txndate" now contains 957655 row versions in 5770 pages DETAIL: 1686 index pages have been deleted, 1686 are currently reusable. CPU 0.37s/0.15u sec elapsed 5.29 sec. INFO: index "idx_inventory_txns_locitemnbr" now contains 957655 row versions in 4752 pages DETAIL: 0 index pages have been deleted, 0 are currently reusable. CPU 0.54s/0.09u sec elapsed 2.93 sec. INFO: "inventory_txns": found 0 removable, 957655 nonremovable row versions in 39549 pages DETAIL: 0 dead row versions cannot be removed yet. There were 15914 unused item pointers. 0 pages are entirely empty. CPU 2.21s/0.51u sec elapsed 16.37 sec. INFO: analyzing "public.inventory_txns" INFO: "inventory_txns": 39549 pages, 150000 rows sampled, 944729 estimated total rows VACUUM It still decided to sample 150000 rows. Am I missing something obvious here? Shouldn't fewer rows be sampled when I set the collection target to 1? Regards, Shelby Cain --- Tom Lane <tgl@sss.pgh.pa.us> wrote: > ... the issue is obviously ANALYZE and not VACUUM at > all. What > statistics targets are you using? It's hard to > believe ANALYZE > would eat that much space unless it's being asked > for a really > large target. > __________________________________ Do you Yahoo!? Yahoo! Finance Tax Center - File online. File on time. http://taxes.yahoo.com/filing.html
Shelby Cain <alyandon@yahoo.com> writes: > It still decided to sample 150000 rows. Am I missing > something obvious here? Shouldn't fewer rows be > sampled when I set the collection target to 1? The sample size is 300 rows times the largest per-column analysis target, where default_statistics_target is used if the recorded per-column setting is -1. I would say that you have set a target of 500 for at least one of the columns of that table, using ALTER TABLE SET STATISTICS. Try this to see which: select attname, attstattarget from pg_attribute where attrelid = 'table_name_here'::regclass; regards, tom lane
I had thought that I had dropped and reloaded this table but apparently I hadn't and I had set the statistics target for one column to 500 while experimenting. Resetting it to -1 and running with a default of 300 gets ~ 70 megs memory footprint during the analyze now. Thanks Tom for indulging my curiosity on the matter. I've learned something that I didn't readily pick up from reading the documentation. Regards, Shelby Cain --- Tom Lane <tgl@sss.pgh.pa.us> wrote: > Shelby Cain <alyandon@yahoo.com> writes: > > It still decided to sample 150000 rows. Am I > missing > > something obvious here? Shouldn't fewer rows be > > sampled when I set the collection target to 1? > > The sample size is 300 rows times the largest > per-column analysis > target, where default_statistics_target is used if > the recorded > per-column setting is -1. I would say that you have > set a target of 500 > for at least one of the columns of that table, using > ALTER TABLE SET > STATISTICS. Try this to see which: > > select attname, attstattarget from pg_attribute > where attrelid = 'table_name_here'::regclass; > > regards, tom lane __________________________________ Do you Yahoo!? Yahoo! Finance Tax Center - File online. File on time. http://taxes.yahoo.com/filing.html