Thread: Memory usage during vacuum

Memory usage during vacuum

From
Shelby Cain
Date:
Version: PostgreSQL 7.4.1 on i686-pc-cygwin, compiled
by GCC gcc (GCC) 3.3.1 (cygming special)

postgresql.conf settings:

tcpip_socket = true
max_connections = 16
shared_buffers = 2048 # min 16, at least
max_connections*2, 8KB each
sort_mem = 2048 # min 64, size in KB
vacuum_mem = 8192 # min 1024, size in KB
wal_buffers = 16                # min 4, 8KB each
checkpoint_segments = 9 # in logfile segments, min 1,
16MB each
effective_cache_size = 3000     # typically 8KB each
random_page_cost = 2            # units are one
sequential page fetch cost
cpu_index_tuple_cost = 0.0001   # 0.001(same)
default_statistics_target = 300 # range 1-1000
log_timestamp = true
stats_start_collector = true
stats_command_string = true
stats_block_level = true
stats_row_level = true
stats_reset_on_server_start = false

This is on a workstation so I've purposely limited the
amount of memory that would be used.  I would have
assumed that some combination of (shared_buffers*8 +
vacuum_mem) plus a little overhead would be in the
neighborhood of the maximum amount of memory used
during the vacuum analyze process.  However, I've
noticed that when I hit some very large tables the
backend's memory usage will soar to as high as 100+
megs.  I'm trying to keep postgresql's memory usage
under 40 megs under all conditions so that other
services/applications don't grind to a halt due to
swapping.  Is there any way to achieve my goal?

Regards,

Shelby Cain

__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html

Re: Memory usage during vacuum

From
Tom Lane
Date:
Shelby Cain <alyandon@yahoo.com> writes:
> I'm trying to keep postgresql's memory usage
> under 40 megs under all conditions so that other
> services/applications don't grind to a halt due to
> swapping.  Is there any way to achieve my goal?

Don't use VACUUM FULL.  The vacuum_mem setting only limits the space
consumed by plain VACUUM --- VACUUM FULL needs to keep track of all the
free space in the table, and will eat as much memory as it has to to do
that.

            regards, tom lane

Re: Memory usage during vacuum

From
"Uwe C. Schroeder"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


How about plugging in more memory ?
40MB seems a bit low for a database server footprint - well, certainly depends
on what you do.
But if your machine starts swapping with an extra 40 MB of memory consumption
I'd say the machine is undersized for the application. I usually have around
500 MB free memory with everything running. Memory is cheap nowadays...

On Thursday 25 March 2004 08:15 am, Tom Lane wrote:
> Shelby Cain <alyandon@yahoo.com> writes:
> > I'm trying to keep postgresql's memory usage
> > under 40 megs under all conditions so that other
> > services/applications don't grind to a halt due to
> > swapping.  Is there any way to achieve my goal?
>
> Don't use VACUUM FULL.  The vacuum_mem setting only limits the space
> consumed by plain VACUUM --- VACUUM FULL needs to keep track of all the
> free space in the table, and will eat as much memory as it has to to do
> that.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match

- --
    UC

- --
Open Source Solutions 4U, LLC    2570 Fleetwood Drive
Phone:  +1 650 872 2425        San Bruno, CA 94066
Cell:   +1 650 302 2405        United States
Fax:    +1 650 872 2417
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFAYw22jqGXBvRToM4RAghbAKCbXZ9avDIMwpxOyo3g+iyoTmJNSQCgkk3n
2a8HrY9gxBNMk/2iwLrnHEA=
=TZob
-----END PGP SIGNATURE-----


Re: Memory usage during vacuum

From
Shelby Cain
Date:
I apologize for my original post being unclear.  I'm
running "vacuum analyze" and seeing the behavior
mentioned.  Does specifying the analyze option imply
"vacuum full"?

On a hunch I just ran analyze <really big table> and
the backend's memory usage soared up to 100+ megs.  I
suspect that means it isn't the vacuum but the analyze
that is eating all my precious ram.  :)

Any tips on minimizing the memory footprint during
analyze (ie: backing off the 300 setting that I'm
currently using) or is this just something I'll have
to live with?

Regards,

Shelby Cain


--- Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Don't use VACUUM FULL.  The vacuum_mem setting only
> limits the space
> consumed by plain VACUUM --- VACUUM FULL needs to
> keep track of all the
> free space in the table, and will eat as much memory
> as it has to to do
> that.
>
>             regards, tom lane


__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html

Re: Memory usage during vacuum

From
Shelby Cain
Date:
I agree in principle that the solution is to run on a
server with more memory instead of my local
development box.  However, I'm not going to be able to
simply request that additional memory be installed as
these are "standard" boxes that IT distributes to
employees.

Regardless, I'm more curious about whether I was
overlookign a setting that could reduce the memory
footprint during a vacuum analyze cycle than about
getting it reduced.  If it becomes a major pain I'll
simply run the thing on off hours while I'm not at
work.  :)

Regards,

Shelby Cain


--- "Uwe C. Schroeder" <uwe@oss4u.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> How about plugging in more memory ?
> 40MB seems a bit low for a database server footprint
> - well, certainly depends
> on what you do.
> But if your machine starts swapping with an extra 40
> MB of memory consumption
> I'd say the machine is undersized for the
> application. I usually have around
> 500 MB free memory with everything running. Memory
> is cheap nowadays...
>
> On Thursday 25 March 2004 08:15 am, Tom Lane wrote:
> > Shelby Cain <alyandon@yahoo.com> writes:
> > > I'm trying to keep postgresql's memory usage
> > > under 40 megs under all conditions so that other
> > > services/applications don't grind to a halt due
> to
> > > swapping.  Is there any way to achieve my goal?
> >
> > Don't use VACUUM FULL.  The vacuum_mem setting
> only limits the space
> > consumed by plain VACUUM --- VACUUM FULL needs to
> keep track of all the
> > free space in the table, and will eat as much
> memory as it has to to do
> > that.
> >
> >             regards, tom lane
> >
> > ---------------------------(end of
> broadcast)---------------------------
> > TIP 9: the planner will ignore your desire to
> choose an index scan if your
> >       joining column's datatypes do not match
>
> - --
>     UC
>
> - --
> Open Source Solutions 4U, LLC    2570 Fleetwood Drive
> Phone:  +1 650 872 2425        San Bruno, CA 94066
> Cell:   +1 650 302 2405        United States
> Fax:    +1 650 872 2417
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.2.3 (GNU/Linux)
>
>
iD8DBQFAYw22jqGXBvRToM4RAghbAKCbXZ9avDIMwpxOyo3g+iyoTmJNSQCgkk3n
> 2a8HrY9gxBNMk/2iwLrnHEA=
> =TZob
> -----END PGP SIGNATURE-----
>


__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html

Re: Memory usage during vacuum

From
Tom Lane
Date:
Shelby Cain <alyandon@yahoo.com> writes:
> I apologize for my original post being unclear.  I'm
> running "vacuum analyze" and seeing the behavior
> mentioned.  Does specifying the analyze option imply
> "vacuum full"?

No; I just figured you were probably using FULL without saying so.
However ...

> On a hunch I just ran analyze <really big table> and
> the backend's memory usage soared up to 100+ megs.

... the issue is obviously ANALYZE and not VACUUM at all.  What
statistics targets are you using?  It's hard to believe ANALYZE
would eat that much space unless it's being asked for a really
large target.

            regards, tom lane

Re: Memory usage during vacuum

From
"Uwe C. Schroeder"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thursday 25 March 2004 09:12 am, Shelby Cain wrote:
> I agree in principle that the solution is to run on a
> server with more memory instead of my local
> development box.  However, I'm not going to be able to
> simply request that additional memory be installed as
> these are "standard" boxes that IT distributes to
> employees.

Too bad that.

>
> Regardless, I'm more curious about whether I was
> overlookign a setting that could reduce the memory
> footprint during a vacuum analyze cycle than about
> getting it reduced.  If it becomes a major pain I'll
> simply run the thing on off hours while I'm not at
> work.  :)

That's what I would do. Depending on how many transactions (inserts/deletes)
you do a day, it might be perfectly fine to just run a vacuum full analyze
every night via cron. You can reduce the per connection settings (max
connections allowed as well as shared buffers), but that may have a rather
significant impact on performance. So it's probably best to leave those
settings alone (when developing you don't want to wait a minute per query)
and rather do the cleanup at night, or maybe an additional one during lunch
(if needed).

>
> Regards,
>
> Shelby Cain
>
> --- "Uwe C. Schroeder" <uwe@oss4u.com> wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> >
> > How about plugging in more memory ?
> > 40MB seems a bit low for a database server footprint
> > - well, certainly depends
> > on what you do.
> > But if your machine starts swapping with an extra 40
> > MB of memory consumption
> > I'd say the machine is undersized for the
> > application. I usually have around
> > 500 MB free memory with everything running. Memory
> > is cheap nowadays...
> >
> > On Thursday 25 March 2004 08:15 am, Tom Lane wrote:
> > > Shelby Cain <alyandon@yahoo.com> writes:
> > > > I'm trying to keep postgresql's memory usage
> > > > under 40 megs under all conditions so that other
> > > > services/applications don't grind to a halt due
> >
> > to
> >
> > > > swapping.  Is there any way to achieve my goal?
> > >
> > > Don't use VACUUM FULL.  The vacuum_mem setting
> >
> > only limits the space
> >
> > > consumed by plain VACUUM --- VACUUM FULL needs to
> >
> > keep track of all the
> >
> > > free space in the table, and will eat as much
> >
> > memory as it has to to do
> >
> > > that.
> > >
> > >             regards, tom lane
> > >
> > > ---------------------------(end of
> >
> > broadcast)---------------------------
> >
> > > TIP 9: the planner will ignore your desire to
> >
> > choose an index scan if your
> >
> > >       joining column's datatypes do not match
> >
> > - --
> >     UC
> >
> > - --
> > Open Source Solutions 4U, LLC    2570 Fleetwood Drive
> > Phone:  +1 650 872 2425        San Bruno, CA 94066
> > Cell:   +1 650 302 2405        United States
> > Fax:    +1 650 872 2417
> > -----BEGIN PGP SIGNATURE-----
> > Version: GnuPG v1.2.3 (GNU/Linux)
>
> iD8DBQFAYw22jqGXBvRToM4RAghbAKCbXZ9avDIMwpxOyo3g+iyoTmJNSQCgkk3n
>
> > 2a8HrY9gxBNMk/2iwLrnHEA=
> > =TZob
> > -----END PGP SIGNATURE-----
>
> __________________________________
> Do you Yahoo!?
> Yahoo! Finance Tax Center - File online. File on time.
> http://taxes.yahoo.com/filing.html

- --
    UC

- --
Open Source Solutions 4U, LLC    2570 Fleetwood Drive
Phone:  +1 650 872 2425        San Bruno, CA 94066
Cell:   +1 650 302 2405        United States
Fax:    +1 650 872 2417
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFAYxazjqGXBvRToM4RAh1RAKCV9ozDim5gBWYdX+VBdyiT8Ych1QCgyqZb
Eo6UT9r2K2z6TE1jQpg/PEU=
=4ogO
-----END PGP SIGNATURE-----


Re: Memory usage during vacuum

From
Shelby Cain
Date:
Currently my default is 300 (yes - very large I know)
but overriding default_statistics_target with a value
of 1 and re-running vacuum analyze on the same large
table results in no change in maximum memory
consumption during the process that I can see.  It
should be noted that I see this behavior only with one
table.  All other tables, even relatively large ones
only result in a backend that grows to consume about
30 megabytes.

Here is some sample output:

c1scain=# set default_statistics_target=300;
SET
c1scain=# vacuum verbose analyze inventory_txns;
INFO:  vacuuming "public.inventory_txns"
INFO:  index "idx_inventory_txns_txndate" now contains
957655 row versions in 5770 pages
DETAIL:  1686 index pages have been deleted, 1686 are
currently reusable.
CPU 0.51s/0.21u sec elapsed 5.20 sec.
INFO:  index "idx_inventory_txns_locitemnbr" now
contains 957655 row versions in 4752 pages
DETAIL:  0 index pages have been deleted, 0 are
currently reusable.
CPU 0.53s/0.09u sec elapsed 2.73 sec.
INFO:  "inventory_txns": found 0 removable, 957655
nonremovable row versions in 39549 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 15914 unused item pointers.
0 pages are entirely empty.
CPU 2.16s/0.60u sec elapsed 15.76 sec.
INFO:  analyzing "public.inventory_txns"
INFO:  "inventory_txns": 39549 pages, 150000 rows
sampled, 944729 estimated total rows

I notice that postgresql decided to sample 150000
rows.

Now, when I try the following:

c1scain=# set default_statistics_target=1;
SET
c1scain=# vacuum verbose analyze inventory_txns;
INFO:  vacuuming "public.inventory_txns"
INFO:  index "idx_inventory_txns_txndate" now contains
957655 row versions in 5770 pages
DETAIL:  1686 index pages have been deleted, 1686 are
currently reusable.
CPU 0.37s/0.15u sec elapsed 5.29 sec.
INFO:  index "idx_inventory_txns_locitemnbr" now
contains 957655 row versions in 4752 pages
DETAIL:  0 index pages have been deleted, 0 are
currently reusable.
CPU 0.54s/0.09u sec elapsed 2.93 sec.
INFO:  "inventory_txns": found 0 removable, 957655
nonremovable row versions in 39549 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 15914 unused item pointers.
0 pages are entirely empty.
CPU 2.21s/0.51u sec elapsed 16.37 sec.
INFO:  analyzing "public.inventory_txns"
INFO:  "inventory_txns": 39549 pages, 150000 rows
sampled, 944729 estimated total rows
VACUUM

It still decided to sample 150000 rows.  Am I missing
something obvious here?  Shouldn't fewer rows be
sampled when I set the collection target to 1?

Regards,

Shelby Cain



--- Tom Lane <tgl@sss.pgh.pa.us> wrote:
> ... the issue is obviously ANALYZE and not VACUUM at
> all.  What
> statistics targets are you using?  It's hard to
> believe ANALYZE
> would eat that much space unless it's being asked
> for a really
> large target.
>


__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html

Re: Memory usage during vacuum

From
Tom Lane
Date:
Shelby Cain <alyandon@yahoo.com> writes:
> It still decided to sample 150000 rows.  Am I missing
> something obvious here?  Shouldn't fewer rows be
> sampled when I set the collection target to 1?

The sample size is 300 rows times the largest per-column analysis
target, where default_statistics_target is used if the recorded
per-column setting is -1.  I would say that you have set a target of 500
for at least one of the columns of that table, using ALTER TABLE SET
STATISTICS.  Try this to see which:

select attname, attstattarget from pg_attribute
where attrelid = 'table_name_here'::regclass;

            regards, tom lane

Re: Memory usage during vacuum

From
Shelby Cain
Date:
I had thought that I had dropped and reloaded this
table but apparently I hadn't and I had set the
statistics target for one column to 500 while
experimenting.  Resetting it to -1 and running with a
default of 300 gets ~ 70 megs memory footprint during
the analyze now.

Thanks Tom for indulging my curiosity on the matter.
I've learned something that I didn't readily pick up
from reading the documentation.

Regards,

Shelby Cain

--- Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Shelby Cain <alyandon@yahoo.com> writes:
> > It still decided to sample 150000 rows.  Am I
> missing
> > something obvious here?  Shouldn't fewer rows be
> > sampled when I set the collection target to 1?
>
> The sample size is 300 rows times the largest
> per-column analysis
> target, where default_statistics_target is used if
> the recorded
> per-column setting is -1.  I would say that you have
> set a target of 500
> for at least one of the columns of that table, using
> ALTER TABLE SET
> STATISTICS.  Try this to see which:
>
> select attname, attstattarget from pg_attribute
> where attrelid = 'table_name_here'::regclass;
>
>             regards, tom lane


__________________________________
Do you Yahoo!?
Yahoo! Finance Tax Center - File online. File on time.
http://taxes.yahoo.com/filing.html