Thread: Why memory is not used ? Why vacuum so slow ?

Why memory is not used ? Why vacuum so slow ?

From
Hervé Piedvache
Date:
Hi,

I have tried to tune a database that I'm using only for statistical access ...
I mean that I'm importing a dump of my production database each night, but
preserving some aggregat tables, and statistics ones ... (that I'm
calculating after the importation of the dump). This database is only used by
few people but make some big requests, tables have mixed sizes between 200
000 rows up to 10 000 000 records.

This server's got 2Gb memory, and 100 Gb RAID 5 Hard disk, is a woody Debian,
and I'm using a self compiled version of PotsgreSQL v7.3.4.

My postgresql.conf file looks like this :

#
#   Shared Memory Size
#
shared_buffers = 31000      # min max_connections*2 or 16, 8KB each
max_fsm_relations = 1000    # min 10, fsm is free space map, ~40 bytes
max_fsm_pages = 10000       # min 1000, fsm is free space map, ~6 bytes
#max_locks_per_transaction = 64 # min 10
wal_buffers = 32            # min 4, typically 8KB each

#
#   Non-shared Memory Sizes
#
sort_mem = 32768        # min 64, size in KB
vacuum_mem = 32768      # min 1024, size in KB

#checkpoint_segments = 3    # in logfile segments, min 1, 16MB each
checkpoint_timeout = 160    # range 30-3600, in seconds
effective_cache_size = 400000   # typically 8KB each
random_page_cost = 1.5          # units are one sequential page fetch cost

Before my effective_cache_size was 1000 ... and reading some tuning pages and
comments telling : "effective_cache_size: You should adjust this according to
the amount of free memory you have." ... I grow it to 400000 ...

Then ... first point I'm only using 5% of my memory (all linux system,and
software) ... and no swap (good point for this) ... Why I don't use more
memory ... ??

Second point ... after importing my dump ... I make a vacuum full analyze of
my base (in same time because of my caculation of the day before for my
aggregats and stats tables about 200 000 row deleted and/or inserted for more
than 20 tables (each)) ... but It takes about 5 hours ...

Example of a (for me) really slow vacuum ... more than 85 min for a table with
only 9105740 records ...

INFO:  --Relation public.hebcnt--
INFO:  Pages 175115: Changed 0, reaped 3309, Empty 0, New 0; Tup 9105740: Vac
175330, Keep/VTL 0/0, UnUsed 0, MinLen 148, MaxLen 148; Re-using: Free/Avail.
Space 46265980/26336600; EndEmpty/Avail. Pages 0/3310.
        CPU 6.75s/1.67u sec elapsed 91.41 sec.
INFO:  Index ix_hebcnt_idc: Pages 40446; Tuples 9105740: Deleted 175330.
        CPU 2.94s/6.17u sec elapsed 222.34 sec.
INFO:  Index ix_hebcnt_cweek: Pages 229977; Tuples 9105740: Deleted 175330.
        CPU 9.64s/3.14u sec elapsed 1136.01 sec.
INFO:  Index ix_hebcnt_cpte: Pages 72939; Tuples 9105740: Deleted 175330.
        CPU 4.86s/9.13u sec elapsed 398.73 sec.
INFO:  Index ix_hebcnt_idctweek: Pages 66014; Tuples 9105740: Deleted 175330.
        CPU 3.87s/8.61u sec elapsed 163.26 sec.
INFO:  Rel hebcnt: Pages: 175115 --> 171807; Tuple(s) moved: 175330.
        CPU 16.49s/52.04u sec elapsed 1406.34 sec.
INFO:  Index ix_hebcnt_idc: Pages 40446; Tuples 9105740: Deleted 175330.
        CPU 1.76s/5.65u sec elapsed 124.98 sec.
INFO:  Index ix_hebcnt_cweek: Pages 230690; Tuples 9105740: Deleted 175330.
        CPU 10.07s/2.60u sec elapsed 1095.17 sec.
INFO:  Index ix_hebcnt_cpte: Pages 72940; Tuples 9105740: Deleted 175330.
        CPU 4.51s/8.90u sec elapsed 353.45 sec.
INFO:  Index ix_hebcnt_idcweek: Pages 66015; Tuples 9105740: Deleted 175330.
        CPU 3.96s/8.58u sec elapsed 147.64 sec.
INFO:  --Relation pg_toast.pg_toast_76059978--
INFO:  Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0, Keep/VTL
0/0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space 0/0; EndEmpty/
Avail. Pages 0/0.
        CPU 0.00s/0.00u sec elapsed 0.02 sec.
INFO:  Index pg_toast_76059978_index: Pages 1; Tuples 0.
        CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  Analyzing public.hebcnt

Structure of this table :
frstats=# \d hebcnt
                          Table "public.hebcnt"
      Column      |            Type             |       Modifiers
------------------+-----------------------------+------------------------
 id_c                | integer                     | not null
 contrat          | text                        | not null
 arrete_week      | text                        | not null
 cpte                | text                        | not null
 is_active        | boolean                     | not null
 year              | text                        | not null
 use               | integer                     | not null
 use_priv         | integer                     | not null
 use_ind         | integer                     | not null
 passback         | integer                     | not null
 resa               | integer                     | not null
 noshow           | integer                     | not null
 nbc                | integer                     | not null
 dureecnt         | integer                     | not null
 dureecpt         | integer                     | not null
 anciennete2      | integer                     | not null
 c_week           | text                        | not null
 blacklist        | integer                     | not null
 dcrea            | timestamp without time zone | not null default now()
 dmaj             | timestamp without time zone |
Indexes: ix_hebcnt_cweek btree (c_week),
         ix_hebcnt_cpte btree (cpte),
         ix_hebcnt_idc btree (id_c),
         ix_hebcnt_idcweek btree (id_c, c_week)

Any idea ?

Regards,
--
Hervé Piedvache

Elma Ingénierie Informatique
6 rue du Faubourg Saint-Honoré
F-75008 - Paris - France
Pho. 33-144949901
Fax. 33-144949902


Tuning Techniques To Avoid?

From
Christopher Browne
Date:
Here's a scheme for query optimization that probably needs to be
avoided in that it would run afoul of a patent held by Oracle...


<http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=/netahtml/srchnum.htm&r=1&f=G&l=50&s1=5761654.WKU.&OS=PN/5761654&RS=PN/5761654>

It looks like what they have patented is pretty much a "greedy search"
heuristic, starting by finding the table in a join that has the
greatest selectivity (e.g. - where the number of entries selected is
cut the most by the selection criteria), and then describes how to
search for the "best" approach to joining in the other tables.
--
output = reverse("gro.mca" "@" "enworbbc")
http://www.ntlug.org/~cbbrowne/nonrdbms.html
"If I  could find  a way to  get [Saddam  Hussein] out of  there, even
putting a  contract out on him,  if the CIA  still did that sort  of a
thing, assuming it ever did, I would be for it."  -- Richard M. Nixon

Re: Why memory is not used ? Why vacuum so slow ?

From
Tom Lane
Date:
=?iso-8859-15?q?Herv=E9=20Piedvache?= <herve@elma.fr> writes:
> Second point ... after importing my dump ... I make a vacuum full analyze of
> my base (in same time because of my caculation of the day before for my
> aggregats and stats tables about 200 000 row deleted and/or inserted for more
> than 20 tables (each)) ... but It takes about 5 hours ...

Don't do vacuum full.  You should not need it in ordinary circumstances,
if you are doing plain vacuums on a reasonable schedule and you have the
FSM parameters set high enough.  (You do not BTW ... with 175000 pages in
this table alone, 10000 FSM pages for the whole database is surely too
low.)

            regards, tom lane

Re: Why memory is not used ? Why vacuum so slow ?

From
Hervé Piedvache
Date:
Hi Tom,

Le Vendredi 2 Janvier 2004 15:42, Tom Lane a écrit :
> =?iso-8859-15?q?Herv=E9=20Piedvache?= <herve@elma.fr> writes:
> > Second point ... after importing my dump ... I make a vacuum full analyze
> > of my base (in same time because of my caculation of the day before for
> > my aggregats and stats tables about 200 000 row deleted and/or inserted
> > for more than 20 tables (each)) ... but It takes about 5 hours ...
>
> Don't do vacuum full.  You should not need it in ordinary circumstances,
> if you are doing plain vacuums on a reasonable schedule and you have the
> FSM parameters set high enough.  (You do not BTW ... with 175000 pages in
> this table alone, 10000 FSM pages for the whole database is surely too
> low.)

Ok for this ... I have now configured the FSM pages to 300 000 ... then when I
have started the database I get a message about my SHMMAX too low ... it was
set to :
more /proc/sys/kernel/shmmax
262111232

Then I put 300000000 ... PostgreSQL accepted to start ... What can be maximum
value for this ? To be usufull to the entire configuration ... ?

Like this during during the vacuum full this is my used memory ...
             total       used       free     shared    buffers     cached
Mem:       2069608    2059052      10556          0       8648    1950672
-/+ buffers/cache:      99732    1969876
Swap:      2097136      16080    2081056

Seems that's I'm really using 5% of my memory ??? no ? or I missed something ?

Now difficult to test again ... I will have to wait tomorrow morning to see
the result ... because I have already vacuumed the base to day ...

But I have done again a full vacuum to see if I have quick visible
difference ... and I have also saw that the full vacuum for pg_atribute seems
to be so slow ... more than 1 min for 7256 tupples ? Is this is normal ?

INFO:  --Relation pg_catalog.pg_attribute--
INFO:  Pages 119: Changed 0, reaped 1, Empty 0, New 0; Tup 7256: Vac 0, Keep/
VTL 0/0, UnUsed 3, MinLen 128, MaxLen 128; Re-using: Free/Avail. Space
14664/504; EndEmpty/Avail. Pages 0/1.
        CPU 0.00s/0.00u sec elapsed 0.08 sec.
INFO:  Index pg_attribute_relid_attnam_index: Pages 21082; Tuples 7256:
Deleted 0.
        CPU 0.83s/0.13u sec elapsed 59.32 sec.
INFO:  Index pg_attribute_relid_attnum_index: Pages 5147; Tuples 7256: Deleted
0.
        CPU 0.26s/0.03u sec elapsed 8.79 sec.
INFO:  Rel pg_attribute: Pages: 119 --> 119; Tuple(s) moved: 0.
        CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  Analyzing pg_catalog.pg_attribute

Thanks for your help ...

Regards,
--
Hervé Piedvache

Elma Ingénierie Informatique
6 rue du Faubourg Saint-Honoré
F-75008 - Paris - France
Pho. 33-144949901
Fax. 33-144949902


Re: Why memory is not used ? Why vacuum so slow ?

From
Tom Lane
Date:
=?iso-8859-15?q?Herv=E9=20Piedvache?= <herve@elma.fr> writes:
> and I have also saw that the full vacuum for pg_atribute seems
> to be so slow ... more than 1 min for 7256 tupples ? Is this is normal ?

> INFO:  --Relation pg_catalog.pg_attribute--
> INFO:  Pages 119: Changed 0, reaped 1, Empty 0, New 0; Tup 7256: Vac 0, Keep/
> VTL 0/0, UnUsed 3, MinLen 128, MaxLen 128; Re-using: Free/Avail. Space
> 14664/504; EndEmpty/Avail. Pages 0/1.
>         CPU 0.00s/0.00u sec elapsed 0.08 sec.
> INFO:  Index pg_attribute_relid_attnam_index: Pages 21082; Tuples 7256:
> Deleted 0.
>         CPU 0.83s/0.13u sec elapsed 59.32 sec.
> INFO:  Index pg_attribute_relid_attnum_index: Pages 5147; Tuples 7256: Deleted
> 0.
>         CPU 0.26s/0.03u sec elapsed 8.79 sec.

You're suffering from index bloat (21000 pages in an index for a
119-page table!?).  Updating to 7.4 would probably fix this, but
if that's not practical consider reindexing pg_attribute.

            regards, tom lane

Re: Tuning Techniques To Avoid?

From
Greg Stark
Date:
Christopher Browne <cbbrowne@acm.org> writes:

> Here's a scheme for query optimization that probably needs to be
> avoided in that it would run afoul of a patent held by Oracle...

What does this have to do with Hervé Piedvache's post "Why memory is not
used?" ?

--
greg