Thread: Postgresql 'eats' all mi data partition

Postgresql 'eats' all mi data partition

From
Javier Carlos
Date:
============================================================================
                        POSTGRESQL BUG REPORT TEMPLATE
============================================================================


Your name        :    Javier Carlos Rivera
Your email address    :    fjcarlos ( at ) correo ( dot ) insp ( dot ) mx


System Configuration
----------------------
  Architecture (example: Intel Pentium)        : Intel Pentium 4

  Operating System (example: Linux 2.0.26 ELF)       : Debian GNU/Linux 3.0 2.4.21

  RAM                          : 256 MB

  PostgreSQL version (example: PostgreSQL-6.3.2)  : PostgreSQL-7.3.4

  Compiler used (example:  gcc 2.7.2)          : 2.95.4



Please enter a FULL description of your problem:
-------------------------------------------------
    On Thursday Bruce Momjian was at Mexico; I saw him and asked about
this problem. He told me to write to this e-mail.

    When I do a simple 'UPDATE' PostgreSQL 'eats' all my partition space
of my data directory. For example:

***** My data directory is in /var
***** BEFORE I do the UPDATEs I got this from df:
OPORTUNIDADES:~# df
Filesystem           1k-blocks      Used Available Use% Mounted on
/dev/hda2              2885812     14372   2724848   1% /
/dev/hda1             14421344   1195132  12493652   9% /var
/dev/hda3              7692908    888560   6413568  13% /usr
/dev/hda6             12491804     22704  11834536   1% /javier
/dev/hda7              1494204     23936   1394364   2% /home


***** Then I do the UPDATEs:
**** The updates are of this type :
UPDATE tbl_personas SET "PIDFOLIO"=-2 WHERE "PIDFOLIO" IS NULL;
UPDATE tbl_personas SET "P5_1"=-2 WHERE "P5_1" IS NULL;
UPDATE tbl_personas SET "P4PAQ"=-2 WHERE "P4PAQ" IS NULL;
UPDATE tbl_personas SET "P5_4"=-2 WHERE "P5_4" IS NULL;
UPDATE tbl_personas SET "P5_5"=-2 WHERE "P5_5" IS NULL;
UPDATE tbl_personas SET "P36_4"=-2 WHERE "P36_4" IS NULL;
..
UPDATE table_name SET column_name = -2 WHERE column_name IS NULL;
..


   For this example, I run 182 UPDATE queries. It took a long time to
PostgreSQL to do that updates (more than 30 minutes).
***** After that I got this from df:
OPORTUNIDADES:~# df
Filesystem           1k-blocks      Used Available Use% Mounted on
/dev/hda2              2885812     14368   2724852   1% /
/dev/hda1             14421344   7422004   6266780  55% /var
/dev/hda3              7692908    888536   6413592  13% /usr
/dev/hda6             12491804     22704  11834536   1% /javier
/dev/hda7              1494204     23960   1394340   2% /home

******* The /var Partition growed up from 9% to 55%, from 1GB to 5.9GB !!!!

*** When I make a 'DROP DATABASE' to the database where that table belongs to,
mi /var partition returns to its original size (in this example to 9%).

      If I'd continue making UPDATES the size of the database will grow up
until fill all my partition (from 1GB grows up to 15GB !!!).

    I don't know if this problem is related to the migration from SPSS to
Dbase to SQL Server to PostgreSQL that I had to do. I have also used
StatTransfer to migrate directly from SPSS to PostgreSQL, but I obtained the
same bad results.


        Thanks,

        Javier



Please describe a way to repeat the problem.   Please try to provide a
concise reproducible example, if at all possible:
-----------------------------------------------------------------------
    I have done five times all the process from drop the database, migrate
the database and then do the updates, and the result is always the same.



If you know how this problem might be fixed, list the solution below:
----------------------------------------------------------------------


-------------------------------------------------
http://www.insp.mx

Re: Postgresql 'eats' all mi data partition

From
Andreas Pflug
Date:
Javier Carlos wrote:

>============================================================================
>                        POSTGRESQL BUG REPORT TEMPLATE
>============================================================================
>
>
>Your name        :    Javier Carlos Rivera
>Your email address    :    fjcarlos ( at ) correo ( dot ) insp ( dot ) mx
>
>
>System Configuration
>----------------------
>  Architecture (example: Intel Pentium)        : Intel Pentium 4
>
>  Operating System (example: Linux 2.0.26 ELF)       : Debian GNU/Linux 3.0 2.4.21
>
>  RAM                          : 256 MB
>
>  PostgreSQL version (example: PostgreSQL-6.3.2)  : PostgreSQL-7.3.4
>
>  Compiler used (example:  gcc 2.7.2)          : 2.95.4
>
>
>
>Please enter a FULL description of your problem:
>-------------------------------------------------
>    On Thursday Bruce Momjian was at Mexico; I saw him and asked about
>this problem. He told me to write to this e-mail.
>
>    When I do a simple 'UPDATE' PostgreSQL 'eats' all my partition space
>of my data directory. For example:
>
>***** My data directory is in /var
>***** BEFORE I do the UPDATEs I got this from df:
>OPORTUNIDADES:~# df
>Filesystem           1k-blocks      Used Available Use% Mounted on
>/dev/hda2              2885812     14372   2724848   1% /
>/dev/hda1             14421344   1195132  12493652   9% /var
>/dev/hda3              7692908    888560   6413568  13% /usr
>/dev/hda6             12491804     22704  11834536   1% /javier
>/dev/hda7              1494204     23936   1394364   2% /home
>
>
>***** Then I do the UPDATEs:
>**** The updates are of this type :
>UPDATE tbl_personas SET "PIDFOLIO"=-2 WHERE "PIDFOLIO" IS NULL;
>UPDATE tbl_personas SET "P5_1"=-2 WHERE "P5_1" IS NULL;
>UPDATE tbl_personas SET "P4PAQ"=-2 WHERE "P4PAQ" IS NULL;
>UPDATE tbl_personas SET "P5_4"=-2 WHERE "P5_4" IS NULL;
>UPDATE tbl_personas SET "P5_5"=-2 WHERE "P5_5" IS NULL;
>UPDATE tbl_personas SET "P36_4"=-2 WHERE "P36_4" IS NULL;
>..
>UPDATE table_name SET column_name = -2 WHERE column_name IS NULL;
>..
>
>
>   For this example, I run 182 UPDATE queries. It took a long time to
>PostgreSQL to do that updates (more than 30 minutes).
>***** After that I got this from df:
>OPORTUNIDADES:~# df
>Filesystem           1k-blocks      Used Available Use% Mounted on
>/dev/hda2              2885812     14368   2724852   1% /
>/dev/hda1             14421344   7422004   6266780  55% /var
>/dev/hda3              7692908    888536   6413592  13% /usr
>/dev/hda6             12491804     22704  11834536   1% /javier
>/dev/hda7              1494204     23960   1394340   2% /home
>
>******* The /var Partition growed up from 9% to 55%, from 1GB to 5.9GB !!!!
>
>*** When I make a 'DROP DATABASE' to the database where that table belongs to,
>mi /var partition returns to its original size (in this example to 9%).
>
>      If I'd continue making UPDATES the size of the database will grow up
>until fill all my partition (from 1GB grows up to 15GB !!!).
>

This is by design, you need to VACUUM regularly to keep your db healthy
as pointed out in the documentation.

Regards,
Andreas

Re: Postgresql 'eats' all mi data partition

From
Stephan Szabo
Date:
On Thu, 25 Sep 2003, Javier Carlos wrote:

> ============================================================================
>                         POSTGRESQL BUG REPORT TEMPLATE
> ============================================================================
>
>
> Your name        :    Javier Carlos Rivera
> Your email address    :    fjcarlos ( at ) correo ( dot ) insp ( dot ) mx
>
>
> System Configuration
> ----------------------
>   Architecture (example: Intel Pentium)        : Intel Pentium 4
>
>   Operating System (example: Linux 2.0.26 ELF)       : Debian GNU/Linux 3.0 2.4.21
>
>   RAM                          : 256 MB
>
>   PostgreSQL version (example: PostgreSQL-6.3.2)  : PostgreSQL-7.3.4
>
>   Compiler used (example:  gcc 2.7.2)          : 2.95.4
>
>
>
> Please enter a FULL description of your problem:
> -------------------------------------------------
>     On Thursday Bruce Momjian was at Mexico; I saw him and asked about
> this problem. He told me to write to this e-mail.
>
>     When I do a simple 'UPDATE' PostgreSQL 'eats' all my partition space
> of my data directory. For example:
>
> ***** My data directory is in /var
> ***** BEFORE I do the UPDATEs I got this from df:
> OPORTUNIDADES:~# df
> Filesystem           1k-blocks      Used Available Use% Mounted on
> /dev/hda2              2885812     14372   2724848   1% /
> /dev/hda1             14421344   1195132  12493652   9% /var
> /dev/hda3              7692908    888560   6413568  13% /usr
> /dev/hda6             12491804     22704  11834536   1% /javier
> /dev/hda7              1494204     23936   1394364   2% /home
>
>
> ***** Then I do the UPDATEs:
> **** The updates are of this type :
> UPDATE tbl_personas SET "PIDFOLIO"=-2 WHERE "PIDFOLIO" IS NULL;
> UPDATE tbl_personas SET "P5_1"=-2 WHERE "P5_1" IS NULL;
> UPDATE tbl_personas SET "P4PAQ"=-2 WHERE "P4PAQ" IS NULL;
> UPDATE tbl_personas SET "P5_4"=-2 WHERE "P5_4" IS NULL;
> UPDATE tbl_personas SET "P5_5"=-2 WHERE "P5_5" IS NULL;
> UPDATE tbl_personas SET "P36_4"=-2 WHERE "P36_4" IS NULL;
> ..
> UPDATE table_name SET column_name = -2 WHERE column_name IS NULL;
> ..

If you're not vacuuming, you're going to have a potentially large
number of dead rows.  Does a vacuum between updates or a vacuum full at
the end bring the space usage down to something reasonable?

Re: Postgresql 'eats' all mi data partition

From
Stephan Szabo
Date:
On Thu, 25 Sep 2003, Javier Carlos wrote:

> Quoting Stephan Szabo <sszabo@megazone.bigpanda.com>:
>
> > On Thu, 25 Sep 2003, Javier Carlos wrote:
> >
> > >
> > ============================================================================
> > >                         POSTGRESQL BUG REPORT TEMPLATE
> > >
> > ============================================================================
> > >
> > >
> > > Your name        :    Javier Carlos Rivera
> > > Your email address    :    fjcarlos ( at ) correo ( dot ) insp ( dot ) mx
> > >
> > >
> > > System Configuration
> > > ----------------------
> > >   Architecture (example: Intel Pentium)        : Intel Pentium 4
> > >
> > >   Operating System (example: Linux 2.0.26 ELF)       : Debian GNU/Linux
> 3.0
> > 2.4.21
> > >
> > >   RAM                          : 256 MB
> > >
> > >   PostgreSQL version (example: PostgreSQL-6.3.2)  : PostgreSQL-7.3.4
> > >
> > >   Compiler used (example:  gcc 2.7.2)          : 2.95.4
> > >
> > >
> > >
> > > Please enter a FULL description of your problem:
> > > -------------------------------------------------
> > >     On Thursday Bruce Momjian was at Mexico; I saw him and asked about
> > > this problem. He told me to write to this e-mail.
> > >
> > >     When I do a simple 'UPDATE' PostgreSQL 'eats' all my partition space
> > > of my data directory. For example:
> > >
> > > ***** My data directory is in /var
> > > ***** BEFORE I do the UPDATEs I got this from df:
> > > OPORTUNIDADES:~# df
> > > Filesystem           1k-blocks      Used Available Use% Mounted on
> > > /dev/hda2              2885812     14372   2724848   1% /
> > > /dev/hda1             14421344   1195132  12493652   9% /var
> > > /dev/hda3              7692908    888560   6413568  13% /usr
> > > /dev/hda6             12491804     22704  11834536   1% /javier
> > > /dev/hda7              1494204     23936   1394364   2% /home
> > >
> > >
> > > ***** Then I do the UPDATEs:
> > > **** The updates are of this type :
> > > UPDATE tbl_personas SET "PIDFOLIO"=-2 WHERE "PIDFOLIO" IS NULL;
> > > UPDATE tbl_personas SET "P5_1"=-2 WHERE "P5_1" IS NULL;
> > > UPDATE tbl_personas SET "P4PAQ"=-2 WHERE "P4PAQ" IS NULL;
> > > UPDATE tbl_personas SET "P5_4"=-2 WHERE "P5_4" IS NULL;
> > > UPDATE tbl_personas SET "P5_5"=-2 WHERE "P5_5" IS NULL;
> > > UPDATE tbl_personas SET "P36_4"=-2 WHERE "P36_4" IS NULL;
> > > ..
> > > UPDATE table_name SET column_name = -2 WHERE column_name IS NULL;
> > > ..
> >
> > If you're not vacuuming, you're going to have a potentially large
> > number of dead rows.  Does a vacuum between updates or a vacuum full at
> > the end bring the space usage down to something reasonable?
> >
>
>    I did a vacuumbdb after the updates, and the space usage didn't down to
> something reasonable. For example, I had a 250MB database, then I did about
> 300 query updates, and mi partition growed up until fill all mi data partition
> space of 15GB. After that I did an vacuumdb and only the space down 100MB.
> After that I DROPPED the database, and the space down ALL the 15GB; It's very
> weird, don't you think?

Did you use -f on the vacuumdb?  If not, it did a normal vacuum (which
isn't likely to help) not a full vacuum.

Re: Postgresql 'eats' all mi data partition

From
Javier Carlos
Date:
Quoting Stephan Szabo <sszabo@megazone.bigpanda.com>:

> On Thu, 25 Sep 2003, Javier Carlos wrote:
>
> >
> ============================================================================
> >                         POSTGRESQL BUG REPORT TEMPLATE
> >
> ============================================================================
> >
> >
> > Your name        :    Javier Carlos Rivera
> > Your email address    :    fjcarlos ( at ) correo ( dot ) insp ( dot ) mx
> >
> >
> > System Configuration
> > ----------------------
> >   Architecture (example: Intel Pentium)        : Intel Pentium 4
> >
> >   Operating System (example: Linux 2.0.26 ELF)       : Debian GNU/Linux
3.0
> 2.4.21
> >
> >   RAM                          : 256 MB
> >
> >   PostgreSQL version (example: PostgreSQL-6.3.2)  : PostgreSQL-7.3.4
> >
> >   Compiler used (example:  gcc 2.7.2)          : 2.95.4
> >
> >
> >
> > Please enter a FULL description of your problem:
> > -------------------------------------------------
> >     On Thursday Bruce Momjian was at Mexico; I saw him and asked about
> > this problem. He told me to write to this e-mail.
> >
> >     When I do a simple 'UPDATE' PostgreSQL 'eats' all my partition space
> > of my data directory. For example:
> >
> > ***** My data directory is in /var
> > ***** BEFORE I do the UPDATEs I got this from df:
> > OPORTUNIDADES:~# df
> > Filesystem           1k-blocks      Used Available Use% Mounted on
> > /dev/hda2              2885812     14372   2724848   1% /
> > /dev/hda1             14421344   1195132  12493652   9% /var
> > /dev/hda3              7692908    888560   6413568  13% /usr
> > /dev/hda6             12491804     22704  11834536   1% /javier
> > /dev/hda7              1494204     23936   1394364   2% /home
> >
> >
> > ***** Then I do the UPDATEs:
> > **** The updates are of this type :
> > UPDATE tbl_personas SET "PIDFOLIO"=-2 WHERE "PIDFOLIO" IS NULL;
> > UPDATE tbl_personas SET "P5_1"=-2 WHERE "P5_1" IS NULL;
> > UPDATE tbl_personas SET "P4PAQ"=-2 WHERE "P4PAQ" IS NULL;
> > UPDATE tbl_personas SET "P5_4"=-2 WHERE "P5_4" IS NULL;
> > UPDATE tbl_personas SET "P5_5"=-2 WHERE "P5_5" IS NULL;
> > UPDATE tbl_personas SET "P36_4"=-2 WHERE "P36_4" IS NULL;
> > ..
> > UPDATE table_name SET column_name = -2 WHERE column_name IS NULL;
> > ..
>
> If you're not vacuuming, you're going to have a potentially large
> number of dead rows.  Does a vacuum between updates or a vacuum full at
> the end bring the space usage down to something reasonable?
>

   I did a vacuumbdb after the updates, and the space usage didn't down to
something reasonable. For example, I had a 250MB database, then I did about
300 query updates, and mi partition growed up until fill all mi data partition
space of 15GB. After that I did an vacuumdb and only the space down 100MB.
After that I DROPPED the database, and the space down ALL the 15GB; It's very
weird, don't you think?

   Cheers,

   Javier

-------------------------------------------------
http://www.insp.mx

Re: Postgresql 'eats' all mi data partition

From
Javier Carlos
Date:
Quoting Stephan Szabo <sszabo@megazone.bigpanda.com>:

> On Thu, 25 Sep 2003, Javier Carlos wrote:
>
> > Quoting Stephan Szabo <sszabo@megazone.bigpanda.com>:
> >
> > > On Thu, 25 Sep 2003, Javier Carlos wrote:
> > >
> > > >
> > >
> ============================================================================
> > > >                         POSTGRESQL BUG REPORT TEMPLATE
> > > >
> > >
> ============================================================================
> > > >
> > > >
> > > > Your name        :    Javier Carlos Rivera
> > > > Your email address    :    fjcarlos ( at ) correo ( dot ) insp ( dot ) mx
> > > >
> > > >
> > > > System Configuration
> > > > ----------------------
> > > >   Architecture (example: Intel Pentium)        : Intel Pentium 4
> > > >
> > > >   Operating System (example: Linux 2.0.26 ELF)       : Debian GNU/Linux
> > 3.0
> > > 2.4.21
> > > >
> > > >   RAM                          : 256 MB
> > > >
> > > >   PostgreSQL version (example: PostgreSQL-6.3.2)  : PostgreSQL-7.3.4
> > > >
> > > >   Compiler used (example:  gcc 2.7.2)          : 2.95.4
> > > >
> > > >
> > > >
> > > > Please enter a FULL description of your problem:
> > > > -------------------------------------------------
> > > >     On Thursday Bruce Momjian was at Mexico; I saw him and asked about
> > > > this problem. He told me to write to this e-mail.
> > > >
> > > >     When I do a simple 'UPDATE' PostgreSQL 'eats' all my partition space
> > > > of my data directory. For example:
> > > >
> > > > ***** My data directory is in /var
> > > > ***** BEFORE I do the UPDATEs I got this from df:
> > > > OPORTUNIDADES:~# df
> > > > Filesystem           1k-blocks      Used Available Use% Mounted on
> > > > /dev/hda2              2885812     14372   2724848   1% /
> > > > /dev/hda1             14421344   1195132  12493652   9% /var
> > > > /dev/hda3              7692908    888560   6413568  13% /usr
> > > > /dev/hda6             12491804     22704  11834536   1% /javier
> > > > /dev/hda7              1494204     23936   1394364   2% /home
> > > >
> > > >
> > > > ***** Then I do the UPDATEs:
> > > > **** The updates are of this type :
> > > > UPDATE tbl_personas SET "PIDFOLIO"=-2 WHERE "PIDFOLIO" IS NULL;
> > > > UPDATE tbl_personas SET "P5_1"=-2 WHERE "P5_1" IS NULL;
> > > > UPDATE tbl_personas SET "P4PAQ"=-2 WHERE "P4PAQ" IS NULL;
> > > > UPDATE tbl_personas SET "P5_4"=-2 WHERE "P5_4" IS NULL;
> > > > UPDATE tbl_personas SET "P5_5"=-2 WHERE "P5_5" IS NULL;
> > > > UPDATE tbl_personas SET "P36_4"=-2 WHERE "P36_4" IS NULL;
> > > > ..
> > > > UPDATE table_name SET column_name = -2 WHERE column_name IS NULL;
> > > > ..
> > >
> > > If you're not vacuuming, you're going to have a potentially large
> > > number of dead rows.  Does a vacuum between updates or a vacuum full at
> > > the end bring the space usage down to something reasonable?
> > >
> >
> >    I did a vacuumbdb after the updates, and the space usage didn't down to
> > something reasonable. For example, I had a 250MB database, then I did
> about
> > 300 query updates, and mi partition growed up until fill all mi data
> partition
> > space of 15GB. After that I did an vacuumdb and only the space down 100MB.
> > After that I DROPPED the database, and the space down ALL the 15GB; It's
> very
> > weird, don't you think?
>
> Did you use -f on the vacuumdb?  If not, it did a normal vacuum (which
> isn't likely to help) not a full vacuum.
>

    Many Thanks!,

    I used the -f option and it worked!!

    I really appreciate your help.

    Best regards,

    Javier

-------------------------------------------------
http://www.insp.mx

Re: Postgresql 'eats' all mi data partition

From
Christopher Browne
Date:
fjcarlos@correo.insp.mx (Javier Carlos) writes:
> *** When I make a 'DROP DATABASE' to the database where that table belongs to,
> mi /var partition returns to its original size (in this example to 9%).

VACUUM ANALYZE is your friend, here.

PostgreSQL uses MVCC, wherein each update results in the creation of a
new tuple which coexists with the old one in the database.  If you
update a particular row in the database 10 times, then there will be
11 copies, 10 of them being obsolete.

If you VACUUM the table or the database, the rows that are no longer
accessible will be cleared out, thus diminishing the amount of space
in use.

Things are somewhat more complicated than that; VACUUM looks at the
data page-by-page, and won't _always_ reclaim all free space.  The
more often you VACUUM, the more likely that such reclaimation will be
possible.

There is a program called pg_autovacuum that will vacuum automatically
when specific tables look like they need it.

ANALYZE does something different but related; it updates the
statistics used by the query optimizer.  When you make major changes
to the table, it may be necessary to ANALYZE it in order for queries
to be efficient.
--
"cbbrowne","@","libertyrms.info"
<http://dev6.int.libertyrms.com/>
Christopher Browne
(416) 646 3304 x124 (land)

Re: Postgresql 'eats' all mi data partition

From
Tomas Szepe
Date:
> [sszabo@megazone.bigpanda.com]
>
> Did you use -f on the vacuumdb?  If not, it did a normal vacuum (which
> isn't likely to help) not a full vacuum.

There are scenarios where VACUUM FULL is not an option because
of its resource-hungriness and plain VACUUM just doesn't seem
to help.

We have a production database that happens to receive several
thousand row updates per minute.  We VACUUM ANALYZE every four
hours with max_fsm_pages set to 2100000, and it's no use.

The only way to prevent the system from filling up the data
partition seems to be to regularly schedule downtime to dump
and restore the whole db (the dump is ~150 MiB gzipped).

--
Tomas Szepe <szepe@pinerecords.com>

Re: Postgresql 'eats' all mi data partition

From
Stephan Szabo
Date:
On Fri, 26 Sep 2003, Tomas Szepe wrote:

> > [sszabo@megazone.bigpanda.com]
> >
> > Did you use -f on the vacuumdb?  If not, it did a normal vacuum (which
> > isn't likely to help) not a full vacuum.
>
> There are scenarios where VACUUM FULL is not an option because
> of its resource-hungriness and plain VACUUM just doesn't seem
> to help.
>
> We have a production database that happens to receive several
> thousand row updates per minute.  We VACUUM ANALYZE every four
> hours with max_fsm_pages set to 2100000, and it's no use.

Hmm, what does vacuum verbose say?

One other thing is to find where the space is going.  Some of that might
be ending up in indexes which (unfortunately) on 7.3 and earlier aren't
going to get cleaned up by vacuum and will instead need a reindex.

Re: Postgresql 'eats' all mi data partition

From
Tom Lane
Date:
Tomas Szepe <szepe@pinerecords.com> writes:
> We have a production database that happens to receive several
> thousand row updates per minute.  We VACUUM ANALYZE every four
> hours with max_fsm_pages set to 2100000, and it's no use.

Have you spent any time determining exactly where the problem is?
I'm suspicious that it's an index-bloat issue.

            regards, tom lane

Re: Postgresql 'eats' all mi data partition

From
Tomas Szepe
Date:
> [sszabo@megazone.bigpanda.com]
>
> On Fri, 26 Sep 2003, Tomas Szepe wrote:
>
> > > [sszabo@megazone.bigpanda.com]
> > >
> > > Did you use -f on the vacuumdb?  If not, it did a normal vacuum (which
> > > isn't likely to help) not a full vacuum.
> >
> > There are scenarios where VACUUM FULL is not an option because
> > of its resource-hungriness and plain VACUUM just doesn't seem
> > to help.
> >
> > We have a production database that happens to receive several
> > thousand row updates per minute.  We VACUUM ANALYZE every four
> > hours with max_fsm_pages set to 2100000, and it's no use.
>
> Hmm, what does vacuum verbose say?

(postgres is 7.3.4 on x86 Linux)

INFO:  --Relation pg_catalog.pg_description--
INFO:  Pages 12: Changed 0, Empty 0; Tup 1390: Vac 0, Keep 0, UnUsed 1.
    Total CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO:  --Relation pg_toast.pg_toast_16416--
INFO:  Pages 0: Changed 0, Empty 0; Tup 0: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
WARNING:  Skipping "pg_group" --- only table or database owner can VACUUM it
INFO:  --Relation pg_catalog.pg_proc--
INFO:  Pages 58: Changed 0, Empty 0; Tup 1492: Vac 0, Keep 0, UnUsed 165.
    Total CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO:  --Relation pg_toast.pg_toast_1255--
INFO:  Pages 0: Changed 0, Empty 0; Tup 0: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_rewrite--
INFO:  Pages 4: Changed 0, Empty 0; Tup 27: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_toast.pg_toast_16410--
INFO:  Pages 4: Changed 0, Empty 0; Tup 16: Vac 0, Keep 0, UnUsed 1.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_type--
INFO:  Pages 4: Changed 0, Empty 0; Tup 178: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_attribute--
INFO:  Pages 16: Changed 0, Empty 0; Tup 914: Vac 0, Keep 0, UnUsed 4.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_class--
INFO:  Pages 4: Changed 0, Empty 0; Tup 138: Vac 0, Keep 0, UnUsed 44.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_inherits--
INFO:  Pages 0: Changed 0, Empty 0; Tup 0: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_index--
INFO:  Pages 3: Changed 0, Empty 0; Tup 69: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_operator--
INFO:  Pages 13: Changed 0, Empty 0; Tup 643: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_opclass--
INFO:  Pages 1: Changed 0, Empty 0; Tup 51: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_am--
INFO:  Pages 1: Changed 0, Empty 0; Tup 4: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_amop--
INFO:  Pages 1: Changed 0, Empty 0; Tup 180: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_amproc--
INFO:  Pages 1: Changed 0, Empty 0; Tup 57: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_language--
INFO:  Pages 1: Changed 0, Empty 0; Tup 3: Vac 0, Keep 0, UnUsed 3.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_largeobject--
INFO:  Pages 0: Changed 0, Empty 0; Tup 0: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_aggregate--
INFO:  Pages 1: Changed 0, Empty 0; Tup 60: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_trigger--
INFO:  Pages 1: Changed 0, Empty 0; Tup 2: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_listener--
INFO:  Pages 0: Changed 0, Empty 0; Tup 0: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_cast--
INFO:  Pages 2: Changed 0, Empty 0; Tup 174: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_namespace--
INFO:  Pages 1: Changed 0, Empty 0; Tup 4: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
WARNING:  Skipping "pg_shadow" --- only table or database owner can VACUUM it
INFO:  --Relation pg_catalog.pg_conversion--
INFO:  Pages 2: Changed 0, Empty 0; Tup 114: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_depend--
INFO:  Pages 20: Changed 0, Empty 0; Tup 2834: Vac 0, Keep 0, UnUsed 66.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_attrdef--
INFO:  Pages 1: Changed 0, Empty 0; Tup 4: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_toast.pg_toast_16384--
INFO:  Pages 0: Changed 0, Empty 0; Tup 0: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_catalog.pg_constraint--
INFO:  Pages 1: Changed 0, Empty 0; Tup 5: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation pg_toast.pg_toast_16386--
INFO:  Pages 0: Changed 0, Empty 0; Tup 0: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
WARNING:  Skipping "pg_database" --- only table or database owner can VACUUM it
INFO:  --Relation pg_catalog.pg_statistic--
INFO:  Index pg_statistic_relid_att_index: Pages 4; Tuples 189: Deleted 187.
    CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  Removed 187 tuples in 15 pages.
    CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  Pages 18: Changed 9, Empty 0; Tup 189: Vac 187, Keep 0, UnUsed 259.
    Total CPU 0.00s/0.00u sec elapsed 0.02 sec.
INFO:  --Relation pg_toast.pg_toast_16408--
INFO:  Pages 0: Changed 0, Empty 0; Tup 0: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation public.contract_ips--
INFO:  Index contract_ips_pkey: Pages 430; Tuples 743: Deleted 37893.
    CPU 0.03s/0.13u sec elapsed 0.16 sec.
INFO:  Removed 37893 tuples in 609 pages.
    CPU 0.01s/0.07u sec elapsed 0.07 sec.
INFO:  Pages 1113: Changed 24, Empty 0; Tup 743: Vac 37893, Keep 0, UnUsed 36763.
    Total CPU 0.08s/0.20u sec elapsed 0.28 sec.
INFO:  Truncated 1113 --> 110 pages.
    CPU 0.05s/0.00u sec elapsed 0.41 sec.
INFO:  --Relation pg_toast.pg_toast_50107070--
INFO:  Pages 0: Changed 0, Empty 0; Tup 0: Vac 0, Keep 0, UnUsed 0.
    Total CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  --Relation public.authinfo--
INFO:  Index authinfo_pkey: Pages 733; Tuples 930: Deleted 47430.
    CPU 0.04s/0.18u sec elapsed 0.44 sec.
INFO:  Removed 47430 tuples in 717 pages.
    CPU 0.02s/0.07u sec elapsed 0.09 sec.
INFO:  Pages 1380: Changed 29, Empty 0; Tup 930: Vac 47430, Keep 0, UnUsed 45290.
    Total CPU 0.10s/0.27u sec elapsed 0.60 sec.
INFO:  --Relation public.stats_min--
INFO:  Index stats_min_start: Pages 34445; Tuples 1985464: Deleted 404651.
    CPU 1.19s/2.41u sec elapsed 17.86 sec.
INFO:  Index stats_min_pkey: Pages 76501; Tuples 1986938: Deleted 404651.
    CPU 3.98s/5.47u sec elapsed 217.07 sec.
INFO:  Removed 404651 tuples in 6118 pages.
    CPU 0.83s/0.77u sec elapsed 13.52 sec.
INFO:  Pages 25295: Changed 4615, Empty 0; Tup 1985464: Vac 404651, Keep 0, UnUsed 468220.
    Total CPU 7.19s/8.78u sec elapsed 252.67 sec.
INFO:  --Relation public.stats_hr--
INFO:  Index stats_hr_start: Pages 57654; Tuples 10811294: Deleted 348991.
    CPU 3.09s/5.27u sec elapsed 63.67 sec.
INFO:  Index stats_hr_pkey: Pages 78301; Tuples 10814527: Deleted 348991.
    CPU 5.11s/6.39u sec elapsed 152.78 sec.
INFO:  Removed 348991 tuples in 8333 pages.
    CPU 1.09s/0.92u sec elapsed 36.46 sec.
INFO:  Pages 217213: Changed 1362, Empty 0; Tup 10810476: Vac 348991, Keep 0, UnUsed 352822.
    Total CPU 17.09s/13.98u sec elapsed 284.52 sec.
INFO:  --Relation public.stats_hr_old--
INFO:  Pages 60984: Changed 0, Empty 0; Tup 3232113: Vac 0, Keep 0, UnUsed 0.
    Total CPU 2.76s/0.45u sec elapsed 13.90 sec.
VACUUM

> One other thing is to find where the space is going.  Some of that might
> be ending up in indexes which (unfortunately) on 7.3 and earlier aren't
> going to get cleaned up by vacuum and will instead need a reindex.

That's very likely happening in our case I'm afraid.

Hmm, you seem to suggest that we might expect a change in this regard
as 7.4 ships.  Is that right?

Thanks for your interest in this problem,
--
Tomas Szepe <szepe@pinerecords.com>

Re: Postgresql 'eats' all mi data partition

From
Tomas Szepe
Date:
> [tgl@sss.pgh.pa.us]
>
> Tomas Szepe <szepe@pinerecords.com> writes:
> > We have a production database that happens to receive several
> > thousand row updates per minute.  We VACUUM ANALYZE every four
> > hours with max_fsm_pages set to 2100000, and it's no use.
>
> Have you spent any time determining exactly where the problem is?
> I'm suspicious that it's an index-bloat issue.

I'm afraid so.  The weird thing is, we once tried running
VACUUM FULL followed by REINDEX and although the process
took ages to complete, it didn't seem to help either. :(

I'll post whatever debug data I'm asked for, just don't make
me run VACUUM FULL or REINDEX again please.  (A full redump
is 10-20 times faster. :O)

Thanks,
--
Tomas Szepe <szepe@pinerecords.com>

Re: Postgresql 'eats' all mi data partition

From
Tom Lane
Date:
Tomas Szepe <szepe@pinerecords.com> writes:
> I'm afraid so.  The weird thing is, we once tried running
> VACUUM FULL followed by REINDEX and although the process
> took ages to complete, it didn't seem to help either. :(

> I'll post whatever debug data I'm asked for, just don't make
> me run VACUUM FULL or REINDEX again please.  (A full redump
> is 10-20 times faster. :O)

The output of VACUUM VERBOSE for the problem table(s) would be
interesting.

            regards, tom lane

Re: Postgresql 'eats' all mi data partition

From
Tomas Szepe
Date:
> [tgl@sss.pgh.pa.us]
>
> Tomas Szepe <szepe@pinerecords.com> writes:
> > I'm afraid so.  The weird thing is, we once tried running
> > VACUUM FULL followed by REINDEX and although the process
> > took ages to complete, it didn't seem to help either. :(
>
> > I'll post whatever debug data I'm asked for, just don't make
> > me run VACUUM FULL or REINDEX again please.  (A full redump
> > is 10-20 times faster. :O)
>
> The output of VACUUM VERBOSE for the problem table(s) would be
> interesting.

Just posted, please see my message to
Stephan Szabo <sszabo@megazone.bigpanda.com>, CC pgsql-bugs@postgresql.org
with Message-ID: <20030926184329.GC26641@louise.pinerecords.com>

Thanks,
--
Tomas Szepe <szepe@pinerecords.com>

Re: Postgresql 'eats' all mi data partition

From
Tomas Szepe
Date:
> > The output of VACUUM VERBOSE for the problem table(s) would be
> > interesting.
>
> Just posted, please see my message to
> Stephan Szabo <sszabo@megazone.bigpanda.com>, CC pgsql-bugs@postgresql.org
> with Message-ID: <20030926184329.GC26641@louise.pinerecords.com>

Oh, and the names of the problematic tables are stats_min and stats_hr.

--
Tomas Szepe <szepe@pinerecords.com>

Re: Postgresql 'eats' all mi data partition

From
Tom Lane
Date:
Tomas Szepe <szepe@pinerecords.com> writes:
> INFO:  --Relation public.stats_min--
> INFO:  Index stats_min_start: Pages 34445; Tuples 1985464: Deleted 404651.
>     CPU 1.19s/2.41u sec elapsed 17.86 sec.
> INFO:  Index stats_min_pkey: Pages 76501; Tuples 1986938: Deleted 404651.
>     CPU 3.98s/5.47u sec elapsed 217.07 sec.
> INFO:  Removed 404651 tuples in 6118 pages.
>     CPU 0.83s/0.77u sec elapsed 13.52 sec.
> INFO:  Pages 25295: Changed 4615, Empty 0; Tup 1985464: Vac 404651, Keep 0, UnUsed 468220.
>     Total CPU 7.19s/8.78u sec elapsed 252.67 sec.
> INFO:  --Relation public.stats_hr--
> INFO:  Index stats_hr_start: Pages 57654; Tuples 10811294: Deleted 348991.
>     CPU 3.09s/5.27u sec elapsed 63.67 sec.
> INFO:  Index stats_hr_pkey: Pages 78301; Tuples 10814527: Deleted 348991.
>     CPU 5.11s/6.39u sec elapsed 152.78 sec.
> INFO:  Removed 348991 tuples in 8333 pages.
>     CPU 1.09s/0.92u sec elapsed 36.46 sec.
> INFO:  Pages 217213: Changed 1362, Empty 0; Tup 10810476: Vac 348991, Keep 0, UnUsed 352822.
>     Total CPU 17.09s/13.98u sec elapsed 284.52 sec.

You've definitely got an index-bloat problem on stats_min (the indexes
are four times the size of the table :-(), and I suspect the same on
stats_hr, though not quite as bad.  What are the datatypes of the
index columns?

> Hmm, you seem to suggest that we might expect a change in this regard
> as 7.4 ships.  Is that right?

7.4 should improve matters.

            regards, tom lane

Re: Postgresql 'eats' all mi data partition

From
Tomas Szepe
Date:
> [tgl@sss.pgh.pa.us]
>
> You've definitely got an index-bloat problem on stats_min (the indexes
> are four times the size of the table :-(), and I suspect the same on
> stats_hr, though not quite as bad.  What are the datatypes of the
> index columns?

indexes:
stats_min_pkey primary key btree (ip, "start")
stats_min_start btree ("start")
stats_hr_pkey primary key btree (ip, "start")
stats_hr_start btree ("start")

ip is of type "inet" in all tables.
start is of type "timestamp without time zone" in all tables.

> > Hmm, you seem to suggest that we might expect a change in this regard
> > as 7.4 ships.  Is that right?
>
> 7.4 should improve matters.

Trouble is, I probably can't give it a try on the production machine
unless I'm sure it won't trash the data, and I don't know of a way to
reproduce the workload on a test setup. :(

Thanks,
--
Tomas Szepe <szepe@pinerecords.com>

Re: Postgresql 'eats' all mi data partition

From
Gaetano Mendola
Date:
Tomas Szepe wrote:
> INFO:  --Relation public.stats_min--
> INFO:  Index stats_min_start: Pages 34445; Tuples 1985464: Deleted 404651.
>     CPU 1.19s/2.41u sec elapsed 17.86 sec.
> INFO:  Index stats_min_pkey: Pages 76501; Tuples 1986938: Deleted 404651.
>     CPU 3.98s/5.47u sec elapsed 217.07 sec.
> INFO:  Removed 404651 tuples in 6118 pages.
>     CPU 0.83s/0.77u sec elapsed 13.52 sec.
> INFO:  Pages 25295: Changed 4615, Empty 0; Tup 1985464: Vac 404651, Keep 0, UnUsed 468220.
>     Total CPU 7.19s/8.78u sec elapsed 252.67 sec.
> INFO:  --Relation public.stats_hr--
> INFO:  Index stats_hr_start: Pages 57654; Tuples 10811294: Deleted 348991.
>     CPU 3.09s/5.27u sec elapsed 63.67 sec.
> INFO:  Index stats_hr_pkey: Pages 78301; Tuples 10814527: Deleted 348991.
>     CPU 5.11s/6.39u sec elapsed 152.78 sec.
> INFO:  Removed 348991 tuples in 8333 pages.
>     CPU 1.09s/0.92u sec elapsed 36.46 sec.
> INFO:  Pages 217213: Changed 1362, Empty 0; Tup 10810476: Vac 348991, Keep 0, UnUsed 352822.
>     Total CPU 17.09s/13.98u sec elapsed 284.52 sec.

Do you regular reindex these tables ?


Regards
Gaetano Mendola

Re: Postgresql 'eats' all mi data partition

From
Tom Lane
Date:
Tomas Szepe <szepe@pinerecords.com> writes:
> indexes:
> stats_min_pkey primary key btree (ip, "start")
> stats_min_start btree ("start")
> stats_hr_pkey primary key btree (ip, "start")
> stats_hr_start btree ("start")

> ip is of type "inet" in all tables.
> start is of type "timestamp without time zone" in all tables.

Okay, so a pkey index entry will take 32 bytes counting overhead ...
you've got about 10:1 bloat on the stats_min indexes and 2:1 in stats_hr.
Definitely bad :-(

I expect the bloat is coming from the fact that the interesting range of
"start" changes over time.  7.4 should be able to recycle index space
in that situation, but 7.3 and before can't.

            regards, tom lane

Re: Postgresql 'eats' all mi data partition

From
Tomas Szepe
Date:
> [tgl@sss.pgh.pa.us]
>
> > indexes:
> > stats_min_pkey primary key btree (ip, "start")
> > stats_min_start btree ("start")
> > stats_hr_pkey primary key btree (ip, "start")
> > stats_hr_start btree ("start")
>
> > ip is of type "inet" in all tables.
> > start is of type "timestamp without time zone" in all tables.
>
> Okay, so a pkey index entry will take 32 bytes counting overhead ...
> you've got about 10:1 bloat on the stats_min indexes and 2:1 in stats_hr.
> Definitely bad :-(

The only difference between the way stats_min and stats_hr are updated
stems from the fact that stats_min only holds records for the last 1440
minutes (because of its killer time granularity), whereas stats_hr
holds its data until we decide some of it is obsolete enough and
issue a "delete from" by hand.

Thus, the updates look like (pardon the pseudosql please):

stats_min:
    begin;
    delete from stats_min where \
        [data is current minute but yesterday's, or even older,
        or from the future];
    [repeat for all ips]
        insert into stats_min [data for the current minute];
    [end repeat]
    commit;

stats_hr:
    begin;
    [repeat for all ips]
        update stats_hr set [data += increment] where \
            [data is for current hour];
        [if no_of_rows_updated < 1]
            insert into stats_hr [current hour's first increment];
        [endif]
    [end repeat]
    commit;

> I expect the bloat is coming from the fact that the interesting range of
> "start" changes over time.  7.4 should be able to recycle index space
> in that situation, but 7.3 and before can't.

OK, I'll definitely try 7.4 once I'm confident with it.

Thanks for your time,
--
Tomas Szepe <szepe@pinerecords.com>

Re: Postgresql 'eats' all mi data partition

From
Gaetano Mendola
Date:
Tomas Szepe wrote:
>>[tgl@sss.pgh.pa.us]
>>
>>
>>>indexes:
>>>stats_min_pkey primary key btree (ip, "start")
>>>stats_min_start btree ("start")
>>>stats_hr_pkey primary key btree (ip, "start")
>>>stats_hr_start btree ("start")
>>
>>>ip is of type "inet" in all tables.
>>>start is of type "timestamp without time zone" in all tables.
>>
>>Okay, so a pkey index entry will take 32 bytes counting overhead ...
>>you've got about 10:1 bloat on the stats_min indexes and 2:1 in stats_hr.
>>Definitely bad :-(
>
>
> The only difference between the way stats_min and stats_hr are updated
> stems from the fact that stats_min only holds records for the last 1440
> minutes (because of its killer time granularity), whereas stats_hr
> holds its data until we decide some of it is obsolete enough and
> issue a "delete from" by hand.

Are you sure that all indexes are needed and that a partial index could
not help ? What about the statistics on these indexes ? Are they really
used ?




Regards
Gaetano Mendola.

Re: Postgresql 'eats' all mi data partition

From
Tomas Szepe
Date:
> [mendola@bigfoot.com]
>
> Tomas Szepe wrote:
> >>[tgl@sss.pgh.pa.us]
> >>
> >>
> >>>indexes:
> >>>stats_min_pkey primary key btree (ip, "start")
> >>>stats_min_start btree ("start")
> >>>stats_hr_pkey primary key btree (ip, "start")
> >>>stats_hr_start btree ("start")
> >>
> >>>ip is of type "inet" in all tables.
> >>>start is of type "timestamp without time zone" in all tables.
> >>
> >>Okay, so a pkey index entry will take 32 bytes counting overhead ...
> >>you've got about 10:1 bloat on the stats_min indexes and 2:1 in stats_hr.
> >>Definitely bad :-(
> >
> >
> >The only difference between the way stats_min and stats_hr are updated
> >stems from the fact that stats_min only holds records for the last 1440
> >minutes (because of its killer time granularity), whereas stats_hr
> >holds its data until we decide some of it is obsolete enough and
> >issue a "delete from" by hand.
>
> Are you sure that all indexes are needed and that a partial index could
> not help ? What about the statistics on these indexes ? Are they really
> used ?

Yup, they're all essential. :(

--
Tomas Szepe <szepe@pinerecords.com>

Re: Postgresql 'eats' all mi data partition

From
Gaetano Mendola
Date:
Tomas Szepe wrote:

>>[mendola@bigfoot.com]
>>
>>Tomas Szepe wrote:
>>
>>>>[tgl@sss.pgh.pa.us]
>>>>
>>>>
>>>>
>>>>>indexes:
>>>>>stats_min_pkey primary key btree (ip, "start")
>>>>>stats_min_start btree ("start")
>>>>>stats_hr_pkey primary key btree (ip, "start")
>>>>>stats_hr_start btree ("start")
>>>>
>>>>>ip is of type "inet" in all tables.
>>>>>start is of type "timestamp without time zone" in all tables.
>>>>
>>>>Okay, so a pkey index entry will take 32 bytes counting overhead ...
>>>>you've got about 10:1 bloat on the stats_min indexes and 2:1 in stats_hr.
>>>>Definitely bad :-(
>>>
>>>
>>>The only difference between the way stats_min and stats_hr are updated
>>>stems from the fact that stats_min only holds records for the last 1440
>>>minutes (because of its killer time granularity), whereas stats_hr
>>>holds its data until we decide some of it is obsolete enough and
>>>issue a "delete from" by hand.
>>
>>Are you sure that all indexes are needed and that a partial index could
>>not help ? What about the statistics on these indexes ? Are they really
>>used ?
>
>
> Yup, they're all essential. :(

May I see yours tipical queries where these indexes are involved ?


Regards
Gaetano Mendola

Re: Postgresql 'eats' all mi data partition

From
Tomas Szepe
Date:
> [mendola@bigfoot.com]
>
> >>>>>indexes:
> >>>>>stats_min_pkey primary key btree (ip, "start")
> >>>>>stats_min_start btree ("start")
> >>>>>stats_hr_pkey primary key btree (ip, "start")
> >>>>>stats_hr_start btree ("start")
> >>>>
> >>>>>ip is of type "inet" in all tables.
> >>>>>start is of type "timestamp without time zone" in all tables.
> >>>>
> >>>>Okay, so a pkey index entry will take 32 bytes counting overhead ...
> >>>>you've got about 10:1 bloat on the stats_min indexes and 2:1 in
> >>>>stats_hr.
> >>>>Definitely bad :-(
> >>>
> >>>
> >>>The only difference between the way stats_min and stats_hr are updated
> >>>stems from the fact that stats_min only holds records for the last 1440
> >>>minutes (because of its killer time granularity), whereas stats_hr
> >>>holds its data until we decide some of it is obsolete enough and
> >>>issue a "delete from" by hand.
> >>
> >>Are you sure that all indexes are needed and that a partial index could
> >>not help ? What about the statistics on these indexes ? Are they really
> >>used ?
> >
> >
> >Yup, they're all essential. :(
>
> May I see yours tipical queries where these indexes are involved ?

A very typical query (apart from those I've already posted in my "how the
updates work" mail) would be:

select ip, start::time,
    (in_tcp_web + in_tcp_mail + in_udp_and_icmp
        + in_tcp_rest + in_rest) as d_in,
    (out_tcp_web + out_tcp_mail + out_udp_and_icmp
        + out_tcp_rest + out_rest) as d_out,
    (in_tcp_web + in_tcp_mail + in_udp_and_icmp
        + in_tcp_rest + in_rest
        + out_tcp_web + out_tcp_mail + out_udp_and_icmp
        + out_tcp_rest + out_rest) as d_sum,
    ((in_tcp_web + in_tcp_mail + in_udp_and_icmp
        + in_tcp_rest + in_rest
        + out_tcp_web + out_tcp_mail + out_udp_and_icmp
        + out_tcp_rest + out_rest) / intlen / 128) as rate_sum
    from stats_hr
    where start=(select start from stats_hr order by start desc limit 1)
    order by (in_tcp_web + in_tcp_mail + in_udp_and_icmp
        + in_tcp_rest + in_rest + out_tcp_web + out_tcp_mail
        + out_udp_and_icmp + out_tcp_rest + out_rest)
        desc
    limit 20;

->

    QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=5152.26..5152.31 rows=20 width=104)
   InitPlan
     ->  Limit  (cost=0.00..1.11 rows=1 width=8)
           ->  Index Scan Backward using stats_hr_start on stats_hr  (cost=0.00..12059890.93 rows=10847279 width=8)
   ->  Sort  (cost=5152.26..5162.55 rows=4115 width=104)
         Sort Key: (((((((((in_tcp_web + in_tcp_mail) + in_udp_and_icmp) + in_tcp_rest) + in_rest) + out_tcp_web) +
out_tcp_mail)+ out_udp_and_icmp) + out_tcp_rest) + out_rest) 
         ->  Index Scan using stats_hr_start on stats_hr  (cost=0.00..4905.22 rows=4115 width=104)
               Index Cond: ("start" = $0)

(done in 0.079s.)

--
Tomas Szepe <szepe@pinerecords.com>

Re: Postgresql 'eats' all mi data partition

From
Christopher Browne
Date:
fjcarlos@correo.insp.mx (Javier Carlos) writes:
>    I did a vacuumbdb after the updates, and the space usage didn't down to
> something reasonable. For example, I had a 250MB database, then I did about
> 300 query updates, and mi partition growed up until fill all mi data partition
> space of 15GB. After that I did an vacuumdb and only the space down 100MB.
> After that I DROPPED the database, and the space down ALL the 15GB; It's very
> weird, don't you think?

You may not have been expecting it, but it is no great surprise to me.

If you vacuumed after every ~10-20 of those query updates, then the
size shouldn't have gotten nearly that high, and performance would
likely have been quite a bit better.

In effect, every time you update substantially all of the data, you
should _definitely_ do a vacuum, as it will be quite likely to "reap"
a whole table's worth of dead tuples.

A VACUUM FULL will cut the size down absolutely, at the cost of
blocking updates during the vacuum.  If you run "ordinary VACUUMs"
along the way, they aren't as effective at cutting down on the space
used, but are non-blocking.  Probably it's better to regularly do
"ordinary VACUUMs."
--
"cbbrowne","@","libertyrms.info"
<http://dev6.int.libertyrms.com/>
Christopher Browne
(416) 646 3304 x124 (land)

Success report (WAS: postgresql 'eats' my whole data partition)

From
Tomas Szepe
Date:
On Sep-26 2003, Fri, 16:28 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Okay, so a pkey index entry will take 32 bytes counting overhead ...
> you've got about 10:1 bloat on the stats_min indexes and 2:1 in stats_hr.
> Definitely bad :-(
>
> I expect the bloat is coming from the fact that the interesting range of
> "start" changes over time.  7.4 should be able to recycle index space
> in that situation, but 7.3 and before can't.

This is how things look with 7.4beta3:

INFO:  vacuuming "public.stats_min"
INFO:  index "stats_min_start" now contains 1963394 tuples in 12086 pages
DETAIL:  736088 index tuples were removed.
5029 index pages have been deleted, 2411 are currently reusable.
CPU 1.07s/2.13u sec elapsed 8.15 sec.
INFO:  index "stats_min_pkey" now contains 1963520 tuples in 27945 pages
DETAIL:  736088 index tuples were removed.
11722 index pages have been deleted, 5603 are currently reusable.
CPU 4.18s/5.55u sec elapsed 113.04 sec.
INFO:  "stats_min": removed 736088 tuples in 9483 pages
DETAIL:  CPU 1.00s/1.12u sec elapsed 13.68 sec.
INFO:  "stats_min": found 736088 removable, 1963394 nonremovable tuples in 23890 pages
DETAIL:  0 dead tuples cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 7.20s/9.16u sec elapsed 146.28 sec.

...

INFO:  vacuuming "public.stats_hr"
INFO:  index "stats_hr_start" now contains 7850557 tuples in 28059 pages
DETAIL:  728706 index tuples were removed.
4337 index pages have been deleted, 2035 are currently reusable.
CPU 2.66s/4.72u sec elapsed 41.18 sec.
INFO:  index "stats_hr_pkey" now contains 7852826 tuples in 57657 pages
DETAIL:  728706 index tuples were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 5.26s/5.66u sec elapsed 123.91 sec.
INFO:  "stats_hr": removed 728706 tuples in 16033 pages
DETAIL:  CPU 1.51s/1.68u sec elapsed 25.43 sec.
INFO:  "stats_hr": found 728706 removable, 7848770 nonremovable tuples in 161840 pages
DETAIL:  0 dead tuples cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 15.71s/12.88u sec elapsed 213.28 sec.

Definitely a tremendous improvement.  The "working base" on-disk
db size seems to have stabilised at around 2.4 GiB and is only
increasing by ~100 MiB a day, which, by my calculations, is roughly
proportional to the frequency of data added.  (There is still
some minor overhead all right, but it's far from serious.)

So again it's a big thank-you to the whole postgres development
group for yet another nice step forward, especially to you Tom
for the "allow btree index compaction and empty page reuse"
changeset.

Best regards,
--
Tomas Szepe <szepe@pinerecords.com>