Thread: vacuum full

vacuum full

From

"Henrik Steffen"

Date:

21 November 2002, 11:52:28

hello all,

how often should "vacuum full" usually be run ?

thanks,

--

Mit freundlichem Gruß

Henrik Steffen
Geschäftsführer

top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com          Tel. +49 4141 991230
mail: steffen@topconcepts.com       Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline:  +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
Ihr SMS-Gateway: JETZT NEU unter: http://sms.city-map.de
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------

stange optimizer results

From

"Peter T. Brown"

Date:

21 November 2002, 12:21:29

Hello--

Attached is a file containing two SQL queries. The first take
prohibitively long to complete because, according to EXPLAIN, it ignore
two very important indexes. The second SQL query seems almost identical
to the first but runs very fast because, according to EXPLAIN, it does
uses all the indexes appropriately.

Can someone please explain to me what the difference is here? Or if
there is something I can do with my indexes to make the first query run
like the second?


Thanks Much

Peter

Attachment

sqlquestion.sql

Re: vacuum full

From

"scott.marlowe"

Date:

21 November 2002, 12:34:41

On Thu, 21 Nov 2002, Henrik Steffen wrote:

>
> hello all,
>
> how often should "vacuum full" usually be run ?

I recommend nightly.  Also, check index size.  If they are growing, you
may want to reindex each night or week as well.

Re: vacuum full

From

"Henrik Steffen"

Date:

21 November 2002, 13:18:13

ok, but how can I measure index sizes?

--

Mit freundlichem Gruß

Henrik Steffen
Geschäftsführer

top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com          Tel. +49 4141 991230
mail: steffen@topconcepts.com       Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline:  +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
Ihr SMS-Gateway: JETZT NEU unter: http://sms.city-map.de
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------

----- Original Message -----
From: "scott.marlowe" <scott.marlowe@ihs.com>
To: "Henrik Steffen" <steffen@city-map.de>
Cc: <pgsql-performance@postgresql.org>
Sent: Thursday, November 21, 2002 6:32 PM
Subject: Re: [PERFORM] vacuum full


> On Thu, 21 Nov 2002, Henrik Steffen wrote:
>
> >
> > hello all,
> >
> > how often should "vacuum full" usually be run ?
>
> I recommend nightly.  Also, check index size.  If they are growing, you
> may want to reindex each night or week as well.
>

Re: vacuum full

From

"Henrik Steffen"

Date:

21 November 2002, 13:20:44

sorry, didn't notice your message posted to pgsql-general...

but is there any method to see the size in bytes a particular index
for a particular table takes?

--

Mit freundlichem Gruß

Henrik Steffen
Geschäftsführer

top concepts Internetmarketing GmbH
Am Steinkamp 7 - D-21684 Stade - Germany
--------------------------------------------------------
http://www.topconcepts.com          Tel. +49 4141 991230
mail: steffen@topconcepts.com       Fax. +49 4141 991233
--------------------------------------------------------
24h-Support Hotline:  +49 1908 34697 (EUR 1.86/Min,topc)
--------------------------------------------------------
Ihr SMS-Gateway: JETZT NEU unter: http://sms.city-map.de
System-Partner gesucht: http://www.franchise.city-map.de
--------------------------------------------------------
Handelsregister: AG Stade HRB 5811 - UstId: DE 213645563
--------------------------------------------------------

----- Original Message -----
From: "scott.marlowe" <scott.marlowe@ihs.com>
To: "Henrik Steffen" <steffen@city-map.de>
Cc: <pgsql-performance@postgresql.org>
Sent: Thursday, November 21, 2002 6:32 PM
Subject: Re: [PERFORM] vacuum full


> On Thu, 21 Nov 2002, Henrik Steffen wrote:
>
> >
> > hello all,
> >
> > how often should "vacuum full" usually be run ?
>
> I recommend nightly.  Also, check index size.  If they are growing, you
> may want to reindex each night or week as well.
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>

Re: stange optimizer results

From

Stephan Szabo

Date:

21 November 2002, 13:34:17

On 21 Nov 2002, Peter T. Brown wrote:

> Hello--
>
> Attached is a file containing two SQL queries. The first take
> prohibitively long to complete because, according to EXPLAIN, it ignore
> two very important indexes. The second SQL query seems almost identical
> to the first but runs very fast because, according to EXPLAIN, it does
> uses all the indexes appropriately.
>
> Can someone please explain to me what the difference is here? Or if
> there is something I can do with my indexes to make the first query run
> like the second?

It doesn't take into account that in general a=b, b=constant implies
a=constant.

 Perhaps if you used explicit join syntax for visitor joining
visitorextra it might help.  Like doing:
 FROM visitor inner join visitorextra on (...)
  left outer join ...

Re: stange optimizer results

From

Stephan Szabo

Date:

21 November 2002, 13:35:39

On Thu, 21 Nov 2002, Stephan Szabo wrote:

> On 21 Nov 2002, Peter T. Brown wrote:
>
> > Hello--
> >
> > Attached is a file containing two SQL queries. The first take
> > prohibitively long to complete because, according to EXPLAIN, it ignore
> > two very important indexes. The second SQL query seems almost identical
> > to the first but runs very fast because, according to EXPLAIN, it does
> > uses all the indexes appropriately.
> >
> > Can someone please explain to me what the difference is here? Or if
> > there is something I can do with my indexes to make the first query run
> > like the second?
>
> It doesn't take into account that in general a=b, b=constant implies
> a=constant.
>
>  Perhaps if you used explicit join syntax for visitor joining
> visitorextra it might help.  Like doing:
>  FROM visitor inner join visitorextra on (...)
>   left outer join ...

Sent this too quickly.  It probably won't make it use an index on
vistorextra, but it may lower the number of expected rows that it's going
to be left outer joining so that a nested loop and index scan makes sense.

Re: vacuum full

From

george young

Date:

21 November 2002, 13:50:54

On Thu, 21 Nov 2002 19:20:16 +0100
"Henrik Steffen" <steffen@city-map.de> wrote:

> sorry, didn't notice your message posted to pgsql-general...
>
> but is there any method to see the size in bytes a particular index
> for a particular table takes?

For a table foo, do:
psql mydatabase
-- first find the index name:
mydatabase=> \d foo
...
Indexes: foo_pkey unique btree (mykey)
-- now find the unix file name of the index:
mydatabase=> select relfilenode from pg_class where relname='foo_pkey';
 relfilenode
-------------
       18122
-- Thus the file name of the index is "18122".
\q
# now go and look for the file:
unixprompt> su postgres
Password:
postgres> cd /var/lib/pgsql/data/base/????
postgres> ls -l 18122
-rw-------    1 postgres daemon    7471104 Nov 21 12:52 18122

Thus the index for table foo is 7.4 MBytes in size.  What I left
out is the ???? directory name above.  I find it by educated guess.

Does someone know the right way to map from database name to
data directory name?

-- George

> From: "scott.marlowe" <scott.marlowe@ihs.com>
> To: "Henrik Steffen" <steffen@city-map.de>
> Cc: <pgsql-performance@postgresql.org>
> Sent: Thursday, November 21, 2002 6:32 PM
> Subject: Re: [PERFORM] vacuum full
>
>
> > On Thu, 21 Nov 2002, Henrik Steffen wrote:
> > >
> > > how often should "vacuum full" usually be run ?
> >
> > I recommend nightly.  Also, check index size.  If they are growing, you
> > may want to reindex each night or week as well.

--
 I cannot think why the whole bed of the ocean is
 not one solid mass of oysters, so prolific they seem. Ah,
 I am wandering! Strange how the brain controls the brain!
    -- Sherlock Holmes in "The Dying Detective"

Re: vacuum full

From

"scott.marlowe"

Date:

21 November 2002, 14:14:00

On Thu, 21 Nov 2002, Henrik Steffen wrote:

> sorry, didn't notice your message posted to pgsql-general...
>
> but is there any method to see the size in bytes a particular index
> for a particular table takes?

There are some sql queries that can tell you the number of blocks used and
all, but I generally do it with oid2name (you can get it installed by
going into your source tree/contrib/oid2name and doing a make/make install
there.)

oid2name by itself will tell you the oids of your databases.  On my fairly
fresh system it looks like this:

All databases:
---------------------------------
16976  = postgres
1      = template1
16975  = template0

Then,

'oid2name -d postgres' outputs this:

16999  = g
17025  = g_name_dx
16977  = gaff
16988  = test
16986  = test_id_seq
17019  = tester

So, I can do this 'ls -l $PGDATA/base/16976/17025'

to see how big the index g_name_dx is.

Re: vacuum full

From

Tom Lane

Date:

21 November 2002, 14:22:03

george young <gry@ll.mit.edu> writes:
> Does someone know the right way to map from database name to
> data directory name?

pg_database.oid column.

However, that's definitely the hard way.  I'd just look at the relpages
column of pg_class, which should be reasonably accurate if you've done
a VACUUM or ANALYZE recently.  For example:

regression=# select relname, relkind, relpages from pg_class where
regression-# relname like 'tenk1%';
    relname    | relkind | relpages
---------------+---------+----------
 tenk1         | r       |      358
 tenk1_hundred | i       |       30
 tenk1_unique1 | i       |       30
 tenk1_unique2 | i       |       30
(4 rows)

Here we have a table and its three indexes, and the index sizes look
reasonable.  If the index sizes approach or exceed the table size,
you are probably suffering from index bloat --- try a reindex.

            regards, tom lane

Re: stange optimizer results

From

"Peter T. Brown"

Date:

21 November 2002, 15:36:48

trouble is that this SQL is being automatically created by my
object-relational mapping software and I can't easily customize it. Is
there any other way I can force Postgres to do the most efficient thing?



On Thu, 2002-11-21 at 10:35, Stephan Szabo wrote:
>
> On Thu, 21 Nov 2002, Stephan Szabo wrote:
>
> > On 21 Nov 2002, Peter T. Brown wrote:
> >
> > > Hello--
> > >
> > > Attached is a file containing two SQL queries. The first take
> > > prohibitively long to complete because, according to EXPLAIN, it ignore
> > > two very important indexes. The second SQL query seems almost identical
> > > to the first but runs very fast because, according to EXPLAIN, it does
> > > uses all the indexes appropriately.
> > >
> > > Can someone please explain to me what the difference is here? Or if
> > > there is something I can do with my indexes to make the first query run
> > > like the second?
> >
> > It doesn't take into account that in general a=b, b=constant implies
> > a=constant.
> >
> >  Perhaps if you used explicit join syntax for visitor joining
> > visitorextra it might help.  Like doing:
> >  FROM visitor inner join visitorextra on (...)
> >   left outer join ...
>
> Sent this too quickly.  It probably won't make it use an index on
> vistorextra, but it may lower the number of expected rows that it's going
> to be left outer joining so that a nested loop and index scan makes sense.
>
>
--

Peter T. Brown
Director Of Technology
Memetic Systems, Inc.
"Translating Customer Data Into Marketing Action."
206.335.2927
http://www.memeticsystems.com/

Re: stange optimizer results

From

Stephan Szabo

Date:

21 November 2002, 16:26:49

On 21 Nov 2002, Peter T. Brown wrote:

> trouble is that this SQL is being automatically created by my
> object-relational mapping software and I can't easily customize it. Is
> there any other way I can force Postgres to do the most efficient thing?

Given no changes to the query, probably only by using blunt hammers like
seeing if set enable_seqscan=off or set enable_mergejoin=off helps, but
you'd want to wrap the statement with them (setting them globally is
bad) and you might not be able to do that as well if your software won't
let you.