Re: Index Scans become Seq Scans after VACUUM ANALYSE - Mailing list pgsql-hackers

From Jean-Paul ARGUDO
Subject Re: Index Scans become Seq Scans after VACUUM ANALYSE
Date
Msg-id 20020417101503.GA5010@pastis
Whole thread Raw
In response to Re: Index Scans become Seq Scans after VACUUM ANALYSE  (mlw <markw@mohawksoft.com>)
List pgsql-hackers
> Here is the problem, in a single paragraph.
> 
> If the DBA notices that there is a problem with a query, he adds an index, he
> notices that there is no difference, then he notices that PostgreSQL is not
> using his index. First and foremost he gets mad at PostgreSQL for not using his
> index. If PostgreSQL decided to use an index which increases execution time,
> the DBA would delete the index. If PostgreSQL does not use an index, he has to
> modify the posgresql.conf file, which disallows PostgreSQL from using an index
> when it would be a clear loser.
> 
> My assertion is this: "If a DBA creates an index, he has a basis for his
> actions."

I agree with Mark.

I jump on this thread to ask some questions:

1) When a DBA creates an index, it is mainly to optimize. But when an
index is created, we need to make a vacuum --analyze in order to give PG
optimizer (totally guessed, Tom may correct this affirmation?) knowledge
of it. My 1st question is : wouldn't we create a kind of trigger to make
an automatic vacuum --analyze on the table when a new index is created
on it?

Here an example on a pratical optimisation day: (taken from a
optimisation journal I make every time I need to make an optimisation on
a customer' database):

« Line 962

EXPLAIN SELECT t12_bskid, t12_pnb, t12_lne, t12_tck
FROM T12_20011231
WHERE t12_bskid >= 1
ORDER BY t12_bskid, t12_pnb, t12_tck, t12_lne;

Sort  (cost=1348.70..1348.70 rows=8565 width=16) ->  Seq Scan on t12_20011231  (cost=0.00..789.20 rows=8565 width=16)


dbkslight=# create index t12_bskid_pnb_tck_lne on t12_20011231
(t12_bskid, t12_pnb, t12_tck, t12_lne);
CREATE
dbkslight=# EXPLAIN SELECT t12_bskid, t12_pnb, t12_lne, t12_tck
dbkslight-# FROM T12_20011231
dbkslight-# WHERE t12_bskid >= 1
dbkslight-# ORDER BY t12_bskid, t12_pnb, t12_tck, t12_lne;

NOTICE:  QUERY PLAN:

Sort  (cost=1348.70..1348.70 rows=8565 width=16) ->  Seq Scan on t12_20011231  (cost=0.00..789.20 rows=8565 width=16)

EXPLAIN

dbkslight=# vacuum analyze t12_20011231;
VACUUM

dbkslight=# EXPLAIN SELECT t12_bskid, t12_pnb, t12_lne, t12_tck
FROM T12_20011231
WHERE t12_bskid >= 1
ORDER BY t12_bskid, t12_pnb, t12_tck, t12_lne;dbkslight-# dbkslight-#
dbkslight-# ;
NOTICE:  QUERY PLAN:

Index Scan using t12_bskid_pnb_tck_lne on t12_20011231
(cost=0.00..2232.11 rows=25693 width=16)

;-))

» end of example............

2) We all know that indices on small tables have to be dropped, because
the seq scan is always cheaper. I wonder about middle tables often
accessed: data are mainly in the PG buffers. So seq scan a table whose
data pages are in the buffer is always cheaper too :) 

Then, depending the memory allowed to PG, can we say indices on medium
tables have also to be dropped? I think so, because index maintenace has
a cost too.

3) I have to say that queries sometimes have to be rewrited. It is very
well explained in the "PostgreSQL Developper Handbook" I have at home...
(at work at the moment, will post complete references later, but surely
you can find this book at techdocs.postgresql.org ..).

I experienced myself many times, joints have to be rewrited...
This is really true for outter joins (LEFT/RIGHT join). And it has to be
tested with explain plans.

Hope this helps.

Regards,


-- 
Jean-Paul ARGUDO                                IDEALX S.A.S
Consultant bases de données                     15-17, av. de Ségur
http://www.idealx.com                           F-75007 PARIS


pgsql-hackers by date:

Previous
From: Michael Loftis
Date:
Subject: Re: Index Scans become Seq Scans after VACUUM ANALYSE
Next
From: tycho@fruru.com
Date:
Subject: Re: Index Scans become Seq Scans after VACUUM ANALYSE