Thread: Trouble with index in 7.1

Trouble with index in 7.1

From
gerry.smit@lombard.ca
Date:
Hi folks. We're having a baffling time with Postgres 7.1

I have 4 databases (central, atlantic, quebec, western) with 2 tables each
(policy_cross_reference  (_d or _b)).

The Central tables have about 800,000 rows each, versus 300,000 each in
Western, 180,000 in Atlantic and 100,000 in Quebec.

The tables are , triple checked, defined with the same indexes .

Here's a snippet

cen_db=> \d pol_xref_d
              Table "pol_xref_d"
 Attribute |         Type          | Modifier
-----------+-----------------------+----------
 sequence  | integer               | not null
 policy_no | integer               |
 module    | character(2)          |
 bd        | character(1)          |
 dir       | character varying(30) |
 policy_id | integer               |
 ua        | character varying(8)  |
 da        | date                  |
 ta        | time                  |
 uwcode    | integer               |
 agency    | integer               |
Indices: pol_xref_d_nkey_1,
         pol_xref_d_pkey


And the indexes :

cen_db=> \d pol_xref_d_nkey_1
Index "pol_xref_d_nkey_1"
 Attribute |     Type
-----------+--------------
 policy_no | integer
 module    | character(2)
btree

cen_db=> \d pol_xref_d_pkey
Index "pol_xref_d_pkey"
 Attribute |  Type
-----------+---------
 sequence  | integer
unique btree (primary key)

cen_db=> explain select * from pol_xref_d where policy_no=1200079;
NOTICE:  QUERY PLAN:

Seq Scan on pol_xref_d  (cost=0.00..22299.22 rows=8325 width=80)


Sequential Scan  !!!!  Good God, why ?


switching to the Atlantic DB:

 \connect atl_db
You are now connected to database atl_db.
atl_db=> explain select * from pol_xref_d where policy_no=1200079;
NOTICE:  QUERY PLAN:

Index Scan using pol_xref_d_nkey_1 on pol_xref_d  (cost=0.00..4627.09
rows=1864 width=80)

EXPLAIN



Uses the index, and is lighning fast.

Weirdly enough, up until a few months ago. Central DB used the index as
well. When it started to go sequential, we added the phrase "and module
like '%' "  to the WHERE clause, and the index came back into use. Now we
get no luck at all.

Given that the index SELECT work in the other 3 databases, I'm at the point
where , at least temporarily, going to have to delete older rows in order
to improve performance. Even if we don't get indexing back, searching
400,000 rows will only take half the time , and 11 second response wil
ldrop to about 6.



Oh, we vacuum EVERY night, in case that has any bearing on this. Further,
the table is updated every day with INSERT of about 3000-5000 new rows.
Otherwise, this is a read only table. WORM, to be specific.



ANY ANY ANY suggestions would be helpful. Otherwise, I'm stuck creating 10
tables pol_xref_d0, pol_xref_d1, etc, and using the last digit of the
policy number as a hashing algorithm on where to store the policy. that
brings the number of rows down to 70,000-100,000 rows per table , and
presumably, would re-enable the index reads.

Gerry Smit

Lombard Insurance
Toronto, Canada.












Re: Trouble with index in 7.1

From
"Adam Erickson"
Date:
> Weirdly enough, up until a few months ago. Central DB used the index as
> well. When it started to go sequential, we added the phrase "and module
> like '%' "  to the WHERE clause, and the index came back into use. Now we
> get no luck at all.
>
> Given that the index SELECT work in the other 3 databases, I'm at
> the point
> where , at least temporarily, going to have to delete older rows in order
> to improve performance. Even if we don't get indexing back, searching
> 400,000 rows will only take half the time , and 11 second response wil
> ldrop to about 6.
>
> Oh, we vacuum EVERY night, in case that has any bearing on this. Further,
> the table is updated every day with INSERT of about 3000-5000 new rows.
> Otherwise, this is a read only table. WORM, to be specific.

Have you tried using analyze <table>?  It's not entirely clear (to me) that
a plain vacuum does this step.

-Adam


Re: Trouble with index in 7.1

From
Tom Lane
Date:
gerry.smit@lombard.ca writes:
> Hi folks. We're having a baffling time with Postgres 7.1

Updating to 7.2 might help.

> cen_db=> explain select * from pol_xref_d where policy_no=1200079;

> Seq Scan on pol_xref_d  (cost=0.00..22299.22 rows=8325 width=80)

> Sequential Scan  !!!!  Good God, why ?

Because it thinks a substantial fraction of your table will be visited
(note the rows=8325 estimate).  My guess is that you have some one value
of policy_no that is extremely common, and that that is fooling the
7.1 planner into thinking the table contains only a few distinct values
of policy_no --- which would make a seqscan an appropriate choice.

7.2 keeps more extensive stats and is more able to deal with scenarios
where there are a few common values and lots of not-so-common values.

            regards, tom lane

Re: Trouble with index in 7.1

From
gerry.smit@lombard.ca
Date:
That wasn't it.  Largest occurance was 96, and a few 95's, in a table of
800,000.   But, as other posts have said, ANALYZE fixed the problem.

Thanks for the hint!

Gerry




        
                      "Tom Lane"
        
                      <tgl@sss.pgh.pa.         To:      gerry.smit@lombard.ca
        
                      us>                      cc:      pgsql-novice@postgresql.org
        
                                               Fax to:
        
                      16/05/2002 12:32         Subject: Re: [NOVICE] Trouble with index in 7.1
        
                      AM
        

        

        




gerry.smit@lombard.ca writes:
> Hi folks. We're having a baffling time with Postgres 7.1

Updating to 7.2 might help.

> cen_db=> explain select * from pol_xref_d where policy_no=1200079;

> Seq Scan on pol_xref_d  (cost=0.00..22299.22 rows=8325 width=80)

> Sequential Scan  !!!!  Good God, why ?

Because it thinks a substantial fraction of your table will be visited
(note the rows=8325 estimate).  My guess is that you have some one value
of policy_no that is extremely common, and that that is fooling the
7.1 planner into thinking the table contains only a few distinct values
of policy_no --- which would make a seqscan an appropriate choice.

7.2 keeps more extensive stats and is more able to deal with scenarios
where there are a few common values and lots of not-so-common values.

   regards, tom lane