Re: Index not used, performance problem - Mailing list pgsql-performance

From Christopher Kings-Lynne
Subject Re: Index not used, performance problem
Date
Msg-id 20030329214139.N4697-100000@houston.familyhealth.com.au
Whole thread Raw
In response to Index not used, performance problem  (Andreas Kostyrka <andreas@mtg.co.at>)
Responses Re: Index not used, performance problem
Re: Index not used, performance problem
Re: Index not used, performance problem
List pgsql-performance
Hi Andreas,

A few points:

PostgreSQL is rarely going to use an index for a boolean column.  The
reason is that since almost by definition true will occupy 50% of the rows
and false will occupy 50% (say).  In this case, a sequential scan is
always faster.  You would say that the 'selectivity' isn't good enough.

As for the LIKE searches, the only ones that PostgreSQL can index are of
the form 'FOO%', which is what you are doing.  However, I believe that
PostgreSQL cannot do this if your database encoding is anything other than
'C'.  So, if you are using an Austrian encoding, it might not be able to
use the index.

Some things to try:

If you are always seeking over all four columns, then drop the 4
individual indexes and create one like this:

create index my_key on patient(nachname, vorname, hvvsnummer);

That would be more efficient, in the C locale.

Also, what is the point of searching for LIKE '%'? Why not just leave that
out?

Chris

On 29 Mar 2003, Andreas Kostyrka wrote:

> Hi!
>
> I've got the following problem:
> PostgreSQL 7.2.1-2 (Debian) on Duron/700MHz, 512MB, IDE hdd (laptop).
>
> I've got a table that has 6400 rows, an index on the deleted, nachname,
> vorname and hvvsnummer attributes, and my O-R wrapper generate queries
> like this:
>
> SELECT patient.id, patient.vorname, patient.nachname, patient.titel,
> patient.geburtsdatum, patient.hvvsnummer, patient.geschlecht,
> patient.adresse_id, patient.beruf, patient.kommentar, patient.cave,
> patient.zusatzversicherung, patient.deleted FROM patient WHERE
> ((((patient.deleted = 'f') AND (patient.nachname LIKE 'K%')) AND
> (patient.vorname LIKE '%')) AND (patient.hvvsnummer LIKE '%'))
>
> This results in a SeqScan von patient. Even more curious is that simpler
> queries like
>
> select * from patient where deleted='f'; OR:
> select * from patient where nachname LIKE 'K%';
>
> all result in SeqScan on patient.
>
> I've "analyzed" and "reindex" the table already multiple times, and
> still PostgreSQL insists upon not using any index.
>
> TIA for any pointers,
>
> Andreas
>
> mpp2=# \d patient
>                  Table "patient"
>        Column       |     Type     |  Modifiers
> --------------------+--------------+-------------
>  id                 | integer      | not null
>  vorname            | text         | not null
>  nachname           | text         | not null
>  titel              | text         |
>  geburtsdatum       | date         |
>  hvvsnummer         | text         |
>  geschlecht         | character(1) |
>  adresse_id         | integer      |
>  beruf              | text         |
>  kommentar          | text         |
>  cave               | text         |
>  zusatzversicherung | text         |
>  deleted            | boolean      | default 'f'
> Indexes: patient_deleted,
>          patient_hvvsnummer,
>          patient_nachname,
>          patient_vorname
> Primary key: patient_pkey
> Check constraints: "patient_geschlecht" (((geschlecht = 'm'::bpchar) OR
> (geschlecht = 'w'::bpchar)) OR (geschlecht = '?'::bpchar))
> Triggers: RI_ConstraintTrigger_352787,
>           RI_ConstraintTrigger_352789,
>           RI_ConstraintTrigger_352801,
>           RI_ConstraintTrigger_352803,
>           RI_ConstraintTrigger_352815
>
> mpp2=# select count(*) from patient;
>  count
> -------
>   6406
> (1 row)
>
> mpp2=# explain SELECT * FROM patient WHERE (patient.nachname LIKE 'K%');
> NOTICE:  QUERY PLAN:
>
> Seq Scan on patient  (cost=0.00..173.07 rows=272 width=70)
>
> EXPLAIN
> mpp2=# explain SELECT * FROM patient WHERE NOT deleted;
> NOTICE:  QUERY PLAN:
>
> Seq Scan on patient  (cost=0.00..157.06 rows=6406 width=70)
>
> EXPLAIN
> mpp2=# explain SELECT * FROM patient WHERE deleted='f';
> NOTICE:  QUERY PLAN:
>
> Seq Scan on patient  (cost=0.00..173.07 rows=6406 width=70)
>
> EXPLAIN
>
>
>


pgsql-performance by date:

Previous
From: Andreas Kostyrka
Date:
Subject: Index not used, performance problem
Next
From: Andrew Sullivan
Date:
Subject: Re: Index not used, performance problem