Re: Index usage with slow query - Mailing list pgsql-general

From Marc Mamin
Subject Re: Index usage with slow query
Date
Msg-id B6F6FD62F2624C4C9916AC0175D56D8828AC2D79@jenmbs01.ad.intershop.net
Whole thread Raw
In response to Re: Index usage with slow query  (David G Johnston <david.g.johnston@gmail.com>)
List pgsql-general


>The presence of absence of the length limiter on a varchar will not impact
>the query plan.  And I'm pretty sure you cannot even store a too long
>varchar in an index.  It will error on the attempt (as opposed to
>truncating).

The max size is almost one block. After that you get an error:
FEHLER:  Indexzeile benötigt 9184 Bytes, Maximalgröße ist 8191

>Looking for some advice regarding a slow query I have and indexing.
>
>I'm using postgresql 9.1 and this is my table that has around 6800000 rows:

It's worth a try to compare the planner choice in Postgres 9.3 ...

>
>CREATE TABLE mytable
>(
>  class character varying,
>  floor character varying,
>  source_id integer,
>  the_geom geometry

You could slightly improve the table definition while placing the integer column in front of the varchar.

More interesting would be to move the varchar in separate reference column and only have integers except for the
geometrytype. 
This will make the new index smaller and faster. I bet that the planner would take that change in account.


>
>INDEX idx_source_id
>    USING btree
>  (source_id);
>
>INDEX idx_the_geom_gist
>  USING gist
>  (the_geom);
>
>
>This table is constantly hit with the below query (not always the same values in the where). The only difference
betweenqueries are the values in the where clause: 
>
>SELECT the_geom,oid from mytable
>WHERE
>the_geom && ST_GeomFromText('POLYGON((529342.334095833 180696.221733333,529342.334095833 181533.44595,530964.336820833
181533.44595,530964.336820833180696.221733333,529342.334095833 180696.221733333))',find_srid('','mytable','the_geom'))  
>AND
>(floor = 'gf' AND source_id = '689' AND class = 'General')
>
>
>As the table has increased in size, this query has become slower, so I made this index:
>
>
>INDEX idx_floor_sourceid_class
>  USING btree
>  (floor, source_id, class);

if your query always uses these 3 columns, you should put the one with the highest cardinality first.


>
>When I run an EXPLAIN and EXPLAIN ANALYZE the query isn't using this new index.
>
>Sometimes it uses just idx_the_geom_gist
>
>other times it uses idx_the_geom_gist and idx_source_id
>
>I don't understand why it's inconsistent in its use of indexes when the query is always the same structure, just
differentwhere clause values, and I don't understand why it's not using the new index either. 

The planner uses statistics on the different columns content to weight the possible query plans.
And it is good at that :)
You can help him while raising the target statistics on these 3 columns.

More difficult for the planner is to compare the advantage of the GIN index to the other one.
If some columns get toasted then the cost of detoasting seems to often be underestimated.

For the case when one of your 3 first columns has a very low cardinality, you may consider adding some partial indexes.
e.g.:
  create INDEX idx_the_geom_gist_general  USING gist (the_geom) where class ='general';
  create INDEX idx_the_geom_gist_special  USING gist (the_geom) where class ='special';

They can of course only get used when your query contains exactly the same clause.


regards,

Marc Mamin

pgsql-general by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: event triggers in 9.3.4
Next
From: Marc Mamin
Date:
Subject: Re: copy/dump database to text/csv files