Home > mailing lists

Re: An "obvious" index not being used - Mailing list pgsql-performance

From	Daniele Varrazzo
Subject	Re: An "obvious" index not being used
Date	July 19, 2008 14:22:23
Msg-id	488222A7.1000202@develer.com Whole thread Raw
In response to	Re: An "obvious" index not being used (Francisco Reyes <lists@stringsutils.com>)
Responses	Re: An "obvious" index not being used
List	pgsql-performance

Tree view

Francisco Reyes writes:
> Daniele Varrazzo writes:
>
>> I suspect the foo.account_id statistical data are not used at all in
>> query: the query planner can only estimate the number of accounts to
>> look for, not
>
> You mentioned you bumped your default_statistics_target.
> What did you increase it to?
> My data sets are so "strange" that anything less than 350 gives many bad
> plans.

Not default_statistics_target: I used "ALTER TABLE SET STATISTICS" to change
the stats only for the tables I was interested in, arriving up to 1000. I
think the result is the same, but it was a red herring anyway: these stats
couldn't be used at all in my query.

In my problem I had 2 tables: a small one (accounts), a large one (foo). The
way the query is written doesn't allow the stats from the large table to be
used at all, unless the records from the small table are fetched. This is
independent from the stats accuracy.

What the planner does is to assume an even distribution in the data in the
joined fields. The assumption is probably better than not having anything, but
in my data set (where there were a bunch of accounts with many foo each,but
many accounts with too little foo) this proved false.

The stats can be used only if at planning time the planner knows what values
to look for in the field: this is the reason for which, if the query is split
in two parts, performances become acceptable. In this case we may fall in your
situation: a data set may be "strange" and thus require an increase in the
stats resolution. I can't remember if the default 10 was too low, but 100 was
definitely enough for me.

It would be nice if the planner could perform the "split query" optimization
automatically, i.e. fetch records from small tables to plan the action on
larger tables. But I suspect this doesn't fit at all in the current PostgreSQL
query pipeline... or does it?

--
Daniele Varrazzo - Develer S.r.l.
http://www.develer.com

pgsql-performance by date:

From: "Luke Lonergan"
Date: 19 July 2008, 10:59:27
Subject: Re: Performance on Sun Fire X4150 x64 (dd, bonnie++, pgbench)

From: Tom Lane
Date: 20 July 2008, 01:51:51
Subject: Re: An "obvious" index not being used

Re: An "obvious" index not being used - Mailing list pgsql-performance

Previous

Next