Re: Logarithmic data frequency distributions and the query planner - Mailing list pgsql-performance

From Tom Lane
Subject Re: Logarithmic data frequency distributions and the query planner
Date
Msg-id 16024.1278537720@sss.pgh.pa.us
Whole thread Raw
In response to Logarithmic data frequency distributions and the query planner  (Jerry Gamache <jerry.gamache@idilia.com>)
List pgsql-performance
Jerry Gamache <jerry.gamache@idilia.com> writes:
> On 8.1, I have a very interesting database where the distributions of
> some values in a multi-million rows table is logarithmic (i.e. the most
> frequent value is an order of magnitude more frequent than the next
> ones). If I analyze the table, the statistics become extremely skewed
> towards the most frequent values and this prevents the planner from
> giving any good results on queries that do not target these entries.

Highly skewed distributions are hardly unusual, and I'm not aware that
the planner is totally incapable of dealing with them.  You do need a
large enough stats target to get down into the tail of the
distribution (the default target for 8.1 is probably too small for you).
It might be that there have been some other relevant improvements since
8.1, too ...

            regards, tom lane

pgsql-performance by date:

Previous
From: Jerry Gamache
Date:
Subject: Logarithmic data frequency distributions and the query planner
Next
From: Ryan Wexler
Date:
Subject: performance on new linux box