Home > mailing lists

Re: Thoughts on statistics for continuously advancing columns - Mailing list pgsql-hackers

From	Dimitri Fontaine
Subject	Re: Thoughts on statistics for continuously advancing columns
Date	December 31, 2009 14:56:19
Msg-id	m2ljgjht16.fsf@hi-media.com Whole thread Raw
In response to	Re: Thoughts on statistics for continuously advancing columns (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

Tom Lane <tgl@sss.pgh.pa.us> writes:
> Actually, in the problematic cases, it's interesting to consider the
> following strategy: when scalarineqsel notices that it's being asked for
> a range estimate that's outside the current histogram bounds, first try
> to obtain the actual current max() or min() of the column value --- this
> is something we can get fairly cheaply if there's a btree index on the
> column.  If we can get it, plug it into the histogram, replacing the
> high or low bin boundary.  Then estimate as we currently do.  This would
> work reasonably well as long as re-analyzes happen at a time scale such
> that the histogram doesn't move much overall, ie, the number of
> insertions between analyzes isn't a lot compared to the number of rows
> per bin.  We'd have some linear-in-the-bin-size estimation error because
> the modified last or first bin actually contains more rows than other
> bins, but it would certainly work a lot better than it does now.

I know very little about statistics in general, but your proposal seems
straigth enough for me to understand it, and looks good: +1.

Regards,
-- 
dim

pgsql-hackers by date:

From: Bruce Momjian
Date: 31 December 2009, 14:44:44
Subject: Re: uintptr_t for Datum

From: Simon Riggs
Date: 31 December 2009, 15:26:11
Subject: Re: Thoughts on statistics for continuously advancing columns

Re: Thoughts on statistics for continuously advancing columns - Mailing list pgsql-hackers

Previous

Next