Home > mailing lists

Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics
Date	June 10, 2008 20:28:40
Msg-id	26472.1213140508@sss.pgh.pa.us Whole thread Raw
In response to	Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics (Gregory Stark <stark@enterprisedb.com>)
Responses	Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics
List	pgsql-hackers

Tree view

Gregory Stark <stark@enterprisedb.com> writes:
> The screw case I've seen is when you have a large partitioned table where
> constraint_exclusion fails to exclude the irrelevant partitions. You're going
> to get 0 rows from all but the one partition which contains the 1 row you're
> looking for. But since each partition is clamped to 1 you end up with an
> estimate of a few hundred rows coming out of the Append node.

> The natural way to kill this is to allow fractional rows for these scans.

No, the right fix is to fix the constraint-exclusion failure.

> Alternatively we could make Append more clever about estimating the number of
> rows it produces. Somehow it could be informed of some holistic view of the
> quals on its children and how they're inter-dependent. If it's told that only
> one of its children will produce rows then it can use max() instead of sum()
> to calculate its rows estimate.

This gets back to the discussions at PGCon about needing to have a more
explicit representation of partitioning.  Right now, for a
many-partition table we spend huge amounts of time deriving the expected
behavior from first principles, each time we make a plan.  And even then
we can only get it right for limited cases (eg, no parameterized
queries).  If the planner actually understood that a set of tables
formed a partitioned set then it'd be a lot easier and faster to get the
desired behavior, not only with respect to the rowcount estimates but
the plan's structure.
        regards, tom lane

pgsql-hackers by date:

From: Gregory Stark
Date: 10 June 2008, 20:19:48
Subject: Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics

From: Gregory Stark
Date: 10 June 2008, 20:38:17
Subject: Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics

Re: Proposal - improve eqsel estimates by including histogram bucket numdistinct statistics - Mailing list pgsql-hackers

Previous

Next