Home > mailing lists

Re: Using multiple extended statistics for estimates - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: Using multiple extended statistics for estimates
Date	November 6, 2019 19:58:49
Msg-id	20191106195849.odhhc66xd23hw6hf@development Whole thread
In response to	Re: Using multiple extended statistics for estimates (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses	Re: Using multiple extended statistics for estimates Re: Using multiple extended statistics for estimates
List	pgsql-hackers

Tree view

On Wed, Nov 06, 2019 at 08:54:40PM +0100, Tomas Vondra wrote:
>On Mon, Oct 28, 2019 at 04:20:48PM +0100, Tomas Vondra wrote:
>>Hi,
>>
>>PostgreSQL 10 introduced extended statistics, allowing us to consider
>>correlation between columns to improve estimates, and PostgreSQL 12
>>added support for MCV statistics. But we still had the limitation that
>>we only allowed using a single extended statistics per relation, i.e.
>>given a table with two extended stats
>>
>>  CREATE TABLE t (a int, b int, c int, d int);
>>  CREATE STATISTICS s1 (mcv) ON a, b FROM t;
>>  CREATE STATISTICS s2 (mcv) ON c, d FROM t;
>>
>>and a query
>>
>>  SELECT * FROM t WHERE a = 1 AND b = 1 AND c = 1 AND d = 1;
>>
>>we only ever used one of the statistics (and we considered them in a not
>>particularly well determined order).
>>
>>This patch addresses this by using as many extended stats as possible,
>>by adding a loop to statext_mcv_clauselist_selectivity(). In each step
>>we pick the "best" applicable statistics (in the sense of covering the
>>most attributes) and factor it into the oveall estimate.
>>
>>All this happens where we'd originally consider applying a single MCV
>>list, i.e. before even considering the functional dependencies, so
>>roughly like this:
>>
>>  while ()
>>  {
>>      ... apply another MCV list ...
>>  }
>>
>>  ... apply functional dependencies ...
>>
>>
>>I've both in the loop, but I think that'd be wrong - the MCV list is
>>expected to contain more information about individual values (compared
>>to functional deps, which are column-level).
>>
>
>Here is a slightly polished v2 of the patch, the main difference being
>that computing clause_attnums was moved to a separate function.
>

This time with the attachment ;-)


-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

use-multiple-extended-stats-v2.patch

pgsql-hackers by date:

From: Tomas Vondra
Date: 06 November 2019, 19:54:40
Subject: Re: Using multiple extended statistics for estimates

From: Juan José Santamaría Flecha
Date: 06 November 2019, 20:41:56
Subject: Re: TAP tests aren't using the magic words for Windows file access

Re: Using multiple extended statistics for estimates - Mailing list pgsql-hackers

Attachment

Previous

Next