Re: Using multiple extended statistics for estimates - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Using multiple extended statistics for estimates
Date
Msg-id 20191110173428.w5klequcmb6gawaa@development
Whole thread Raw
In response to Re: Using multiple extended statistics for estimates  (Mark Dilger <hornschnorter@gmail.com>)
Responses Re: Using multiple extended statistics for estimates  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
On Sat, Nov 09, 2019 at 02:32:27PM -0800, Mark Dilger wrote:
>
>
>On 11/9/19 12:33 PM, Mark Dilger wrote:
>>
>>
>>On 11/6/19 11:58 AM, Tomas Vondra wrote:
>>>On Wed, Nov 06, 2019 at 08:54:40PM +0100, Tomas Vondra wrote:
>>>>On Mon, Oct 28, 2019 at 04:20:48PM +0100, Tomas Vondra wrote:
>>>>>Hi,
>>>>>
>>>>>PostgreSQL 10 introduced extended statistics, allowing us to consider
>>>>>correlation between columns to improve estimates, and PostgreSQL 12
>>>>>added support for MCV statistics. But we still had the limitation that
>>>>>we only allowed using a single extended statistics per relation, i.e.
>>>>>given a table with two extended stats
>>>>>
>>>>> CREATE TABLE t (a int, b int, c int, d int);
>>>>> CREATE STATISTICS s1 (mcv) ON a, b FROM t;
>>>>> CREATE STATISTICS s2 (mcv) ON c, d FROM t;
>>>>>
>>>>>and a query
>>>>>
>>>>> SELECT * FROM t WHERE a = 1 AND b = 1 AND c = 1 AND d = 1;
>>>>>
>>>>>we only ever used one of the statistics (and we considered 
>>>>>them in a not
>>>>>particularly well determined order).
>>>>>
>>>>>This patch addresses this by using as many extended stats as possible,
>>>>>by adding a loop to statext_mcv_clauselist_selectivity(). In each step
>>>>>we pick the "best" applicable statistics (in the sense of covering the
>>>>>most attributes) and factor it into the oveall estimate.
>>
>>Tomas,
>>
>>Your patch compiles and passes the regression tests for me on debian 
>>linux under master.
>>
>>Since your patch does not include modified regression tests, I wrote 
>>a test that I expected to improve under this new code, but running 
>>it both before and after applying your patch, there is no change.
>
>Ok, the attached test passes before applying your patch and fails 
>afterward owing to the estimates improving and no longer matching the 
>expected output.  To be clear, this confirms your patch working as 
>expected.
>
>I haven't seen any crashes in several hours of running different 
>tests, so I think it looks good.
>

Yep, thanks for adding the tests. I'll include them into the patch.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Using multiple extended statistics for estimates
Next
From: Magnus Hagander
Date:
Subject: Re: ssl passphrase callback