Thread: Re: extended statistics n-distinct on multiple columns not used when join two tables

(moving to -hackers)

On Tue, 13 Jun 2023 at 21:30, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> út 13. 6. 2023 v 11:21 odesílatel James Pang (chaolpan) <chaolpan@cisco.com> napsal:
>>      When join two table on multiple columns equaljoin, rows estimation always use selectivity = multiplied by
distinctmultiple individual columns, possible to use  extended n-distinct statistics on multiple columns? 
>>
>>     PG v14.8-1, attached please check test case with details.
>
> There is not any support for multi tables statistic

I think it's probably worth adjusting the docs to mention this. It
seems like it might be something that could surprise someone.

Something like the attached, maybe?

David

Attachment


út 13. 6. 2023 v 13:26 odesílatel David Rowley <dgrowleyml@gmail.com> napsal:
(moving to -hackers)

On Tue, 13 Jun 2023 at 21:30, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> út 13. 6. 2023 v 11:21 odesílatel James Pang (chaolpan) <chaolpan@cisco.com> napsal:
>>      When join two table on multiple columns equaljoin, rows estimation always use selectivity = multiplied by distinct multiple individual columns, possible to use  extended n-distinct statistics on multiple columns?
>>
>>     PG v14.8-1, attached please check test case with details.
>
> There is not any support for multi tables statistic

I think it's probably worth adjusting the docs to mention this. It
seems like it might be something that could surprise someone.

Something like the attached, maybe?

+1

Pavel


David

RE: extended statistics n-distinct on multiple columns not used when join two tables

From
"James Pang (chaolpan)"
Date:

Thanks  for your information, yes, with multiple columns equal join and correlation , looks like extended statistics could  help reduce “significantly rows estimation”. Hopefully it’s in future version.

 

James

 

From: Pavel Stehule <pavel.stehule@gmail.com>
Sent: Tuesday, June 13, 2023 7:29 PM
To: David Rowley <dgrowleyml@gmail.com>
Cc: PostgreSQL Developers <pgsql-hackers@lists.postgresql.org>; James Pang (chaolpan) <chaolpan@cisco.com>
Subject: Re: extended statistics n-distinct on multiple columns not used when join two tables

 

 

 

út 13. 6. 2023 v 13:26 odesílatel David Rowley <dgrowleyml@gmail.com> napsal:

(moving to -hackers)

On Tue, 13 Jun 2023 at 21:30, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> út 13. 6. 2023 v 11:21 odesílatel James Pang (chaolpan) <chaolpan@cisco.com> napsal:
>>      When join two table on multiple columns equaljoin, rows estimation always use selectivity = multiplied by distinct multiple individual columns, possible to use  extended n-distinct statistics on multiple columns?
>>
>>     PG v14.8-1, attached please check test case with details.
>
> There is not any support for multi tables statistic

I think it's probably worth adjusting the docs to mention this. It
seems like it might be something that could surprise someone.

Something like the attached, maybe?

 

+1

 

Pavel

 


David

On Tue, 13 Jun 2023 at 23:29, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> I think it's probably worth adjusting the docs to mention this. It
>> seems like it might be something that could surprise someone.
>>
>> Something like the attached, maybe?
>
> +1

Ok, I pushed that patch.  Thanks.

David