Home > mailing lists

Re: WIP: multivariate statistics / proof of concept - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: WIP: multivariate statistics / proof of concept
Date	November 22, 2016 03:42:24
Msg-id	6a0b84cf-24e3-bf28-c4ac-b3ef33c3b55f@2ndquadrant.com Whole thread Raw
In response to	Re: WIP: multivariate statistics / proof of concept (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: WIP: multivariate statistics / proof of concept
List	pgsql-hackers

Tree view

On 11/21/2016 11:10 PM, Robert Haas wrote:
> [ reviving an old multivariate statistics thread ]
>
> On Thu, Nov 13, 2014 at 6:31 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> On 12 October 2014 23:00, Tomas Vondra <tv@fuzzy.cz> wrote:
>>
>>> It however seems to be working sufficiently well at this point, enough
>>> to get some useful feedback. So here we go.
>>
>> This looks interesting and useful.
>>
>> What I'd like to check before a detailed review is that this has
>> sufficient applicability to be useful.
>>
>> My understanding is that Q9 and Q18 of TPC-H have poor plans as a
>> result of multi-column stats errors.
>>
>> Could you look at those queries and confirm that this patch can
>> produce better plans for them?
>
> Tomas, did you ever do any testing in this area?  One of my
> colleagues, Rafia Sabih, recently did some testing of TPC-H queries @
> 20 GB.  Q18 actually doesn't complete at all right now because of an
> issue with the new simplehash implementation.  I reported it to Andres
> and he tracked it down, but hasn't posted the patch yet - see
> http://archives.postgresql.org/message-id/20161115192802.jfbec5s6ougxwicp@alap3.anarazel.de
>
> Of the remaining queries, the slowest are Q9 and Q20, and both of them
> have serious estimation errors.  On Q9, things go wrong here:
>
>                                  ->  Merge Join
> (cost=5225092.04..6595105.57 rows=154 width=47) (actual
> time=103592.821..149335.010 rows=6503988 loops=1)
>                                        Merge Cond:
> (partsupp.ps_partkey = lineitem.l_partkey)
>                                        Join Filter:
> (lineitem.l_suppkey = partsupp.ps_suppkey)
>                                        Rows Removed by Join Filter: 19511964
>                                        ->  Index Scan using> [snip]
>
> Rows Removed by Filter: 756627
>
> The estimate for the index scan on partsupp is essentially perfect,
> and the lineitem-part join is off by about 3x.  However, the merge
> join is off by about 4000x, which is real bad.
>

The patch only deals with statistics on base relations, no joins, at 
this point. It's meant to be extended in that direction, so the syntax 
supports it, but at this point that's all. No joins.

That being said, this estimate should be improved in 9.6, when you 
create a foreign key between the tables. In fact, that patch was exactly 
about Q9.

This is how the join estimate looks on scale 1 without the FK between 
the two tables:
                          QUERY PLAN
----------------------------------------------------------------------- Merge Join  (cost=19.19..700980.12 rows=2404
width=261)  Merge Cond: ((lineitem.l_partkey = partsupp.ps_partkey) AND                (lineitem.l_suppkey =
partsupp.ps_suppkey))  ->  Index Scan using idx_lineitem_part_supp on lineitem                (cost=0.43..605856.84
rows=6001117width=117)   ->  Index Scan using partsupp_pkey on partsupp                (cost=0.42..61141.76 rows=800000
width=144)
(4 rows)


and with the foreign key:
                             QUERY PLAN
----------------------------------------------------------------------- Merge Join  (cost=19.19..700980.12 rows=6001117
width=261)            (actual rows=6001215 loops=1)   Merge Cond: ((lineitem.l_partkey = partsupp.ps_partkey) AND
        (lineitem.l_suppkey = partsupp.ps_suppkey))   ->  Index Scan using idx_lineitem_part_supp on lineitem
    (cost=0.43..605856.84 rows=6001117 width=117)                (actual rows=6001215 loops=1)   ->  Index Scan using
partsupp_pkeyon partsupp                (cost=0.42..61141.76 rows=800000 width=144)                (actual rows=6001672
loops=1)Planning time: 3.840 ms Execution time: 21987.913 ms
 
(6 rows)


> On Q20, things go wrong here:>
> [snip]
>
> The estimate for the GroupAggregate feeding one side of the merge join
> is quite accurate.  The estimate for the part-partsupp join on the
> other side is off by 8x.  Then things get much worse: the estimate for
> the merge join is off by 400x.
>

Well, most of the estimation error comes from the join, but sadly the 
aggregate makes using the foreign keys impossible - at least in the 
current version. I don't know if it can be improved, somehow.

> I'm not really sure whether the multivariate statistics stuff will fix
> this kind of case or not, but if it did it would be awesome.
>

Join statistics are something I'd like to add eventually, but I don't 
see how it could happen in the first version. Also, the patch received 
no reviews this CF, and making it even larger is unlikely to make it 
more attractive.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

From: Kyotaro HORIGUCHI
Date: 22 November 2016, 03:35:56
Subject: Re: Re: Use procsignal_sigusr1_handler and RecoveryConflictInterrupt() from walsender?

From: Pavel Stehule
Date: 22 November 2016, 04:29:20
Subject: Re: Danger of automatic connection reset in psql

Re: WIP: multivariate statistics / proof of concept - Mailing list pgsql-hackers

Previous

Next