Re: [HACKERS] [COMMITTERS] pgsql: Collect and use multi-columndependency stats - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: [HACKERS] [COMMITTERS] pgsql: Collect and use multi-columndependency stats
Date
Msg-id 20170406.194224.249381919.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
List pgsql-hackers
At Thu, 6 Apr 2017 21:55:43 +1200, David Rowley <david.rowley@2ndquadrant.com> wrote in
<CAKJS1f95tOuSEMfmYWBPj-fGw=SY0MYDbQh5BiRiTtonMpws7Q@mail.gmail.com>
> On 6 April 2017 at 19:50, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > At Thu, 6 Apr 2017 18:59:35 +1200, David Rowley <david.rowley@2ndquadrant.com> wrote in
<CAKJS1f-yrLizV5N_-r1o4vemuZBTJd8EzwPyx2QG=F6891++=g@mail.gmail.com>
> >> On 6 April 2017 at 18:03, Kyotaro HORIGUCHI
> >> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> >> > At Thu, 6 Apr 2017 13:10:48 +1200, David Rowley <david.rowley@2ndquadrant.com> wrote in
<CAKJS1f8Um=BvRmgcb3u6ze1q1xL7A1VKTVF9s2R1_UfRqx8q5w@mail.gmail.com>
> >> >> On 6 April 2017 at 13:05, David Rowley <david.rowley@2ndquadrant.com> wrote:
> >> I'm not all that sure why the number of columns in the relation has
> >> relevance to the performance of find_relation_from_clauses(). The
> >> bms_get_singleton_member() is checking which relations are part of the
> >> RestrictInfo, nothing related to columns in relations.
> >> Perhaps you meant clauses in the clauses list? Which does not really
> >> have all that much to do with the number of columns in the relation
> >> either.
> >
> > Sorry, it's number of relations, not columns. I'm not sure up to
> > how many relations we practically should consider but anyway it
> > is extra burden to every call to clauselist_selectivity. We
> > should avoid calling find_relation_from_clauses as far as
> > possible or do the same in more simple way. However I'm not sure
> > more precise exclusion is possible or not, I thinks that the case
> > of jointype != JOIN_INNER can be exluded.
> 
> Well, I imagine queries with >= 32 relations are not planning very
> quickly as of today already. I understand what you mean when you speak
> of attributes, as we could constantly be looking for the 1400's
> attribute which is many loops into a bms_get_singleton_member() call.
> I can't imagine we'll even flow over the first word in a bitmap set in
> 99% of cases with clause_relids.  In any case, even if there's a giant
> chain of clauses in the the 'clauses' list, we'll bail out on the
> first join qual anyway, since it won't be a singleton clause_relid.

Yes, I agree that most cases doesn't suffer this. Anyway since I
don't have enough knowlege required to roughly estimate the
impact nor concrete expample where the planning time increases
significantly, I don't assert any more on this point.

> I'd say if you can come up with a test case where you can measure the
> impact of this, then let's discuss more. Otherwise we're stepping back
> into the territory that Tom warned me about a few emails up....
> Premature Optimisation. I'm not walking down there again, I only just
> got back.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki
Next
From: Ashutosh Bapat
Date:
Subject: [HACKERS] Constraint exclusion for partitioned tables