Re: [COMMITTERS] pgsql: Collect and use multi-columndependency stats - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: [COMMITTERS] pgsql: Collect and use multi-columndependency stats
Date
Msg-id 20170406.165056.135052192.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
Responses Re: [COMMITTERS] pgsql: Collect and use multi-columndependency stats  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers
At Thu, 6 Apr 2017 18:59:35 +1200, David Rowley <david.rowley@2ndquadrant.com> wrote in
<CAKJS1f-yrLizV5N_-r1o4vemuZBTJd8EzwPyx2QG=F6891++=g@mail.gmail.com>
> On 6 April 2017 at 18:03, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > At Thu, 6 Apr 2017 13:10:48 +1200, David Rowley <david.rowley@2ndquadrant.com> wrote in
<CAKJS1f8Um=BvRmgcb3u6ze1q1xL7A1VKTVF9s2R1_UfRqx8q5w@mail.gmail.com>
> >> On 6 April 2017 at 13:05, David Rowley <david.rowley@2ndquadrant.com> wrote:
> I'm not all that sure why the number of columns in the relation has
> relevance to the performance of find_relation_from_clauses(). The
> bms_get_singleton_member() is checking which relations are part of the
> RestrictInfo, nothing related to columns in relations.
> Perhaps you meant clauses in the clauses list? Which does not really
> have all that much to do with the number of columns in the relation
> either.

Sorry, it's number of relations, not columns. I'm not sure up to
how many relations we practically should consider but anyway it
is extra burden to every call to clauselist_selectivity. We
should avoid calling find_relation_from_clauses as far as
possible or do the same in more simple way. However I'm not sure
more precise exclusion is possible or not, I thinks that the case
of jointype != JOIN_INNER can be exluded.

> > At Thu, 6 Apr 2017 13:05:24 +1200, David Rowley <david.rowley@2ndquadrant.com> wrote in
<CAKJS1f_gB=gyZn8wMw0v8uCKD1nYeWyNYCXKz=+Oo0yR4RRyiA@mail.gmail.com>
> >> > And you measured the overhead of doing it the other way to be ... ?
> >> > Premature optimization and all that.
> >>
> >> I tested with the attached, and it does not seem to hurt planner
> >> performance executing:
> >
> > Here, bms_singleton_member takes longer time if the relation has
> > many columns and there's a functional dependency covering the
> > columns at the very tail. Maybe only two are not practical for
> > testing.
> 
> Can you explain why you think this? And confirm you're speaking about
> the bms_get_singleton() member in find_relation_from_clauses()

I mentioned dependency_is_compatible_clause here, but I saw that
it has been simplified enough in the committed shape.

> > Even if it doesn't impact performance detectably, if only one
> > attribute is needed, an AttrNumber member in context will be
> > sufficient. No bitmap operation seems required in
> > dependency_compatible_walker and it can bail out by the second
> > attribute.
> 
> Are you looking at an old patch? That function no longer exists.

Yes! Sorry for the noise.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Faster methods for getting SPI results (460% improvement)
Next
From: Emre Hasegeli
Date:
Subject: Re: BRIN cost estimate