Thread: Re: [COMMITTERS] pgsql: Collect and use multi-columndependency stats

Re: [COMMITTERS] pgsql: Collect and use multi-columndependency stats

From
David Rowley
Date:
On 6 April 2017 at 13:05, David Rowley <david.rowley@2ndquadrant.com> wrote:
> I tested with the attached, and it does not seem to hurt planner
> performance executing:

Here's it again, this time with a comment on the
find_relation_from_clauses() function.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: [COMMITTERS] pgsql: Collect and use multi-columndependency stats

From
Kyotaro HORIGUCHI
Date:
At Thu, 6 Apr 2017 13:10:48 +1200, David Rowley <david.rowley@2ndquadrant.com> wrote in
<CAKJS1f8Um=BvRmgcb3u6ze1q1xL7A1VKTVF9s2R1_UfRqx8q5w@mail.gmail.com>
> On 6 April 2017 at 13:05, David Rowley <david.rowley@2ndquadrant.com> wrote:
> > I tested with the attached, and it does not seem to hurt planner
> > performance executing:
> 
> Here's it again, this time with a comment on the
> find_relation_from_clauses() function.

It seems to work as the same as the previous version with
additional cost to scan over restrict clauses. But separate loop
over clauses is additional overhead in any cases even irrelavant
to functional dependency.  The more columns are in the relation,
the longer time bms_get_singleton_member takes. Although I'm not
sure how much it hurts performance and I can't think of a good
alternative right now, I think that the overhead should be
avoided anyhow.

At Thu, 6 Apr 2017 13:05:24 +1200, David Rowley <david.rowley@2ndquadrant.com> wrote in
<CAKJS1f_gB=gyZn8wMw0v8uCKD1nYeWyNYCXKz=+Oo0yR4RRyiA@mail.gmail.com>
> > And you measured the overhead of doing it the other way to be ... ?
> > Premature optimization and all that.
> 
> I tested with the attached, and it does not seem to hurt planner
> performance executing:

Here, bms_singleton_member takes longer time if the relation has
many columns and there's a functional dependency covering the
columns at the very tail. Maybe only two are not practical for
testing.

Even if it doesn't impact performance detectably, if only one
attribute is needed, an AttrNumber member in context will be
sufficient. No bitmap operation seems required in
dependency_compatible_walker and it can bail out by the second
attribute.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [COMMITTERS] pgsql: Collect and use multi-columndependency stats

From
David Rowley
Date:
On 6 April 2017 at 18:03, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> At Thu, 6 Apr 2017 13:10:48 +1200, David Rowley <david.rowley@2ndquadrant.com> wrote in
<CAKJS1f8Um=BvRmgcb3u6ze1q1xL7A1VKTVF9s2R1_UfRqx8q5w@mail.gmail.com>
>> On 6 April 2017 at 13:05, David Rowley <david.rowley@2ndquadrant.com> wrote:
>> > I tested with the attached, and it does not seem to hurt planner
>> > performance executing:
>>
>> Here's it again, this time with a comment on the
>> find_relation_from_clauses() function.
>
> It seems to work as the same as the previous version with
> additional cost to scan over restrict clauses. But separate loop
> over clauses is additional overhead in any cases even irrelavant
> to functional dependency.  The more columns are in the relation,
> the longer time bms_get_singleton_member takes. Although I'm not
> sure how much it hurts performance and I can't think of a good
> alternative right now, I think that the overhead should be
> avoided anyhow.

I'm not all that sure why the number of columns in the relation has
relevance to the performance of find_relation_from_clauses(). The
bms_get_singleton_member() is checking which relations are part of the
RestrictInfo, nothing related to columns in relations.

Perhaps you meant clauses in the clauses list? Which does not really
have all that much to do with the number of columns in the relation
either.

>
> At Thu, 6 Apr 2017 13:05:24 +1200, David Rowley <david.rowley@2ndquadrant.com> wrote in
<CAKJS1f_gB=gyZn8wMw0v8uCKD1nYeWyNYCXKz=+Oo0yR4RRyiA@mail.gmail.com>
>> > And you measured the overhead of doing it the other way to be ... ?
>> > Premature optimization and all that.
>>
>> I tested with the attached, and it does not seem to hurt planner
>> performance executing:
>
> Here, bms_singleton_member takes longer time if the relation has
> many columns and there's a functional dependency covering the
> columns at the very tail. Maybe only two are not practical for
> testing.

Can you explain why you think this? And confirm you're speaking about
the bms_get_singleton() member in find_relation_from_clauses()

> Even if it doesn't impact performance detectably, if only one
> attribute is needed, an AttrNumber member in context will be
> sufficient. No bitmap operation seems required in
> dependency_compatible_walker and it can bail out by the second
> attribute.

Are you looking at an old patch? That function no longer exists.

-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services