Re: [COMMITTERS] pgsql: Collect and use multi-columndependency stats - Mailing list pgsql-hackers

From David Rowley
Subject Re: [COMMITTERS] pgsql: Collect and use multi-columndependency stats
Date
Msg-id CAKJS1f95tOuSEMfmYWBPj-fGw=SY0MYDbQh5BiRiTtonMpws7Q@mail.gmail.com
Whole thread Raw
In response to Re: [COMMITTERS] pgsql: Collect and use multi-columndependency stats  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
List pgsql-hackers
On 6 April 2017 at 19:50, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> At Thu, 6 Apr 2017 18:59:35 +1200, David Rowley <david.rowley@2ndquadrant.com> wrote in
<CAKJS1f-yrLizV5N_-r1o4vemuZBTJd8EzwPyx2QG=F6891++=g@mail.gmail.com>
>> On 6 April 2017 at 18:03, Kyotaro HORIGUCHI
>> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>> > At Thu, 6 Apr 2017 13:10:48 +1200, David Rowley <david.rowley@2ndquadrant.com> wrote in
<CAKJS1f8Um=BvRmgcb3u6ze1q1xL7A1VKTVF9s2R1_UfRqx8q5w@mail.gmail.com>
>> >> On 6 April 2017 at 13:05, David Rowley <david.rowley@2ndquadrant.com> wrote:
>> I'm not all that sure why the number of columns in the relation has
>> relevance to the performance of find_relation_from_clauses(). The
>> bms_get_singleton_member() is checking which relations are part of the
>> RestrictInfo, nothing related to columns in relations.
>> Perhaps you meant clauses in the clauses list? Which does not really
>> have all that much to do with the number of columns in the relation
>> either.
>
> Sorry, it's number of relations, not columns. I'm not sure up to
> how many relations we practically should consider but anyway it
> is extra burden to every call to clauselist_selectivity. We
> should avoid calling find_relation_from_clauses as far as
> possible or do the same in more simple way. However I'm not sure
> more precise exclusion is possible or not, I thinks that the case
> of jointype != JOIN_INNER can be exluded.

Well, I imagine queries with >= 32 relations are not planning very
quickly as of today already. I understand what you mean when you speak
of attributes, as we could constantly be looking for the 1400's
attribute which is many loops into a bms_get_singleton_member() call.
I can't imagine we'll even flow over the first word in a bitmap set in
99% of cases with clause_relids.  In any case, even if there's a giant
chain of clauses in the the 'clauses' list, we'll bail out on the
first join qual anyway, since it won't be a singleton clause_relid.

I'd say if you can come up with a test case where you can measure the
impact of this, then let's discuss more. Otherwise we're stepping back
into the territory that Tom warned me about a few emails up....
Premature Optimisation. I'm not walking down there again, I only just
got back.

-- David Rowley                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: Kuntal Ghosh
Date:
Subject: Re: strange parallel query behavior after OOM crashes
Next
From: Andrew Dunstan
Date:
Subject: Re: PoC plpgsql - possibility to force custom or genericplan