Re: Push down more full joins in postgres_fdw - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Push down more full joins in postgres_fdw
Date
Msg-id CAFjFpRcsm1j8YdKT3PC43Z_F5QufYC4M_mnh_ghy-0iAzh+S9Q@mail.gmail.com
Whole thread Raw
In response to Re: Push down more full joins in postgres_fdw  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Responses Re: Push down more full joins in postgres_fdw  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
List pgsql-hackers
>
>>>> I guess, the arrays need to be
>>>> computed only once for any relation when the query for that relation
>>>> is deparsed the first time.
>
>
>>> Does this algorithm extend to the case where we consider paths for every
>>> join order?
>
>
>> Yes, if we store the information about which of relations need
>> subquery and which don't for every join order.
>
>
> Hmm.  Sorry, I'm not so excited about this proposal.  I think (1) that is
> solving a problem that hasn't been proven to be a problem, (2) that would
> complicate the deparser logic, and (3) the cost of creating this array for
> each relation by the bottom-up method while deparsing a remote query would
> be not small (especially when the query is large), so that might need more
> cycles for deparsing the query than what I proposed when
> use_remote_estimate=false.  So, I'd like to go with what I proposed, at
> least as the first cut.

The arrays will need to computed only when there is at least one
relation required to be deparsed as subquery. Every relation above the
relation which is converted into subquery requires the array. Each
array will be N * size of pointer long where N is the largest relid
covered by that relation. N will vary across the RelOptInfo hierarchy.
All that array holds is targetlist pointer which are required to
computed anyway. So, the amount of memory is bounded by N * size of
pointer * (2N - 1), which is way lesser than what we use in other
places.

Your patch calls isSubqueryExpr() recursively for every Var in the
targetlist, which can be many for foreign tables with many columns.
For every such Var it may need to reach upto the relation which is
converted into subquery, which can as bad as reaching every base
relation. So, it looks like the number of recursive calls to
isSubqueryExpr() is bounded by V * N (i.e. worst case depth of the
RelOptInfo tree), which can be quite costly. If use_remote_estimates
is true, we do this for every intermediate relation and for every path
created. That isn't very performance efficient.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [BUG] pg_basebackup from disconnected standby fails
Next
From: Amit Langote
Date:
Subject: Re: Declarative partitioning - another take