Re: Oddity in EXPLAIN for foreign/custom join pushdown plans - Mailing list pgsql-hackers

From Kouhei Kaigai
Subject Re: Oddity in EXPLAIN for foreign/custom join pushdown plans
Date
Msg-id 9A28C8860F777E439AA12E8AEA7694F80120E5B9@BPXM15GP.gisp.nec.co.jp
Whole thread Raw
In response to Re: Oddity in EXPLAIN for foreign/custom join pushdown plans  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
Responses Re: Oddity in EXPLAIN for foreign/custom join pushdown plans
List pgsql-hackers
> On 2016/07/27 13:51, Kouhei Kaigai wrote:
> > This output is, at least, not incorrect.
> > This ForeignScan actually scan a relation that consists of two joined
> > tables on the remote side. So, not incorrect, but may not convenient for
> > better understanding by users who don't have deep internal knowledge.
>
> Hmm.
>
> > On the other hands, I cannot be happy with your patch because it concludes
> > the role of ForeignScan/CustomScan with scanrelid==0 is always join.
> > However, it does not cover all the case.
> >
> > When postgres_fdw gets support of remote partial aggregate, we can implement
> > the feature using ForeignScan with scanrelid==0. Is it a join? No.
>
> Yeah, I think that that could be implemented in both cases (scanrelid>0
> and scanrelid=0).
>
> > Probably, the core cannot know exact purpose of ForeignScan/CustomScan with
> > scanrelid==0. It can be a remote partial aggregation. It can be an alternative
> > sort logic by GPU. It can be something others.
> > So, I think extension needs to inform the core more proper label to show;
> > including a flag to control whether the relation(s) name shall be attached
> > next to the ForeignScan/CustomScan line.
> >
> > I'd like you to suggest to add two fields below:
> >  - An alternative label extension wants to show instead of the default one
>
> How about adding something like the "Additional Operations" line as
> proposed in a previous email, instead?  As you know, FDWs/Extensions
> could add such information through ExplainForeignScan/ExplainCustomScan.
>
No. What I'm saying is here:


EXPLAIN (COSTS false, VERBOSE)
SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) ORDER BY
t1.c3, t1.c1 OFFSET 100 LIMIT 10;                     QUERY PLAN
-------------------------------------------   Limit     Output: t1.c1, t2.c1, t1.c3     ->  Foreign Scan
^^^^           Output: t1.c1, t2.c1, t1.c3           Relations: (public.ft1 t1) INNER JOIN (public.ft2 t2)
RemoteSQL: SELECT r1."C 1", r1.c3, r2."C 1" FROM ("S 1"."T   
1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r1."C 1" = r2."C 1")))) ORDER BY
r1.c3 ASC N\
ULLS LAST, r1."C 1" ASC NULLS LAST
(6 rows)


Your concern is that EXPLAIN shows ForeignScan node as "Foreign Scan"
although it actually works as "Foreign Join".

My suggestion makes EXPLAIN print "Foreign %s" according to the label
assigned by the extension. Only FDW driver knows how this ForeignScan
node actually performs on, thus, only FDW driver can set meaningful label.

This label may be "Join", may be "Partial Aggregate + Join", or something
others according to the actual job of this node.

> >  - A flag to turn on/off printing relation(s) name
>
> ISTM that the relation information should be always printed even in the
> case of scanrelid=0 by the core, because that would be essential for
> analyzing EXPLAIN results.
>
We have no information which relations are actually scanned by ForeignScan
and CustomScan. Your patch abuses fs_relids and custom_relids, however,
these fields don't indicate the relations actually scan by these nodes.
It just tracks relations processed by this node or its children.

See the example below. This GpuJoin on CustomScan takes three SeqScan nodes
as inner/outer source of relations joins. GpuJoin never scans the tables
actually. It picks up the records generated by underlying SeqScan nodes.


postgres=# explain select id from t0 natural join t1 natural join t2;                               QUERY PLAN
---------------------------------------------------------------------------Custom Scan (GpuJoin)
(cost=12385.67..291245.35rows=9815465 width=4)  GPU Projection: t0.id  Depth 1: GpuHashJoin, HashKeys: (t0.bid)
 JoinQuals: (t0.bid = t2.bid)           Nrows (in/out: 98.15%), KDS-Hash (size: 13.47MB, nbatches: 1)  Depth 2:
GpuHashJoin,HashKeys: (t0.aid)           JoinQuals: (t0.aid = t1.aid)           Nrows (in/out: 100.00%), KDS-Hash
(size:13.47MB, nbatches: 1)  ->  Seq Scan on t0  (cost=0.00..183333.96 rows=9999996 width=12)  ->  Seq Scan on t2
(cost=0.00..1935.00rows=100000 width=4)  ->  Seq Scan on t1  (cost=0.00..1935.00 rows=100000 width=4) 
(11 rows)


If EXPLAIN command attaches something like "on t0,t1,t2" after the GpuJoin
*with no choice*, it is problematic and misleading.
It shall be controllable by the extension which knows what tables are
actually scanned. (Please note that I never says extension makes the string.
It is a job of core explain. I suggest it needs to be controllable.)


Even though I recommended a boolean flag to turn on/off this print, it may
need to be a bitmap to indicate which underlying relations should be printed
by the explain command. Because we can easily assume ForeignScan/CustomScan
node that scans only a part of child tables. For example, it is available to
implement GpuJoin scans the t1 and t2 by itself but takes SeqScan on t0.

> >  - A flag to turn on/off printing relation(s) name

So, my second suggestion shall be adjusted as follows:
- A bitmap to indicate which relation(s) name shall be printed by the core explain command.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: old_snapshot_threshold allows heap:toast disagreement
Next
From: Craig Ringer
Date:
Subject: Re: A Modest Upgrade Proposal