Re: Foreign join pushdown vs EvalPlanQual - Mailing list pgsql-hackers

From Kouhei Kaigai
Subject Re: Foreign join pushdown vs EvalPlanQual
Date
Msg-id 9A28C8860F777E439AA12E8AEA7694F80110A196@BPXM15GP.gisp.nec.co.jp
Whole thread Raw
In response to Re: Foreign join pushdown vs EvalPlanQual  (Kohei KaiGai <kaigai@kaigai.gr.jp>)
Responses Re: Foreign join pushdown vs EvalPlanQual  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
List pgsql-hackers
Fujita-san,

> Does it make sense to put the result tuple of remote join on evety
> estate->es_epqTupleSet[] slot represented by this ForeignScan if
> scanrelid==0?
>
Sorry, I misunderstood behavior of the es_epqTupleSet[].

I'd like to suggest a solution that re-construct remote tuple according
to the fdw_scan_tlist on ExecScanFetch, if given scanrelid == 0.
It enables to run local qualifier associated with the ForeignScan node,
and it will also work for the case when tuple in es_epqTupleSet[] was
local heap.

For details:
The es_epqTuple[] is set by EvalPlanQualSetTuple(). It put a tuple
exactly reflects a particular base relation (that has positive rtindex).
Even if it is a foreign-table, ExecLockRows() put a tuple dynamically
constructed via whole-row-reference at EvalPlanQualFetchRowMarks().
So, regardless of copy or reference to heap, we can expect es_epqTuple[]
keeps tuples of the base relations for each.

On the other hands, ForeignScan that replaced local join by remote
join has a valid fdw_scan_tlist list. It contains expression node
to construct individual attribute of the pseudo scan target-list.

So, all we need to do is, (1) if scanrelid == 0 on ExecScanFetch(),
(2) it should be ForeignScan or CustomScan, with *_scan_tlist.
(3) then, we reconstruct a tuple of the pseudo scan based on the
*_scan_tlist, instead of simple reference to es_epqTupleSet[],
(4) and, evaluate local qualifiers of the node.

How about your thought?

BTW, if you try newer version of postgres_fdw foreign join patch,
please provide me to reproduce the problem/

Also, as an aside, postgres_fdw does not implement RefetchForeignRow()
at this moment. Does it make a problem?

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kohei KaiGai
> Sent: Wednesday, June 24, 2015 10:02 PM
> To: Etsuro Fujita
> Cc: PostgreSQL-development
> Subject: Re: [HACKERS] Foreign join pushdown vs EvalPlanQual
> 
> Does it make sense to put the result tuple of remote join on evety
> estate->es_epqTupleSet[] slot represented by this ForeignScan if
> scanrelid==0?
> 
> It allows to recheck qualifier for each LockRow that intends to lock
> base foreign table underlying the remote join.
> ForeignScan->fdw_relids tells us which rtindexes are represented
> by this ForeignScan, so infrastructure side may be able to handle.
> 
> Thanks,
> 
> 
> 2015-06-24 11:40 GMT+09:00 Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>:
> > Hi,
> >
> > While reviewing the foreign join pushdown core patch, I noticed that the
> > patch doesn't perform an EvalPlanQual recheck properly.  The example
> > that crashes the server will be shown below (it uses the postgres_fdw
> > patch [1]).  I think the reason for that is because the ForeignScan node
> > performing the foreign join remotely has scanrelid = 0 while
> > ExecScanFetch assumes that its scan node has scanrelid > 0.
> >
> > I think this is a bug.  I've not figured out how to fix this yet, but I
> > thought we would also need another plan that evaluates the join locally
> > for the test tuples for EvalPlanQual.  Though I'm missing something though.
> >
> > Create an environment:
> >
> > postgres=# create table tab (a int, b int);
> > CREATE TABLE
> > postgres=# create foreign table foo (a int) server myserver options
> > (table_name 'foo');
> > CREATE FOREIGN TABLE
> > postgres=# create foreign table bar (a int) server myserver options
> > (table_name 'bar');
> > CREATE FOREIGN TABLE
> > postgres=# insert into tab values (1, 1);
> > INSERT 0 1
> > postgres=# insert into foo values (1);
> > INSERT 0 1
> > postgres=# insert into bar values (1);
> > INSERT 0 1
> > postgres=# analyze tab;
> > ANALYZE
> > postgres=# analyze foo;
> > ANALYZE
> > postgres=# analyze bar;
> > ANALYZE
> >
> > Run the example:
> >
> > [Terminal 1]
> > postgres=# begin;
> > BEGIN
> > postgres=# update tab set b = b + 1 where a = 1;
> > UPDATE 1
> >
> > [Terminal 2]
> > postgres=# explain verbose select tab.* from tab, foo, bar where tab.a =
> > foo.a and foo.a = bar.a for update;
> >
> >                                                      QUERY PLAN
> >
> >
> ----------------------------------------------------------------------------
> ----------------------------------------------------------------------------
> >
> ----------------------------------------------------------------------------
> --------------------------------
> >  LockRows  (cost=100.00..101.18 rows=4 width=70)
> >    Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
> >    ->  Nested Loop  (cost=100.00..101.14 rows=4 width=70)
> >          Output: tab.a, tab.b, tab.ctid, foo.*, bar.*
> >          Join Filter: (foo.a = tab.a)
> >          ->  Seq Scan on public.tab  (cost=0.00..1.01 rows=1 width=14)
> >                Output: tab.a, tab.b, tab.ctid
> >          ->  Foreign Scan  (cost=100.00..100.08 rows=4 width=64)
> >                Output: foo.*, foo.a, bar.*, bar.a
> >                Relations: (public.foo) INNER JOIN (public.bar)
> >                Remote SQL: SELECT l.a1, l.a2, r.a1, r.a2 FROM (SELECT
> > ROW(l.a9), l.a9 FROM (SELECT a a9 FROM public.foo FOR UPDATE) l) l (a1,
> > a2) INNER
> > JOIN (SELECT ROW(r.a9), r.a9 FROM (SELECT a a9 FROM public.bar FOR
> > UPDATE) r) r (a1, a2) ON ((l.a2 = r.a2))
> > (11 rows)
> >
> > postgres=# select tab.* from tab, foo, bar where tab.a = foo.a and foo.a
> > = bar.a for update;
> >
> > [Terminal 1]
> > postgres=# commit;
> > COMMIT
> >
> > [Terminal 2]
> > (After the commit in Terminal 1, Terminal 2 will show the following.)
> > server closed the connection unexpectedly
> >         This probably means the server terminated abnormally
> >         before or while processing the request.
> > The connection to the server was lost. Attempting reset: Failed.
> > !>
> >
> > Best regards,
> > Etsuro Fujita
> >
> > [1]
> >
> http://www.postgresql.org/message-id/CAEZqfEe9KGy=1_waGh2rgZPg0o4pqgD+iauYaj
> 8wTze+CYJUHg@mail.gmail.com
> >
> >
> > --
> > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-hackers
> 
> 
> 
> --
> KaiGai Kohei <kaigai@kaigai.gr.jp>
> 
> 
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pg_rewind failure by file deletion in source server
Next
From: Michael Paquier
Date:
Subject: Re: Support for N synchronous standby servers - take 2