Re: Foreign join pushdown vs EvalPlanQual - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: Foreign join pushdown vs EvalPlanQual
Date
Msg-id 561E12D4.7040403@lab.ntt.co.jp
Whole thread Raw
In response to Re: Foreign join pushdown vs EvalPlanQual  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Foreign join pushdown vs EvalPlanQual  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Re: Foreign join pushdown vs EvalPlanQual  (Robert Haas <robertmhaas@gmail.com>)
Re: Foreign join pushdown vs EvalPlanQual  (Etsuro Fujita <fujita.etsuro@lab.ntt.co.jp>)
List pgsql-hackers
On 2015/10/10 10:17, Robert Haas wrote:
> On Thu, Oct 8, 2015 at 11:00 PM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp> wrote:
>>> The best plan is presumably something like this as you said before:
>>>
>>> LockRows
>>> -> Nested Loop
>>>      -> Seq Scan on verysmall v
>>>      -> Foreign Scan on bigft1 and bigft2
>>>           Remote SQL: SELECT * FROM bigft1 JOIN bigft2 ON bigft1.x =
>>> bigft2.x AND bigft1.q = $1 AND bigft2.r = $2
>>>
>>> Consider the EvalPlanQual testing to see if the updated version of a
>>> tuple in v satisfies the query.  If we use the column in the testing, we
>>> would get the wrong results in some cases.

>> More precisely, we would get the wrong result when the value of v.q or v.r
>> in the updated version has changed.

> Interesting test case.  It's worth considering why this works if you
> were to replace the Foreign Scan with an Index Scan; suppose the query
> is SELECT * FROM verysmall v LEFT JOIN realbiglocaltable t ON v.x =
> t.x FOR UPDATE OF v, so that you get:
>
> LockRows
> -> Nested Loop
>    -> Seq Scan on verysmall v
>    -> Foreign Scan on realbiglocaltable t
>      Index Cond: v.x = t.x
>
> In your example, the remote SQL pushes down certain quals to the
> remote server, and so if we just return the same tuple they might no
> longer be satisfied.  In this example, the index qual is essentially a
> filter condition that has been "pushed down" into the index AM.  The
> EvalPlanQual machinery prevents this from generating wrong answers by
> rechecking the index cond - see IndexRecheck.  Even though it's
> normally the AM's job to enforce the index cond, and the executor does
> not need to recheck, in the EvalPlanQual case it does need to recheck.
>
> I think the foreign data wrapper case should be handled the same way.
> Any condition that we initially pushed down to the foreign server
> needs to be locally rechecked if we're inside EPQ.

Agreed.

As KaiGai-san also pointed out before, I think we should address this in
each of the following cases:

1) remote qual (scanrelid>0)
2) remote join (scanrelid==0)

As for #1, I noticed that there is a bug in handling the same kind of
FDW queries, which will be shown below.  As you said, I think this
should be addressed by rechecking the remote quals *locally*.  (I
thought another fix for this kind of bug before, though.)  IIUC, I think
this should be fixed separately from #2, as this is a bug not only in
9.5, but in back branches.  Please find attached a patch.

Create an environment:

mydatabase=# create table t1 (a int primary key, b text);
mydatabase=# insert into t1 select a, 'notsolongtext' from
generate_series(1, 1000000) a;

postgres=# create server myserver foreign data wrapper postgres_fdw
options (dbname 'mydatabase');
postgres=# create user mapping for current_user server myserver;
postgres=# create foreign table ft1 (a int, b text) server myserver
options (table_name 't1');
postgres=# alter foreign table ft1 options (add use_remote_estimate 'true');
postgres=# create table inttab (a int);
postgres=# insert into inttab select a from generate_series(1, 10) a;
postgres=# analyze ft1;
postgres=# analyze inttab;

Run concurrent transactions that produce incorrect result:

[Terminal1]
postgres=# begin;
BEGIN
postgres=# update inttab set a = a + 1 where a = 1;
UPDATE 1

[Terminal2]
postgres=# explain verbose select * from inttab, ft1 where inttab.a =
ft1.a limit 1 for update;
                                            QUERY PLAN
-------------------------------------------------------------------------------------------------
  Limit  (cost=100.43..198.99 rows=1 width=70)
    Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.*
    ->  LockRows  (cost=100.43..1086.00 rows=10 width=70)
          Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.*
          ->  Nested Loop  (cost=100.43..1085.90 rows=10 width=70)
                Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.*
                ->  Seq Scan on public.inttab  (cost=0.00..1.10 rows=10
width=10)
                      Output: inttab.a, inttab.ctid
                ->  Foreign Scan on public.ft1  (cost=100.43..108.47
rows=1 width=18)
                      Output: ft1.a, ft1.b, ft1.*
                      Remote SQL: SELECT a, b FROM public.t1 WHERE
(($1::integer = a)) FOR UPDATE
(11 rows)

postgres=# select * from inttab, ft1 where inttab.a = ft1.a limit 1 for
update;

[Terminal1]
postgres=# commit;
COMMIT

[Terminal2]
(After the commit in Terminal1, the following result will be shown in
Terminal2.  Note that the values of inttab.a and ft1.a wouldn't satisfy
the remote qual!)
  a | a |       b
---+---+---------------
  2 | 1 | notsolongtext
(1 row)

As for #2, I didn't come up with any solution to locally rechecking
pushed-down join conditions against a joined tuple populated from a
column that we discussed.  Instead, I'd like to revise a
local-join-execution-plan-based approach that we discussed before, by
addressing your comments such as [1].  Would it be the right way to go?

Best regards,
Etsuro Fujita

[1]
http://www.postgresql.org/message-id/CA+TgmoaAzs0dR23R7PTBseQfwOtuVCPNBqDHxeBo9Gi+dMxj8w@mail.gmail.com

Attachment

pgsql-hackers by date:

Previous
From: "Shulgin, Oleksandr"
Date:
Subject: Re: Database schema diff
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Dangling Client Backend Process