Re: Foreign join pushdown vs EvalPlanQual - Mailing list pgsql-hackers
From | Etsuro Fujita |
---|---|
Subject | Re: Foreign join pushdown vs EvalPlanQual |
Date | |
Msg-id | 561E12D4.7040403@lab.ntt.co.jp Whole thread Raw |
In response to | Re: Foreign join pushdown vs EvalPlanQual (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Foreign join pushdown vs EvalPlanQual
Re: Foreign join pushdown vs EvalPlanQual Re: Foreign join pushdown vs EvalPlanQual |
List | pgsql-hackers |
On 2015/10/10 10:17, Robert Haas wrote: > On Thu, Oct 8, 2015 at 11:00 PM, Etsuro Fujita > <fujita.etsuro@lab.ntt.co.jp> wrote: >>> The best plan is presumably something like this as you said before: >>> >>> LockRows >>> -> Nested Loop >>> -> Seq Scan on verysmall v >>> -> Foreign Scan on bigft1 and bigft2 >>> Remote SQL: SELECT * FROM bigft1 JOIN bigft2 ON bigft1.x = >>> bigft2.x AND bigft1.q = $1 AND bigft2.r = $2 >>> >>> Consider the EvalPlanQual testing to see if the updated version of a >>> tuple in v satisfies the query. If we use the column in the testing, we >>> would get the wrong results in some cases. >> More precisely, we would get the wrong result when the value of v.q or v.r >> in the updated version has changed. > Interesting test case. It's worth considering why this works if you > were to replace the Foreign Scan with an Index Scan; suppose the query > is SELECT * FROM verysmall v LEFT JOIN realbiglocaltable t ON v.x = > t.x FOR UPDATE OF v, so that you get: > > LockRows > -> Nested Loop > -> Seq Scan on verysmall v > -> Foreign Scan on realbiglocaltable t > Index Cond: v.x = t.x > > In your example, the remote SQL pushes down certain quals to the > remote server, and so if we just return the same tuple they might no > longer be satisfied. In this example, the index qual is essentially a > filter condition that has been "pushed down" into the index AM. The > EvalPlanQual machinery prevents this from generating wrong answers by > rechecking the index cond - see IndexRecheck. Even though it's > normally the AM's job to enforce the index cond, and the executor does > not need to recheck, in the EvalPlanQual case it does need to recheck. > > I think the foreign data wrapper case should be handled the same way. > Any condition that we initially pushed down to the foreign server > needs to be locally rechecked if we're inside EPQ. Agreed. As KaiGai-san also pointed out before, I think we should address this in each of the following cases: 1) remote qual (scanrelid>0) 2) remote join (scanrelid==0) As for #1, I noticed that there is a bug in handling the same kind of FDW queries, which will be shown below. As you said, I think this should be addressed by rechecking the remote quals *locally*. (I thought another fix for this kind of bug before, though.) IIUC, I think this should be fixed separately from #2, as this is a bug not only in 9.5, but in back branches. Please find attached a patch. Create an environment: mydatabase=# create table t1 (a int primary key, b text); mydatabase=# insert into t1 select a, 'notsolongtext' from generate_series(1, 1000000) a; postgres=# create server myserver foreign data wrapper postgres_fdw options (dbname 'mydatabase'); postgres=# create user mapping for current_user server myserver; postgres=# create foreign table ft1 (a int, b text) server myserver options (table_name 't1'); postgres=# alter foreign table ft1 options (add use_remote_estimate 'true'); postgres=# create table inttab (a int); postgres=# insert into inttab select a from generate_series(1, 10) a; postgres=# analyze ft1; postgres=# analyze inttab; Run concurrent transactions that produce incorrect result: [Terminal1] postgres=# begin; BEGIN postgres=# update inttab set a = a + 1 where a = 1; UPDATE 1 [Terminal2] postgres=# explain verbose select * from inttab, ft1 where inttab.a = ft1.a limit 1 for update; QUERY PLAN ------------------------------------------------------------------------------------------------- Limit (cost=100.43..198.99 rows=1 width=70) Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.* -> LockRows (cost=100.43..1086.00 rows=10 width=70) Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.* -> Nested Loop (cost=100.43..1085.90 rows=10 width=70) Output: inttab.a, ft1.a, ft1.b, inttab.ctid, ft1.* -> Seq Scan on public.inttab (cost=0.00..1.10 rows=10 width=10) Output: inttab.a, inttab.ctid -> Foreign Scan on public.ft1 (cost=100.43..108.47 rows=1 width=18) Output: ft1.a, ft1.b, ft1.* Remote SQL: SELECT a, b FROM public.t1 WHERE (($1::integer = a)) FOR UPDATE (11 rows) postgres=# select * from inttab, ft1 where inttab.a = ft1.a limit 1 for update; [Terminal1] postgres=# commit; COMMIT [Terminal2] (After the commit in Terminal1, the following result will be shown in Terminal2. Note that the values of inttab.a and ft1.a wouldn't satisfy the remote qual!) a | a | b ---+---+--------------- 2 | 1 | notsolongtext (1 row) As for #2, I didn't come up with any solution to locally rechecking pushed-down join conditions against a joined tuple populated from a column that we discussed. Instead, I'd like to revise a local-join-execution-plan-based approach that we discussed before, by addressing your comments such as [1]. Would it be the right way to go? Best regards, Etsuro Fujita [1] http://www.postgresql.org/message-id/CA+TgmoaAzs0dR23R7PTBseQfwOtuVCPNBqDHxeBo9Gi+dMxj8w@mail.gmail.com
Attachment
pgsql-hackers by date: