Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server - Mailing list pgsql-hackers
From | Etsuro Fujita |
---|---|
Subject | Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server |
Date | |
Msg-id | 5B9BB133.1060107@lab.ntt.co.jp Whole thread Raw |
In response to | Re: Problem while updating a foreign table pointing to apartitioned table on foreign server (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Responses |
Re: Problem while updating a foreign table pointing to apartitioned table on foreign server
|
List | pgsql-hackers |
(2018/08/24 16:58), Kyotaro HORIGUCHI wrote: > At Tue, 21 Aug 2018 11:01:32 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI<horiguchi.kyotaro@lab.ntt.co.jp> wrote in<20180821.110132.261184472.horiguchi.kyotaro@lab.ntt.co.jp> >>> You wrote: >>>> Several places seems to be assuming that fdw_scan_tlist may be >>>> used foreign scan on simple relation but I didn't find that >>>> actually happens. >>> >>> Yeah, currently, postgres_fdw and file_fdw don't use that list for >>> simple foreign table scans, but it could be used to improve the >>> efficiency for those scans, as explained in fdwhandler.sgml: > ... >> I'll put more consideration on using fdw_scan_tlist in the >> documented way. > > Done. postgres_fdw now generates full fdw_scan_tlist (as > documented) for foreign relations with junk columns having a > small change in core side. However it is far less invasive than > the previous version and I believe that it dones't harm > maybe-existing use of fdw_scan_tlist on non-join rels (that is, > in the case of a subset of relation columns). Yeah, changes to the core by the new version is really small, which is great, but I'm not sure it's a good idea to modify the catalog info on the target table on the fly: @@ -126,8 +173,18 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId,\ bool inhparent, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), errmsg("cannot access temporary or unlogged relations during r\ ecovery"))); + max_attrnum = RelationGetNumberOfAttributes(relation); + + /* Foreign table may have exanded this relation with junk columns */ + if (root->simple_rte_array[varno]->relkind == RELKIND_FOREIGN_TABLE) + { + AttrNumber maxattno = max_varattno(root->parse->targetList, varno); + if (max_attrnum < maxattno) + max_attrnum = maxattno; + } + rel->min_attr = FirstLowInvalidHeapAttributeNumber + 1; - rel->max_attr = RelationGetNumberOfAttributes(relation); + rel->max_attr = max_attrnum; rel->reltablespace = RelationGetForm(relation)->reltablespace; This breaks the fundamental assumption that rel->max_attr is equal to RelationGetNumberOfAttributes of that table. My concern is: this change would probably be a wart, so it would be bug-prone in future versions. Another thing on the new version: @@ -1575,6 +1632,19 @@ build_physical_tlist(PlannerInfo *root, RelOptInfo *rel) relation = heap_open(rte->relid, NoLock); numattrs = RelationGetNumberOfAttributes(relation); + + /* + * Foreign tables may have expanded with some junk columns. Punt + * in the case. + */ + if (numattrs < rel->max_attr) + { + Assert(root->simple_rte_array[rel->relid]->relkind == + RELKIND_FOREIGN_TABLE); + heap_close(relation, NoLock); + break; + } I think this would disable the optimization on projection in foreign scans, causing performance regression. > One arguable behavior change is about wholrow vars. Currently it > refferes local tuple with all columns but it is explicitly > fetched as ROW() after this patch applied. This could be fixed > but not just now. > > Part of 0004-: > - Output: f1, ''::text, ctid, rem1.* > - Remote SQL: SELECT f1, f2, ctid FROM public.loc1 FOR UPDATE > + Output: f1, ''::text, tableoid, ctid, rem1.* > + Remote SQL: SELECT f1, tableoid, ctid, ROW(f1, f2) FROM public.loc1 FOR UPDATE That would be also performance regression. If we go in this direction, that should be fixed. > Since this uses fdw_scan_tlist so it is theoretically > back-patchable back to 9.6. IIRC, the fdw_scan_tlist stuff was introduced in PG9.5 as part of join pushdown infrastructure, so I think your patch can be back-patched to PG9.5, but I don't think that's enough; IIRC, this issue was introduced in PG9.3, so a solution for this should be back-patch-able to PG9.3, I think. > Please find the attached three files. Thanks for the patches! > 0001-Add-test-for-postgres_fdw-foreign-parition-update.patch > > This should fail for unpatched postgres_fdw. (Just for demonstration) +CREATE TABLE p1 (a int, b int); +CREATE TABLE c1 (LIKE p1) INHERITS (p1); +CREATE TABLE c2 (LIKE p1) INHERITS (p1); +CREATE FOREIGN TABLE fp1 (a int, b int) + SERVER loopback OPTIONS (table_name 'p1'); +INSERT INTO c1 VALUES (0, 1); +INSERT INTO c2 VALUES (1, 1); +SELECT tableoid::int - (SELECT min(tableoid) FROM fp1)::int AS toiddiff, ctid, * FROM fp1; Does it make sense to evaluate toiddiff? I think that should always be 0. > 0003-Fix-of-foreign-update-bug-of-PgFDW.patch > > Fix of postgres_fdw for this problem. Sorry, I have not looked at it closely yet, but before that I'd like to discuss the direction we go in. I'm not convinced that your approach is the right direction, so as promised, I wrote a patch using the Param-based approach, and compared the two approaches. Attached is a WIP patch for that, which includes the 0003 patch. I don't think there would be any warts as discussed above in the Param-based approach for now. (That approach modifies the planner so that the targetrel's tlist would contain Params as well as Vars/PHVs, so actually, it breaks the planner assumption that a rel's tlist would only include Vars/PHVs, but I don't find any issues on that at least for now. Will look into that in more detail.) And I don't think there would be any concern about performance regression, either. Maybe I'm missing something, though. What do you think about that? Note about the attached: I tried to invent a utility for generate_new_param like SS_make_initplan_output_param as mentioned in [1], but since the FDW API doesn't pass PlannerInfo to the FDW, I think the FDW can't call the utility the same way. Instead, I modified the planner so that 1) the FDW adds Params without setting PARAM_EXEC Param IDs using a new function, and then 2) the core fixes the IDs. Sorry for the delay. Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/3919.1527775582%40sss.pgh.pa.us
Attachment
pgsql-hackers by date: