Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails |
Date | |
Msg-id | 2201880.1630528969@sss.pgh.pa.us Whole thread Raw |
In response to | Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails (Etsuro Fujita <etsuro.fujita@gmail.com>) |
Responses |
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails |
List | pgsql-hackers |
Etsuro Fujita <etsuro.fujita@gmail.com> writes: > On Thu, Jul 15, 2021 at 4:17 AM Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote: >> On Wed, Jul 14, 2021 at 1:41 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Seems to just need an update of the expected-file to account for test >>> cases added recently. (I take no position on whether the new results >>> are desirable; some of these might be breaking the intent of the case. >>> But this should quiet the cfbot anyway.) >> The test case was added by commit "Add support for asynchronous execution" >> "27e1f14563cf982f1f4d71e21ef247866662a052" by Etsuro Fujita. He can comment >> whether the new results are desirable or not. > The new results aren't what I intended. I'll update the patch to > avoid that by modifying the original test cases properly, if there are > no objections. Please follow up on that sometime? In the meantime, here is a rebase over aa769f80e and 2dc53fe2a, to placate the cfbot. The real reason that this hasn't gotten committed is that I remain pretty uncomfortable about whether it's an acceptable solution to the problem. Suddenly asking people to plaster COLLATE clauses on all their textual remote columns seems like a big compatibility gotcha. However, I lack any ideas about a less unpleasant solution. regards, tom lane diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index d98bd66681..654b09273e 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -328,7 +328,9 @@ foreign_expr_walker(Node *node, /* * If the Var is from the foreign table, we consider its - * collation (if any) safe to use. If it is from another + * collation (if any) safe to use, *unless* it's + * DEFAULT_COLLATION_OID. We treat that as meaning "we don't + * know which collation this is". If it is from another * table, we treat its collation the same way as we would a * Param's collation, ie it's not safe for it to have a * non-default collation. @@ -350,7 +352,12 @@ foreign_expr_walker(Node *node, /* Else check the collation */ collation = var->varcollid; - state = OidIsValid(collation) ? FDW_COLLATE_SAFE : FDW_COLLATE_NONE; + if (collation == InvalidOid) + state = FDW_COLLATE_NONE; + else if (collation == DEFAULT_COLLATION_OID) + state = FDW_COLLATE_UNSAFE; + else + state = FDW_COLLATE_SAFE; } else { @@ -941,8 +948,24 @@ foreign_expr_walker(Node *node, /* * Now, merge my collation information into my parent's state. + * + * If one branch of an expression derives a non-default collation safely + * (that is, from a foreign Var) and another one derives the same + * collation unsafely, we can consider the expression safe overall. This + * allows cases such as "foreign_var = ('foo' COLLATE x)" where x is the + * same collation the foreign_var has anyway. Note that we will not ship + * any explicit COLLATE clause to the remote, but rely on it to re-derive + * the correct collation based on the foreign_var. */ - if (state > outer_cxt->state) + if (collation == outer_cxt->collation && + ((state == FDW_COLLATE_UNSAFE && + outer_cxt->state == FDW_COLLATE_SAFE) || + (state == FDW_COLLATE_SAFE && + outer_cxt->state == FDW_COLLATE_UNSAFE))) + { + outer_cxt->state = FDW_COLLATE_SAFE; + } + else if (state > outer_cxt->state) { /* Override previous parent state */ outer_cxt->collation = collation; diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index e3ee30f1aa..9e52e09a8b 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -32,29 +32,29 @@ CREATE SCHEMA "S 1"; CREATE TABLE "S 1"."T 1" ( "C 1" int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10), + c6 varchar(10) collate "C", + c7 char(10) collate "C", c8 user_enum, CONSTRAINT t1_pkey PRIMARY KEY ("C 1") ); CREATE TABLE "S 1"."T 2" ( c1 int NOT NULL, - c2 text, + c2 text collate "C", CONSTRAINT t2_pkey PRIMARY KEY (c1) ); CREATE TABLE "S 1"."T 3" ( c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", CONSTRAINT t3_pkey PRIMARY KEY (c1) ); CREATE TABLE "S 1"."T 4" ( c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", CONSTRAINT t4_pkey PRIMARY KEY (c1) ); -- Disable autovacuum for these tables to avoid unexpected effects of that @@ -94,16 +94,18 @@ ANALYZE "S 1"."T 3"; ANALYZE "S 1"."T 4"; -- =================================================================== -- create foreign tables +-- Note: to ensure stable regression results, all collatable columns +-- in these tables must have explicitly-specified collations. -- =================================================================== CREATE FOREIGN TABLE ft1 ( c0 int, c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10) default 'ft1', + c6 varchar(10) collate "C", + c7 char(10) default 'ft1' collate "C", c8 user_enum ) SERVER loopback; ALTER FOREIGN TABLE ft1 DROP COLUMN c0; @@ -111,28 +113,28 @@ CREATE FOREIGN TABLE ft2 ( c1 int NOT NULL, c2 int NOT NULL, cx int, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10) default 'ft2', + c6 varchar(10) collate "C", + c7 char(10) default 'ft2' collate "C", c8 user_enum ) SERVER loopback; ALTER FOREIGN TABLE ft2 DROP COLUMN cx; CREATE FOREIGN TABLE ft4 ( c1 int NOT NULL, c2 int NOT NULL, - c3 text + c3 text collate "C" ) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3'); CREATE FOREIGN TABLE ft5 ( c1 int NOT NULL, c2 int NOT NULL, - c3 text + c3 text collate "C" ) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4'); CREATE FOREIGN TABLE ft6 ( c1 int NOT NULL, c2 int NOT NULL, - c3 text + c3 text collate "C" ) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4'); CREATE FOREIGN TABLE ft7 ( c1 int NOT NULL, @@ -288,7 +290,7 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1, t1.tabl Output: c1, c2, c3, c4, c5, c6, c7, c8, tableoid -> Sort Output: c1, c2, c3, c4, c5, c6, c7, c8, tableoid - Sort Key: t1.c3, t1.c1, t1.tableoid + Sort Key: t1.c3 COLLATE "C", t1.c1, t1.tableoid -> Foreign Scan on public.ft1 t1 Output: c1, c2, c3, c4, c5, c6, c7, c8, tableoid Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" @@ -902,12 +904,12 @@ EXPLAIN (VERBOSE, COSTS OFF) (6 rows) EXPLAIN (VERBOSE, COSTS OFF) - SELECT * FROM ft2 ORDER BY ft2.c1, ft2.c3 collate "C"; + SELECT * FROM ft2 ORDER BY ft2.c1, ft2.c3 collate "POSIX"; QUERY PLAN ------------------------------------------------------------------------------- Sort Output: c1, c2, c3, c4, c5, c6, c7, c8, ((c3)::text) - Sort Key: ft2.c1, ft2.c3 COLLATE "C" + Sort Key: ft2.c1, ft2.c3 COLLATE "POSIX" -> Foreign Scan on public.ft2 Output: c1, c2, c3, c4, c5, c6, c7, c8, c3 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" @@ -1149,13 +1151,12 @@ SELECT * FROM ft1 WHERE CASE c3 WHEN c6 THEN true ELSE c3 < 'bar' END; -- but this is not because of collation EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 WHERE CASE c3 COLLATE "C" WHEN c6 THEN true ELSE c3 < 'bar' END; - QUERY PLAN -------------------------------------------------------------------------------------- + QUERY PLAN +----------------------------------------------------------------------------------------------------------------------------------------- Foreign Scan on public.ft1 Output: c1, c2, c3, c4, c5, c6, c7, c8 - Filter: CASE (ft1.c3)::text WHEN ft1.c6 THEN true ELSE (ft1.c3 < 'bar'::text) END - Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" -(4 rows) + Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((CASE c3 WHEN c6 THEN true ELSE (c3 < 'bar'::text)END)) +(3 rows) -- =================================================================== -- JOIN queries @@ -1921,7 +1922,7 @@ WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t Remote SQL: SELECT r1."C 1", r1.c3, r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r1."C 1" =r2."C 1")))) -> Sort Output: t.c1_1, t.c2_1, t.c1_3 - Sort Key: t.c1_3, t.c1_1 + Sort Key: t.c1_3 COLLATE "C", t.c1_1 -> CTE Scan on t Output: t.c1_1, t.c2_1, t.c1_3 (12 rows) @@ -2121,7 +2122,7 @@ SELECT t1.c1, t2.c1 FROM ft1 t1 LEFT JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 Output: t1.c1, t2.c1, t1.c3 -> Sort Output: t1.c1, t2.c1, t1.c3 - Sort Key: t1.c3, t1.c1 + Sort Key: t1.c3 COLLATE "C", t1.c1 -> Hash Right Join Output: t1.c1, t2.c1, t1.c3 Hash Cond: (t2.c1 = t1.c1) @@ -2163,7 +2164,7 @@ SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = t2. Output: t1.c1, t2.c1, t1.c3 -> Sort Output: t1.c1, t2.c1, t1.c3 - Sort Key: t1.c3, t1.c1 + Sort Key: t1.c3 COLLATE "C", t1.c1 -> Foreign Scan Output: t1.c1, t2.c1, t1.c3 Filter: (t1.c8 = t2.c8) @@ -3611,7 +3612,7 @@ select c2, c6, sum(c1) from ft1 where c2 < 3 group by grouping sets(c2, c6) orde ---------------------------------------------------------------------------------- Sort Output: c2, c6, (sum(c1)) - Sort Key: ft1.c2, ft1.c6 + Sort Key: ft1.c2, ft1.c6 COLLATE "C" -> HashAggregate Output: c2, c6, sum(c1) Hash Key: ft1.c2 @@ -3995,13 +3996,13 @@ EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st6; PREPARE st7 AS INSERT INTO ft1 (c1,c2,c3) VALUES (1001,101,'foo'); EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st7; - QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + QUERY PLAN +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Insert on public.ft1 Remote SQL: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8) Batch Size: 1 -> Result - Output: NULL::integer, 1001, 101, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time zone,NULL::character varying, 'ft1 '::character(10), NULL::user_enum + Output: NULL::integer, 1001, 101, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time zone,NULL::character varying COLLATE "C", 'ft1 '::character(10), NULL::user_enum (5 rows) ALTER TABLE "S 1"."T 1" RENAME TO "T 0"; @@ -4029,13 +4030,13 @@ EXECUTE st6; (9 rows) EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st7; - QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + QUERY PLAN +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Insert on public.ft1 Remote SQL: INSERT INTO "S 1"."T 0"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8) Batch Size: 1 -> Result - Output: NULL::integer, 1001, 101, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time zone,NULL::character varying, 'ft1 '::character(10), NULL::user_enum + Output: NULL::integer, 1001, 101, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time zone,NULL::character varying COLLATE "C", 'ft1 '::character(10), NULL::user_enum (5 rows) ALTER TABLE "S 1"."T 0" RENAME TO "T 1"; @@ -4266,36 +4267,14 @@ explain (verbose, costs off) select * from ft3 where f1 COLLATE "C" = 'foo'; Remote SQL: SELECT f1, f2, f3 FROM public.loct3 WHERE ((f1 = 'foo'::text)) (3 rows) -explain (verbose, costs off) select * from ft3 where f2 = 'foo'; - QUERY PLAN ------------------------------------------------------------------------------- - Foreign Scan on public.ft3 - Output: f1, f2, f3 - Remote SQL: SELECT f1, f2, f3 FROM public.loct3 WHERE ((f2 = 'foo'::text)) -(3 rows) - -explain (verbose, costs off) select * from ft3 where f3 = 'foo'; +explain (verbose, costs off) select * from ft3 where f1 = 'foo' COLLATE "C"; QUERY PLAN ------------------------------------------------------------------------------ Foreign Scan on public.ft3 Output: f1, f2, f3 - Remote SQL: SELECT f1, f2, f3 FROM public.loct3 WHERE ((f3 = 'foo'::text)) + Remote SQL: SELECT f1, f2, f3 FROM public.loct3 WHERE ((f1 = 'foo'::text)) (3 rows) -explain (verbose, costs off) select * from ft3 f, loct3 l - where f.f3 = l.f3 and l.f1 = 'foo'; - QUERY PLAN --------------------------------------------------------------------------------------------------- - Nested Loop - Output: f.f1, f.f2, f.f3, l.f1, l.f2, l.f3 - -> Index Scan using loct3_f1_key on public.loct3 l - Output: l.f1, l.f2, l.f3 - Index Cond: (l.f1 = 'foo'::text) - -> Foreign Scan on public.ft3 f - Output: f.f1, f.f2, f.f3 - Remote SQL: SELECT f1, f2, f3 FROM public.loct3 WHERE (($1::character varying(10) = f3)) -(8 rows) - -- can't be sent to remote explain (verbose, costs off) select * from ft3 where f1 COLLATE "POSIX" = 'foo'; QUERY PLAN @@ -4306,12 +4285,12 @@ explain (verbose, costs off) select * from ft3 where f1 COLLATE "POSIX" = 'foo'; Remote SQL: SELECT f1, f2, f3 FROM public.loct3 (4 rows) -explain (verbose, costs off) select * from ft3 where f1 = 'foo' COLLATE "C"; +explain (verbose, costs off) select * from ft3 where f2 = 'foo'; QUERY PLAN --------------------------------------------------- Foreign Scan on public.ft3 Output: f1, f2, f3 - Filter: (ft3.f1 = 'foo'::text COLLATE "C") + Filter: (ft3.f2 = 'foo'::text) Remote SQL: SELECT f1, f2, f3 FROM public.loct3 (4 rows) @@ -4333,6 +4312,33 @@ explain (verbose, costs off) select * from ft3 where f2 = 'foo' COLLATE "C"; Remote SQL: SELECT f1, f2, f3 FROM public.loct3 (4 rows) +explain (verbose, costs off) select * from ft3 where f3 = 'foo'; + QUERY PLAN +--------------------------------------------------- + Foreign Scan on public.ft3 + Output: f1, f2, f3 + Filter: ((ft3.f3)::text = 'foo'::text) + Remote SQL: SELECT f1, f2, f3 FROM public.loct3 +(4 rows) + +explain (verbose, costs off) select * from ft3 f, loct3 l + where f.f3 = l.f3 and l.f1 = 'foo'; + QUERY PLAN +------------------------------------------------------------- + Hash Join + Output: f.f1, f.f2, f.f3, l.f1, l.f2, l.f3 + Inner Unique: true + Hash Cond: ((f.f3)::text = (l.f3)::text) + -> Foreign Scan on public.ft3 f + Output: f.f1, f.f2, f.f3 + Remote SQL: SELECT f1, f2, f3 FROM public.loct3 + -> Hash + Output: l.f1, l.f2, l.f3 + -> Index Scan using loct3_f1_key on public.loct3 l + Output: l.f1, l.f2, l.f3 + Index Cond: (l.f1 = 'foo'::text) +(12 rows) + explain (verbose, costs off) select * from ft3 f, loct3 l where f.f3 = l.f3 COLLATE "POSIX" and l.f1 = 'foo'; QUERY PLAN @@ -4356,13 +4362,13 @@ explain (verbose, costs off) select * from ft3 f, loct3 l -- =================================================================== EXPLAIN (verbose, costs off) INSERT INTO ft2 (c1,c2,c3) SELECT c1+1000,c2+100, c3 || c3 FROM ft2 LIMIT 20; - QUERYPLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + QUERY PLAN +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Insert on public.ft2 Remote SQL: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8) Batch Size: 1 -> Subquery Scan on "*SELECT*" - Output: "*SELECT*"."?column?", "*SELECT*"."?column?_1", NULL::integer, "*SELECT*"."?column?_2", NULL::timestampwith time zone, NULL::timestamp without time zone, NULL::character varying, 'ft2 '::character(10), NULL::user_enum + Output: "*SELECT*"."?column?", "*SELECT*"."?column?_1", NULL::integer, "*SELECT*"."?column?_2", NULL::timestampwith time zone, NULL::timestamp without time zone, NULL::character varying COLLATE "C", 'ft2 '::character(10),NULL::user_enum -> Foreign Scan on public.ft2 ft2_1 Output: (ft2_1.c1 + 1000), (ft2_1.c2 + 100), (ft2_1.c3 || ft2_1.c3) Remote SQL: SELECT "C 1", c2, c3 FROM "S 1"."T 1" LIMIT 20::bigint @@ -5472,14 +5478,14 @@ SELECT c1,c2,c3,c4 FROM ft2 ORDER BY c1; EXPLAIN (verbose, costs off) INSERT INTO ft2 (c1,c2,c3) VALUES (1200,999,'foo') RETURNING tableoid::regclass; - QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + QUERY PLAN +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Insert on public.ft2 Output: (ft2.tableoid)::regclass Remote SQL: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8) Batch Size: 1 -> Result - Output: 1200, 999, NULL::integer, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time zone,NULL::character varying, 'ft2 '::character(10), NULL::user_enum + Output: 1200, 999, NULL::integer, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time zone,NULL::character varying COLLATE "C", 'ft2 '::character(10), NULL::user_enum (6 rows) INSERT INTO ft2 (c1,c2,c3) VALUES (1200,999,'foo') RETURNING tableoid::regclass; @@ -7881,12 +7887,12 @@ NOTICE: drop cascades to foreign table bar2 drop table loct1; drop table loct2; -- Test pushing down UPDATE/DELETE joins to the remote server -create table parent (a int, b text); -create table loct1 (a int, b text); -create table loct2 (a int, b text); -create foreign table remt1 (a int, b text) +create table parent (a int, b text collate "C"); +create table loct1 (a int, b text collate "C"); +create table loct2 (a int, b text collate "C"); +create foreign table remt1 (a int, b text collate "C") server loopback options (table_name 'loct1'); -create foreign table remt2 (a int, b text) +create foreign table remt2 (a int, b text collate "C") server loopback options (table_name 'loct2'); alter foreign table remt1 inherit parent; insert into remt1 values (1, 'foo'); @@ -8095,10 +8101,10 @@ drop table itrtest; drop table loct1; drop table loct2; -- Test update tuple routing -create table utrtest (a int, b text) partition by list (a); -create table loct (a int check (a in (1)), b text); -create foreign table remp (a int check (a in (1)), b text) server loopback options (table_name 'loct'); -create table locp (a int check (a in (2)), b text); +create table utrtest (a int, b text collate "C") partition by list (a); +create table loct (a int check (a in (1)), b text collate "C"); +create foreign table remp (a int check (a in (1)), b text collate "C") server loopback options (table_name 'loct'); +create table locp (a int check (a in (2)), b text collate "C"); alter table utrtest attach partition remp for values in (1); alter table utrtest attach partition locp for values in (2); insert into utrtest values (1, 'foo'); @@ -8260,7 +8266,7 @@ alter table utrtest detach partition remp; drop foreign table remp; alter table loct drop constraint loct_a_check; alter table loct add check (a in (3)); -create foreign table remp (a int check (a in (3)), b text) server loopback options (table_name 'loct'); +create foreign table remp (a int check (a in (3)), b text collate "C") server loopback options (table_name 'loct'); alter table utrtest attach partition remp for values in (3); insert into utrtest values (2, 'qux'); insert into utrtest values (3, 'xyzzy'); @@ -8721,11 +8727,11 @@ tru_rtable_parent,tru_rtable_child, tru_rtable0_child; -- test IMPORT FOREIGN SCHEMA -- =================================================================== CREATE SCHEMA import_source; -CREATE TABLE import_source.t1 (c1 int, c2 varchar NOT NULL); -CREATE TABLE import_source.t2 (c1 int default 42, c2 varchar NULL, c3 text collate "POSIX"); -CREATE TYPE typ1 AS (m1 int, m2 varchar); +CREATE TABLE import_source.t1 (c1 int, c2 varchar NOT NULL COLLATE "C"); +CREATE TABLE import_source.t2 (c1 int default 42, c2 varchar NULL COLLATE "C", c3 text collate "POSIX"); +CREATE TYPE typ1 AS (m1 int, m2 varchar COLLATE "C"); CREATE TABLE import_source.t3 (c1 timestamptz default now(), c2 typ1); -CREATE TABLE import_source."x 4" (c1 float8, "C 2" text, c3 varchar(42)); +CREATE TABLE import_source."x 4" (c1 float8, "C 2" text COLLATE "C", c3 varchar(42) COLLATE "C"); CREATE TABLE import_source."x 5" (c1 float8); ALTER TABLE import_source."x 5" DROP COLUMN c1; CREATE TABLE import_source."x 6" (c1 int, c2 int generated always as (c1 * 2) stored); @@ -8754,7 +8760,7 @@ IMPORT FOREIGN SCHEMA import_source FROM SERVER loopback INTO import_dest1; Column | Type | Collation | Nullable | Default | FDW options --------+-------------------+-----------+----------+---------+-------------------- c1 | integer | | | | (column_name 'c1') - c2 | character varying | | not null | | (column_name 'c2') + c2 | character varying | C | not null | | (column_name 'c2') Server: loopback FDW options: (schema_name 'import_source', table_name 't1') @@ -8762,7 +8768,7 @@ FDW options: (schema_name 'import_source', table_name 't1') Column | Type | Collation | Nullable | Default | FDW options --------+-------------------+-----------+----------+---------+-------------------- c1 | integer | | | | (column_name 'c1') - c2 | character varying | | | | (column_name 'c2') + c2 | character varying | C | | | (column_name 'c2') c3 | text | POSIX | | | (column_name 'c3') Server: loopback FDW options: (schema_name 'import_source', table_name 't2') @@ -8786,8 +8792,8 @@ FDW options: (schema_name 'import_source', table_name 't4') Column | Type | Collation | Nullable | Default | FDW options --------+-----------------------+-----------+----------+---------+--------------------- c1 | double precision | | | | (column_name 'c1') - C 2 | text | | | | (column_name 'C 2') - c3 | character varying(42) | | | | (column_name 'c3') + C 2 | text | C | | | (column_name 'C 2') + c3 | character varying(42) | C | | | (column_name 'c3') Server: loopback FDW options: (schema_name 'import_source', table_name 'x 4') @@ -8827,7 +8833,7 @@ IMPORT FOREIGN SCHEMA import_source FROM SERVER loopback INTO import_dest2 Column | Type | Collation | Nullable | Default | FDW options --------+-------------------+-----------+----------+---------+-------------------- c1 | integer | | | | (column_name 'c1') - c2 | character varying | | not null | | (column_name 'c2') + c2 | character varying | C | not null | | (column_name 'c2') Server: loopback FDW options: (schema_name 'import_source', table_name 't1') @@ -8835,7 +8841,7 @@ FDW options: (schema_name 'import_source', table_name 't1') Column | Type | Collation | Nullable | Default | FDW options --------+-------------------+-----------+----------+---------+-------------------- c1 | integer | | | 42 | (column_name 'c1') - c2 | character varying | | | | (column_name 'c2') + c2 | character varying | C | | | (column_name 'c2') c3 | text | POSIX | | | (column_name 'c3') Server: loopback FDW options: (schema_name 'import_source', table_name 't2') @@ -8859,8 +8865,8 @@ FDW options: (schema_name 'import_source', table_name 't4') Column | Type | Collation | Nullable | Default | FDW options --------+-----------------------+-----------+----------+---------+--------------------- c1 | double precision | | | | (column_name 'c1') - C 2 | text | | | | (column_name 'C 2') - c3 | character varying(42) | | | | (column_name 'c3') + C 2 | text | C | | | (column_name 'C 2') + c3 | character varying(42) | C | | | (column_name 'c3') Server: loopback FDW options: (schema_name 'import_source', table_name 'x 4') @@ -10620,14 +10626,16 @@ SELECT * FROM insert_tbl ORDER BY a; EXPLAIN (VERBOSE, COSTS OFF) WITH t AS (UPDATE remote_tbl SET c = c || c RETURNING *) INSERT INTO join_tbl SELECT * FROM async_pt LEFT JOIN t ON (async_pt.a = t.a AND async_pt.b = t.b) WHERE async_pt.b ===505; - QUERY PLAN ----------------------------------------------------------------------------------------- + QUERY PLAN +-------------------------------------------------------------------------------------------- Insert on public.join_tbl CTE t -> Update on public.remote_tbl Output: remote_tbl.a, remote_tbl.b, remote_tbl.c - -> Foreign Update on public.remote_tbl - Remote SQL: UPDATE public.base_tbl3 SET c = (c || c) RETURNING a, b, c + Remote SQL: UPDATE public.base_tbl3 SET c = $2 WHERE ctid = $1 RETURNING a, b, c + -> Foreign Scan on public.remote_tbl + Output: (remote_tbl.c || remote_tbl.c), remote_tbl.ctid, remote_tbl.* + Remote SQL: SELECT a, b, c, ctid FROM public.base_tbl3 FOR UPDATE -> Nested Loop Left Join Output: async_pt.a, async_pt.b, async_pt.c, t.a, t.b, t.c Join Filter: ((async_pt.a = t.a) AND (async_pt.b = t.b)) @@ -10645,7 +10653,7 @@ INSERT INTO join_tbl SELECT * FROM async_pt LEFT JOIN t ON (async_pt.a = t.a AND Filter: (async_pt_3.b === 505) -> CTE Scan on t Output: t.a, t.b, t.c -(23 rows) +(25 rows) WITH t AS (UPDATE remote_tbl SET c = c || c RETURNING *) INSERT INTO join_tbl SELECT * FROM async_pt LEFT JOIN t ON (async_pt.a = t.a AND async_pt.b = t.b) WHERE async_pt.b ===505; @@ -10673,17 +10681,21 @@ UPDATE async_pt SET c = c || c WHERE b = 0 RETURNING *; Update on public.async_pt Output: async_pt_1.a, async_pt_1.b, async_pt_1.c Foreign Update on public.async_p1 async_pt_1 + Remote SQL: UPDATE public.base_tbl1 SET c = $2 WHERE ctid = $1 RETURNING a, b, c Foreign Update on public.async_p2 async_pt_2 + Remote SQL: UPDATE public.base_tbl2 SET c = $2 WHERE ctid = $1 RETURNING a, b, c Update on public.async_p3 async_pt_3 -> Append - -> Foreign Update on public.async_p1 async_pt_1 - Remote SQL: UPDATE public.base_tbl1 SET c = (c || c) WHERE ((b = 0)) RETURNING a, b, c - -> Foreign Update on public.async_p2 async_pt_2 - Remote SQL: UPDATE public.base_tbl2 SET c = (c || c) WHERE ((b = 0)) RETURNING a, b, c + -> Foreign Scan on public.async_p1 async_pt_1 + Output: (async_pt_1.c || async_pt_1.c), async_pt_1.tableoid, async_pt_1.ctid, async_pt_1.* + Remote SQL: SELECT a, b, c, ctid FROM public.base_tbl1 WHERE ((b = 0)) FOR UPDATE + -> Foreign Scan on public.async_p2 async_pt_2 + Output: (async_pt_2.c || async_pt_2.c), async_pt_2.tableoid, async_pt_2.ctid, async_pt_2.* + Remote SQL: SELECT a, b, c, ctid FROM public.base_tbl2 WHERE ((b = 0)) FOR UPDATE -> Seq Scan on public.async_p3 async_pt_3 Output: (async_pt_3.c || async_pt_3.c), async_pt_3.tableoid, async_pt_3.ctid, NULL::record Filter: (async_pt_3.b = 0) -(13 rows) +(17 rows) UPDATE async_pt SET c = c || c WHERE b = 0 RETURNING *; a | b | c diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 4bdab30a73..cacc80585f 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -18,6 +18,7 @@ #include "access/sysattr.h" #include "access/table.h" #include "catalog/pg_class.h" +#include "catalog/pg_collation.h" #include "commands/defrem.h" #include "commands/explain.h" #include "commands/vacuum.h" @@ -5299,9 +5300,13 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid) " NULL, "); if (import_collate) - appendStringInfoString(&buf, - " collname, " - " collnsp.nspname "); + appendStringInfo(&buf, + " CASE WHEN coll.oid = '%u' THEN" + " defcoll.collname ELSE coll.collname END, " + " CASE WHEN coll.oid = '%u' THEN" + " defcoll.nspname ELSE collnsp.nspname END ", + DEFAULT_COLLATION_OID, + DEFAULT_COLLATION_OID); else appendStringInfoString(&buf, " NULL, NULL "); @@ -5317,11 +5322,28 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid) " adrelid = c.oid AND adnum = attnum "); if (import_collate) + { appendStringInfoString(&buf, " LEFT JOIN pg_collation coll ON " " coll.oid = attcollation " " LEFT JOIN pg_namespace collnsp ON " - " collnsp.oid = collnamespace "); + " collnsp.oid = collnamespace " + " LEFT JOIN (" + " SELECT cd.collname, nd.nspname FROM" + " pg_collation cd, pg_namespace nd, pg_database d" + " WHERE nd.oid = cd.collnamespace AND" + " d.datname = current_database() AND"); + /* collprovider is new as of v10 */ + if (PQserverVersion(conn) >= 100000) + appendStringInfoString(&buf, + " cd.collprovider = 'c' AND"); + appendStringInfoString(&buf, + " cd.collcollate = d.datcollate AND" + " cd.collctype = d.datctype AND" + " cd.collencoding IN (d.encoding, -1)" + " ORDER BY length(cd.collname) LIMIT 1 )" + " defcoll ON TRUE "); + } appendStringInfoString(&buf, "WHERE c.relkind IN (" diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql index 30b5175da5..a96898ed6b 100644 --- a/contrib/postgres_fdw/sql/postgres_fdw.sql +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -36,29 +36,29 @@ CREATE SCHEMA "S 1"; CREATE TABLE "S 1"."T 1" ( "C 1" int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10), + c6 varchar(10) collate "C", + c7 char(10) collate "C", c8 user_enum, CONSTRAINT t1_pkey PRIMARY KEY ("C 1") ); CREATE TABLE "S 1"."T 2" ( c1 int NOT NULL, - c2 text, + c2 text collate "C", CONSTRAINT t2_pkey PRIMARY KEY (c1) ); CREATE TABLE "S 1"."T 3" ( c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", CONSTRAINT t3_pkey PRIMARY KEY (c1) ); CREATE TABLE "S 1"."T 4" ( c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", CONSTRAINT t4_pkey PRIMARY KEY (c1) ); @@ -102,16 +102,18 @@ ANALYZE "S 1"."T 4"; -- =================================================================== -- create foreign tables +-- Note: to ensure stable regression results, all collatable columns +-- in these tables must have explicitly-specified collations. -- =================================================================== CREATE FOREIGN TABLE ft1 ( c0 int, c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10) default 'ft1', + c6 varchar(10) collate "C", + c7 char(10) default 'ft1' collate "C", c8 user_enum ) SERVER loopback; ALTER FOREIGN TABLE ft1 DROP COLUMN c0; @@ -120,11 +122,11 @@ CREATE FOREIGN TABLE ft2 ( c1 int NOT NULL, c2 int NOT NULL, cx int, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10) default 'ft2', + c6 varchar(10) collate "C", + c7 char(10) default 'ft2' collate "C", c8 user_enum ) SERVER loopback; ALTER FOREIGN TABLE ft2 DROP COLUMN cx; @@ -132,19 +134,19 @@ ALTER FOREIGN TABLE ft2 DROP COLUMN cx; CREATE FOREIGN TABLE ft4 ( c1 int NOT NULL, c2 int NOT NULL, - c3 text + c3 text collate "C" ) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3'); CREATE FOREIGN TABLE ft5 ( c1 int NOT NULL, c2 int NOT NULL, - c3 text + c3 text collate "C" ) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4'); CREATE FOREIGN TABLE ft6 ( c1 int NOT NULL, c2 int NOT NULL, - c3 text + c3 text collate "C" ) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4'); CREATE FOREIGN TABLE ft7 ( @@ -354,7 +356,7 @@ SELECT * FROM ft2 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft1 WHERE c1 < 5)); EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft2 ORDER BY ft2.c1, random(); EXPLAIN (VERBOSE, COSTS OFF) - SELECT * FROM ft2 ORDER BY ft2.c1, ft2.c3 collate "C"; + SELECT * FROM ft2 ORDER BY ft2.c1, ft2.c3 collate "POSIX"; -- user-defined operator/function CREATE FUNCTION postgres_fdw_abs(int) RETURNS int AS $$ @@ -1194,15 +1196,15 @@ create foreign table ft3 (f1 text collate "C", f2 text, f3 varchar(10)) -- can be sent to remote explain (verbose, costs off) select * from ft3 where f1 = 'foo'; explain (verbose, costs off) select * from ft3 where f1 COLLATE "C" = 'foo'; -explain (verbose, costs off) select * from ft3 where f2 = 'foo'; -explain (verbose, costs off) select * from ft3 where f3 = 'foo'; -explain (verbose, costs off) select * from ft3 f, loct3 l - where f.f3 = l.f3 and l.f1 = 'foo'; +explain (verbose, costs off) select * from ft3 where f1 = 'foo' COLLATE "C"; -- can't be sent to remote explain (verbose, costs off) select * from ft3 where f1 COLLATE "POSIX" = 'foo'; -explain (verbose, costs off) select * from ft3 where f1 = 'foo' COLLATE "C"; +explain (verbose, costs off) select * from ft3 where f2 = 'foo'; explain (verbose, costs off) select * from ft3 where f2 COLLATE "C" = 'foo'; explain (verbose, costs off) select * from ft3 where f2 = 'foo' COLLATE "C"; +explain (verbose, costs off) select * from ft3 where f3 = 'foo'; +explain (verbose, costs off) select * from ft3 f, loct3 l + where f.f3 = l.f3 and l.f1 = 'foo'; explain (verbose, costs off) select * from ft3 f, loct3 l where f.f3 = l.f3 COLLATE "POSIX" and l.f1 = 'foo'; @@ -2065,12 +2067,12 @@ drop table loct1; drop table loct2; -- Test pushing down UPDATE/DELETE joins to the remote server -create table parent (a int, b text); -create table loct1 (a int, b text); -create table loct2 (a int, b text); -create foreign table remt1 (a int, b text) +create table parent (a int, b text collate "C"); +create table loct1 (a int, b text collate "C"); +create table loct2 (a int, b text collate "C"); +create foreign table remt1 (a int, b text collate "C") server loopback options (table_name 'loct1'); -create foreign table remt2 (a int, b text) +create foreign table remt2 (a int, b text collate "C") server loopback options (table_name 'loct2'); alter foreign table remt1 inherit parent; @@ -2163,10 +2165,10 @@ drop table loct1; drop table loct2; -- Test update tuple routing -create table utrtest (a int, b text) partition by list (a); -create table loct (a int check (a in (1)), b text); -create foreign table remp (a int check (a in (1)), b text) server loopback options (table_name 'loct'); -create table locp (a int check (a in (2)), b text); +create table utrtest (a int, b text collate "C") partition by list (a); +create table loct (a int check (a in (1)), b text collate "C"); +create foreign table remp (a int check (a in (1)), b text collate "C") server loopback options (table_name 'loct'); +create table locp (a int check (a in (2)), b text collate "C"); alter table utrtest attach partition remp for values in (1); alter table utrtest attach partition locp for values in (2); @@ -2243,7 +2245,7 @@ alter table utrtest detach partition remp; drop foreign table remp; alter table loct drop constraint loct_a_check; alter table loct add check (a in (3)); -create foreign table remp (a int check (a in (3)), b text) server loopback options (table_name 'loct'); +create foreign table remp (a int check (a in (3)), b text collate "C") server loopback options (table_name 'loct'); alter table utrtest attach partition remp for values in (3); insert into utrtest values (2, 'qux'); insert into utrtest values (3, 'xyzzy'); @@ -2553,11 +2555,11 @@ tru_rtable_parent,tru_rtable_child, tru_rtable0_child; -- =================================================================== CREATE SCHEMA import_source; -CREATE TABLE import_source.t1 (c1 int, c2 varchar NOT NULL); -CREATE TABLE import_source.t2 (c1 int default 42, c2 varchar NULL, c3 text collate "POSIX"); -CREATE TYPE typ1 AS (m1 int, m2 varchar); +CREATE TABLE import_source.t1 (c1 int, c2 varchar NOT NULL COLLATE "C"); +CREATE TABLE import_source.t2 (c1 int default 42, c2 varchar NULL COLLATE "C", c3 text collate "POSIX"); +CREATE TYPE typ1 AS (m1 int, m2 varchar COLLATE "C"); CREATE TABLE import_source.t3 (c1 timestamptz default now(), c2 typ1); -CREATE TABLE import_source."x 4" (c1 float8, "C 2" text, c3 varchar(42)); +CREATE TABLE import_source."x 4" (c1 float8, "C 2" text COLLATE "C", c3 varchar(42) COLLATE "C"); CREATE TABLE import_source."x 5" (c1 float8); ALTER TABLE import_source."x 5" DROP COLUMN c1; CREATE TABLE import_source."x 6" (c1 int, c2 int generated always as (c1 * 2) stored); diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml index 0075bc3dbb..fc368db47a 100644 --- a/doc/src/sgml/postgres-fdw.sgml +++ b/doc/src/sgml/postgres-fdw.sgml @@ -97,6 +97,12 @@ need, surprising semantic anomalies may arise when types or collations do not match, due to the remote server interpreting <literal>WHERE</literal> clauses slightly differently from the local server. + Also, while you need not apply any explicit <literal>COLLATE</literal> + clause to a collatable column of a foreign table, not doing so will cause + the planner to assume that the remote column's collation does not match any + locally available collation. This will generally result in not sending + <literal>WHERE</literal> clauses involving such a column to the remote + server, which may greatly degrade query performance. </para> <para> @@ -537,6 +543,10 @@ OPTIONS (ADD password_required 'false'); need to turn this off if the remote server has a different set of collation names than the local server does, which is likely to be the case if it's running on a different operating system. + (Note, however, that simply turning this off may result in + non-optimal plans for queries on the imported tables. It's best to + find a way to label collatable columns with collations that match the + remote server's behavior.) </para> </listitem> </varlistentry>
pgsql-hackers by date: