Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails |
Date | |
Msg-id | 1311836.1597781384@sss.pgh.pa.us Whole thread Raw |
Responses |
Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
|
List | pgsql-hackers |
I wrote: >> So I think what is happening here is that postgres_fdw's version of >> IMPORT FOREIGN SCHEMA translates "COLLATE default" on the remote >> server to "COLLATE default" on the local one, which of course is >> a big fail if the defaults don't match. That allows the local >> planner to believe that remote ORDER BYs on the two foreign tables >> will give compatible results, causing the merge join to not work >> very well at all. Here's a full patch addressing this issue. I decided that the best way to address the test-instability problem is to explicitly give collations to all the foreign-table columns for which it matters in the postgres_fdw test. (For portability's sake, that has to be "C" or "POSIX"; I mostly used "C".) Aside from ensuring that the test still passes with some other prevailing locale, this seems like a good idea since we'll then be testing the case we are encouraging users to use. And indeed, it immediately turned up a new problem: if we explicitly assign a collation to a foreign-table column c, the system won't ship WHERE clauses as simple as "c = 'foo'" to the remote. This surprised me, but the reason turned out to be that what postgres_fdw is actually seeing is something like {OPEXPR :opno 98 :opfuncid 67 :opresulttype 16 :opretset false :opcollid 0 :inputcollid 950 :args ( {VAR :varno 6 :varattno 4 :vartype 25 :vartypmod -1 :varcollid 950 :varlevelsup 0 :varnosyn 6 :varattnosyn 4 :location 171 } {RELABELTYPE :arg {CONST :consttype 25 :consttypmod -1 :constcollid 100 :constlen -1 :constbyval false :constisnull false :location 341 :constvalue 9 [ 36 0 0 0 48 48 48 48 49 ] } :resulttype 25 :resulttypmod -1 :resultcollid 950 :relabelformat 2 :location -1 } ) :location -1 } that is, the constant is being explicitly relabeled with the correct collation, and thus is_foreign_expr() thinks the collation shown by the RelabelType node is an unsafely-introduced collation. What I did about this was to change the recursion rule in foreign_expr_walker() so that merging a safely-derived collation with the same collation unsafely derived is considered safe. I think this is all right, and it allows us to accept some cases that previously were rejected as unsafe. But I might be missing something. (BTW, there's an independent bug here, which is that we're getting a tree of the above shape rather than a simple Const with the appropriate collation; that is, this tree isn't fully const-folded. This is a bug in canonicalize_ec_expression, which I'll go fix separately. But it won't affect the problem at hand.) This seems like a sufficiently large change in postgres_fdw's behavior to require review, so I'll go add this to the next CF. regards, tom lane diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index ad37a74221..f0693a061c 100644 --- a/contrib/postgres_fdw/deparse.c +++ b/contrib/postgres_fdw/deparse.c @@ -320,7 +320,9 @@ foreign_expr_walker(Node *node, /* * If the Var is from the foreign table, we consider its - * collation (if any) safe to use. If it is from another + * collation (if any) safe to use, *unless* it's + * DEFAULT_COLLATION_OID. We treat that as meaning "we don't + * know which collation this is". If it is from another * table, we treat its collation the same way as we would a * Param's collation, ie it's not safe for it to have a * non-default collation. @@ -342,7 +344,12 @@ foreign_expr_walker(Node *node, /* Else check the collation */ collation = var->varcollid; - state = OidIsValid(collation) ? FDW_COLLATE_SAFE : FDW_COLLATE_NONE; + if (collation == InvalidOid) + state = FDW_COLLATE_NONE; + else if (collation == DEFAULT_COLLATION_OID) + state = FDW_COLLATE_UNSAFE; + else + state = FDW_COLLATE_SAFE; } else { @@ -808,8 +815,24 @@ foreign_expr_walker(Node *node, /* * Now, merge my collation information into my parent's state. + * + * If one branch of an expression derives a non-default collation safely + * (that is, from a foreign Var) and another one derives the same + * collation unsafely, we can consider the expression safe overall. This + * allows cases such as "foreign_var = ('foo' COLLATE x)" where x is the + * same collation the foreign_var has anyway. Note that we will not ship + * any explicit COLLATE clause to the remote, but rely on it to re-derive + * the correct collation based on the foreign_var. */ - if (state > outer_cxt->state) + if (collation == outer_cxt->collation && + ((state == FDW_COLLATE_UNSAFE && + outer_cxt->state == FDW_COLLATE_SAFE) || + (state == FDW_COLLATE_SAFE && + outer_cxt->state == FDW_COLLATE_UNSAFE))) + { + outer_cxt->state = FDW_COLLATE_SAFE; + } + else if (state > outer_cxt->state) { /* Override previous parent state */ outer_cxt->collation = collation; diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index 90db550b92..4707e75d27 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -27,29 +27,29 @@ CREATE SCHEMA "S 1"; CREATE TABLE "S 1"."T 1" ( "C 1" int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10), + c6 varchar(10) collate "C", + c7 char(10) collate "C", c8 user_enum, CONSTRAINT t1_pkey PRIMARY KEY ("C 1") ); CREATE TABLE "S 1"."T 2" ( c1 int NOT NULL, - c2 text, + c2 text collate "C", CONSTRAINT t2_pkey PRIMARY KEY (c1) ); CREATE TABLE "S 1"."T 3" ( c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", CONSTRAINT t3_pkey PRIMARY KEY (c1) ); CREATE TABLE "S 1"."T 4" ( c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", CONSTRAINT t4_pkey PRIMARY KEY (c1) ); -- Disable autovacuum for these tables to avoid unexpected effects of that @@ -89,16 +89,18 @@ ANALYZE "S 1"."T 3"; ANALYZE "S 1"."T 4"; -- =================================================================== -- create foreign tables +-- Note: to ensure stable regression results, all collatable columns +-- in these tables must have explicitly-specified collations. -- =================================================================== CREATE FOREIGN TABLE ft1 ( c0 int, c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10) default 'ft1', + c6 varchar(10) collate "C", + c7 char(10) default 'ft1' collate "C", c8 user_enum ) SERVER loopback; ALTER FOREIGN TABLE ft1 DROP COLUMN c0; @@ -106,28 +108,28 @@ CREATE FOREIGN TABLE ft2 ( c1 int NOT NULL, c2 int NOT NULL, cx int, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10) default 'ft2', + c6 varchar(10) collate "C", + c7 char(10) default 'ft2' collate "C", c8 user_enum ) SERVER loopback; ALTER FOREIGN TABLE ft2 DROP COLUMN cx; CREATE FOREIGN TABLE ft4 ( c1 int NOT NULL, c2 int NOT NULL, - c3 text + c3 text collate "C" ) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3'); CREATE FOREIGN TABLE ft5 ( c1 int NOT NULL, c2 int NOT NULL, - c3 text + c3 text collate "C" ) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4'); CREATE FOREIGN TABLE ft6 ( c1 int NOT NULL, c2 int NOT NULL, - c3 text + c3 text collate "C" ) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4'); -- =================================================================== -- tests for validator @@ -277,7 +279,7 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 ORDER BY t1.c3, t1.c1, t1.tabl Output: c1, c2, c3, c4, c5, c6, c7, c8, tableoid -> Sort Output: c1, c2, c3, c4, c5, c6, c7, c8, tableoid - Sort Key: t1.c3, t1.c1, t1.tableoid + Sort Key: t1.c3 COLLATE "C", t1.c1, t1.tableoid -> Foreign Scan on public.ft1 t1 Output: c1, c2, c3, c4, c5, c6, c7, c8, tableoid Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" @@ -881,12 +883,12 @@ EXPLAIN (VERBOSE, COSTS OFF) (6 rows) EXPLAIN (VERBOSE, COSTS OFF) - SELECT * FROM ft2 ORDER BY ft2.c1, ft2.c3 collate "C"; + SELECT * FROM ft2 ORDER BY ft2.c1, ft2.c3 collate "POSIX"; QUERY PLAN ------------------------------------------------------------------------------- Sort Output: c1, c2, c3, c4, c5, c6, c7, c8, ((c3)::text) - Sort Key: ft2.c1, ft2.c3 COLLATE "C" + Sort Key: ft2.c1, ft2.c3 COLLATE "POSIX" -> Foreign Scan on public.ft2 Output: c1, c2, c3, c4, c5, c6, c7, c8, c3 Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" @@ -1808,7 +1810,7 @@ WITH t (c1_1, c1_3, c2_1) AS MATERIALIZED (SELECT t1.c1, t1.c3, t2.c1 FROM ft1 t Remote SQL: SELECT r1."C 1", r1.c3, r2."C 1" FROM ("S 1"."T 1" r1 INNER JOIN "S 1"."T 1" r2 ON (((r1."C 1" =r2."C 1")))) -> Sort Output: t.c1_1, t.c2_1, t.c1_3 - Sort Key: t.c1_3, t.c1_1 + Sort Key: t.c1_3 COLLATE "C", t.c1_1 -> CTE Scan on t Output: t.c1_1, t.c2_1, t.c1_3 (12 rows) @@ -2008,7 +2010,7 @@ SELECT t1.c1, t2.c1 FROM ft1 t1 LEFT JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 Output: t1.c1, t2.c1, t1.c3 -> Sort Output: t1.c1, t2.c1, t1.c3 - Sort Key: t1.c3, t1.c1 + Sort Key: t1.c3 COLLATE "C", t1.c1 -> Hash Right Join Output: t1.c1, t2.c1, t1.c3 Hash Cond: (t2.c1 = t1.c1) @@ -2050,7 +2052,7 @@ SELECT t1.c1, t2.c1 FROM ft1 t1 JOIN ft2 t2 ON (t1.c1 = t2.c1) WHERE t1.c8 = t2. Output: t1.c1, t2.c1, t1.c3 -> Sort Output: t1.c1, t2.c1, t1.c3 - Sort Key: t1.c3, t1.c1 + Sort Key: t1.c3 COLLATE "C", t1.c1 -> Foreign Scan Output: t1.c1, t2.c1, t1.c3 Filter: (t1.c8 = t2.c8) @@ -3495,7 +3497,7 @@ select c2, c6, sum(c1) from ft1 where c2 < 3 group by grouping sets(c2, c6) orde ---------------------------------------------------------------------------------- Sort Output: c2, c6, (sum(c1)) - Sort Key: ft1.c2, ft1.c6 + Sort Key: ft1.c2, ft1.c6 COLLATE "C" -> HashAggregate Output: c2, c6, sum(c1) Hash Key: ft1.c2 @@ -3879,12 +3881,12 @@ EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st6; PREPARE st7 AS INSERT INTO ft1 (c1,c2,c3) VALUES (1001,101,'foo'); EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st7; - QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + QUERY PLAN +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Insert on public.ft1 Remote SQL: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8) -> Result - Output: NULL::integer, 1001, 101, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time zone,NULL::character varying, 'ft1 '::character(10), NULL::user_enum + Output: NULL::integer, 1001, 101, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time zone,NULL::character varying COLLATE "C", 'ft1 '::character(10), NULL::user_enum (4 rows) ALTER TABLE "S 1"."T 1" RENAME TO "T 0"; @@ -3912,12 +3914,12 @@ EXECUTE st6; (9 rows) EXPLAIN (VERBOSE, COSTS OFF) EXECUTE st7; - QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + QUERY PLAN +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Insert on public.ft1 Remote SQL: INSERT INTO "S 1"."T 0"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8) -> Result - Output: NULL::integer, 1001, 101, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time zone,NULL::character varying, 'ft1 '::character(10), NULL::user_enum + Output: NULL::integer, 1001, 101, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time zone,NULL::character varying COLLATE "C", 'ft1 '::character(10), NULL::user_enum (4 rows) ALTER TABLE "S 1"."T 0" RENAME TO "T 1"; @@ -4125,36 +4127,14 @@ explain (verbose, costs off) select * from ft3 where f1 COLLATE "C" = 'foo'; Remote SQL: SELECT f1, f2, f3 FROM public.loct3 WHERE ((f1 = 'foo'::text)) (3 rows) -explain (verbose, costs off) select * from ft3 where f2 = 'foo'; - QUERY PLAN ------------------------------------------------------------------------------- - Foreign Scan on public.ft3 - Output: f1, f2, f3 - Remote SQL: SELECT f1, f2, f3 FROM public.loct3 WHERE ((f2 = 'foo'::text)) -(3 rows) - -explain (verbose, costs off) select * from ft3 where f3 = 'foo'; +explain (verbose, costs off) select * from ft3 where f1 = 'foo' COLLATE "C"; QUERY PLAN ------------------------------------------------------------------------------ Foreign Scan on public.ft3 Output: f1, f2, f3 - Remote SQL: SELECT f1, f2, f3 FROM public.loct3 WHERE ((f3 = 'foo'::text)) + Remote SQL: SELECT f1, f2, f3 FROM public.loct3 WHERE ((f1 = 'foo'::text)) (3 rows) -explain (verbose, costs off) select * from ft3 f, loct3 l - where f.f3 = l.f3 and l.f1 = 'foo'; - QUERY PLAN --------------------------------------------------------------------------------------------------- - Nested Loop - Output: f.f1, f.f2, f.f3, l.f1, l.f2, l.f3 - -> Index Scan using loct3_f1_key on public.loct3 l - Output: l.f1, l.f2, l.f3 - Index Cond: (l.f1 = 'foo'::text) - -> Foreign Scan on public.ft3 f - Output: f.f1, f.f2, f.f3 - Remote SQL: SELECT f1, f2, f3 FROM public.loct3 WHERE (($1::character varying(10) = f3)) -(8 rows) - -- can't be sent to remote explain (verbose, costs off) select * from ft3 where f1 COLLATE "POSIX" = 'foo'; QUERY PLAN @@ -4165,12 +4145,12 @@ explain (verbose, costs off) select * from ft3 where f1 COLLATE "POSIX" = 'foo'; Remote SQL: SELECT f1, f2, f3 FROM public.loct3 (4 rows) -explain (verbose, costs off) select * from ft3 where f1 = 'foo' COLLATE "C"; +explain (verbose, costs off) select * from ft3 where f2 = 'foo'; QUERY PLAN --------------------------------------------------- Foreign Scan on public.ft3 Output: f1, f2, f3 - Filter: (ft3.f1 = 'foo'::text COLLATE "C") + Filter: (ft3.f2 = 'foo'::text) Remote SQL: SELECT f1, f2, f3 FROM public.loct3 (4 rows) @@ -4192,6 +4172,33 @@ explain (verbose, costs off) select * from ft3 where f2 = 'foo' COLLATE "C"; Remote SQL: SELECT f1, f2, f3 FROM public.loct3 (4 rows) +explain (verbose, costs off) select * from ft3 where f3 = 'foo'; + QUERY PLAN +--------------------------------------------------- + Foreign Scan on public.ft3 + Output: f1, f2, f3 + Filter: ((ft3.f3)::text = 'foo'::text) + Remote SQL: SELECT f1, f2, f3 FROM public.loct3 +(4 rows) + +explain (verbose, costs off) select * from ft3 f, loct3 l + where f.f3 = l.f3 and l.f1 = 'foo'; + QUERY PLAN +------------------------------------------------------------- + Hash Join + Output: f.f1, f.f2, f.f3, l.f1, l.f2, l.f3 + Inner Unique: true + Hash Cond: ((f.f3)::text = (l.f3)::text) + -> Foreign Scan on public.ft3 f + Output: f.f1, f.f2, f.f3 + Remote SQL: SELECT f1, f2, f3 FROM public.loct3 + -> Hash + Output: l.f1, l.f2, l.f3 + -> Index Scan using loct3_f1_key on public.loct3 l + Output: l.f1, l.f2, l.f3 + Index Cond: (l.f1 = 'foo'::text) +(12 rows) + explain (verbose, costs off) select * from ft3 f, loct3 l where f.f3 = l.f3 COLLATE "POSIX" and l.f1 = 'foo'; QUERY PLAN @@ -4215,12 +4222,12 @@ explain (verbose, costs off) select * from ft3 f, loct3 l -- =================================================================== EXPLAIN (verbose, costs off) INSERT INTO ft2 (c1,c2,c3) SELECT c1+1000,c2+100, c3 || c3 FROM ft2 LIMIT 20; - QUERYPLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + QUERY PLAN +-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Insert on public.ft2 Remote SQL: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8) -> Subquery Scan on "*SELECT*" - Output: "*SELECT*"."?column?", "*SELECT*"."?column?_1", NULL::integer, "*SELECT*"."?column?_2", NULL::timestampwith time zone, NULL::timestamp without time zone, NULL::character varying, 'ft2 '::character(10), NULL::user_enum + Output: "*SELECT*"."?column?", "*SELECT*"."?column?_1", NULL::integer, "*SELECT*"."?column?_2", NULL::timestampwith time zone, NULL::timestamp without time zone, NULL::character varying COLLATE "C", 'ft2 '::character(10),NULL::user_enum -> Foreign Scan on public.ft2 ft2_1 Output: (ft2_1.c1 + 1000), (ft2_1.c2 + 100), (ft2_1.c3 || ft2_1.c3) Remote SQL: SELECT "C 1", c2, c3 FROM "S 1"."T 1" LIMIT 20::bigint @@ -5330,13 +5337,13 @@ SELECT c1,c2,c3,c4 FROM ft2 ORDER BY c1; EXPLAIN (verbose, costs off) INSERT INTO ft2 (c1,c2,c3) VALUES (1200,999,'foo') RETURNING tableoid::regclass; - QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + QUERY PLAN +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Insert on public.ft2 Output: (ft2.tableoid)::regclass Remote SQL: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8) -> Result - Output: 1200, 999, NULL::integer, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time zone,NULL::character varying, 'ft2 '::character(10), NULL::user_enum + Output: 1200, 999, NULL::integer, 'foo'::text, NULL::timestamp with time zone, NULL::timestamp without time zone,NULL::character varying COLLATE "C", 'ft2 '::character(10), NULL::user_enum (5 rows) INSERT INTO ft2 (c1,c2,c3) VALUES (1200,999,'foo') RETURNING tableoid::regclass; @@ -7551,12 +7558,12 @@ NOTICE: drop cascades to foreign table bar2 drop table loct1; drop table loct2; -- Test pushing down UPDATE/DELETE joins to the remote server -create table parent (a int, b text); -create table loct1 (a int, b text); -create table loct2 (a int, b text); -create foreign table remt1 (a int, b text) +create table parent (a int, b text collate "C"); +create table loct1 (a int, b text collate "C"); +create table loct2 (a int, b text collate "C"); +create foreign table remt1 (a int, b text collate "C") server loopback options (table_name 'loct1'); -create foreign table remt2 (a int, b text) +create foreign table remt2 (a int, b text collate "C") server loopback options (table_name 'loct2'); alter foreign table remt1 inherit parent; insert into remt1 values (1, 'foo'); @@ -7755,10 +7762,10 @@ drop table itrtest; drop table loct1; drop table loct2; -- Test update tuple routing -create table utrtest (a int, b text) partition by list (a); -create table loct (a int check (a in (1)), b text); -create foreign table remp (a int check (a in (1)), b text) server loopback options (table_name 'loct'); -create table locp (a int check (a in (2)), b text); +create table utrtest (a int, b text collate "C") partition by list (a); +create table loct (a int check (a in (1)), b text collate "C"); +create foreign table remp (a int check (a in (1)), b text collate "C") server loopback options (table_name 'loct'); +create table locp (a int check (a in (2)), b text collate "C"); alter table utrtest attach partition remp for values in (1); alter table utrtest attach partition locp for values in (2); insert into utrtest values (1, 'foo'); @@ -7942,7 +7949,7 @@ alter table utrtest detach partition remp; drop foreign table remp; alter table loct drop constraint loct_a_check; alter table loct add check (a in (3)); -create foreign table remp (a int check (a in (3)), b text) server loopback options (table_name 'loct'); +create foreign table remp (a int check (a in (3)), b text collate "C") server loopback options (table_name 'loct'); alter table utrtest attach partition remp for values in (3); insert into utrtest values (2, 'qux'); insert into utrtest values (3, 'xyzzy'); @@ -8203,11 +8210,11 @@ drop table loc3; -- test IMPORT FOREIGN SCHEMA -- =================================================================== CREATE SCHEMA import_source; -CREATE TABLE import_source.t1 (c1 int, c2 varchar NOT NULL); -CREATE TABLE import_source.t2 (c1 int default 42, c2 varchar NULL, c3 text collate "POSIX"); -CREATE TYPE typ1 AS (m1 int, m2 varchar); +CREATE TABLE import_source.t1 (c1 int, c2 varchar NOT NULL COLLATE "C"); +CREATE TABLE import_source.t2 (c1 int default 42, c2 varchar NULL COLLATE "C", c3 text collate "POSIX"); +CREATE TYPE typ1 AS (m1 int, m2 varchar COLLATE "C"); CREATE TABLE import_source.t3 (c1 timestamptz default now(), c2 typ1); -CREATE TABLE import_source."x 4" (c1 float8, "C 2" text, c3 varchar(42)); +CREATE TABLE import_source."x 4" (c1 float8, "C 2" text COLLATE "C", c3 varchar(42) COLLATE "C"); CREATE TABLE import_source."x 5" (c1 float8); ALTER TABLE import_source."x 5" DROP COLUMN c1; CREATE TABLE import_source.t4 (c1 int) PARTITION BY RANGE (c1); @@ -8232,7 +8239,7 @@ IMPORT FOREIGN SCHEMA import_source FROM SERVER loopback INTO import_dest1; Column | Type | Collation | Nullable | Default | FDW options --------+-------------------+-----------+----------+---------+-------------------- c1 | integer | | | | (column_name 'c1') - c2 | character varying | | not null | | (column_name 'c2') + c2 | character varying | C | not null | | (column_name 'c2') Server: loopback FDW options: (schema_name 'import_source', table_name 't1') @@ -8240,7 +8247,7 @@ FDW options: (schema_name 'import_source', table_name 't1') Column | Type | Collation | Nullable | Default | FDW options --------+-------------------+-----------+----------+---------+-------------------- c1 | integer | | | | (column_name 'c1') - c2 | character varying | | | | (column_name 'c2') + c2 | character varying | C | | | (column_name 'c2') c3 | text | POSIX | | | (column_name 'c3') Server: loopback FDW options: (schema_name 'import_source', table_name 't2') @@ -8264,8 +8271,8 @@ FDW options: (schema_name 'import_source', table_name 't4') Column | Type | Collation | Nullable | Default | FDW options --------+-----------------------+-----------+----------+---------+--------------------- c1 | double precision | | | | (column_name 'c1') - C 2 | text | | | | (column_name 'C 2') - c3 | character varying(42) | | | | (column_name 'c3') + C 2 | text | C | | | (column_name 'C 2') + c3 | character varying(42) | C | | | (column_name 'c3') Server: loopback FDW options: (schema_name 'import_source', table_name 'x 4') @@ -8296,7 +8303,7 @@ IMPORT FOREIGN SCHEMA import_source FROM SERVER loopback INTO import_dest2 Column | Type | Collation | Nullable | Default | FDW options --------+-------------------+-----------+----------+---------+-------------------- c1 | integer | | | | (column_name 'c1') - c2 | character varying | | not null | | (column_name 'c2') + c2 | character varying | C | not null | | (column_name 'c2') Server: loopback FDW options: (schema_name 'import_source', table_name 't1') @@ -8304,7 +8311,7 @@ FDW options: (schema_name 'import_source', table_name 't1') Column | Type | Collation | Nullable | Default | FDW options --------+-------------------+-----------+----------+---------+-------------------- c1 | integer | | | 42 | (column_name 'c1') - c2 | character varying | | | | (column_name 'c2') + c2 | character varying | C | | | (column_name 'c2') c3 | text | POSIX | | | (column_name 'c3') Server: loopback FDW options: (schema_name 'import_source', table_name 't2') @@ -8328,8 +8335,8 @@ FDW options: (schema_name 'import_source', table_name 't4') Column | Type | Collation | Nullable | Default | FDW options --------+-----------------------+-----------+----------+---------+--------------------- c1 | double precision | | | | (column_name 'c1') - C 2 | text | | | | (column_name 'C 2') - c3 | character varying(42) | | | | (column_name 'c3') + C 2 | text | C | | | (column_name 'C 2') + c3 | character varying(42) | C | | | (column_name 'c3') Server: loopback FDW options: (schema_name 'import_source', table_name 'x 4') diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index 9fc53cad68..6e1b391808 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -18,6 +18,7 @@ #include "access/sysattr.h" #include "access/table.h" #include "catalog/pg_class.h" +#include "catalog/pg_collation.h" #include "commands/defrem.h" #include "commands/explain.h" #include "commands/vacuum.h" @@ -4802,44 +4803,54 @@ postgresImportForeignSchema(ImportForeignSchemaStmt *stmt, Oid serverOid) * include a schema name for types/functions in other schemas, which * is what we want. */ + appendStringInfoString(&buf, + "SELECT relname, " + " attname, " + " format_type(atttypid, atttypmod), " + " attnotnull, " + " pg_get_expr(adbin, adrelid), "); + if (import_collate) + appendStringInfo(&buf, + " CASE WHEN coll.oid = '%u' THEN" + " defcoll.collname ELSE coll.collname END, " + " CASE WHEN coll.oid = '%u' THEN" + " defcoll.nspname ELSE collnsp.nspname END ", + DEFAULT_COLLATION_OID, + DEFAULT_COLLATION_OID); + else + appendStringInfoString(&buf, " NULL, NULL "); + appendStringInfoString(&buf, + "FROM pg_class c " + " JOIN pg_namespace n ON " + " relnamespace = n.oid " + " LEFT JOIN pg_attribute a ON " + " attrelid = c.oid AND attnum > 0 " + " AND NOT attisdropped " + " LEFT JOIN pg_attrdef ad ON " + " adrelid = c.oid AND adnum = attnum "); if (import_collate) + { appendStringInfoString(&buf, - "SELECT relname, " - " attname, " - " format_type(atttypid, atttypmod), " - " attnotnull, " - " pg_get_expr(adbin, adrelid), " - " collname, " - " collnsp.nspname " - "FROM pg_class c " - " JOIN pg_namespace n ON " - " relnamespace = n.oid " - " LEFT JOIN pg_attribute a ON " - " attrelid = c.oid AND attnum > 0 " - " AND NOT attisdropped " - " LEFT JOIN pg_attrdef ad ON " - " adrelid = c.oid AND adnum = attnum " " LEFT JOIN pg_collation coll ON " " coll.oid = attcollation " " LEFT JOIN pg_namespace collnsp ON " - " collnsp.oid = collnamespace "); - else + " collnsp.oid = collnamespace " + " LEFT JOIN (" + " SELECT cd.collname, nd.nspname FROM" + " pg_collation cd, pg_namespace nd, pg_database d" + " WHERE nd.oid = cd.collnamespace AND" + " d.datname = current_database() AND"); + /* collprovider is new as of v10 */ + if (PQserverVersion(conn) >= 100000) + appendStringInfoString(&buf, + " cd.collprovider = 'c' AND"); appendStringInfoString(&buf, - "SELECT relname, " - " attname, " - " format_type(atttypid, atttypmod), " - " attnotnull, " - " pg_get_expr(adbin, adrelid), " - " NULL, NULL " - "FROM pg_class c " - " JOIN pg_namespace n ON " - " relnamespace = n.oid " - " LEFT JOIN pg_attribute a ON " - " attrelid = c.oid AND attnum > 0 " - " AND NOT attisdropped " - " LEFT JOIN pg_attrdef ad ON " - " adrelid = c.oid AND adnum = attnum "); - + " cd.collcollate = d.datcollate AND" + " cd.collctype = d.datctype AND" + " cd.collencoding IN (d.encoding, -1)" + " ORDER BY length(cd.collname) LIMIT 1 )" + " defcoll ON TRUE "); + } appendStringInfoString(&buf, "WHERE c.relkind IN (" CppAsString2(RELKIND_RELATION) "," diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql index 83971665e3..8454f7ae27 100644 --- a/contrib/postgres_fdw/sql/postgres_fdw.sql +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -31,29 +31,29 @@ CREATE SCHEMA "S 1"; CREATE TABLE "S 1"."T 1" ( "C 1" int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10), + c6 varchar(10) collate "C", + c7 char(10) collate "C", c8 user_enum, CONSTRAINT t1_pkey PRIMARY KEY ("C 1") ); CREATE TABLE "S 1"."T 2" ( c1 int NOT NULL, - c2 text, + c2 text collate "C", CONSTRAINT t2_pkey PRIMARY KEY (c1) ); CREATE TABLE "S 1"."T 3" ( c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", CONSTRAINT t3_pkey PRIMARY KEY (c1) ); CREATE TABLE "S 1"."T 4" ( c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", CONSTRAINT t4_pkey PRIMARY KEY (c1) ); @@ -97,16 +97,18 @@ ANALYZE "S 1"."T 4"; -- =================================================================== -- create foreign tables +-- Note: to ensure stable regression results, all collatable columns +-- in these tables must have explicitly-specified collations. -- =================================================================== CREATE FOREIGN TABLE ft1 ( c0 int, c1 int NOT NULL, c2 int NOT NULL, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10) default 'ft1', + c6 varchar(10) collate "C", + c7 char(10) default 'ft1' collate "C", c8 user_enum ) SERVER loopback; ALTER FOREIGN TABLE ft1 DROP COLUMN c0; @@ -115,11 +117,11 @@ CREATE FOREIGN TABLE ft2 ( c1 int NOT NULL, c2 int NOT NULL, cx int, - c3 text, + c3 text collate "C", c4 timestamptz, c5 timestamp, - c6 varchar(10), - c7 char(10) default 'ft2', + c6 varchar(10) collate "C", + c7 char(10) default 'ft2' collate "C", c8 user_enum ) SERVER loopback; ALTER FOREIGN TABLE ft2 DROP COLUMN cx; @@ -127,19 +129,19 @@ ALTER FOREIGN TABLE ft2 DROP COLUMN cx; CREATE FOREIGN TABLE ft4 ( c1 int NOT NULL, c2 int NOT NULL, - c3 text + c3 text collate "C" ) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 3'); CREATE FOREIGN TABLE ft5 ( c1 int NOT NULL, c2 int NOT NULL, - c3 text + c3 text collate "C" ) SERVER loopback OPTIONS (schema_name 'S 1', table_name 'T 4'); CREATE FOREIGN TABLE ft6 ( c1 int NOT NULL, c2 int NOT NULL, - c3 text + c3 text collate "C" ) SERVER loopback2 OPTIONS (schema_name 'S 1', table_name 'T 4'); -- =================================================================== @@ -332,7 +334,7 @@ SELECT * FROM ft2 WHERE c1 = ANY (ARRAY(SELECT c1 FROM ft1 WHERE c1 < 5)); EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft2 ORDER BY ft2.c1, random(); EXPLAIN (VERBOSE, COSTS OFF) - SELECT * FROM ft2 ORDER BY ft2.c1, ft2.c3 collate "C"; + SELECT * FROM ft2 ORDER BY ft2.c1, ft2.c3 collate "POSIX"; -- user-defined operator/function CREATE FUNCTION postgres_fdw_abs(int) RETURNS int AS $$ @@ -1119,15 +1121,15 @@ create foreign table ft3 (f1 text collate "C", f2 text, f3 varchar(10)) -- can be sent to remote explain (verbose, costs off) select * from ft3 where f1 = 'foo'; explain (verbose, costs off) select * from ft3 where f1 COLLATE "C" = 'foo'; -explain (verbose, costs off) select * from ft3 where f2 = 'foo'; -explain (verbose, costs off) select * from ft3 where f3 = 'foo'; -explain (verbose, costs off) select * from ft3 f, loct3 l - where f.f3 = l.f3 and l.f1 = 'foo'; +explain (verbose, costs off) select * from ft3 where f1 = 'foo' COLLATE "C"; -- can't be sent to remote explain (verbose, costs off) select * from ft3 where f1 COLLATE "POSIX" = 'foo'; -explain (verbose, costs off) select * from ft3 where f1 = 'foo' COLLATE "C"; +explain (verbose, costs off) select * from ft3 where f2 = 'foo'; explain (verbose, costs off) select * from ft3 where f2 COLLATE "C" = 'foo'; explain (verbose, costs off) select * from ft3 where f2 = 'foo' COLLATE "C"; +explain (verbose, costs off) select * from ft3 where f3 = 'foo'; +explain (verbose, costs off) select * from ft3 f, loct3 l + where f.f3 = l.f3 and l.f1 = 'foo'; explain (verbose, costs off) select * from ft3 f, loct3 l where f.f3 = l.f3 COLLATE "POSIX" and l.f1 = 'foo'; @@ -1931,12 +1933,12 @@ drop table loct1; drop table loct2; -- Test pushing down UPDATE/DELETE joins to the remote server -create table parent (a int, b text); -create table loct1 (a int, b text); -create table loct2 (a int, b text); -create foreign table remt1 (a int, b text) +create table parent (a int, b text collate "C"); +create table loct1 (a int, b text collate "C"); +create table loct2 (a int, b text collate "C"); +create foreign table remt1 (a int, b text collate "C") server loopback options (table_name 'loct1'); -create foreign table remt2 (a int, b text) +create foreign table remt2 (a int, b text collate "C") server loopback options (table_name 'loct2'); alter foreign table remt1 inherit parent; @@ -2029,10 +2031,10 @@ drop table loct1; drop table loct2; -- Test update tuple routing -create table utrtest (a int, b text) partition by list (a); -create table loct (a int check (a in (1)), b text); -create foreign table remp (a int check (a in (1)), b text) server loopback options (table_name 'loct'); -create table locp (a int check (a in (2)), b text); +create table utrtest (a int, b text collate "C") partition by list (a); +create table loct (a int check (a in (1)), b text collate "C"); +create foreign table remp (a int check (a in (1)), b text collate "C") server loopback options (table_name 'loct'); +create table locp (a int check (a in (2)), b text collate "C"); alter table utrtest attach partition remp for values in (1); alter table utrtest attach partition locp for values in (2); @@ -2109,7 +2111,7 @@ alter table utrtest detach partition remp; drop foreign table remp; alter table loct drop constraint loct_a_check; alter table loct add check (a in (3)); -create foreign table remp (a int check (a in (3)), b text) server loopback options (table_name 'loct'); +create foreign table remp (a int check (a in (3)), b text collate "C") server loopback options (table_name 'loct'); alter table utrtest attach partition remp for values in (3); insert into utrtest values (2, 'qux'); insert into utrtest values (3, 'xyzzy'); @@ -2312,11 +2314,11 @@ drop table loc3; -- =================================================================== CREATE SCHEMA import_source; -CREATE TABLE import_source.t1 (c1 int, c2 varchar NOT NULL); -CREATE TABLE import_source.t2 (c1 int default 42, c2 varchar NULL, c3 text collate "POSIX"); -CREATE TYPE typ1 AS (m1 int, m2 varchar); +CREATE TABLE import_source.t1 (c1 int, c2 varchar NOT NULL COLLATE "C"); +CREATE TABLE import_source.t2 (c1 int default 42, c2 varchar NULL COLLATE "C", c3 text collate "POSIX"); +CREATE TYPE typ1 AS (m1 int, m2 varchar COLLATE "C"); CREATE TABLE import_source.t3 (c1 timestamptz default now(), c2 typ1); -CREATE TABLE import_source."x 4" (c1 float8, "C 2" text, c3 varchar(42)); +CREATE TABLE import_source."x 4" (c1 float8, "C 2" text COLLATE "C", c3 varchar(42) COLLATE "C"); CREATE TABLE import_source."x 5" (c1 float8); ALTER TABLE import_source."x 5" DROP COLUMN c1; CREATE TABLE import_source.t4 (c1 int) PARTITION BY RANGE (c1); diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml index eab2cc9378..50601fb208 100644 --- a/doc/src/sgml/postgres-fdw.sgml +++ b/doc/src/sgml/postgres-fdw.sgml @@ -89,6 +89,12 @@ need, surprising semantic anomalies may arise when types or collations do not match, due to the remote server interpreting <literal>WHERE</literal> clauses slightly differently from the local server. + Also, while you need not apply any explicit <literal>COLLATE</literal> + clause to a collatable column of a foreign table, not doing so will cause + the planner to assume that the remote column's collation does not match any + locally available collation. This will generally result in not sending + <literal>WHERE</literal> clauses involving such a column to the remote + server, which may greatly degrade query performance. </para> <para> @@ -423,6 +429,10 @@ OPTIONS (ADD password_required 'false'); need to turn this off if the remote server has a different set of collation names than the local server does, which is likely to be the case if it's running on a different operating system. + (Note, however, that simply turning this off may result in + non-optimal plans for queries on the imported tables. It's best to + find a way to label collatable columns with collations that match the + remote server's behavior.) </para> </listitem> </varlistentry>
pgsql-hackers by date: