Thread: BUG #15112: Unable to run pg_upgrade with earthdistance extension
The following bug has been logged on the website: Bug reference: 15112 Logged by: Keiko Oda Email address: keiko713@gmail.com PostgreSQL version: 10.3 Operating system: Ubuntu Description: This is similar to the bug report from 2016 (https://www.postgresql.org/message-id/20161015001424.1413.93990@wrigleys.postgresql.org). where earthdistance extension is assuming that it's executed with public schema in search_path. With 10.3 release, Postgres has tighten the search_path during pg_upgrade only to pg_catalog. This is problematic as it's now impossible to run pg_upgrade with earthdistance extension (it fails with create index with ll_to_earth function). Prior to this, we were at least able to workaround by adding public schema in search_path with pg_upgrade. As it's recommended in the release note, earthdistance should adjust these functions to not assume anything about what search path they are invoked under.
On Wed, Mar 14, 2018 at 11:12:26PM +0000, PG Bug reporting form wrote: > The following bug has been logged on the website: > > Bug reference: 15112 > Logged by: Keiko Oda > Email address: keiko713@gmail.com > PostgreSQL version: 10.3 > Operating system: Ubuntu > Description: > > This is similar to the bug report from 2016 > (https://www.postgresql.org/message-id/20161015001424.1413.93990@wrigleys.postgresql.org). > where earthdistance extension is assuming that it's executed with public > schema in search_path. > > With 10.3 release, Postgres has tighten the search_path during pg_upgrade > only to pg_catalog. This is problematic as it's now impossible to run > pg_upgrade with earthdistance extension (it fails with create index with > ll_to_earth function). > Prior to this, we were at least able to workaround by adding public schema > in search_path with pg_upgrade. > > As it's recommended in the release note, earthdistance should adjust these > functions to not assume anything about what search path they are invoked > under. Uh, I can reproduce this failure. :-( I tested it by installing earchdistance (and cube) and an index using an earchdistance function in the old cluster and ran pg_upgrade. I used the instructions in the referenced email: CREATE TABLE zip_code_geo_poly_data ( id serial NOT NULL PRIMARY KEY, zip_code TEXT, latitude NUMERIC, longitude NUMERIC ); CREATE INDEX zip_code_geo_poly_data_ll_to_earth ON zip_code_geo_poly_data USING gist(ll_to_earth(latitude, longitude)); The failure is actually in pg_dump/restore, but pg_upgrade relies on that, so it fails too. The failure output is: LINE 1: ...ians($1))*sin(radians($2))),earth()*sin(radians($1)))::earth ^ QUERY: SELECT cube(cube(cube(earth()*cos(radians($1))*cos(radians($2))),earth()*cos(radians($1))*sin(radians($2))),earth()*sin(radians($1)))::earth CONTEXT: SQL function "ll_to_earth" during inlining Command was: -- For binary upgrade, must preserve pg_class oids SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('16498'::pg_catalog.oid); CREATE INDEX "zip_code_geo_poly_data_ll_to_earth" ON "public"."zip_code_geo_poly_data" USING "gist" ("public"."ll_to_earth"(("latitude")::doubleprecision, ("longitude")::double precision)); The function definition of ll_to_earth() is (as defined in earthdistance--1.1.sql): CREATE FUNCTION ll_to_earth(float8, float8) RETURNS earth LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE AS 'SELECT cube(cube(cube(earth()*cos(radians($1))*cos(radians($2))),earth()*cos(radians($1))*sin(radians($2))),earth()*sin(radians($1)))::earth'; As you can see, the SQL function is doing a cast to ::earth, but in this test case the earth cast is stored in the public schema, and the restore doesn't reference it. I am not sure we can fix this without requiring people to drop and recreate such indexes. However, I am even at a loss in how to fix the CREATE FUNCTION to reference a cast in the same schema as the function, in this case 'public'. We can rewrite the cast to not use :: and use a function call with schema qualification. e.g. public.earth(), but how do we know what schema that is in, i.e. what if the extension is loaded into a schema other than public? FYI, earthdistance is certainly not the only case of this problem. This is also part of a larger problem with our new schema qualification approach. While we are qualifying where the object is to be created, we are not always qualifying the object's contents, i.e. in this case, the SQL function body. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Fri, Mar 30, 2018 at 04:52:29PM -0400, Bruce Momjian wrote: > On Wed, Mar 14, 2018 at 11:12:26PM +0000, PG Bug reporting form wrote: > > The following bug has been logged on the website: > > > > Bug reference: 15112 > > Logged by: Keiko Oda > > Email address: keiko713@gmail.com > > PostgreSQL version: 10.3 > > Operating system: Ubuntu > > Description: > > > > This is similar to the bug report from 2016 > > (https://www.postgresql.org/message-id/20161015001424.1413.93990@wrigleys.postgresql.org). > > where earthdistance extension is assuming that it's executed with public > > schema in search_path. > > > > With 10.3 release, Postgres has tighten the search_path during pg_upgrade > > only to pg_catalog. This is problematic as it's now impossible to run > > pg_upgrade with earthdistance extension (it fails with create index with > > ll_to_earth function). > > Prior to this, we were at least able to workaround by adding public schema > > in search_path with pg_upgrade. > > > > As it's recommended in the release note, earthdistance should adjust these > > functions to not assume anything about what search path they are invoked > > under. > > Uh, I can reproduce this failure. :-( > > I tested it by installing earchdistance (and cube) and an index using an > earchdistance function in the old cluster and ran pg_upgrade. I used > the instructions in the referenced email: > > CREATE TABLE zip_code_geo_poly_data ( > id serial NOT NULL PRIMARY KEY, > zip_code TEXT, > latitude NUMERIC, > longitude NUMERIC > ); > > CREATE INDEX zip_code_geo_poly_data_ll_to_earth > ON zip_code_geo_poly_data > USING gist(ll_to_earth(latitude, longitude)); > > The failure is actually in pg_dump/restore, but pg_upgrade relies on > that, so it fails too. The failure output is: > > LINE 1: ...ians($1))*sin(radians($2))),earth()*sin(radians($1)))::earth > ^ > QUERY: SELECT cube(cube(cube(earth()*cos(radians($1))*cos(radians($2))),earth()*cos(radians($1))*sin(radians($2))),earth()*sin(radians($1)))::earth > CONTEXT: SQL function "ll_to_earth" during inlining > Command was: > -- For binary upgrade, must preserve pg_class oids > SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('16498'::pg_catalog.oid); > > CREATE INDEX "zip_code_geo_poly_data_ll_to_earth" ON "public"."zip_code_geo_poly_data" USING "gist" ("public"."ll_to_earth"(("latitude")::doubleprecision, ("longitude")::double precision)); > > The function definition of ll_to_earth() is (as defined in > earthdistance--1.1.sql): > > CREATE FUNCTION ll_to_earth(float8, float8) > RETURNS earth > LANGUAGE SQL > IMMUTABLE STRICT > PARALLEL SAFE > AS 'SELECT cube(cube(cube(earth()*cos(radians($1))*cos(radians($2))),earth()*cos(radians($1))*sin(radians($2))),earth()*sin(radians($1)))::earth'; > > As you can see, the SQL function is doing a cast to ::earth, but in this > test case the earth cast is stored in the public schema, and the restore > doesn't reference it. > > I am not sure we can fix this without requiring people to drop and > recreate such indexes. However, I am even at a loss in how to fix the > CREATE FUNCTION to reference a cast in the same schema as the function, > in this case 'public'. We can rewrite the cast to not use :: and use a > function call with schema qualification. e.g. public.earth(), but how do > we know what schema that is in, i.e. what if the extension is loaded > into a schema other than public? > > FYI, earthdistance is certainly not the only case of this problem. > > This is also part of a larger problem with our new schema qualification > approach. While we are qualifying where the object is to be created, we > are not always qualifying the object's contents, i.e. in this case, the > SQL function body. I have not received a reply to this so I am CCing the hackers list for more input. (If this is not the proper procedure, please let me know.) The only additional thoughts I have are that when we removed public from pg_dump's search_path and decided to instead fully qualify object creation names, I was concerned about function creation failing because of 'public' schema object references in the function body, but someone pointed out that we don't check function bodies during restore because of: SET check_function_bodies = false; However, what I did not consider was creation of indexes based on function bodies with 'public' schema references. I have ran some tests and found out that only SQL functions cause such failures. I believe this is because they are inlined during CREATE INDEX. The attached pg_dump output file recreates the failure, and shows that prefixing it with "public" avoids the failure, and shows that indexing of a PL/pgSQL function doesn't fail (see SQL comments). A more general issue with functions is that if you schema-qualify a call to a function, you will still need all object references in the function to reference the schema where the object is installed if they are not in search_path, i.e. schema qualifying the object doesn't propagate that schema qualification into the function body (functions look in search_path, not necessarily in the schema in which they are defined). I am not sure what to suggest, but I don't think we can ignore this problem. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Attachment
On Fri, Mar 30, 2018 at 04:52:29PM -0400, Bruce Momjian wrote: > On Wed, Mar 14, 2018 at 11:12:26PM +0000, PG Bug reporting form wrote: > > The following bug has been logged on the website: > > > > Bug reference: 15112 > > Logged by: Keiko Oda > > Email address: keiko713@gmail.com > > PostgreSQL version: 10.3 > > Operating system: Ubuntu > > Description: > > > > This is similar to the bug report from 2016 > > (https://www.postgresql.org/message-id/20161015001424.1413.93990@wrigleys.postgresql.org). > > where earthdistance extension is assuming that it's executed with public > > schema in search_path. > > > > With 10.3 release, Postgres has tighten the search_path during pg_upgrade > > only to pg_catalog. This is problematic as it's now impossible to run > > pg_upgrade with earthdistance extension (it fails with create index with > > ll_to_earth function). > > Prior to this, we were at least able to workaround by adding public schema > > in search_path with pg_upgrade. > > > > As it's recommended in the release note, earthdistance should adjust these > > functions to not assume anything about what search path they are invoked > > under. > > Uh, I can reproduce this failure. :-( > > I tested it by installing earchdistance (and cube) and an index using an > earchdistance function in the old cluster and ran pg_upgrade. I used > the instructions in the referenced email: > > CREATE TABLE zip_code_geo_poly_data ( > id serial NOT NULL PRIMARY KEY, > zip_code TEXT, > latitude NUMERIC, > longitude NUMERIC > ); > > CREATE INDEX zip_code_geo_poly_data_ll_to_earth > ON zip_code_geo_poly_data > USING gist(ll_to_earth(latitude, longitude)); > > The failure is actually in pg_dump/restore, but pg_upgrade relies on > that, so it fails too. The failure output is: > > LINE 1: ...ians($1))*sin(radians($2))),earth()*sin(radians($1)))::earth > ^ > QUERY: SELECT cube(cube(cube(earth()*cos(radians($1))*cos(radians($2))),earth()*cos(radians($1))*sin(radians($2))),earth()*sin(radians($1)))::earth > CONTEXT: SQL function "ll_to_earth" during inlining > Command was: > -- For binary upgrade, must preserve pg_class oids > SELECT pg_catalog.binary_upgrade_set_next_index_pg_class_oid('16498'::pg_catalog.oid); > > CREATE INDEX "zip_code_geo_poly_data_ll_to_earth" ON "public"."zip_code_geo_poly_data" USING "gist" ("public"."ll_to_earth"(("latitude")::doubleprecision, ("longitude")::double precision)); > > The function definition of ll_to_earth() is (as defined in > earthdistance--1.1.sql): > > CREATE FUNCTION ll_to_earth(float8, float8) > RETURNS earth > LANGUAGE SQL > IMMUTABLE STRICT > PARALLEL SAFE > AS 'SELECT cube(cube(cube(earth()*cos(radians($1))*cos(radians($2))),earth()*cos(radians($1))*sin(radians($2))),earth()*sin(radians($1)))::earth'; > > As you can see, the SQL function is doing a cast to ::earth, but in this > test case the earth cast is stored in the public schema, and the restore > doesn't reference it. > > I am not sure we can fix this without requiring people to drop and > recreate such indexes. However, I am even at a loss in how to fix the > CREATE FUNCTION to reference a cast in the same schema as the function, > in this case 'public'. We can rewrite the cast to not use :: and use a > function call with schema qualification. e.g. public.earth(), but how do > we know what schema that is in, i.e. what if the extension is loaded > into a schema other than public? > > FYI, earthdistance is certainly not the only case of this problem. > > This is also part of a larger problem with our new schema qualification > approach. While we are qualifying where the object is to be created, we > are not always qualifying the object's contents, i.e. in this case, the > SQL function body. I have not received a reply to this so I am CCing the hackers list for more input. (If this is not the proper procedure, please let me know.) The only additional thoughts I have are that when we removed public from pg_dump's search_path and decided to instead fully qualify object creation names, I was concerned about function creation failing because of 'public' schema object references in the function body, but someone pointed out that we don't check function bodies during restore because of: SET check_function_bodies = false; However, what I did not consider was creation of indexes based on function bodies with 'public' schema references. I have ran some tests and found out that only SQL functions cause such failures. I believe this is because they are inlined during CREATE INDEX. The attached pg_dump output file recreates the failure, and shows that prefixing it with "public" avoids the failure, and shows that indexing of a PL/pgSQL function doesn't fail (see SQL comments). A more general issue with functions is that if you schema-qualify a call to a function, you will still need all object references in the function to reference the schema where the object is installed if they are not in search_path, i.e. schema qualifying the object doesn't propagate that schema qualification into the function body (functions look in search_path, not necessarily in the schema in which they are defined). I am not sure what to suggest, but I don't think we can ignore this problem. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Attachment
On Mon, Apr 02, 2018 at 04:24:02PM -0400, Bruce Momjian wrote: > > I am not sure we can fix this without requiring people to drop and > > recreate such indexes. However, I am even at a loss in how to fix the > > CREATE FUNCTION to reference a cast in the same schema as the function, > > in this case 'public'. We can rewrite the cast to not use :: and use a > > function call with schema qualification. e.g. public.earth(), but how do > > we know what schema that is in, i.e. what if the extension is loaded > > into a schema other than public? The task is to convert it to being a non-relocatable extension that uses @extschema@, like here: https://www.postgresql.org/docs/devel/static/extend-extensions.html#EXTEND-EXTENSIONS-EXAMPLE Keiko Oda, you can hack around this by modifying the problematic extension function in each database: CREATE OR REPLACE FUNCTION public.ll_to_earth(float8, float8) RETURNS public.earth LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE AS 'SELECT public.cube(public.cube(public.cube(public.earth()*cos(radians($1))*cos(radians($2))),public.earth()*cos(radians($1))*sin(radians($2))),public.earth()*sin(radians($1)))::public.earth'; There's no need to drop and recreate indexes. However, if the table is small and nothing depends on the index, that is an easy workaround that one can employ right now. (Drop before upgrade and re-create after upgrade.) > > FYI, earthdistance is certainly not the only case of this problem. True; I've been expecting a report like this for contrib/xml2 especially.
On Mon, Apr 02, 2018 at 04:24:02PM -0400, Bruce Momjian wrote: > > I am not sure we can fix this without requiring people to drop and > > recreate such indexes. However, I am even at a loss in how to fix the > > CREATE FUNCTION to reference a cast in the same schema as the function, > > in this case 'public'. We can rewrite the cast to not use :: and use a > > function call with schema qualification. e.g. public.earth(), but how do > > we know what schema that is in, i.e. what if the extension is loaded > > into a schema other than public? The task is to convert it to being a non-relocatable extension that uses @extschema@, like here: https://www.postgresql.org/docs/devel/static/extend-extensions.html#EXTEND-EXTENSIONS-EXAMPLE Keiko Oda, you can hack around this by modifying the problematic extension function in each database: CREATE OR REPLACE FUNCTION public.ll_to_earth(float8, float8) RETURNS public.earth LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE AS 'SELECT public.cube(public.cube(public.cube(public.earth()*cos(radians($1))*cos(radians($2))),public.earth()*cos(radians($1))*sin(radians($2))),public.earth()*sin(radians($1)))::public.earth'; There's no need to drop and recreate indexes. However, if the table is small and nothing depends on the index, that is an easy workaround that one can employ right now. (Drop before upgrade and re-create after upgrade.) > > FYI, earthdistance is certainly not the only case of this problem. True; I've been expecting a report like this for contrib/xml2 especially.
On Wed, Apr 04, 2018 at 11:59:57PM -0700, Noah Misch wrote: > On Mon, Apr 02, 2018 at 04:24:02PM -0400, Bruce Momjian wrote: > > > I am not sure we can fix this without requiring people to drop and > > > recreate such indexes. However, I am even at a loss in how to fix the > > > CREATE FUNCTION to reference a cast in the same schema as the function, > > > in this case 'public'. We can rewrite the cast to not use :: and use a > > > function call with schema qualification. e.g. public.earth(), but how do > > > we know what schema that is in, i.e. what if the extension is loaded > > > into a schema other than public? > > The task is to convert it to being a non-relocatable extension that uses > @extschema@, like here: > https://www.postgresql.org/docs/devel/static/extend-extensions.html#EXTEND-EXTENSIONS-EXAMPLE Extension "earthdistance" creates sql-language functions that call functions of extension "cube", which may appear outside @extschema@ and may relocate at any moment. Hence, this case is not as simple as using @extschema@. While the cube/earthdistance system happens to be revealing this problem, it would arise in many cases of a function referring to an object of a relocatable extension. I see these options: 1. Stop using relocatable=true in core extensions (by adding a new version number and versioned control file). To relocate an extension, drop and recreate it. Deprecate relocatable=true. Sub-options: 1a. Require that "earthdistance" and "cube" appear in the same schema by forcing an error[1] when they don't. 1b. Expand @DEPNAME_schema@ in extension SQL files. Use @cube_schema@ to refer to the right objects. 1c. Use plpgsql to query pg_extension.extnamespace, then EXECUTE a CREATE FUNCTION statement after substituting the right schema names. 2. Like (1), including all sub-options, but warn about the problem without deprecating relocatable=true. Drop relocatable=true from extensions that have cause to do so: cube, earthdistance, pageinspect, pg_freespacemap, xml2. Do likewise for others as needed in the future. 3. Make "earthdistance" dynamically discover the location of "cube" during each function call. This entails rewriting earthdistance sql-language functions in C. (One could use plpgsql, but that would add a substantial performance loss and a runtime dependency.) 4. Re-implement the earthdistance sql functions in C, not calling "cube" functions at all. 5. Create copies in "earthdistance" of the "cube" functions it uses[2]. This violates modularity. It makes \dx+ uglier. 6. Allow an extension to ship SQL commands for re-binding to schemas when it or a dependency relocates. This would allow relocatable=true in extensions that refer to @extschema@. Include (1b) in this project. 7. Augment function system and LANGUAGE sql to offer the ability to parse at CREATE time, storing a Query tree like we do for views/rules. (This would be a complex feature.) (One can simulate this today with a rule[3], but it defeats inline_function().) Overall, I lean toward (2b). It's a self-contained project that doesn't uglify contrib and that sets a reasonable example for non-core extensions. While (7) would solve this and other problems nicely, it's a poor back-patch candidate. I liked (1b) for awhile, but it would be overkill if we ever get (7). Other ideas or preferences? Thanks, nm [1] SELECT 'cube and earthdistance must appear in the same schema', 1 / (count(DISTINCT extnamespace) = 1)::int FROM pg_extension WHERE extname IN ('cube', 'earthdistance'); [2] CREATE FUNCTION _earthdistance_cube(float8) RETURNS cube AS '$libdir/cube', 'cube_f8' LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE; [3] CREATE TABLE sum_scratch (addend0 int, addend1 int, sum int); CREATE VIEW sum_impl AS SELECT * FROM sum_scratch; CREATE RULE sum_calc AS ON INSERT TO sum_impl DO INSTEAD INSERT INTO sum_scratch VALUES (NEW.addend0, NEW.addend1, NEW.addend0 + NEW.addend1) RETURNING *; CREATE FUNCTION sum(int, int) RETURNS int LANGUAGE sql AS $$INSERT INTO sum_impl VALUES ($1, $2) RETURNING sum$$; SELECT sum(4, 3);
On Wed, Apr 04, 2018 at 11:59:57PM -0700, Noah Misch wrote: > On Mon, Apr 02, 2018 at 04:24:02PM -0400, Bruce Momjian wrote: > > > I am not sure we can fix this without requiring people to drop and > > > recreate such indexes. However, I am even at a loss in how to fix the > > > CREATE FUNCTION to reference a cast in the same schema as the function, > > > in this case 'public'. We can rewrite the cast to not use :: and use a > > > function call with schema qualification. e.g. public.earth(), but how do > > > we know what schema that is in, i.e. what if the extension is loaded > > > into a schema other than public? > > The task is to convert it to being a non-relocatable extension that uses > @extschema@, like here: > https://www.postgresql.org/docs/devel/static/extend-extensions.html#EXTEND-EXTENSIONS-EXAMPLE Extension "earthdistance" creates sql-language functions that call functions of extension "cube", which may appear outside @extschema@ and may relocate at any moment. Hence, this case is not as simple as using @extschema@. While the cube/earthdistance system happens to be revealing this problem, it would arise in many cases of a function referring to an object of a relocatable extension. I see these options: 1. Stop using relocatable=true in core extensions (by adding a new version number and versioned control file). To relocate an extension, drop and recreate it. Deprecate relocatable=true. Sub-options: 1a. Require that "earthdistance" and "cube" appear in the same schema by forcing an error[1] when they don't. 1b. Expand @DEPNAME_schema@ in extension SQL files. Use @cube_schema@ to refer to the right objects. 1c. Use plpgsql to query pg_extension.extnamespace, then EXECUTE a CREATE FUNCTION statement after substituting the right schema names. 2. Like (1), including all sub-options, but warn about the problem without deprecating relocatable=true. Drop relocatable=true from extensions that have cause to do so: cube, earthdistance, pageinspect, pg_freespacemap, xml2. Do likewise for others as needed in the future. 3. Make "earthdistance" dynamically discover the location of "cube" during each function call. This entails rewriting earthdistance sql-language functions in C. (One could use plpgsql, but that would add a substantial performance loss and a runtime dependency.) 4. Re-implement the earthdistance sql functions in C, not calling "cube" functions at all. 5. Create copies in "earthdistance" of the "cube" functions it uses[2]. This violates modularity. It makes \dx+ uglier. 6. Allow an extension to ship SQL commands for re-binding to schemas when it or a dependency relocates. This would allow relocatable=true in extensions that refer to @extschema@. Include (1b) in this project. 7. Augment function system and LANGUAGE sql to offer the ability to parse at CREATE time, storing a Query tree like we do for views/rules. (This would be a complex feature.) (One can simulate this today with a rule[3], but it defeats inline_function().) Overall, I lean toward (2b). It's a self-contained project that doesn't uglify contrib and that sets a reasonable example for non-core extensions. While (7) would solve this and other problems nicely, it's a poor back-patch candidate. I liked (1b) for awhile, but it would be overkill if we ever get (7). Other ideas or preferences? Thanks, nm [1] SELECT 'cube and earthdistance must appear in the same schema', 1 / (count(DISTINCT extnamespace) = 1)::int FROM pg_extension WHERE extname IN ('cube', 'earthdistance'); [2] CREATE FUNCTION _earthdistance_cube(float8) RETURNS cube AS '$libdir/cube', 'cube_f8' LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE; [3] CREATE TABLE sum_scratch (addend0 int, addend1 int, sum int); CREATE VIEW sum_impl AS SELECT * FROM sum_scratch; CREATE RULE sum_calc AS ON INSERT TO sum_impl DO INSTEAD INSERT INTO sum_scratch VALUES (NEW.addend0, NEW.addend1, NEW.addend0 + NEW.addend1) RETURNING *; CREATE FUNCTION sum(int, int) RETURNS int LANGUAGE sql AS $$INSERT INTO sum_impl VALUES ($1, $2) RETURNING sum$$; SELECT sum(4, 3);
I am writing to resurrect this 3-year-old thread. Attached is a patch to address earthdistance related failures during pg_restore. The proposed patch will: - Create a new version of earthdistance (1.2) and make this new version default - Change earthdistance relocatable from true to false - SET SEARCH_PATH=@extschema@ in earthdistance functions - Create validation to restrict earthdistance to be created only in the same schema as cube - Change cube extension relocatable from true to false to avoid a relocate of cube which will break earthdistance. - Change documentation [1] FROM "It is strongly recommended that earthdistance and cube be installed in the same schema".TO "it's required that earthdistance and cube be installed and kept in the same schema" [1] - https://www.postgresql.org/docs/current/earthdistance.html#EARTHDISTANCE ---------- ## Proposal - pg_restore test: $ psql <<EOF > select version(); > drop extension if exists cube cascade; > create extension earthdistance cascade ; > > drop schema if exists test cascade; > create schema test; > > --set search_path=test,public; > drop table if exists test.addresses cascade; > create table test.addresses ( > latitude float8, > longitude float8 > ); > > CREATE INDEX locations ON test.addresses USING gist(ll_to_earth(latitude, longitude)); > > \dx > EOF version ------------------------------------------------------------------------------------------------------------ PostgreSQL 15devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-12), 64-bit (1 row) NOTICE: drop cascades to 2 other objects DETAIL: drop cascades to extension earthdistance drop cascades to index test.locations DROP EXTENSION NOTICE: installing required extension "cube" CREATE EXTENSION NOTICE: drop cascades to table test.addresses DROP SCHEMA CREATE SCHEMA NOTICE: table "addresses" does not exist, skipping DROP TABLE CREATE TABLE CREATE INDEX List of installed extensions Name | Version | Schema | Description ---------------+---------+------------+-------------------------------------------------------------- cube | 1.5 | public | data type for multidimensional cubes earthdistance | 1.2 | public | calculate great-circle distances on the surface of the Earth plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language (3 rows) $ ## dump the schema $ rm /tmp/test.dmp $ pg_dump --schema test --format=custom --file /tmp/test.dmp $ $ ## drop the schema $ psql <<EOF > drop schema test cascade ; > EOF NOTICE: drop cascades to table test.addresses DROP SCHEMA $ $ $ ## attempt to restore the schema $ pg_restore -d postgres /tmp/test.dmp $ ## Check restored schema. $ psql <<EOF > \d+ test.addresses > EOF Table "test.addresses" Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description -----------+------------------+-----------+----------+---------+---------+-------------+--------------+------------- latitude | double precision | | | | plain | | | longitude | double precision | | | | plain | | | Indexes: "locations" gist (ll_to_earth(latitude, longitude)) Access method: heap ---------- ## Proposal - schema dependency validation test: $ psql <<EOF > select version(); > drop schema if exists test cascade; > create schema test; > drop extension if exists cube cascade; > create extension cube; > create extension earthdistance with schema test; > > \dx > EOF version ------------------------------------------------------------------------------------------------------------ PostgreSQL 15devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-12), 64-bit (1 row) DROP SCHEMA CREATE SCHEMA DROP EXTENSION CREATE EXTENSION ERROR: earthdistance extension must be installed in the same schema as the cube extension CONTEXT: PL/pgSQL function inline_code_block line 11 at RAISE List of installed extensions Name | Version | Schema | Description ---------+---------+------------+-------------------------------------- cube | 1.5 | public | data type for multidimensional cubes plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language (2 rows) Luiz Verona -----Original Message----- From: Noah Misch <noah@leadboat.com> Sent: Tuesday, July 10, 2018 2:43 AM To: Bruce Momjian <bruce@momjian.us> Cc: keiko713@gmail.com; pgsql-bugs@lists.postgresql.org; PostgreSQL-development <pgsql-hackers@postgresql.org> Subject: [EXTERNAL] Extension relocation vs. schema qualification CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirmthe sender and know the content is safe. On Wed, Apr 04, 2018 at 11:59:57PM -0700, Noah Misch wrote: > On Mon, Apr 02, 2018 at 04:24:02PM -0400, Bruce Momjian wrote: > > > I am not sure we can fix this without requiring people to drop and > > > recreate such indexes. However, I am even at a loss in how to fix > > > the CREATE FUNCTION to reference a cast in the same schema as the > > > function, in this case 'public'. We can rewrite the cast to not > > > use :: and use a function call with schema qualification. e.g. > > > public.earth(), but how do we know what schema that is in, i.e. > > > what if the extension is loaded into a schema other than public? > > The task is to convert it to being a non-relocatable extension that > uses @extschema@, like here: > https://www.postgresql.org/docs/devel/static/extend-extensions.html#EX > TEND-EXTENSIONS-EXAMPLE Extension "earthdistance" creates sql-language functions that call functions of extension "cube", which may appear outside@extschema@ and may relocate at any moment. Hence, this case is not as simple as using @extschema@. While the cube/earthdistancesystem happens to be revealing this problem, it would arise in many cases of a function referring to anobject of a relocatable extension. I see these options: 1. Stop using relocatable=true in core extensions (by adding a new version number and versioned control file). To relocate an extension, drop and recreate it. Deprecate relocatable=true. Sub-options: 1a. Require that "earthdistance" and "cube" appear in the same schema by forcing an error[1] when they don't. 1b. Expand @DEPNAME_schema@ in extension SQL files. Use @cube_schema@ to refer to the right objects. 1c. Use plpgsql to query pg_extension.extnamespace, then EXECUTE a CREATE FUNCTION statement after substituting the right schema names. 2. Like (1), including all sub-options, but warn about the problem without deprecating relocatable=true. Drop relocatable=true from extensions that have cause to do so: cube, earthdistance, pageinspect, pg_freespacemap, xml2. Do likewise for others as needed in the future. 3. Make "earthdistance" dynamically discover the location of "cube" during each function call. This entails rewriting earthdistance sql-language functions in C. (One could use plpgsql, but that would add a substantial performance loss and a runtime dependency.) 4. Re-implement the earthdistance sql functions in C, not calling "cube" functions at all. 5. Create copies in "earthdistance" of the "cube" functions it uses[2]. This violates modularity. It makes \dx+ uglier. 6. Allow an extension to ship SQL commands for re-binding to schemas when it or a dependency relocates. This would allow relocatable=true in extensions that refer to @extschema@. Include (1b) in this project. 7. Augment function system and LANGUAGE sql to offer the ability to parse at CREATE time, storing a Query tree like we do for views/rules. (This would be a complex feature.) (One can simulate this today with a rule[3], but it defeats inline_function().) Overall, I lean toward (2b). It's a self-contained project that doesn't uglify contrib and that sets a reasonable examplefor non-core extensions. While (7) would solve this and other problems nicely, it's a poor back-patch candidate. I liked (1b) for awhile, but itwould be overkill if we ever get (7). Other ideas or preferences? Thanks, nm [1] SELECT 'cube and earthdistance must appear in the same schema', 1 / (count(DISTINCT extnamespace) = 1)::int FROM pg_extension WHERE extname IN ('cube', 'earthdistance'); [2] CREATE FUNCTION _earthdistance_cube(float8) RETURNS cube AS '$libdir/cube', 'cube_f8' LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE; [3] CREATE TABLE sum_scratch (addend0 int, addend1 int, sum int); CREATE VIEW sum_impl AS SELECT * FROM sum_scratch; CREATE RULE sum_calc AS ON INSERT TO sum_impl DO INSTEAD INSERT INTO sum_scratch VALUES (NEW.addend0, NEW.addend1, NEW.addend0 + NEW.addend1) RETURNING *; CREATE FUNCTION sum(int, int) RETURNS int LANGUAGE sql AS $$INSERT INTO sum_impl VALUES ($1, $2) RETURNING sum$$; SELECT sum(4, 3);
Attachment
I am writing to resurrect this 3-year-old thread. Attached is a patch to address earthdistance related failures during pg_restore. The proposed patch will: - Create a new version of earthdistance (1.2) and make this new version default - Change earthdistance relocatable from true to false - SET SEARCH_PATH=@extschema@ in earthdistance functions - Create validation to restrict earthdistance to be created only in the same schema as cube - Change cube extension relocatable from true to false to avoid a relocate of cube which will break earthdistance. - Change documentation [1] FROM "It is strongly recommended that earthdistance and cube be installed in the same schema".TO "it's required that earthdistance and cube be installed and kept in the same schema" [1] - https://www.postgresql.org/docs/current/earthdistance.html#EARTHDISTANCE ---------- ## Proposal - pg_restore test: $ psql <<EOF > select version(); > drop extension if exists cube cascade; > create extension earthdistance cascade ; > > drop schema if exists test cascade; > create schema test; > > --set search_path=test,public; > drop table if exists test.addresses cascade; > create table test.addresses ( > latitude float8, > longitude float8 > ); > > CREATE INDEX locations ON test.addresses USING gist(ll_to_earth(latitude, longitude)); > > \dx > EOF version ------------------------------------------------------------------------------------------------------------ PostgreSQL 15devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-12), 64-bit (1 row) NOTICE: drop cascades to 2 other objects DETAIL: drop cascades to extension earthdistance drop cascades to index test.locations DROP EXTENSION NOTICE: installing required extension "cube" CREATE EXTENSION NOTICE: drop cascades to table test.addresses DROP SCHEMA CREATE SCHEMA NOTICE: table "addresses" does not exist, skipping DROP TABLE CREATE TABLE CREATE INDEX List of installed extensions Name | Version | Schema | Description ---------------+---------+------------+-------------------------------------------------------------- cube | 1.5 | public | data type for multidimensional cubes earthdistance | 1.2 | public | calculate great-circle distances on the surface of the Earth plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language (3 rows) $ ## dump the schema $ rm /tmp/test.dmp $ pg_dump --schema test --format=custom --file /tmp/test.dmp $ $ ## drop the schema $ psql <<EOF > drop schema test cascade ; > EOF NOTICE: drop cascades to table test.addresses DROP SCHEMA $ $ $ ## attempt to restore the schema $ pg_restore -d postgres /tmp/test.dmp $ ## Check restored schema. $ psql <<EOF > \d+ test.addresses > EOF Table "test.addresses" Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description -----------+------------------+-----------+----------+---------+---------+-------------+--------------+------------- latitude | double precision | | | | plain | | | longitude | double precision | | | | plain | | | Indexes: "locations" gist (ll_to_earth(latitude, longitude)) Access method: heap ---------- ## Proposal - schema dependency validation test: $ psql <<EOF > select version(); > drop schema if exists test cascade; > create schema test; > drop extension if exists cube cascade; > create extension cube; > create extension earthdistance with schema test; > > \dx > EOF version ------------------------------------------------------------------------------------------------------------ PostgreSQL 15devel on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-12), 64-bit (1 row) DROP SCHEMA CREATE SCHEMA DROP EXTENSION CREATE EXTENSION ERROR: earthdistance extension must be installed in the same schema as the cube extension CONTEXT: PL/pgSQL function inline_code_block line 11 at RAISE List of installed extensions Name | Version | Schema | Description ---------+---------+------------+-------------------------------------- cube | 1.5 | public | data type for multidimensional cubes plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language (2 rows) Luiz Verona -----Original Message----- From: Noah Misch <noah@leadboat.com> Sent: Tuesday, July 10, 2018 2:43 AM To: Bruce Momjian <bruce@momjian.us> Cc: keiko713@gmail.com; pgsql-bugs@lists.postgresql.org; PostgreSQL-development <pgsql-hackers@postgresql.org> Subject: [EXTERNAL] Extension relocation vs. schema qualification CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirmthe sender and know the content is safe. On Wed, Apr 04, 2018 at 11:59:57PM -0700, Noah Misch wrote: > On Mon, Apr 02, 2018 at 04:24:02PM -0400, Bruce Momjian wrote: > > > I am not sure we can fix this without requiring people to drop and > > > recreate such indexes. However, I am even at a loss in how to fix > > > the CREATE FUNCTION to reference a cast in the same schema as the > > > function, in this case 'public'. We can rewrite the cast to not > > > use :: and use a function call with schema qualification. e.g. > > > public.earth(), but how do we know what schema that is in, i.e. > > > what if the extension is loaded into a schema other than public? > > The task is to convert it to being a non-relocatable extension that > uses @extschema@, like here: > https://www.postgresql.org/docs/devel/static/extend-extensions.html#EX > TEND-EXTENSIONS-EXAMPLE Extension "earthdistance" creates sql-language functions that call functions of extension "cube", which may appear outside@extschema@ and may relocate at any moment. Hence, this case is not as simple as using @extschema@. While the cube/earthdistancesystem happens to be revealing this problem, it would arise in many cases of a function referring to anobject of a relocatable extension. I see these options: 1. Stop using relocatable=true in core extensions (by adding a new version number and versioned control file). To relocate an extension, drop and recreate it. Deprecate relocatable=true. Sub-options: 1a. Require that "earthdistance" and "cube" appear in the same schema by forcing an error[1] when they don't. 1b. Expand @DEPNAME_schema@ in extension SQL files. Use @cube_schema@ to refer to the right objects. 1c. Use plpgsql to query pg_extension.extnamespace, then EXECUTE a CREATE FUNCTION statement after substituting the right schema names. 2. Like (1), including all sub-options, but warn about the problem without deprecating relocatable=true. Drop relocatable=true from extensions that have cause to do so: cube, earthdistance, pageinspect, pg_freespacemap, xml2. Do likewise for others as needed in the future. 3. Make "earthdistance" dynamically discover the location of "cube" during each function call. This entails rewriting earthdistance sql-language functions in C. (One could use plpgsql, but that would add a substantial performance loss and a runtime dependency.) 4. Re-implement the earthdistance sql functions in C, not calling "cube" functions at all. 5. Create copies in "earthdistance" of the "cube" functions it uses[2]. This violates modularity. It makes \dx+ uglier. 6. Allow an extension to ship SQL commands for re-binding to schemas when it or a dependency relocates. This would allow relocatable=true in extensions that refer to @extschema@. Include (1b) in this project. 7. Augment function system and LANGUAGE sql to offer the ability to parse at CREATE time, storing a Query tree like we do for views/rules. (This would be a complex feature.) (One can simulate this today with a rule[3], but it defeats inline_function().) Overall, I lean toward (2b). It's a self-contained project that doesn't uglify contrib and that sets a reasonable examplefor non-core extensions. While (7) would solve this and other problems nicely, it's a poor back-patch candidate. I liked (1b) for awhile, but itwould be overkill if we ever get (7). Other ideas or preferences? Thanks, nm [1] SELECT 'cube and earthdistance must appear in the same schema', 1 / (count(DISTINCT extnamespace) = 1)::int FROM pg_extension WHERE extname IN ('cube', 'earthdistance'); [2] CREATE FUNCTION _earthdistance_cube(float8) RETURNS cube AS '$libdir/cube', 'cube_f8' LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE; [3] CREATE TABLE sum_scratch (addend0 int, addend1 int, sum int); CREATE VIEW sum_impl AS SELECT * FROM sum_scratch; CREATE RULE sum_calc AS ON INSERT TO sum_impl DO INSTEAD INSERT INTO sum_scratch VALUES (NEW.addend0, NEW.addend1, NEW.addend0 + NEW.addend1) RETURNING *; CREATE FUNCTION sum(int, int) RETURNS int LANGUAGE sql AS $$INSERT INTO sum_impl VALUES ($1, $2) RETURNING sum$$; SELECT sum(4, 3);