Thread: Restoring only a subset of schemas
Hi all, I have a DB with one schema named "Common" holding data referenced by other schemas. All other schemas have the same structure (tables and fields) and are named "cXXX" where XXX is just an int. Thus the only cross-schema foreign keys are in "cXXX" pointing to "Common", and each "cXXX" is completely independent of other "cXXX" schemas. Now if I want to restore from a full dump of this DB, but with only one "cXXX" and the "Common" schema : - if I pass --create --schema=Common, then the CREATE SCHEMA is missing, i.e. it only emits data inside "Common" and the restore fails. - if I could pass --create --exclude-schema='c*' (fictional notation as patterns are only recognized by pg_dump), then all schemas would be created, with no data inside except for "Common". Creating all schemas is a waste of time, but more importantly would make restoring other schemas more difficult (e.g. rows should be inserted before creating foreign keys). Note : to check the behaviour of pg_restore above, I pass -f- to check the SQL as it is far quicker than to actually restore a DB. Maybe a new --include-create-schema option should be added to emit CREATE SCHEMA in addition to objects inside it ? That way I could : 1. --create --include-create-schema --schema=Common and have a DB with all DB-level properties (DEFAULT PRIVILEGES, COMMENT, SET parameter, etc.) and one schema with all of its data and schema-level properties (DEFAULT PRIVILEGES, COMMENT, GRANT USAGE). 2. then at any point later without the --create, with as many schemas I need : --include-create-schema --schema=cXXX. And if I need to reset a "cXXX" schema, just manually DROP SCHEMA and restore again. Similarly, maybe add --exclude-create-schema to additionally exclude CREATE SCHEMA for schemas targeted by --exclude-schema. IOW --schema and --exclude-schema both target objects inside schemas, these 2 new options would allow to also have control on the schemas themselves (and their properties like DEFAULT PRIVILEGES, COMMENT, etc.) Cheers, Sylvain
On 3/17/25 07:57, Sylvain Cuaz wrote: > Hi all, > > I have a DB with one schema named "Common" holding data referenced > by other schemas. All other schemas have the same structure (tables and > fields) and are named "cXXX" where XXX is just an int. Thus the only > cross-schema foreign keys are in "cXXX" pointing to "Common", and each > "cXXX" is completely independent of other "cXXX" schemas. > Now if I want to restore from a full dump of this DB, but with only > one "cXXX" and the "Common" schema : > - if I pass --create --schema=Common, then the CREATE SCHEMA is missing, > i.e. it only emits data inside "Common" and the restore fails. I am not seeing that. For: pg_dump -d test -U postgres -s --create --schema=other_sch --schema=public In the output I get: [...] CREATE SCHEMA other_sch; ALTER SCHEMA other_sch OWNER TO postgres; -- -- Name: public; Type: SCHEMA; Schema: -; Owner: pg_database_owner -- CREATE SCHEMA public; ALTER SCHEMA public OWNER TO pg_database_owner; [...] What is the complete command you are using for the pg_dump? What Postgres version(s) are you using? > > Cheers, > > Sylvain > > > -- Adrian Klaver adrian.klaver@aklaver.com
Sylvain Cuaz <sylvain@ilm-informatique.fr> writes: > Now if I want to restore from a full dump of this DB, but with only one "cXXX" and the "Common" > schema : > - if I pass --create --schema=Common, then the CREATE SCHEMA is missing, i.e. it only emits data > inside "Common" and the restore fails. > - if I could pass --create --exclude-schema='c*' (fictional notation as patterns are only recognized > by pg_dump), then all schemas would be created, with no data inside except for "Common". Creating > all schemas is a waste of time, but more importantly would make restoring other schemas more > difficult (e.g. rows should be inserted before creating foreign keys). In general, the solution for edge-case restore selection needs is to make a list of the dump's contents with "pg_restore -l", edit out what you don't want using any method you like, then use the edited list with "pg_restore -L". While I'd be in favor of improving pg_restore to accept wild-card patterns, I'm very hesitant to start inventing new kinds of selection switches for it. The interactions between such switches would be a mess. regards, tom lane
Le 17/03/2025 à 16:21, Adrian Klaver a écrit : > On 3/17/25 07:57, Sylvain Cuaz wrote: >> Hi all, >> >> I have a DB with one schema named "Common" holding data referenced by other schemas. All >> other schemas have the same structure (tables and fields) and are named "cXXX" where XXX is just >> an int. Thus the only cross-schema foreign keys are in "cXXX" pointing to "Common", and each >> "cXXX" is completely independent of other "cXXX" schemas. >> Now if I want to restore from a full dump of this DB, but with only one "cXXX" and the >> "Common" schema : >> - if I pass --create --schema=Common, then the CREATE SCHEMA is missing, i.e. it only emits data >> inside "Common" and the restore fails. > > I am not seeing that. > > For: > > pg_dump -d test -U postgres -s --create --schema=other_sch --schema=public > > What is the complete command you are using for the pg_dump? Hi, As I said I'm restoring, not dumping. I make daily full backups and sometimes need to restore a specific day, but the full database is quite big and I would like to only restore one or two schemas. As you said, if one passes --create --schema to pg_dump then a valid SQL is produced with CREATE DATABASE, CREATE SCHEMA, CREATE TABLE and all objects inside the schema. But if one passes --create --schema to pg_restore then an invalid SQL is produced because it contains CREATE DATABASE, CREATE TABLE but it doesn't contain the CREATE SCHEMA needed for the tables. Is there any reason for that discrepancy between dump & restore and for outputting invalid SQL ? My proposed --include-create-schema would just add the CREATE SCHEMA so that pg_restore behaves the same as pg_dump, and would allow to output valid SQL. But ideally this option shouldn't even be needed because pg_restore would just emit CREATE SCHEMA like pg_dump. > > What Postgres version(s) are you using? A lot :-) But for this problem I'm using 13 & 15. Cheers, Sylvain.
Le 17/03/2025 à 16:29, Tom Lane a écrit : > Sylvain Cuaz <sylvain@ilm-informatique.fr> writes: >> Now if I want to restore from a full dump of this DB, but with only one "cXXX" and the "Common" >> schema : >> - if I pass --create --schema=Common, then the CREATE SCHEMA is missing, i.e. it only emits data >> inside "Common" and the restore fails. >> - if I could pass --create --exclude-schema='c*' (fictional notation as patterns are only recognized >> by pg_dump), then all schemas would be created, with no data inside except for "Common". Creating >> all schemas is a waste of time, but more importantly would make restoring other schemas more >> difficult (e.g. rows should be inserted before creating foreign keys). > In general, the solution for edge-case restore selection needs is to > make a list of the dump's contents with "pg_restore -l", edit out what > you don't want using any method you like, then use the edited list with > "pg_restore -L". Hi, I am aware of that feature, but that forces me to know every type of entry that pertains to a schema or database (e.g. DEFAULT ACL, ACL, COMMENT, DATABASE PROPERTIES, etc.) and what about new ones that will be added in the future ? Further, I don't see how it's an edge-case, at the core I just want to restore some but not all the schemas. This is possible for pg_dump, see my response to Adrian Klaver. > While I'd be in favor of improving pg_restore to accept wild-card > patterns, That would definitely be appreciated. > I'm very hesitant to start inventing new kinds of selection > switches for it. The interactions between such switches would be a > mess. Which interactions ? It seems to me that the name of the schema should be used as the namespace to check in _tocEntryRequired() in pg_backup_archiver.c, and then the dependent entries (e.g. ACL, COMMENT) would be handled around line 3050. I've attached a patch with some pseudo-code. In fact, were it not for compatibility, I'd argue that my proposed options should be the default, at least with --create, so as to neither output invalid SQL (for -n) nor extra unwanted ones (for -N) and to behave like pg_dump. Cheers, Sylvain
Attachment
Sylvain Cuaz <sylvain@ilm-informatique.fr> writes: > Further, I don't see how it's an edge-case, at the core I just want to restore some but not all the > schemas. This is possible for pg_dump, see my response to Adrian Klaver. You have a very good point that it's annoying that pg_restore's --schema switch doesn't act like pg_dump's --schema switch. I could support changing that, but only if we also did something about the fact that pg_restore's switch doesn't handle patterns. The pattern aspect is kind of a mess, because pg_dump implements that through a server-side regex, which is a facility pg_restore doesn't have access to. Maybe it'd be good enough to implement the "*" and "?" wildcards and stop there, but I'm not sure. regards, tom lane