Thread: Programmatically duplicating a schema

Programmatically duplicating a schema

From
matt.figg@internode.on.net
Date:
Hi all,

What is a reliable way to programmatically & generically populate an empty schema with all the objects in the public schema as a template?

We are using the multi tenancy ruby gem Apartment ( https://github.com/influitive/apartment ), which was recently broken by the changes made to pg_dump to address CVE-2018-1058 https://nvd.nist.gov/vuln/detail/CVE-2018-1058

Apartment attempts to duplicate the public schema whenever creating a new schema by running:

pg_dump -s -x -0 -n public

to get the SQL statements needed to recreate the public schema & then executes the pg_dump's sql output after creating & switching to the new schema ( via set search_path to <new schema>; )

After the fix to CVE-2018-1058, all table references in pg_dump's output (including within SQL of stored procedures) are prefixed by the public. schema, which means you cannot just reuse this output in a different schema context without first manually changing the sql.
As a temporary fix so we can handle new customers in production, we are using a regex search/replace for public. in the pg_dump output, but clearly this is not a reliable solution for a generic gem such as Apartment.

Is there a different way this gem could now be achieving this?
Without a flag on pg_dump to force the old format, or a CREATE SCHEMA <schema_name> WITH TEMPLATE <other_schema> kind of statement, we can't see any alternative more reliable than the regex search/replace. 

The open issue on the Apartment gem for context: https://github.com/influitive/apartment/issues/532

Thanks in advance.

Re: Programmatically duplicating a schema

From
Adrian Klaver
Date:
On 03/12/2018 08:23 PM, matt.figg@internode.on.net wrote:
> Hi all,
> 
> What is a reliable way to programmatically & generically populate an 
> empty schema with all the objects in the public schema as a template?
> 
> We are using the multi tenancy ruby gem Apartment ( 
> https://github.com/influitive/apartment ), which was recently broken by 
> the changes made to pg_dump to address CVE-2018-1058 
> https://nvd.nist.gov/vuln/detail/CVE-2018-1058
> 
> Apartment attempts to duplicate the public schema whenever creating a 
> new schema by running:
> 
> pg_dump -s -x -0 -n public
> 
> to get the SQL statements needed to recreate the public schema & then 
> executes the pg_dump's sql output after creating & switching to the new 
> schema ( via set search_path to <new schema>; )
> 
> After the fix to CVE-2018-1058, all table references in pg_dump's output 
> (including within SQL of stored procedures) are prefixed by the public. 
> schema, which means you cannot just reuse this output in a different 
> schema context without first manually changing the sql.
> As a temporary fix so we can handle new customers in production, we are 
> using a regex search/replace for public. in the pg_dump output, but 
> clearly this is not a reliable solution for a generic gem such as Apartment.
> 
> Is there a different way this gem could now be achieving this?
> Without a flag on pg_dump to force the old format, or a CREATE SCHEMA 
> <schema_name> WITH TEMPLATE <other_schema> kind of statement, we can't 
> see any alternative more reliable than the regex search/replace.

Wild idea:

1) Roll back to:

https://git.postgresql.org/gitweb/?p=postgresql.git;a=tree;h=c987089c48afba99df0c3c2dc45fd69238b52705;hb=2840d201c6a62beb86d9671a66eeec56183d261b

2) Use that to build pre-10.3 version of pg_dump and use that to dump 
your schema. Basically long way to to force old format. Is fragile 
though as that version will lose contact with changes.


Less wild idea and previously suggested upstream.

1) Check your schema objects into scripts that are checked into version 
control w/o schema qualifications.

2) Use scripts to populate new schema.

> 
> The open issue on the Apartment gem for 
> context: https://github.com/influitive/apartment/issues/532
> 
> Thanks in advance.


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: Programmatically duplicating a schema

From
matt.figg@internode.on.net
Date:

Thanks Adrian,

 

Really appreciate the suggestions.

 

The objective when trying to solve this for the Apartment library itself is to keep it generic (works for any/all database objects - tables, views, stored procedures, 3rd party extension objects, etc.) & to require minimal configuration (preferably not having to tell the library your schema).

 

I believe pg_dump was the only way to satisfy the second point.  The reason we'd want to infer the create statements via pg_dump is, so we don't need to keep database migration files in sync with a 'create new schema' SQL script.  It adds risk that they get out of sync, causing inconsistencies in new schemas created by the library.

 

Assuming there's no other way to infer the create statements from the public schema, Ruby on Rails' structure.sql could probably be used as a starting point for the 'create new schema' SQL file.  It's similar already, however it's also generated via pg_dump (having the same issues as Apartment library). http://guides.rubyonrails.org/v3.2.9/migrations.html#types-of-schema-dumps  This is outside the realms of this mail group though.

 

Cheers,

Matt.

 



----- Original Message -----
From:
"Adrian Klaver" <adrian.klaver@aklaver.com>

To:
<matt.figg@internode.on.net>, <pgsql-general@postgresql.org>
Cc:

Sent:
Mon, 12 Mar 2018 21:22:01 -0700
Subject:
Re: Programmatically duplicating a schema


On 03/12/2018 08:23 PM, matt.figg@internode.on.net wrote:
> Hi all,
>
> What is a reliable way to programmatically & generically populate an
> empty schema with all the objects in the public schema as a template?
>
> We are using the multi tenancy ruby gem Apartment (
> https://github.com/influitive/apartment ), which was recently broken by
> the changes made to pg_dump to address CVE-2018-1058
> https://nvd.nist.gov/vuln/detail/CVE-2018-1058
>
> Apartment attempts to duplicate the public schema whenever creating a
> new schema by running:
>
> pg_dump -s -x -0 -n public
>
> to get the SQL statements needed to recreate the public schema & then
> executes the pg_dump's sql output after creating & switching to the new
> schema ( via set search_path to <new schema>; )
>
> After the fix to CVE-2018-1058, all table references in pg_dump's output
> (including within SQL of stored procedures) are prefixed by the public.
> schema, which means you cannot just reuse this output in a different
> schema context without first manually changing the sql.
> As a temporary fix so we can handle new customers in production, we are
> using a regex search/replace for public. in the pg_dump output, but
> clearly this is not a reliable solution for a generic gem such as Apartment.
>
> Is there a different way this gem could now be achieving this?
> Without a flag on pg_dump to force the old format, or a CREATE SCHEMA
> <schema_name> WITH TEMPLATE <other_schema> kind of statement, we can't
> see any alternative more reliable than the regex search/replace.

Wild idea:

1) Roll back to:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=tree;h=c987089c48afba99df0c3c2dc45fd69238b52705;hb=2840d201c6a62beb86d9671a66eeec56183d261b

2) Use that to build pre-10.3 version of pg_dump and use that to dump
your schema. Basically long way to to force old format. Is fragile
though as that version will lose contact with changes.


Less wild idea and previously suggested upstream.

1) Check your schema objects into scripts that are checked into version
control w/o schema qualifications.

2) Use scripts to populate new schema.

>
> The open issue on the Apartment gem for
> context: https://github.com/influitive/apartment/issues/532
>
> Thanks in advance.


--
Adrian Klaver
adrian.klaver@aklaver.com

Re: Programmatically duplicating a schema

From
Alban Hertroys
Date:
> On 13 Mar 2018, at 4:23, matt.figg@internode.on.net wrote:
>
> Hi all,
>
> What is a reliable way to programmatically & generically populate an empty schema with all the objects in the public
schemaas a template? 
>
> We are using the multi tenancy ruby gem Apartment ( https://github.com/influitive/apartment ), which was recently
brokenby the changes made to pg_dump to address CVE-2018-1058 https://nvd.nist.gov/vuln/detail/CVE-2018-1058 
>
> Apartment attempts to duplicate the public schema whenever creating a new schema by running:
>
> pg_dump -s -x -0 -n public
>
> to get the SQL statements needed to recreate the public schema & then executes the pg_dump's sql output after
creating& switching to the new schema ( via set search_path to <new schema>; ) 
>
> After the fix to CVE-2018-1058, all table references in pg_dump's output (including within SQL of stored procedures)
areprefixed by the public. schema, which means you cannot just reuse this output in a different schema context without
firstmanually changing the sql. 
> As a temporary fix so we can handle new customers in production, we are using a regex search/replace for public. in
thepg_dump output, but clearly this is not a reliable solution for a generic gem such as Apartment. 

In my opinion, it makes sense that if you have the option of dumping the contents of a specific schema, it should be
possibleto restore that dump into a different schema. Unfortunately, looking at pg_restore, there does not appear to be
suchan option (yet). 

I'd even go so far to suggest that every single object type that can be dumped with pg_dump (single database, single
schema,single table, single function, etc) should be restorable under a different name. I realise that this could make
pg_restoreoptions potentially more confusing. 

I suppose people currently manually edit the dumps to this effect, but that risks silent corruption of data when for
examplea data value contains a string such as 'The building is now open to public.'. Regular expressions don't know the
differencebetween data and identifiers in a dump file - pg_restore does. 

Whether psql needs the same treatment? I'd qualify this as "advanced" use and limit it to pg_restore.

But then, I'm just a list-lurker, I currently have but the option of voicing my opinion.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.



Re: Programmatically duplicating a schema

From
Melvin Davidson
Date:
 > What is a reliable way to programmatically & generically populate an empty schema with all the objects in the public schema as a template?

The simplest way is just to load the attached clone_schema function. It was originally created by Emanuel '3manuek', which I enhanced. It
now copies all sequences, tables, indexes, rules, triggers, data(optional), views & functions from any existing schema to a new schema.

Then just call the function.

SAMPLE CALL:
SELECT clone_schema('public', 'new_schema', TRUE);

Use FALSE if you do not want to copy data.


--
Melvin Davidson
Maj. Database & Exploration Specialist

Universe Exploration Command – UXC

Employment by invitation only!
Attachment

Re: Programmatically duplicating a schema

From
"David G. Johnston"
Date:
On Mon, Mar 12, 2018 at 11:05 PM, <matt.figg@internode.on.net> wrote:

The reason we'd want to infer the create statements via pg_dump is, so we don't need to keep database migration files in sync with a 'create new schema' SQL script.  It adds risk that they get out of sync, causing inconsistencies in new schemas created by the library.

​Then don't have a "create new schema" script (aside from the very first one) run your migrations even when creating new installations.

​David J.

Re: Programmatically duplicating a schema

From
Adrian Klaver
Date:
On 03/12/2018 11:05 PM, matt.figg@internode.on.net wrote:
> Thanks Adrian,
> 
> Really appreciate the suggestions.
> 
> The objective when trying to solve this for the Apartment library itself 
> is to keep it generic (works for any/all database objects - tables, 
> views, stored procedures, 3rd party extension objects, etc.) & to 
> require minimal configuration (preferably not having to tell the library 
> your schema).
> 
> I believe pg_dump was the only way to satisfy the second point.  The 
> reason we'd want to infer the create statements via pg_dump is, so we 
> don't need to keep database migration files in sync with a 'create new 
> schema' SQL script.  It adds risk that they get out of sync, causing 
> inconsistencies in new schemas created by the library.

Seems to me your best hope is to make a plea on --hackers for a flag 
that turns off schema qualification of object names.

> 
> Assuming there's no other way to infer the create statements from the 
> public schema, Ruby on Rails' structure.sql could probably be used as a 
> starting point for the 'create new schema' SQL file.  It's similar 
> already, however it's also generated via pg_dump (having the same issues 
> as Apartment library). 
> http://guides.rubyonrails.org/v3.2.9/migrations.html#types-of-schema-dumps  
> This is outside the realms of this mail group though.
> 
> Cheers,
> 
> Matt.
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com