Thread: Isolation of multiple databse instances provided by a singlepostgres server

I am working on a fairly small application to use for managing a companies
business.

I have a "production" instance hosted by one of the cloud providers, and 2
other instances. This is fairly new to me. In the past, I have created
applications by keeping a set of scripts that can be used to rebuild the
database, and pg_dump to restore the date. Based on some recommendations I
am using pg_basebackup to backup the production instance nightly. My
background is primarily Oracle. I realize looking at the way pg_basebackup
works that multiple database instances, provided by one server are actually
stored in the same physical OS files.


We have traditionally (in the Postgres world) had a sandbox, that we used
for upgrades, and testing development methodologies, and this seems to be
supported pretty well by pg_dump.

Now that I know "too much" I am concerned about hosting the sandbox on the
same Postgres instance.

Recognizing that this is a fairly small application, what are wiser folks
than I recommendations?

Should I run the sandbox from different Postgres server, possibly even on a
different machine? Is pg_dump still  good way to move the production
instance to the sandbox, and perhaps even the other way around?

-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
                        -- Benjamin Franklin



On 11/20/19 4:03 PM, stan wrote:
I am working on a fairly small application to use for managing a companies
business.

I have a "production" instance hosted by one of the cloud providers, and 2
other instances. This is fairly new to me. In the past, I have created
applications by keeping a set of scripts that can be used to rebuild the
database, and pg_dump to restore the date. Based on some recommendations I
am using pg_basebackup to backup the production instance nightly. My
background is primarily Oracle. I realize looking at the way pg_basebackup
works that multiple database instances, provided by one server are actually
stored in the same physical OS files.


We have traditionally (in the Postgres world) had a sandbox, that we used
for upgrades, and testing development methodologies, and this seems to be
supported pretty well by pg_dump.

Now that I know "too much" I am concerned about hosting the sandbox on the
same Postgres instance.

What specifically do you mean by "instance"?  (I know what it means in the SQL Server world, and in Postgres all the databases accessible via a single $PGDATA are called a cluster.)

Recognizing that this is a fairly small application, what are wiser folks
than I recommendations?

Should I run the sandbox from different Postgres server, possibly even on a
different machine? Is pg_dump still  good way to move the production
instance to the sandbox, and perhaps even the other way around?

Running CAT, STG, UAT, DEV, etc on different VMs is certainly one solution, isolating them from each other.

OTOH, you can initdb multiple clusters on the same host, accessing them via different $PGDATA variables and port numbers.

--
Angular momentum makes the world go 'round.
On Wed, Nov 20, 2019 at 04:24:40PM -0600, Ron wrote:
> On 11/20/19 4:03 PM, stan wrote:
> > I am working on a fairly small application to use for managing a companies
> > business.
> > 
> > I have a "production" instance hosted by one of the cloud providers, and 2
> > other instances. This is fairly new to me. In the past, I have created
> > applications by keeping a set of scripts that can be used to rebuild the
> > database, and pg_dump to restore the date. Based on some recommendations I
> > am using pg_basebackup to backup the production instance nightly. My
> > background is primarily Oracle. I realize looking at the way pg_basebackup
> > works that multiple database instances, provided by one server are actually
> > stored in the same physical OS files.
> > 
> > 
> > We have traditionally (in the Postgres world) had a sandbox, that we used
> > for upgrades, and testing development methodologies, and this seems to be
> > supported pretty well by pg_dump.
> > 
> > Now that I know "too much" I am concerned about hosting the sandbox on the
> > same Postgres instance.
> 
> What specifically do you mean by "instance"??? (I know what it means in the
> SQL Server world, and in Postgres all the databases accessible via a single
> $PGDATA are called a *cluster*.)

Sorry for my incorrect terminology. I am probably confused about the
technology here. Let me try to explain what I think I understand.

It seems to me that I can have one Postgres "server" running listening on a
single port on a single machine. It appears that the data files for this
"server" are managed internally by the Postgres server instance, and I
have no control of what is stored where in them. In an Oracle world, I can
create tablespaces, which have a relationship to OS files, and I can
explicitly control what objects are stored in which tablespaces (OS file),
thus, for example, when I do a hot backup, I put a specific tablespaces in
backup mode, and can then safely copy this OS file (yes I have to properly
deal with archive logs). Thus I would be somewhat comfortable have to
distinct "instance: provided by that one Oracle "server".

It appears to me that, within this one Postgres "instance", there are 2
levels of "isolation", which are database, and schemas. Is this correct? If
so, how does this cores pond to physical on disk storage?
> 
> > Recognizing that this is a fairly small application, what are wiser folks
> > than I recommendations?
> > 
> > Should I run the sandbox from different Postgres server, possibly even on a
> > different machine? Is pg_dump still  good way to move the production
> > instance to the sandbox, and perhaps even the other way around?
> 
> Running CAT, STG, UAT, DEV, etc on different VMs is certainly one solution,
> isolating them from each other.
 Makes sense.
> 
> OTOH, you can initdb multiple clusters on the same host, accessing them via
> different $PGDATA variables and port numbers.

That is consistent with what I thought I understood.

Thanks for taking time to educate me.
-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
                        -- Benjamin Franklin



On 11/21/19 5:55 AM, stan wrote:
On Wed, Nov 20, 2019 at 04:24:40PM -0600, Ron wrote:
On 11/20/19 4:03 PM, stan wrote:
I am working on a fairly small application to use for managing a companies
business.

I have a "production" instance hosted by one of the cloud providers, and 2
other instances. This is fairly new to me. In the past, I have created
applications by keeping a set of scripts that can be used to rebuild the
database, and pg_dump to restore the date. Based on some recommendations I
am using pg_basebackup to backup the production instance nightly. My
background is primarily Oracle. I realize looking at the way pg_basebackup
works that multiple database instances, provided by one server are actually
stored in the same physical OS files.


We have traditionally (in the Postgres world) had a sandbox, that we used
for upgrades, and testing development methodologies, and this seems to be
supported pretty well by pg_dump.

Now that I know "too much" I am concerned about hosting the sandbox on the
same Postgres instance.
What specifically do you mean by "instance"??? (I know what it means in the
SQL Server world, and in Postgres all the databases accessible via a single
$PGDATA are called a *cluster*.)
Sorry for my incorrect terminology. I am probably confused about the
technology here. Let me try to explain what I think I understand.

It seems to me that I can have one Postgres "server" running listening on a
single port on a single machine. It appears that the data files for this
"server" are managed internally by the Postgres server instance, and I
have no control of what is stored where in them. In an Oracle world, I can
create tablespaces, which have a relationship to OS files, and I can
explicitly control what objects are stored in which tablespaces (OS file),

Same in Postgres.

https://www.postgresql.org/docs/9.6/sql-createtablespace.html

CREATE TABLESPACE tablespace_name   [ OWNER { new_owner | CURRENT_USER | SESSION_USER } ]   LOCATION 'directory'   [ WITH ( tablespace_option = value [, ... ] ) ]


thus, for example, when I do a hot backup, I put a specific tablespaces in
backup mode, and can then safely copy this OS file (yes I have to properly
deal with archive logs). Thus I would be somewhat comfortable have to
distinct "instance: provided by that one Oracle "server".

There's no ability to backup a single tablespace in Postgres, because the purpose behind them is not the same in Oracle and Postgres.


It appears to me that, within this one Postgres "instance", there are 2
levels of "isolation", which are database, and schemas. Is this correct? 

Yes, but ... schema in Postgres are different from Oracle schema.  In Postgres, the CREATE SCHEMA command creates a schema, whereas CREATE USER creates a schema in Oracle.

If
so, how does this cores pond to physical on disk storage?

It corresponds not at all.

The directory that a table's files go in is solely dependent on the tablespace it lives in (typically "pg_default", who's location is $PGDATA).

Recognizing that this is a fairly small application, what are wiser folks
than I recommendations?

Should I run the sandbox from different Postgres server, possibly even on a
different machine? Is pg_dump still  good way to move the production
instance to the sandbox, and perhaps even the other way around?
Running CAT, STG, UAT, DEV, etc on different VMs is certainly one solution,
isolating them from each other.
 Makes sense.
OTOH, you can initdb multiple clusters on the same host, accessing them via
different $PGDATA variables and port numbers.
That is consistent with what I thought I understood.

Thanks for taking time to educate me.

--
Angular momentum makes the world go 'round.
On Thu, Nov 21, 2019 at 08:18:21AM -0600, Ron wrote:
> On 11/21/19 5:55 AM, stan wrote:
> > On Wed, Nov 20, 2019 at 04:24:40PM -0600, Ron wrote:
> > > On 11/20/19 4:03 PM, stan wrote:
> > > > I am working on a fairly small application to use for managing a companies
> > > > business.
> > > > 
> > > > I have a "production" instance hosted by one of the cloud providers, and 2
> > > > other instances. This is fairly new to me. In the past, I have created
> > > > applications by keeping a set of scripts that can be used to rebuild the
> > > > database, and pg_dump to restore the date. Based on some recommendations I
> > > > am using pg_basebackup to backup the production instance nightly. My
> > > > background is primarily Oracle. I realize looking at the way pg_basebackup
> > > > works that multiple database instances, provided by one server are actually
> > > > stored in the same physical OS files.
> > > > 
> > > > 
> > > > We have traditionally (in the Postgres world) had a sandbox, that we used
> > > > for upgrades, and testing development methodologies, and this seems to be
> > > > supported pretty well by pg_dump.
> > > > 
> > > > Now that I know "too much" I am concerned about hosting the sandbox on the
> > > > same Postgres instance.
> > > What specifically do you mean by "instance"??? (I know what it means in the
> > > SQL Server world, and in Postgres all the databases accessible via a single
> > > $PGDATA are called a *cluster*.)
> > Sorry for my incorrect terminology. I am probably confused about the
> > technology here. Let me try to explain what I think I understand.
> > 
> > It seems to me that I can have one Postgres "server" running listening on a
> > single port on a single machine. It appears that the data files for this
> > "server" are managed internally by the Postgres server instance, and I
> > have no control of what is stored where in them. In an Oracle world, I can
> > create tablespaces, which have a relationship to OS files, and I can
> > explicitly control what objects are stored in which tablespaces (OS file),
> 
> Same in Postgres.
> 
> https://www.postgresql.org/docs/9.6/sql-createtablespace.html
> 
> CREATE TABLESPACEtablespace_name
>     [ OWNER {new_owner  | CURRENT_USER | SESSION_USER } ]
>     LOCATION 'directory'
>     [ WITH (tablespace_option  =value  [, ... ] ) ]
> 
OH, that does look familiar, thanks for taking time to educate me.
> 
> 
> > thus, for example, when I do a hot backup, I put a specific tablespaces in
> > backup mode, and can then safely copy this OS file (yes I have to properly
> > deal with archive logs). Thus I would be somewhat comfortable have to
> > distinct "instance: provided by that one Oracle "server".
> 
> There's no ability to backup a single tablespace in Postgres, because the
> purpose behind them is not the same in Oracle and Postgres.

Got that.
> 
> 
> > It appears to me that, within this one Postgres "instance", there are 2
> > levels of "isolation", which are database, and schemas. Is this correct?
> 
> Yes, but ... schema in Postgres are /different/ from Oracle schema.?? In
> Postgres, the CREATE SCHEMA command creates a schema, whereas CREATE USER
> creates a schema in Oracle.

I am still struggling with the schema - role - user relationship in Postgres.
It appears to me there is one more layer than is needed/useful.
> 
> > If
> > so, how does this cores pond to physical on disk storage?
> 
> It corresponds not at all.

Understand that now.
> 
> The directory that a table's files go in is solely dependent on the
> tablespace it lives in (typically "pg_default", who's location is $PGDATA).
> 
> > > > Recognizing that this is a fairly small application, what are wiser folks
> > > > than I recommendations?
> > > > 
> > > > Should I run the sandbox from different Postgres server, possibly even on a
> > > > different machine? Is pg_dump still  good way to move the production
> > > > instance to the sandbox, and perhaps even the other way around?
> > > Running CAT, STG, UAT, DEV, etc on different VMs is certainly one solution,
> > > isolating them from each other.
> >   Makes sense.
> > > OTOH, you can initdb multiple clusters on the same host, accessing them via
> > > different $PGDATA variables and port numbers.
> > That is consistent with what I thought I understood.
> > 
> > Thanks for taking time to educate me.
> 
> -- 
> Angular momentum makes the world go 'round.

Thanks again.

BTW, I do appreciate your taking time to clarify things I am confused on,
Hopefully I can pass on this knowledge to to others.

I have decide to isolate my sandbox instance by running on a physically
different machine BTW, which was what the original question was about, but I
am learning other things that I knew I did not have a full understanding of,
and I appreciate that.

-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
                        -- Benjamin Franklin



Re: Isolation of multiple databse instances provided by a singlepostgres server

From
Laurenz Albe
Date:
On Thu, 2019-11-21 at 06:55 -0500, stan wrote:
> It seems to me that I can have one Postgres "server" running listening on a
> single port on a single machine. It appears that the data files for this
> "server" are managed internally by the Postgres server instance, and I
> have no control of what is stored where in them. In an Oracle world, I can
> create tablespaces, which have a relationship to OS files, and I can
> explicitly control what objects are stored in which tablespaces (OS file),
> thus, for example, when I do a hot backup, I put a specific tablespaces in
> backup mode, and can then safely copy this OS file (yes I have to properly
> deal with archive logs). Thus I would be somewhat comfortable have to
> distinct "instance: provided by that one Oracle "server".
> 
> It appears to me that, within this one Postgres "instance", there are 2
> levels of "isolation", which are database, and schemas. Is this correct? If
> so, how does this cores pond to physical on disk storage?

You can use tablespaces in PostgreSQL, which are directories on a
different file system, to put your data elsewhere.
But that has very limited use-cases, and normally you don't
create a tablespace.

About isolation:
- The different databases in a cluster are physically located in
  the same tablespace, but they are logically strictly separated.
  You cannot connect to one database and access another database from there.

- There can be several schemas in a database.
  You can access a table in a schema if you have the required privilege
  on both the schema and the table.

This is entirely independent of physical storage, which is provided
by tablespaces.  Tables from different databases can be located in the
same tablespace and vice versa.

Think of "database" and "schema" as a logical separation in SQL.

You cannot backup and restore an individual tablespace, only the
whole cluster.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com




On 11/21/19 8:42 AM, stan wrote:
On Thu, Nov 21, 2019 at 08:18:21AM -0600, Ron wrote:
[snip]
Yes, but ... schema in Postgres are /different/ from Oracle schema.?? In
Postgres, the CREATE SCHEMA command creates a schema, whereas CREATE USER
creates a schema in Oracle.
I am still struggling with the schema - role - user relationship in Postgres.
It appears to me there is one more layer than is needed/useful.

For one thing, there are no users in Postgres anymore.

https://www.postgresql.org/docs/9.6/sql-createuser.html

CREATE USER is now an alias for CREATE ROLE. The only difference is that when the command is spelled CREATE USER, LOGIN is assumed by default, whereas NOLOGIN is assumed when the command is spelled CREATE ROLE.


--
Angular momentum makes the world go 'round.

Re: Isolation of multiple databse instances provided by a singlepostgres server

From
Adrian Klaver
Date:
On 11/21/19 6:51 AM, Laurenz Albe wrote:
> On Thu, 2019-11-21 at 06:55 -0500, stan wrote:

> You can use tablespaces in PostgreSQL, which are directories on a
> different file system, to put your data elsewhere.
> But that has very limited use-cases, and normally you don't
> create a tablespace.
> 
> About isolation:
> - The different databases in a cluster are physically located in
>    the same tablespace, but they are logically strictly separated.
>    You cannot connect to one database and access another database from there.

dblink(https://www.postgresql.org/docs/11/dblink.html) and 
FDW(https://www.postgresql.org/docs/11/sql-createforeigntable.html) 
would beg to differ.

> 
> - There can be several schemas in a database.

There are several schemas in a database. In a new database:

test=# \dnS
         List of schemas
         Name        |  Owner
--------------------+----------
  information_schema | postgres
  pg_catalog         | postgres
  pg_temp_1          | postgres
  pg_toast           | postgres
  pg_toast_temp_1    | postgres
  public             | postgres


>    You can access a table in a schema if you have the required privilege
>    on both the schema and the table.
> 
> This is entirely independent of physical storage, which is provided
> by tablespaces.  Tables from different databases can be located in the
> same tablespace and vice versa.
> 
> Think of "database" and "schema" as a logical separation in SQL.
> 
> You cannot backup and restore an individual tablespace, only the
> whole cluster.
> 
> Yours,
> Laurenz Albe
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Isolation of multiple databse instances provided by a singlepostgres server

From
Adrian Klaver
Date:
On 11/21/19 6:51 AM, Laurenz Albe wrote:
> On Thu, 2019-11-21 at 06:55 -0500, stan wrote:

> You can use tablespaces in PostgreSQL, which are directories on a
> different file system, to put your data elsewhere.
> But that has very limited use-cases, and normally you don't
> create a tablespace.
> 
> About isolation:
> - The different databases in a cluster are physically located in
>    the same tablespace, but they are logically strictly separated.
>    You cannot connect to one database and access another database from there.

dblink(https://www.postgresql.org/docs/11/dblink.html) and 
FDW(https://www.postgresql.org/docs/11/sql-createforeigntable.html) 
would beg to differ.

> 
> - There can be several schemas in a database.

There are several schemas in a database. In a new database:

test=# \dnS
         List of schemas
         Name        |  Owner
--------------------+----------
  information_schema | postgres
  pg_catalog         | postgres
  pg_temp_1          | postgres
  pg_toast           | postgres
  pg_toast_temp_1    | postgres
  public             | postgres


>    You can access a table in a schema if you have the required privilege
>    on both the schema and the table.
> 
> This is entirely independent of physical storage, which is provided
> by tablespaces.  Tables from different databases can be located in the
> same tablespace and vice versa.
> 
> Think of "database" and "schema" as a logical separation in SQL.
> 
> You cannot backup and restore an individual tablespace, only the
> whole cluster.
> 
> Yours,
> Laurenz Albe
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com



Re: Isolation of multiple databse instances provided by a singlepostgres server

From
"Peter J. Holzer"
Date:
On 2019-11-21 08:18:21 -0600, Ron wrote:
>     It appears to me that, within this one Postgres "instance", there are 2
>     levels of "isolation", which are database, and schemas. Is this correct?
[...]
>     If so, how does this cores pond to physical on disk storage?
>
> It corresponds not at all.

That's not quite true. Each database has a specific directory (per
tablespace) to keep its files in. Schemas on the other hand do not
correspond to anything on the filesystem.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

Re: Isolation of multiple databse instances provided by a singlepostgres server

From
"Peter J. Holzer"
Date:
On 2019-11-21 07:21:10 -0800, Adrian Klaver wrote:
> On 11/21/19 6:51 AM, Laurenz Albe wrote:
> > - The different databases in a cluster are physically located in
> >    the same tablespace, but they are logically strictly separated.
> >    You cannot connect to one database and access another database from there.
>
> dblink(https://www.postgresql.org/docs/11/dblink.html) and
> FDW(https://www.postgresql.org/docs/11/sql-createforeigntable.html) would
> beg to differ.

I was thinking of FDW.

But I think Laurenz had the SQL syntax in mind which does include the
database in the fully qualified form of a tablename. Something like

db2=# select * from db2.public.t3;
╔════╤═══════════════════════╗
║ id │           t           ║
╟────┼───────────────────────╢
║  1 │ this is db2.public.t3 ║
╚════╧═══════════════════════╝
(1 row)

works. But when you try to access a table in a different database you
get an error:

db2=# select * from db1.public.t1;
ERROR:  cross-database references are not implemented: "db1.public.t1"
LINE 1: select * from db1.public.t1;

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment