Re: Re: [GENERAL] PostgreSQL vs. MySQL - Mailing list pgsql-hackers

From Mike Mascari
Subject Re: Re: [GENERAL] PostgreSQL vs. MySQL
Date
Msg-id 3969D7CA.8AF9573C@mascari.com
Whole thread Raw
In response to Re: Re: [GENERAL] PostgreSQL vs. MySQL  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: Re: [GENERAL] PostgreSQL vs. MySQL  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
Bruce Momjian wrote:
> 
> > And of course the major problem with *that* is how do you get the
> > connection request to arrive at a backend that's been prestarted in
> > the right database?  If you don't commit to a database then there's
> > not a whole lot of prestarting that can be done.
> >
> > It occurs to me that this'd get a whole lot more feasible if one
> > postmaster == one database, which is something we *could* do if we
> > implemented schemas.  Hiroshi's been arguing that the current hard
> > separation between databases in an installation should be done away
> > with in favor of schemas, and I'm starting to see his point...
> 
> This is interesting.  You believe schema's would allow a pool of
> backends to connect to any database?  That would clearly be a win.

I'm just curious, but did a consensus ever develop on schemas? It
seemed that the schemas/tablespace thread just ran out of steam.
For what its worth, I like the idea of:

1. PostgreSQL installation -> SQL cluster of catalogs
2. PostgreSQL database -> SQL catalog
3. PostgreSQL schema -> SQL schema

This correlates nicely with the current representation of
DATABASE. People can run multiple SQL clusters by running
multiple postmasters on different ports. Today, most people
achieve a logical separation of data by issuing multiple CREATE
DATABASE commands. But under the above, most sites would run with
a single PostgreSQL database (SQL catalog), since:

"Catalogs are named collections of schemas in an SQL-environment"

This would mirror the behavior of Oracle, where most people run
with a single Oracle SID.  The logical separation would be
achieved with SCHEMA's a level under the current DATABASE (a.k.a.
catalog). This eliminates the problem of using softlinks and
creating various subdirectories to mirror *logical* parititioning
of data. It also alleviates the problem people currently
encounter when they've built their data model around multiple
DATABASE's but learn later that they need access to more than one
simultaneously. Instead, they'll model their design around
multiple SCHEMA's which exist within a single DATABASE instance. 

It seems that the discussion of tablespaces shouldn't be mixed
with SCHEMA's except to note that a DATABASE (catalog) should
have a default TABLESPACE whose path matches the current one:

../pgsql/data/base/<mydatabase>

Later, users might be able to create a hierarchy of default
TABLESPACE's where the location of the object is found with logic
like:

1. Is there a object-specified tablespace? (ex: CREATE TABLE payroll IN TABLESPACE...)
2. Is there a user-specified default tablespace? (ex: CREATE USER mike DEFAULT TABLESPACE...)
2. Is there a schema-specified default tablespace? (ex: CREATE SCHEMA accounting DEFAULT TABLESPACE..)
3. Use the catalog-default tablespace (ex: CREATE DATABASE postgres DEFAULT LOCATION '/home/pgsql')

with the last example creating the system tablespace,
'system_tablespace', with '/home/pgsql' as the location.

Anyways, it seems a consensus should be developed on the whole
Cluster/Catalog/Schema scenario.

Mike Mascari


pgsql-hackers by date:

Previous
From: Philip Warner
Date:
Subject: Re: AW: Re: postgres TODO
Next
From: Tom Lane
Date:
Subject: Re: More info