Thread: schemas for organizing tables

schemas for organizing tables

From
Seb
Date:
Hi,

A database I'm handling is becoming a bit large'ish (~ 30 tables), and
I'd like to break them down into their natural units.  Schemas for each
of these natural units seems logical, but are they really meant for
this?  I'm also worried about how this would affect programs like
Libreoffice (the sdbc driver in particular)?

Thanks,

--
Seb

Re: schemas for organizing tables

From
John R Pierce
Date:
On 04/28/11 5:51 PM, Seb wrote:
> Hi,
>
> A database I'm handling is becoming a bit large'ish (~ 30 tables), and
> I'd like to break them down into their natural units.  Schemas for each
> of these natural units seems logical, but are they really meant for
> this?  I'm also worried about how this would affect programs like
> Libreoffice (the sdbc driver in particular)?

30 tables is really not that many.  we have schemas with 200 tables in
databases with 6 or 10 different schemas.

we use schemas to separate out data that is rarely used together,
typically each schema is for different application classes.

as far as liberoffice goes, I have no idea, but you should be able to
refer to schemaname.tablename regardless.



Re: schemas for organizing tables

From
Seb
Date:
On Thu, 28 Apr 2011 18:15:05 -0700,
John R Pierce <pierce@hogranch.com> wrote:

> On 04/28/11 5:51 PM, Seb wrote:
>> Hi,

>> A database I'm handling is becoming a bit large'ish (~ 30 tables),
>> and I'd like to break them down into their natural units.  Schemas
>> for each of these natural units seems logical, but are they really
>> meant for this?  I'm also worried about how this would affect
>> programs like Libreoffice (the sdbc driver in particular)?

> 30 tables is really not that many.  we have schemas with 200 tables in
> databases with 6 or 10 different schemas.

> we use schemas to separate out data that is rarely used together,
> typically each schema is for different application classes.

Thanks, this is exactly what I have in mind.  The last paragraph in
section 5.7.7 of the documentation is somewhat ominous regarding the use
of schemas if at some point one has to work with other DBMS that don't
support schemas.  I suppose this means trouble if at some point one
needs to move the database to a DBMS not supporting schemas or that does
so differently.


> as far as liberoffice goes, I have no idea, but you should be able to
> refer to schemaname.tablename regardless.

I'll look into this.

Thanks,

--
Seb

Re: schemas for organizing tables

From
Darren Duncan
Date:
Seb wrote:
> A database I'm handling is becoming a bit large'ish (~ 30 tables), and
> I'd like to break them down into their natural units.  Schemas for each
> of these natural units seems logical, but are they really meant for
> this?  I'm also worried about how this would affect programs like
> Libreoffice (the sdbc driver in particular)?

I think that 30 tables is too few to be splitting into schemas based just on
their number; you should have other reasons for splitting them.  Also, 30 is
quite small, or at least medium-small; some databases have hundreds, thousands
or tens of thousands of tables.

Think of a schema like a programming namespace.  In a program, you may have
different libraries or classes that each contain functions and such.  You would
logically group functions together at some times and separate them at other
times.  You can let similar concerns organize your schemas, where as schema is
like a library as a function is to a table ... or some other database object
like a stored function.

-- Darren Duncan

Re: schemas for organizing tables

From
Seb
Date:
On Thu, 28 Apr 2011 19:29:11 -0700,
Darren Duncan <darren@darrenduncan.net> wrote:

> Seb wrote:
>> A database I'm handling is becoming a bit large'ish (~ 30 tables),
>> and I'd like to break them down into their natural units.  Schemas
>> for each of these natural units seems logical, but are they really
>> meant for this?  I'm also worried about how this would affect
>> programs like Libreoffice (the sdbc driver in particular)?

> I think that 30 tables is too few to be splitting into schemas based
> just on their number; you should have other reasons for splitting
> them.  Also, 30 is quite small, or at least medium-small; some
> databases have hundreds, thousands or tens of thousands of tables.

> Think of a schema like a programming namespace.  In a program, you may
> have different libraries or classes that each contain functions and
> such.  You would logically group functions together at some times and
> separate them at other times.  You can let similar concerns organize
> your schemas, where as schema is like a library as a function is to a
> table ... or some other database object like a stored function.

Thanks for these thoughts.  Perhaps I can describe a cartoon of this
database to explain what I'm trying to accomplish.

The database stores information related to biological research.  The
bulk of the tables describe things like individual ID, morphometrics,
and behavioural data on all the individuals in several studies.
However, there are a few tables that do not relate to the research
itself (the main use of the DB), but to logistics.  For example, a group
of tables store information on purchases and inventory of material
required for the overall project.  These tables would never (or almost)
be queried together with the others.

So typically we have two types of uses: research and
preparation/logistics for the project.  We wouldn't want to even see the
logistcs tables for research work, whereas we would like to see only
these ones for preparation/planning.  As the project and number of
tables grow, we could see a similar divergence within the group of
tables related to research.  Are these good reasons for separating these
units into schemas?

Cheers,

--
Seb

Re: schemas for organizing tables

From
Darren Duncan
Date:
Seb wrote:
> Thanks for these thoughts.  Perhaps I can describe a cartoon of this
> database to explain what I'm trying to accomplish.
>
> The database stores information related to biological research.  The
> bulk of the tables describe things like individual ID, morphometrics,
> and behavioural data on all the individuals in several studies.
> However, there are a few tables that do not relate to the research
> itself (the main use of the DB), but to logistics.  For example, a group
> of tables store information on purchases and inventory of material
> required for the overall project.  These tables would never (or almost)
> be queried together with the others.
>
> So typically we have two types of uses: research and
> preparation/logistics for the project.  We wouldn't want to even see the
> logistcs tables for research work, whereas we would like to see only
> these ones for preparation/planning.  As the project and number of
> tables grow, we could see a similar divergence within the group of
> tables related to research.  Are these good reasons for separating these
> units into schemas?

If you mean using just 2 schemas for those 2 distinct tasks, then yes, that may
be a reasonable organization strategy.  However, it is important to keep
yourself flexible to a degree such as to change your mind on this decision or
reorganize now and then.  Similar to how you should expect you may refactor or
reorganize your program now and then, though you could design now for how you
reasonably expect to be for awhile.  This all said, I could go either way.  Your
proposal is a more reasonable purpose for using 2 schemas, but at the same time,
because you only have 30 tables for now, it may not hurt to keep them together
in 1 schema anyway.

Okay, bottom line, for simplicity, stick to a single schema for everything until
you can point to a clear benefit for splitting.  If you're on the fence and
could go either way with no clear benefit of one way over the other, then just
use 1 schema for everything by default as that is a simpler approach.

As for the possibility of supporting another DBMS later without schema support,
well ... Postgres is free and is generally superior to all other free DBMSs, and
probably most of the DBMSs you would have to choose from also support schemas
anyway ... or if you need to use the others you could fake it with common
prefixes for your table names.

-- Darren Duncan

Re: schemas for organizing tables

From
Alban Hertroys
Date:
On 29 Apr 2011, at 5:03, Seb wrote:

> So typically we have two types of uses: research and
> preparation/logistics for the project.  We wouldn't want to even see the
> logistcs tables for research work, whereas we would like to see only
> these ones for preparation/planning.  As the project and number of
> tables grow, we could see a similar divergence within the group of
> tables related to research.  Are these good reasons for separating these
> units into schemas?


I suppose the two different schema's are interesting to similarly different users of the database as well? If so, you
couldalso create two different types of users with schema search_paths defaulting to their respective schema. 
That way logistics-type users don't get confused by research tables and researchers don't get confused by logistics
tables.You could even restrict access to the schema the user isn't supposed to access. 

Alban Hertroys

--
Screwing up is an excellent way to attach something to the ceiling.


!DSPAM:737,4dba5f4212121823650944!



Re: schemas for organizing tables

From
fork
Date:
Seb <spluque <at> gmail.com> writes:

> The database stores information related to biological research.  The
> bulk of the tables describe things like individual ID, morphometrics,
> and behavioural data on all the individuals in several studies.
> However, there are a few tables that do not relate to the research
> itself (the main use of the DB), but to logistics.  For example, a group
> of tables store information on purchases and inventory of material
> required for the overall project.  These tables would never (or almost)
> be queried together with the others.
>
> So typically we have two types of uses: research and
> preparation/logistics for the project.  We wouldn't want to even see the
> logistcs tables for research work, whereas we would like to see only
> these ones for preparation/planning.  As the project and number of
> tables grow, we could see a similar divergence within the group of
> tables related to research.  Are these good reasons for separating these
> units into schemas?

One thing that splitting into schemas would give you is the ability to play
games with "SEARCH_PATH".  Users who handle logistics would have their
search_path only set to "LOGISTICS,public" and would need to fully qualify a
research table to even see it (like "select * from research.foo"); any table
creation would also happen in the schema RESEARCH.  Running an unadorned "\d"
from the command line will only show tables in the search path, which might be
useful too.  Vice versa for RESEARCH.

That being said, ..., less complexity is always better, so unless this is really
compelling because people are complaining, I wouldn't bother.  My rule is always
"pull" technology within an organization, never "push"; on your own experimental
box, though, it is well worth hacking for understanding.

>
> Cheers,
>