Thread: Need suggestions

Need suggestions

From
Jack W
Date:
I will create several databases on PostGreSQL. All the databases have the same structure: same number of table/index.
I have two choices:
1. For each database, I create a new tablespace and create a new database in the tablespace.
2. I only create one tablespace. Create all the databases on the same tablespace.

What is the advantage and disadvantage of the two choices? For the first choice, different database locates in different physical directory on the hard drive. For the second choice, all the database locate in the same physical directory.

Another possibility is to create a new "database cluster directory" for each database. Then each database is managed by different database server instance using different connection.

Which way is better? Thanks a lot.


Re: Need suggestions

From
Bill Moran
Date:
In response to Jack W <dbdevelop2000@gmail.com>:

> I will create several databases on PostGreSQL. All the databases have the
> same structure: same number of table/index.
> I have two choices:
> 1. For each database, I create a new tablespace and create a new database in
> the tablespace.
> 2. I only create one tablespace. Create all the databases on the same
> tablespace.
>
> What is the advantage and disadvantage of the two choices? For the first
> choice, different database locates in different physical directory on the
> hard drive. For the second choice, all the database locate in the same
> physical directory.
>
> Another possibility is to create a new "database cluster directory" for each
> database. Then each database is managed by different database server
> instance using different connection.
>
> Which way is better? Thanks a lot.

Depends on what you're trying to accomplish, which you don't state.

The typical reason for tablespaces would be to store different parts of
your database cluster on different physical storage, thus a hard drive
being saturated with writes doesn't slow down other tables that are on
a completely different hard drive.  I can't think of many other reasons
to use tablespaces.

The typical reason for running multiple instances is that the global
settings must change, i.e. the listening port or listening address
must be different, or the roles and server-wide config settings must
be different.

Without knowing what problem you're trying to solve, I can't recommend
one or the other, but hopefully the previous paragraphs will help.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Need suggestions

From
Bill Moran
Date:
In response to Jack W <dbdevelop2000@gmail.com>:

> On Thu, Jun 25, 2009 at 11:37 AM, Bill Moran <wmoran@potentialtech.com>wrote:
>
> > In response to Jack W <dbdevelop2000@gmail.com>:
> >
> > > I will create several databases on PostGreSQL. All the databases have the
> > > same structure: same number of table/index.
> > > I have two choices:
> > > 1. For each database, I create a new tablespace and create a new database
> > in
> > > the tablespace.
> > > 2. I only create one tablespace. Create all the databases on the same
> > > tablespace.
> > >
> > > What is the advantage and disadvantage of the two choices? For the first
> > > choice, different database locates in different physical directory on the
> > > hard drive. For the second choice, all the database locate in the same
> > > physical directory.
> > >
> > > Another possibility is to create a new "database cluster directory" for
> > each
> > > database. Then each database is managed by different database server
> > > instance using different connection.
> > >
> > > Which way is better? Thanks a lot.
> >
> > Depends on what you're trying to accomplish, which you don't state.
> >
> > The typical reason for tablespaces would be to store different parts of
> > your database cluster on different physical storage, thus a hard drive
> > being saturated with writes doesn't slow down other tables that are on
> > a completely different hard drive.  I can't think of many other reasons
> > to use tablespaces.
> >
> > The typical reason for running multiple instances is that the global
> > settings must change, i.e. the listening port or listening address
> > must be different, or the roles and server-wide config settings must
> > be different.
> >
> > Without knowing what problem you're trying to solve, I can't recommend
> > one or the other, but hopefully the previous paragraphs will help.
>
> Thanks for your reply.

Keep the mailing list in your replies.

> The problem I want to solve is a typical web application. I want to use
> several databases on the server side to store information for different
> departments. For example, one database for sales department; one database
> for HR department. And all the databases have the same structure/schema.
>
> If considering performance, which way is better?

Multiple database instances will fragment memory and hurt both Postgres'
and the OS' ability to use memory efficiently.  Do not use multiple
database instances if performance is a major goal.

> If I use one database server instance to manage all the databases, all the
> databases share the same transaction log. When doing backup/recovery, I need
> to back up or recover all the databases together, right?

False.  pg_dump can back up individual databases.  If you plan on doing
PITR, then you are correct.  However, if you data is so important that
you can justify PITR, you'll want to have a separate server for restore
purposes, and once you've restored you can use other methods to transfer
the data to the live system, picking and choosing what you need.

> I can not shut down
> just one database because all the the databases will be shut down, right?

True, but why are you shutting databases down?

> If I use multiple database server instances, I can back up/recover/shut down
> each database separately.

If you expect to need to do that kind of tinkering often, then I would
recommend going one step further and getting each department it's own
physical (or virtual) server.  If your environment is that unpredictable,
you're probably going to come across other issues, like department A runs
a data import that brings the server to its knees and all the other
departments complain.  However, running multiple instances of Postgres
is one way of solving _some_ of those issues (as you describe).  However,
if you need the granularity of PITR in an environment where things are
that unpredictable, you really need to establish multiple independent
environments.

--
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/

Re: Need suggestions

From
Jack W
Date:


On Thu, Jun 25, 2009 at 12:10 PM, Bill Moran <wmoran@potentialtech.com> wrote:
In response to Jack W <dbdevelop2000@gmail.com>:

> On Thu, Jun 25, 2009 at 11:37 AM, Bill Moran <wmoran@potentialtech.com>wrote:
>
> > In response to Jack W <dbdevelop2000@gmail.com>:
> >
> > > I will create several databases on PostGreSQL. All the databases have the
> > > same structure: same number of table/index.
> > > I have two choices:
> > > 1. For each database, I create a new tablespace and create a new database
> > in
> > > the tablespace.
> > > 2. I only create one tablespace. Create all the databases on the same
> > > tablespace.
> > >
> > > What is the advantage and disadvantage of the two choices? For the first
> > > choice, different database locates in different physical directory on the
> > > hard drive. For the second choice, all the database locate in the same
> > > physical directory.
> > >
> > > Another possibility is to create a new "database cluster directory" for
> > each
> > > database. Then each database is managed by different database server
> > > instance using different connection.
> > >
> > > Which way is better? Thanks a lot.
> >
> > Depends on what you're trying to accomplish, which you don't state.
> >
> > The typical reason for tablespaces would be to store different parts of
> > your database cluster on different physical storage, thus a hard drive
> > being saturated with writes doesn't slow down other tables that are on
> > a completely different hard drive.  I can't think of many other reasons
> > to use tablespaces.
> >
> > The typical reason for running multiple instances is that the global
> > settings must change, i.e. the listening port or listening address
> > must be different, or the roles and server-wide config settings must
> > be different.
> >
> > Without knowing what problem you're trying to solve, I can't recommend
> > one or the other, but hopefully the previous paragraphs will help.
>
> Thanks for your reply.

Keep the mailing list in your replies.

> The problem I want to solve is a typical web application. I want to use
> several databases on the server side to store information for different
> departments. For example, one database for sales department; one database
> for HR department. And all the databases have the same structure/schema.
>
> If considering performance, which way is better?

Multiple database instances will fragment memory and hurt both Postgres'
and the OS' ability to use memory efficiently.  Do not use multiple
database instances if performance is a major goal.

> If I use one database server instance to manage all the databases, all the
> databases share the same transaction log. When doing backup/recovery, I need
> to back up or recover all the databases together, right?

False.  pg_dump can back up individual databases.  If you plan on doing
PITR, then you are correct.  However, if you data is so important that
you can justify PITR, you'll want to have a separate server for restore
purposes, and once you've restored you can use other methods to transfer
the data to the live system, picking and choosing what you need.

Thanks for your reply.
Yes. I plan to use PITR.
 

> I can not shut down
> just one database because all the the databases will be shut down, right?

True, but why are you shutting databases down?

Maybe for maintainence purpose or schema change. For example as you mentioned above, if using several tablespaces located on different hard drives. If one hard drive is damaged, the database on that hard drive will not be available. How about other databases managed by the same database server instance? I need to shut down all the databases to do mainatainence, right?   
 

> If I use multiple database server instances, I can back up/recover/shut down
> each database separately.

If you expect to need to do that kind of tinkering often, then I would
recommend going one step further and getting each department it's own
physical (or virtual) server.  If your environment is that unpredictable,
you're probably going to come across other issues, like department A runs
a data import that brings the server to its knees and all the other
departments complain.  However, running multiple instances of Postgres
is one way of solving _some_ of those issues (as you describe).  However,
if you need the granularity of PITR in an environment where things are
that unpredictable, you really need to establish multiple independent
environments.

Re: Need suggestions

From
Filip Rembiałkowski
Date:


2009/6/25 Jack W <dbdevelop2000@gmail.com>

Maybe for maintainence purpose or schema change. For example as you mentioned above, if using several tablespaces located on different hard drives. If one hard drive is damaged, the database on that hard drive will not be available. How about other databases managed by the same database server instance?

interesting questions...

If I were you, I would just verify this by creating separate partition, putting a tablespace there, creating a database in this tablespace, pgbench'ing this database continuosly, and running
umount --force on the partition.

 
I need to shut down all the databases to do mainatainence, right?   

depends on the kind of maintenance.


--
Filip Rembiałkowski
JID,mailto:filip.rembialkowski@gmail.com
http://filip.rembialkowski.net/

Re: Need suggestions

From
Alban Hertroys
Date:
On Jun 25, 2009, at 9:26 PM, Jack W wrote:
>> > I can not shut down
>> > just one database because all the the databases will be shut
>> down, right?
>>
>> True, but why are you shutting databases down?
>
> Maybe for maintainence purpose or schema change. For example as you
> mentioned above, if using several tablespaces located on different
> hard drives. If one hard drive is damaged, the database on that hard
> drive will not be available. How about other databases managed by
> the same database server instance? I need to shut down all the
> databases to do mainatainence, right?

For a schema change the database needs to be running, so that's not a
good reason for wanting to shut down a database.

If your concern is hard drives crashing, then you're probably better
served with a decent RAID array than by using multiple disks with each
their own tablespace. That way, if a disk crashes NO database goes
down. There will probably be a bit of slow-down once you replace the
disk and it's getting filled with the stuff that was on the previous
disk.
For a database server you will probably want to use RAID-10 (RAID-5
doesn't perform very well for databases) on a controller with a
battery-backed cache; Popular brands in the server market are 3Ware
and Areca.

Of course disks aren't the only possible malfunctions in a system, but
they are the most likely to break. Other moving parts are fans, but
most chips that need a fan have built-in fail-safes these days.

Other risks are blackouts & brownouts that could either interrupt your
servers or fry their components, but usually this can be prevented
with a decent UPS. Then there are building cleaners needing a socket
for their vacuum cleaner, fires, floods, etc. Many of these (including
the UPS) can be dealt with by companies who co-locate your servers
plus they usually provide a reliable internet connection for them as
well.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4a45e92b759151647533614!