Re: solutions for new Postgresql application testing - Mailing list pgsql-performance

From Merlin Moncure
Subject Re: solutions for new Postgresql application testing
Date
Msg-id 6EE64EF3AB31D5448D0007DD34EEB3417DD759@Herge.rcsinc.local
Whole thread Raw
In response to solutions for new Postgresql application testing  (Geoffrey <esoteric@3times25.net>)
Responses Re: solutions for new Postgresql application testing  (Geoffrey <esoteric@3times25.net>)
List pgsql-performance
Geoffrey wrote:
> We are going live with a application in a few months that is a
complete
> rewrite of an existing application.  We are moving from an existing
> proprietary database to Postgresql.  We are looking for some
> insight/suggestions as to how folks test Postgresql in such a
situation.

Shouldn't you run your tests *before* rewriting your application? :).
You don't have to answer that.

> We're also trying to decide whether a single database with multiple
> schemas or multiple databases are the best solution.  We've done some
> research on this through the archives, and the answer seems to depend
on
> the database/application design.  Still, we welcome any generic ideas
on
> this issue as well.

I can help a little bit here.  Yes, this decision will be heavily
influenced by application design.  Let's assume you have to keep
multiple identical table sets (suppose you have multiple companies on
the same server for example).  Here are some general stipulations:

Reasons to use schemas:
* If you have a requirement where data must be queried from multiple
data stores at the same time, or between various data stores and a
shared area, this argues for schemas.  While it is possible to do this
without schemas via dblink, which is the postgresql inter-database rpc,
performance can be an issue and there is some overhead of setting it up.

* If you need to swap out data stores on the fly without reconnecting,
then this argues strongly in favor of schemas.  With schemas, you can
manipulate which datastore you are using by simply manipulating the
search_path.  There is one big caveat to this: your non dynamic pl/pgsql
functions will stick to the tables they use following the first time you
run them like suction cups.  Worse, your sql functions will stick to the
tables they refer to when compiled, making non-immutable sql functions a
no-no in a multi-schema environment.  However, there is a clever
workaround to this by force recompiling you pl/pgsql functions (search
the recent archives on this list).

* Finally, since multiple schemas can share a common public area, this
means that if you have to deploy database features that apply to all of
your datastores, you can sometimes get away with sticking them in a
public area of the databse...server side utility functions are an
example of this.

Reasons to use databases:
* Certain third party tools may have trouble with schemas.

* Manipulating the search path can be error prone and relatively
tedious.

* Database are more fully separate.  I run multi schema, and I make
heavy use of the userlock contrib module.  This means I have to take
special care not to have inter-schema overlap of my lock identifier.
There are other cases where this might bite you, for example if you
wanted one data store to respond to notifications but not another.
These are solvable problems, but can be a headache.

In short, there are pros and cons either way.  If it's any help, the
servers I administrate, which have *really complex* data interdependency
and isolation requirements, use schemas for the extra flexibility.

Merlin

pgsql-performance by date:

Previous
From: "Merlin Moncure"
Date:
Subject: improvise callbacks in plpgsql
Next
From: Tom Lane
Date:
Subject: Re: improvise callbacks in plpgsql