Thread: A few questions

A few questions

From

Samantha Atkins

Date:

29 October 2007, 13:53:42

First on prepared statements:

1) If I am using the libpq are prepared statements tied to a
connection?  In other words can I prepare the statement once and use
it on multiple connections?

2) What is the logical scope of prepared statement names?  Can I use
the same name on different tables without conflict or is the scope
database wide or something else?

On indices:

3) same as 2 for index names.  I think they are per table but it is
worth asking.

and last:

4) Is it generally better to have more tables in one database from a
memory and performance point of view or divide into more databases if
there is a logical division.  The reason I ask is that I have a
situation where one app is used by multiple different users each
running their own copy.  The app uses on the order of 30 tables.  In
some ways it would be convenient to have one big database and
specialize the table names per user.   But I am not sure that is most
optimal.  Is there a general answer to such a question?

Thanks very much for any enlightenment on these questions.

- samantha

Re: A few questions

From

David Fetter

Date:

29 October 2007, 14:07:59

On Mon, Oct 29, 2007 at 09:52:55AM -0700, Samantha Atkins wrote:
> First on prepared statements:
>
> 1) If I am using the libpq are prepared statements tied to a
> connection?  In other words can I prepare the statement once and use
> it on multiple connections?

Yes they are, and no, you can't.  Because of MVCC, in general, each
connection could see a completely different database, so there's no
way for plans to cross connections.

> 2) What is the logical scope of prepared statement names?  Can I use
> the same name on different tables without conflict or is the scope
> database wide or something else?

What happened when you tried it?

> On indices:
>
> 3) same as 2 for index names.  I think they are per table but it is
> worth asking.

See above question ;)

> and last:
>
> 4) Is it generally better to have more tables in one database from a
> memory and performance point of view or divide into more databases
> if there is a logical division.

Just generally, get correctness first and improve performance if you
need to by finding bottlenecks empirically and figuring out what to do
once you've identified them.

> The reason I ask is that I have a situation where one app is used by
> multiple different users each running their own copy.  The app uses
> on the order of 30 tables.  In some ways it would be convenient to
> have one big database and specialize the table names per user.
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
That's a loud warning of a design you didn't think through ahead of
time.  As a general rule, a day not spent in design translates into at
least 10 in testing (if you're lucky enough to catch it there) or
(more usually) 100 or more in production.

> But I am not sure that is most  optimal.  Is there a general answer
> to such a question?

See above :)

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: A few questions

From

Richard Huxton

Date:

29 October 2007, 14:17:42

Samantha Atkins wrote:
> First on prepared statements:
>
> 1) If I am using the libpq are prepared statements tied to a
> connection?  In other words can I prepare the statement once and use it
> on multiple connections?

Per session (connection).

Temporary tables etc. are the same.

> 2) What is the logical scope of prepared statement names?  Can I use the
> same name on different tables without conflict or is the scope database
> wide or something else?

Per session.

> On indices:
>
> 3) same as 2 for index names.  I think they are per table but it is
> worth asking.

Per database (if you count the schema name). We don't have cross-table
indexes, but the global naming allows it.

> and last:
>
> 4) Is it generally better to have more tables in one database from a
> memory and performance point of view or divide into more databases if
> there is a logical division.  The reason I ask is that I have a
> situation where one app is used by multiple different users each running
> their own copy.  The app uses on the order of 30 tables.  In some ways
> it would be convenient to have one big database and specialize the table
> names per user.   But I am not sure that is most optimal.  Is there a
> general answer to such a question?

Not really, but...

1. Do you treat them as separate logical entities?
Do you want to backup and restore them separately?
Is any information shared between them?
What are the consequences of a user seeing other users' data?

2. Are you having performance issues with the most logical design?
Can you solve it by adding some more RAM/Disk?
What are the maintenance issues with not having the most logical design?

--
   Richard Huxton
   Archonet Ltd

Re: A few questions

From

Samantha Atkins

Date:

29 October 2007, 19:56:35

On Oct 29, 2007, at 10:14 AM, Richard Huxton wrote:

> Samantha Atkins wrote:
>> First on prepared statements:
>> 1) If I am using the libpq are prepared statements tied to a
>> connection?  In other words can I prepare the statement once and
>> use it on multiple connections?
>
> Per session (connection).
>
> Temporary tables etc. are the same.
>
>> 2) What is the logical scope of prepared statement names?  Can I
>> use the same name on different tables without conflict or is the
>> scope database wide or something else?
>
> Per session.
>
>> On indices:
>> 3) same as 2 for index names.  I think they are per table but it is
>> worth asking.
>
> Per database (if you count the schema name). We don't have cross-
> table indexes, but the global naming allows it.
>
>> and last:
>> 4) Is it generally better to have more tables in one database from
>> a memory and performance point of view or divide into more
>> databases if there is a logical division.  The reason I ask is that
>> I have a situation where one app is used by multiple different
>> users each running their own copy.  The app uses on the order of 30
>> tables.  In some ways it would be convenient to have one big
>> database and specialize the table names per user.   But I am not
>> sure that is most optimal.  Is there a general answer to such a
>> question?
>
> Not really, but...
>
> 1. Do you treat them as separate logical entities?

A set of tables per a user, yes.  A app process is always for one and
only one user.

>
> Do you want to backup and restore them separately?

Not necessarily.  Although the is a possibility of wanting separate
per-user backups which would pretty much answer the question in this
specific case.

>
> Is any information shared between them?

Possible sharing of some common id numbers for common items.
Although it is not essential the common items have the same serial
number on different databases.

>
> What are the consequences of a user seeing other users' data?
>

Little likelihood unless we expose database username/passwd.   These
are "users" not necessarily represented as postgresql database users.

> 2. Are you having performance issues with the most logical design?

The first prototype has not yet been completed so no, not yet.  :-)

>
> Can you solve it by adding some more RAM/Disk?

???  There is a desire to use as little ram/disk as possible for the
application.   I would be interested in what the overhead is for
opening a second database.

>
> What are the maintenance issues with not having the most logical
> design?
>

What do you consider the most logical, one database per user?

- samantha

Re: A few questions

From

Richard Huxton

Date:

29 October 2007, 20:21:39

Samantha  Atkins wrote:
>
> On Oct 29, 2007, at 10:14 AM, Richard Huxton wrote:
>
>> Samantha Atkins wrote:
>>> First on prepared statements:
>>> 1) If I am using the libpq are prepared statements tied to a
>>> connection?  In other words can I prepare the statement once and use
>>> it on multiple connections?
>>
>> Per session (connection).
>>
>> Temporary tables etc. are the same.
>>
>>> 2) What is the logical scope of prepared statement names?  Can I use
>>> the same name on different tables without conflict or is the scope
>>> database wide or something else?
>>
>> Per session.
>>
>>> On indices:
>>> 3) same as 2 for index names.  I think they are per table but it is
>>> worth asking.
>>
>> Per database (if you count the schema name). We don't have cross-table
>> indexes, but the global naming allows it.
>>
>>> and last:
>>> 4) Is it generally better to have more tables in one database from a
>>> memory and performance point of view or divide into more databases if
>>> there is a logical division.  The reason I ask is that I have a
>>> situation where one app is used by multiple different users each
>>> running their own copy.  The app uses on the order of 30 tables.  In
>>> some ways it would be convenient to have one big database and
>>> specialize the table names per user.   But I am not sure that is most
>>> optimal.  Is there a general answer to such a question?
>>
>> Not really, but...
>>
>> 1. Do you treat them as separate logical entities?
>
> A set of tables per a user, yes.  A app process is always for one and
> only one user.

OK, so no data-sharing.

>> Do you want to backup and restore them separately?
>
> Not necessarily.  Although the is a possibility of wanting separate
> per-user backups which would pretty much answer the question in this
> specific case.

Yep. Or if you want to prevent other users knowing that they share a
database.

>> Is any information shared between them?
>
> Possible sharing of some common id numbers for common items.   Although
> it is not essential the common items have the same serial number on
> different databases.
>
>> What are the consequences of a user seeing other users' data?
>
> Little likelihood unless we expose database username/passwd.   These are
> "users" not necessarily represented as postgresql database users.

Ah, but with separate databases they can (and might as well be) separate
  db users. It's the simplest way to guarantee no data leakage.

If you have only one db, then you'll want to have separate tables for
each user, perhaps in their own schema or a column on each table saying
which row belongs to which user. It's easier to make a mistake here.

>> 2. Are you having performance issues with the most logical design?
>
> The first prototype has not yet been completed so no, not yet.  :-)

Good. In that case, I recommend going away and mocking up both, with
twice as many users as you expect and twice as much data. See how they
operate.

>> Can you solve it by adding some more RAM/Disk?
>
> ???  There is a desire to use as little ram/disk as possible for the
> application.

Don't forget there might well be a trade-off between the two. Caching
results in your application will increase requirements there but lower
them on the DB.

 > I would be interested in what the overhead is for opening
> a second database.

Not much. If you have duplicated data, that can prove wasteful.

Otherwise it's a trade off between a single 100MB index and 100 1MB
index and their overheads. Now, if only 15 of your 100 users log in at
any one time that will make a difference too. It'll all come down to
locality of data - whether your queries need more disk blocks from
separate databases than from larger tables in one database.

>> What are the maintenance issues with not having the most logical design?
>
> What do you consider the most logical, one database per user?

You're the only one who knows enough to say. You're not sharing data
between users, so you don't need one database. On the other hand, you
don't care about backing up separate users, which means you don't need
many DBs.

Here's another question: when you upgrade your application, do you want
to upgrade the db-schema for all users at once, or individually?

Write a list of all these sort of tasks - backups, installations,
upgrades, comparing users, expiring user accounts etc. Mark each for how
  often you'll have to deal with it and then how easy/difficult it is
with each design. Total it up and you'll know whether you want a single
DB or multiple.

Then come back and tell us what you decided, it'll be interesting :-)

--
   Richard Huxton
   Archonet Ltd

Re: A few questions

From

"Gregory Williamson"

Date:

29 October 2007, 20:28:32

Samantha Atkins shaped electrons to ask:
>
> What do you consider the most logical, one database per user?
>
> - samantha

Perhaps a schema per user ? Then you can have the common tables (look up values, whatever) in the public schema. Each user gets a schema that has all of the tables they share in common (accounting or addresses or whatever) plus you can add an specialized tables and not worry about other users seeing them. Of course, all table references have to be qualified (myschema.mytable) or you have to set the search_path.

I'd lean toward making each a real postgres user and then revoke all rights ont heir schema from public and allow them access to the schema and the underlying tables.

HTH,

Greg Williamson
Senior DBA
GlobeXplorer LLC, a DigitalGlobe company

Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information and must be protected in accordance with those provisions. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.

(My corporate masters made me say this.)

Thread: A few questions

A few questions

Re: A few questions

Re: A few questions

Attachment

Re: A few questions

Re: A few questions

Re: A few questions

Re: A few questions