Thread: A few questions
First on prepared statements: 1) If I am using the libpq are prepared statements tied to a connection? In other words can I prepare the statement once and use it on multiple connections? 2) What is the logical scope of prepared statement names? Can I use the same name on different tables without conflict or is the scope database wide or something else? On indices: 3) same as 2 for index names. I think they are per table but it is worth asking. and last: 4) Is it generally better to have more tables in one database from a memory and performance point of view or divide into more databases if there is a logical division. The reason I ask is that I have a situation where one app is used by multiple different users each running their own copy. The app uses on the order of 30 tables. In some ways it would be convenient to have one big database and specialize the table names per user. But I am not sure that is most optimal. Is there a general answer to such a question? Thanks very much for any enlightenment on these questions. - samantha
On Mon, Oct 29, 2007 at 09:52:55AM -0700, Samantha Atkins wrote: > First on prepared statements: > > 1) If I am using the libpq are prepared statements tied to a > connection? In other words can I prepare the statement once and use > it on multiple connections? Yes they are, and no, you can't. Because of MVCC, in general, each connection could see a completely different database, so there's no way for plans to cross connections. > 2) What is the logical scope of prepared statement names? Can I use > the same name on different tables without conflict or is the scope > database wide or something else? What happened when you tried it? > On indices: > > 3) same as 2 for index names. I think they are per table but it is > worth asking. See above question ;) > and last: > > 4) Is it generally better to have more tables in one database from a > memory and performance point of view or divide into more databases > if there is a logical division. Just generally, get correctness first and improve performance if you need to by finding bottlenecks empirically and figuring out what to do once you've identified them. > The reason I ask is that I have a situation where one app is used by > multiple different users each running their own copy. The app uses > on the order of 30 tables. In some ways it would be convenient to > have one big database and specialize the table names per user. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ That's a loud warning of a design you didn't think through ahead of time. As a general rule, a day not spent in design translates into at least 10 in testing (if you're lucky enough to catch it there) or (more usually) 100 or more in production. > But I am not sure that is most optimal. Is there a general answer > to such a question? See above :) Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Mon, 29 Oct 2007 09:52:55 -0700 Samantha Atkins <sjatkins@mac.com> wrote: > First on prepared statements: > > 1) If I am using the libpq are prepared statements tied to a > connection? Yes. > In other words can I prepare the statement once and use > it on multiple connections? No. > > 2) What is the logical scope of prepared statement names? Can I use > the same name on different tables without conflict or is the scope > database wide or something else? Each prepare must be unique within the session. So session 1 can have foo and session 2 can have foo, but session 1 can not have foo that calls to two different objects... > > On indices: > > 3) same as 2 for index names. I think they are per table but it is > worth asking. Indexes are per relation (table) > > and last: > > 4) Is it generally better to have more tables in one database from a > memory and performance point of view or divide into more databases > if there is a logical division. Uhmm this is more of a normalization and relation theory question :). I > The reason I ask is that I have a > situation where one app is used by multiple different users each > running their own copy. Ahh... use namespaces/schemas: http://www.postgresql.org/docs/current/static/ddl-schemas.html > Thanks very much for any enlightenment on these questions. > > - samantha Hope this was helpful. Sincerely, Joshua D. Drake > > > ---------------------------(end of > broadcast)--------------------------- TIP 5: don't forget to increase > your free space map settings > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 PostgreSQL solutions since 1997 http://www.commandprompt.com/ UNIQUE NOT NULL Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
Attachment
Samantha Atkins wrote: > First on prepared statements: > > 1) If I am using the libpq are prepared statements tied to a > connection? In other words can I prepare the statement once and use it > on multiple connections? Per session (connection). Temporary tables etc. are the same. > 2) What is the logical scope of prepared statement names? Can I use the > same name on different tables without conflict or is the scope database > wide or something else? Per session. > On indices: > > 3) same as 2 for index names. I think they are per table but it is > worth asking. Per database (if you count the schema name). We don't have cross-table indexes, but the global naming allows it. > and last: > > 4) Is it generally better to have more tables in one database from a > memory and performance point of view or divide into more databases if > there is a logical division. The reason I ask is that I have a > situation where one app is used by multiple different users each running > their own copy. The app uses on the order of 30 tables. In some ways > it would be convenient to have one big database and specialize the table > names per user. But I am not sure that is most optimal. Is there a > general answer to such a question? Not really, but... 1. Do you treat them as separate logical entities? Do you want to backup and restore them separately? Is any information shared between them? What are the consequences of a user seeing other users' data? 2. Are you having performance issues with the most logical design? Can you solve it by adding some more RAM/Disk? What are the maintenance issues with not having the most logical design? -- Richard Huxton Archonet Ltd
On Oct 29, 2007, at 10:14 AM, Richard Huxton wrote: > Samantha Atkins wrote: >> First on prepared statements: >> 1) If I am using the libpq are prepared statements tied to a >> connection? In other words can I prepare the statement once and >> use it on multiple connections? > > Per session (connection). > > Temporary tables etc. are the same. > >> 2) What is the logical scope of prepared statement names? Can I >> use the same name on different tables without conflict or is the >> scope database wide or something else? > > Per session. > >> On indices: >> 3) same as 2 for index names. I think they are per table but it is >> worth asking. > > Per database (if you count the schema name). We don't have cross- > table indexes, but the global naming allows it. > >> and last: >> 4) Is it generally better to have more tables in one database from >> a memory and performance point of view or divide into more >> databases if there is a logical division. The reason I ask is that >> I have a situation where one app is used by multiple different >> users each running their own copy. The app uses on the order of 30 >> tables. In some ways it would be convenient to have one big >> database and specialize the table names per user. But I am not >> sure that is most optimal. Is there a general answer to such a >> question? > > Not really, but... > > 1. Do you treat them as separate logical entities? A set of tables per a user, yes. A app process is always for one and only one user. > > Do you want to backup and restore them separately? Not necessarily. Although the is a possibility of wanting separate per-user backups which would pretty much answer the question in this specific case. > > Is any information shared between them? Possible sharing of some common id numbers for common items. Although it is not essential the common items have the same serial number on different databases. > > What are the consequences of a user seeing other users' data? > Little likelihood unless we expose database username/passwd. These are "users" not necessarily represented as postgresql database users. > 2. Are you having performance issues with the most logical design? The first prototype has not yet been completed so no, not yet. :-) > > Can you solve it by adding some more RAM/Disk? ??? There is a desire to use as little ram/disk as possible for the application. I would be interested in what the overhead is for opening a second database. > > What are the maintenance issues with not having the most logical > design? > What do you consider the most logical, one database per user? - samantha
Samantha Atkins wrote: > > On Oct 29, 2007, at 10:14 AM, Richard Huxton wrote: > >> Samantha Atkins wrote: >>> First on prepared statements: >>> 1) If I am using the libpq are prepared statements tied to a >>> connection? In other words can I prepare the statement once and use >>> it on multiple connections? >> >> Per session (connection). >> >> Temporary tables etc. are the same. >> >>> 2) What is the logical scope of prepared statement names? Can I use >>> the same name on different tables without conflict or is the scope >>> database wide or something else? >> >> Per session. >> >>> On indices: >>> 3) same as 2 for index names. I think they are per table but it is >>> worth asking. >> >> Per database (if you count the schema name). We don't have cross-table >> indexes, but the global naming allows it. >> >>> and last: >>> 4) Is it generally better to have more tables in one database from a >>> memory and performance point of view or divide into more databases if >>> there is a logical division. The reason I ask is that I have a >>> situation where one app is used by multiple different users each >>> running their own copy. The app uses on the order of 30 tables. In >>> some ways it would be convenient to have one big database and >>> specialize the table names per user. But I am not sure that is most >>> optimal. Is there a general answer to such a question? >> >> Not really, but... >> >> 1. Do you treat them as separate logical entities? > > A set of tables per a user, yes. A app process is always for one and > only one user. OK, so no data-sharing. >> Do you want to backup and restore them separately? > > Not necessarily. Although the is a possibility of wanting separate > per-user backups which would pretty much answer the question in this > specific case. Yep. Or if you want to prevent other users knowing that they share a database. >> Is any information shared between them? > > Possible sharing of some common id numbers for common items. Although > it is not essential the common items have the same serial number on > different databases. > >> What are the consequences of a user seeing other users' data? > > Little likelihood unless we expose database username/passwd. These are > "users" not necessarily represented as postgresql database users. Ah, but with separate databases they can (and might as well be) separate db users. It's the simplest way to guarantee no data leakage. If you have only one db, then you'll want to have separate tables for each user, perhaps in their own schema or a column on each table saying which row belongs to which user. It's easier to make a mistake here. >> 2. Are you having performance issues with the most logical design? > > The first prototype has not yet been completed so no, not yet. :-) Good. In that case, I recommend going away and mocking up both, with twice as many users as you expect and twice as much data. See how they operate. >> Can you solve it by adding some more RAM/Disk? > > ??? There is a desire to use as little ram/disk as possible for the > application. Don't forget there might well be a trade-off between the two. Caching results in your application will increase requirements there but lower them on the DB. > I would be interested in what the overhead is for opening > a second database. Not much. If you have duplicated data, that can prove wasteful. Otherwise it's a trade off between a single 100MB index and 100 1MB index and their overheads. Now, if only 15 of your 100 users log in at any one time that will make a difference too. It'll all come down to locality of data - whether your queries need more disk blocks from separate databases than from larger tables in one database. >> What are the maintenance issues with not having the most logical design? > > What do you consider the most logical, one database per user? You're the only one who knows enough to say. You're not sharing data between users, so you don't need one database. On the other hand, you don't care about backing up separate users, which means you don't need many DBs. Here's another question: when you upgrade your application, do you want to upgrade the db-schema for all users at once, or individually? Write a list of all these sort of tasks - backups, installations, upgrades, comparing users, expiring user accounts etc. Mark each for how often you'll have to deal with it and then how easy/difficult it is with each design. Total it up and you'll know whether you want a single DB or multiple. Then come back and tell us what you decided, it'll be interesting :-) -- Richard Huxton Archonet Ltd
Samantha Atkins shaped electrons to ask:
>
> What do you consider the most logical, one database per user?
>
> - samantha
Perhaps a schema per user ? Then you can have the common tables (look up values, whatever) in the public schema. Each user gets a schema that has all of the tables they share in common (accounting or addresses or whatever) plus you can add an specialized tables and not worry about other users seeing them. Of course, all table references have to be qualified (myschema.mytable) or you have to set the search_path.
I'd lean toward making each a real postgres user and then revoke all rights ont heir schema from public and allow them access to the schema and the underlying tables.
HTH,
Greg Williamson
Senior DBA
GlobeXplorer LLC, a DigitalGlobe company
Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information and must be protected in accordance with those provisions. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.
(My corporate masters made me say this.)