Thread: PGSQL or other DB?
Dear PG Users!
I don't want to make a heated debate with this subject, but I wanna ask about your experiences because we need to make a new special site and we wanna know, which DB is the best for this...
This website will supports up to 200 corporations.
The functions are similar, but because of many differents of these corps (datas to keep, way to working them), we wanna make many databases (one DB for one corp).
The site uses one main database that handle all public, and shared services, and store the links to subdatabases.
The website will works with apache/mod_python, and the each of the corp's services are handled with another python module (from another (separated) database). The main structure of the source code already wroted.
But: we need to determine, which database we will use in the future for this project.
The main viewpoints:
- quick (re)connect - because mod_python basically not store the database connections persistently
- fast queries
- easy IDE to use (like pgadmin)
- the db server can handle many databases (webdb_nnn where nnn is an integer)
- I can add/modify a table, or a field to a table without "full lock" on the table (like DBISAM restructure). Like in FireBird, where the "add field" change only the table description. I don't know that PG supports this way of the DB modifying.
- Quick and easy backup/restore system
Another important thing that I don't understand (what as I saw) that the PostGreSQL is store the databases in one, unseparatable file set, in a directory named data.
In another databases, like DBISAM, FireBird, MySQL, the databases are separated to another directories/files.
This "one datadir" is seems to be not too good for us. We used DBISAM in our clients, and many times when we got some filesystem error, we can simply recover the tables - from the files.
When we want to backup or restore one database, we can do it in two way: a.) archive all files b.) make sql dump from database.
If a file get corrupted in a database, then we can restore the datas from files, and this filesystem error causes problems only for this database, not for all.
I very fear from to keep all databases in one place, because if they are corrupted, possible more of them injured (if they are not separated).
I cannot make filesystem based (hard) copy from one db (only SQL dump enabled).
Ok, I saw that pgsql supports "tablespaces", but as I saw, this function can hold only table datas in the another directory, and not the full database can separated with them.
Because I don't used one PGSQL with many databases (up to 200), I don't know, what happening, and which cases possible. But I think you have many experience with it. Please share it with me!
Please help me, because we need to determine which DB to use.
I started the usage of the PG in prev. month, and I liked it except the way of the data storage (one data dir).
I tried the MySQL before I tried PG. InnoDB is seems to be "forcing transaction system on MyISAM". And boolean data type is missing (solved with enum?).
I don't like it all, but it is seems to be fast with little tables, and it is separate the database files to another directories which thing I like. Possible it have many limitations what I don't saw in first time.
Please help me, which DB is good for us, and how to configure, and use PGSQL with these database-set which we need to use.
Thanks for your help:
dd
I don't want to make a heated debate with this subject, but I wanna ask about your experiences because we need to make a new special site and we wanna know, which DB is the best for this...
This website will supports up to 200 corporations.
The functions are similar, but because of many differents of these corps (datas to keep, way to working them), we wanna make many databases (one DB for one corp).
The site uses one main database that handle all public, and shared services, and store the links to subdatabases.
The website will works with apache/mod_python, and the each of the corp's services are handled with another python module (from another (separated) database). The main structure of the source code already wroted.
But: we need to determine, which database we will use in the future for this project.
The main viewpoints:
- quick (re)connect - because mod_python basically not store the database connections persistently
- fast queries
- easy IDE to use (like pgadmin)
- the db server can handle many databases (webdb_nnn where nnn is an integer)
- I can add/modify a table, or a field to a table without "full lock" on the table (like DBISAM restructure). Like in FireBird, where the "add field" change only the table description. I don't know that PG supports this way of the DB modifying.
- Quick and easy backup/restore system
Another important thing that I don't understand (what as I saw) that the PostGreSQL is store the databases in one, unseparatable file set, in a directory named data.
In another databases, like DBISAM, FireBird, MySQL, the databases are separated to another directories/files.
This "one datadir" is seems to be not too good for us. We used DBISAM in our clients, and many times when we got some filesystem error, we can simply recover the tables - from the files.
When we want to backup or restore one database, we can do it in two way: a.) archive all files b.) make sql dump from database.
If a file get corrupted in a database, then we can restore the datas from files, and this filesystem error causes problems only for this database, not for all.
I very fear from to keep all databases in one place, because if they are corrupted, possible more of them injured (if they are not separated).
I cannot make filesystem based (hard) copy from one db (only SQL dump enabled).
Ok, I saw that pgsql supports "tablespaces", but as I saw, this function can hold only table datas in the another directory, and not the full database can separated with them.
Because I don't used one PGSQL with many databases (up to 200), I don't know, what happening, and which cases possible. But I think you have many experience with it. Please share it with me!
Please help me, because we need to determine which DB to use.
I started the usage of the PG in prev. month, and I liked it except the way of the data storage (one data dir).
I tried the MySQL before I tried PG. InnoDB is seems to be "forcing transaction system on MyISAM". And boolean data type is missing (solved with enum?).
I don't like it all, but it is seems to be fast with little tables, and it is separate the database files to another directories which thing I like. Possible it have many limitations what I don't saw in first time.
Please help me, which DB is good for us, and how to configure, and use PGSQL with these database-set which we need to use.
Thanks for your help:
dd
On Fri, Jan 30, 2009 at 08:37:02PM +0100, durumdara wrote: > This website will supports up to 200 corporations. > The functions are similar, but because of many differents of these corps > (datas to keep, way to working them), we wanna make many databases (one > DB for one corp). Databases or schemas? they are different in PG and they have very different behaviors which suit different use cases. > The site uses one main database that handle all public, and shared > services, and store the links to subdatabases. Sounds like you want to use schemas. > The main viewpoints: > - quick (re)connect - because mod_python basically not store the > database connections persistently ?? mod_python can be used to do persistent connections as well as connection pooling. > - fast queries They're as fast as you write them! PG generally does better with higher concurrent loads than MySQL (never used DBISAM or FireBird). > - the db server can handle many databases (webdb_nnn where nnn is an > integer) There's a summary of limits in the about page: http://www.postgresql.org/about/ > - I can add/modify a table, or a field to a table without "full lock" > on the table (like DBISAM restructure). Like in FireBird, where the "add > field" change only the table description. I don't know that PG supports > this way of the DB modifying. Yes; have a read about MVCC: http://www.postgresql.org/docs/current/static/mvcc.html > - Quick and easy backup/restore system pg_dump works well, how quick it is depends on the amount of data you have obviously > Another important thing that I don't understand (what as I saw) that the > PostGreSQL is store the databases in one, unseparatable file set, in a > directory named data. > In another databases, like DBISAM, FireBird, MySQL, the databases are > separated to another directories/files. PG's scheme works well; one file per table doesn't work very well in practice. there will always be links between tables (foreign keys being the obvious one) and hence selectively restoring individual tables on a file by file basis seems like an *amazing* way of killing your database. > This "one datadir" is seems to be not too good for us. We used DBISAM in > our clients, and many times when we got some filesystem error, we can > simply recover the tables - from the files. Use a proper backup system that takes consistent snapshots of your data, anything else will come back and bite you when you really don't want it to > I very fear from to keep all databases in one place, because if they are > corrupted, possible more of them injured (if they are not separated). > I cannot make filesystem based (hard) copy from one db (only SQL dump > enabled). There are much better tools available to solve things than treating tables as files! As far as backups go, there are several options of varying complexity and resilience. pg_dump is simple, you can do filesystem level backups with a bit of fiddling, you can keep another database fed with data from the main one with varying levels of lag. http://www.postgresql.org/docs/current/static/backup.html -- Sam http://samason.me.uk/
> > This "one datadir" is seems to be not too good for us. We used DBISAM > > in our clients, and many times when we got some filesystem error, we > > can simply recover the tables - from the files. > pg_dump files with the custom format can be used to selectively restore tables. It's really easy and a lot more reliable than filesystem level backups - which require snapshotting or shutting down the database or in-database support to get clean backups of. > > I very fear from to keep all databases in one place, because if they > > are corrupted, possible more of them injured (if they are not > > separated). I cannot make filesystem based (hard) copy from one db > > (only SQL dump enabled). If you consistently have hardware problems that cause filesystem corruption, you should probably consider fixing that before worrying too much about which software is on top of it being corrupted. And, honestly, if keeping all the files in once place is a major problem for you, then don't use PostgreSQL. It's not like it's going to be changed in this regard. -- Alan
On Jan 30, 2009, at 2:37 PM, durumdara wrote:
Dear PG Users!
I don't want to make a heated debate with this subject, but I wanna ask about your experiences because we need to make a new special site and we wanna know, which DB is the best for this...
This website will supports up to 200 corporations.
The functions are similar, but because of many differents of these corps (datas to keep, way to working them), we wanna make many databases (one DB for one corp).
The site uses one main database that handle all public, and shared services, and store the links to subdatabases.
The website will works with apache/mod_python, and the each of the corp's services are handled with another python module (from another (separated) database). The main structure of the source code already wroted.
But: we need to determine, which database we will use in the future for this project.
The main viewpoints:
- quick (re)connect - because mod_python basically not store the database connections persistently
I don`t know much about python except the animal but could use a SQL proxy to solve this issue
- fast queries
Define fast, but remember a DB was never designed to be fast. However it sounds like for the coupld of companies you wouldn't have to much problems there.
- easy IDE to use (like pgadmin)
- the db server can handle many databases (webdb_nnn where nnn is an integer)
I don't have experience with PG and hunderds of DB's, but I think it shouldn't be a problem
- I can add/modify a table, or a field to a table without "full lock" on the table (like DBISAM restructure). Like in FireBird, where the "add field" change only the table description. I don't know that PG supports this way of the DB modifying.
How big are these tables and how frequently do they change? If they change really frequently then you have a problem with your design in general.
- Quick and easy backup/restore system
pg_dump/pg_restore are your friends. webmin makes this also easy if you want a point and click interface.
Another important thing that I don't understand (what as I saw) that the PostGreSQL is store the databases in one, unseparatable file set, in a directory named data.
In another databases, like DBISAM, FireBird, MySQL, the databases are separated to another directories/files.
This "one datadir" is seems to be not too good for us. We used DBISAM in our clients, and many times when we got some filesystem error, we can simply recover the tables - from the files.
When we want to backup or restore one database, we can do it in two way: a.) archive all files b.) make sql dump from database.
If a file get corrupted in a database, then we can restore the datas from files, and this filesystem error causes problems only for this database, not for all.
I very fear from to keep all databases in one place, because if they are corrupted, possible more of them injured (if they are not separated).
I cannot make filesystem based (hard) copy from one db (only SQL dump enabled).
Why would just one file get corrupt?? You can store your tables(indexes in different places if you want to but if one get's corrupt you need to change/check/replace hardware anyways. PostgreSQL doesn't behave much like MySQL where all of a sudden tables get corrupted and you need to repair them.
Ok, I saw that pgsql supports "tablespaces", but as I saw, this function can hold only table datas in the another directory, and not the full database can separated with them.
Correct...
Because I don't used one PGSQL with many databases (up to 200), I don't know, what happening, and which cases possible. But I think you have many experience with it. Please share it with me!
Don't expect yourself to just set one table and/or DB back. This is plain wrong and you need to make appropriate backups. If you really want to set a DB back as a table then use sqlite, but that defeats your requirements of 'fast'.
Please help me, because we need to determine which DB to use.
I think PG will do very well in your situation, but you have to set your mind off using files
I started the usage of the PG in prev. month, and I liked it except the way of the data storage (one data dir).
I don't know any DB (except may be some Object DB's and sqlite) where you can do that properly, officially and ACID compliant.
I tried the MySQL before I tried PG. InnoDB is seems to be "forcing transaction system on MyISAM". And boolean data type is missing (solved with enum?).
I don't like it all, but it is seems to be fast with little tables, and it is separate the database files to another directories which thing I like. Possible it have many limitations what I don't saw in first time.
Don't want to debate MySQL but PG is a better option IMHO, it recovers VERY well from crashes where you are worried about (I know, I live in Ecuador where power goes down every week or so and it always recovered perfectly).
Please help me, which DB is good for us, and how to configure, and use PGSQL with these database-set which we need to use.
PostgreSQL is good for you as long as you set your mind away from restoring a DB by replacing a fileset. pg_dump/pg_restore are your friends. If you do care about restoring a DB up to a point in time you can do WAL shipping.
Thanks for your help:
dd
regards, Ries van Twisk
-------------------------------------------------------------------------------------------------
Ries van Twisk
tags: Freelance TYPO3 Glassfish JasperReports JasperETL Flex Blaze-DS WebORB PostgreSQL DB-Architect
email: ries@vantwisk.nl
skype: callto://r.vantwisk
Phone: +1-810-476-4196
SIP: +1-747-690-5133
durumdara wrote: > > Another important thing that I don't understand (what as I saw) that > the PostGreSQL is store the databases in one, unseparatable file set, > in a directory named data. > In another databases, like DBISAM, FireBird, MySQL, the databases are > separated to another directories/files. > > This "one datadir" is seems to be not too good for us. We used DBISAM > in our clients, and many times when we got some filesystem error, we > can simply recover the tables - from the files. > When we want to backup or restore one database, we can do it in two > way: a.) archive all files b.) make sql dump from database. > Postgresql uses a database cluster. In the data directory each database goes into a directory identified by the database OID, and in this directory the database resides in 2 gb chunks(as far as I remember). You can easily backup the cluster by using the file system, you just have to make sure you stop the postmaster and then you can backup the entire data dir or individual database dir. You have to restore this to the same version of PG though. it's the same kind of deal with Firebird, you can't backup a running database with the file system and even Firebird must be stopped in order to do a file system copy of the database file. Firebird also stores all it's transaction data in the same file, so you end up with HUGE files unless you do a backup and restore to shrink them down. You don't have this problem with PostgreSQL as it stores that info in the WAL. PostgreSQL will give the the best solution for your project hands down, just give it a chance and don't worry about what you did with Firebird or MySQL. I have never had a PostgreSQL database or dump file become corrupt, but on Firebird it happens fairly often and it's part of the reason why I eventually dumped it and moved to PostgreSQL. Hope that helps you out some. Tony Caduto AM Software Design Home of Lightning Admin for PostgreSQL http://www.amsoftwaredesign.com
durumdara wrote: > > - quick (re)connect - because mod_python basically not store the > database connections persistently > You should consider using a Python connection pool with something like Pylons (http://pylonshq.com), I use dbutils: http://www.webwareforpython.org/DBUtils/Docs/UsersGuide.html You can scale PostgreSQL and Python web apps really well with that as long as you are not using CGI. I use Pylons combined with mod_wsgi, but you can use it with mod_python as well. Tony Caduto AM Software Design Home of Lightning Admin for PostgreSQL http://www.amsoftwaredesign.com
On Fri, 30 Jan 2009, durumdara wrote: > - quick (re)connect - because mod_python basically not store the database > connections persistently If this turns out to be a problem, you may want to use some sort of connection pooling software in order to reduce this overhead. http://wiki.postgresql.org/wiki/Replication%2C_Clustering%2C_and_Connection_Pooling#Connection_Pooling_and_Acceleration gives a quick intro to this topic with links to some good solutions here. > This "one datadir" is seems to be not too good for us. We used DBISAM in our > clients, and many times when we got some filesystem error, we can simply > recover the tables - from the files. It's quite easy for ISAM solutions to get corrupted. Barring a bad hardware failure, this shouldn't ever happen to a PostgreSQL database. The write-ahead log implementation nevers leaves you in a state where the database can't recover itself after a crash or filesystem error. You should consider this a positive thing--this whole class of problems that you think you need a solution to, needing to fix corrupted database files regularly, shouldn't ever happen in a proper database implementation. The first two sections of http://www.postgresql.org/docs/8.3/static/wal.html give a good introduction to this topic. Rolling back to a backup should only be necessary after a serious hardware disaster--the sort of situation where it's unlikely you'll be able to trust any of your individual tables even if you had the option of repairing them one at a time. > I tried the MySQL before I tried PG. InnoDB is seems to be "forcing > transaction system on MyISAM". And boolean data type is missing (solved with > enum?). > I don't like it all, but it is seems to be fast with little tables, and it is > separate the database files to another directories which thing I like. > Possible it have many limitations what I don't saw in first time. http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL:_Comparing_Reliability_and_Speed_in_2007 has a lot of commentary in this area. MySQL+MyISAM will have the same sort of problems you should be trying to escape. MySQL+InnoDB would be much better. In addition to missing features, the other thing you should be careful about is making speed measurements with a low client count--make sure you evaluate any database you're considering with a many clients going at once as you can simulate. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Alan Hodgson wrote: > > And, honestly, if keeping all the files in once place is a major problem for > you, then don't use PostgreSQL. It's not like it's going to be changed in > this regard. > > Um, I think it already did: http://www.postgresql.org/docs/8.3/interactive/sql-createtablespace.html Cheers, Steve
On Fri, Jan 30, 2009 at 1:04 PM, Sam Mason <sam@samason.me.uk> wrote: > On Fri, Jan 30, 2009 at 08:37:02PM +0100, durumdara wrote: > >> - I can add/modify a table, or a field to a table without "full lock" >> on the table (like DBISAM restructure). Like in FireBird, where the "add >> field" change only the table description. I don't know that PG supports >> this way of the DB modifying. > > Yes; have a read about MVCC: > > http://www.postgresql.org/docs/current/static/mvcc.html Just because one uses MVCC doesn't mean it's free to add a new column to a table. Adding a nullable column in pgsql is trivial. Adding an index can be done concurrently. InnoDB supports MVCC, but MySQL's way of doing things results in innodb tables being rewritten from scratch for both of those for varying reasons, resulting in a locked table the whole time. Create index concurrently is a treasure to behold on large production systems. I can't stress how important it is that I can create an index on demand in the middle of the heaviest user activity on my system and nothing hangs or locks from the user side waiting on it. Same for adding a column. >> - Quick and easy backup/restore system > > pg_dump works well, how quick it is depends on the amount of data you > have obviously More important that quick and easy is reliable, predictable, and capable of being run on a live, working database (backup that is). pg_dump definitely is those things. >> This "one datadir" is seems to be not too good for us. We used DBISAM in >> our clients, and many times when we got some filesystem error, we can >> simply recover the tables - from the files. Do you get a lot of filesystem errors? Or corrupted table errors. Having table corruption be a common problem means something somewhere is broken, whether it be your hardware or the db you're running. I've been running pgsql for a long time, going back to the tailend of the 6.x era. I've never once had a corrupt table on a production server. I am not alone. > Use a proper backup system that takes consistent snapshots of your data, > anything else will come back and bite you when you really don't want it > to I feel like I'm bathed with light and getting religious, but yes, this. >> I very fear from to keep all databases in one place, because if they are >> corrupted, possible more of them injured (if they are not separated). >> I cannot make filesystem based (hard) copy from one db (only SQL dump >> enabled). You need to get out of the habit of playing in your databases files. Any database that requires you to routinely do that is broken by design. I get the feeling you've learned how to use a hammer (the file system level manipulation thing) to fix a problem that's very nail like (file corruption in mysql) and are looking for a place to use your hammer in pgsql. Pgsql has no such nails sticking out of it.
On Jan 30, 2009, at 11:37 AM, durumdara wrote: > The main viewpoints: > - quick (re)connect - because mod_python basically not store the > database connections persistently mod_python is not a database adaptor. Put another way, mod_python does not make database connections. If you're using Python then to connect to Postgres you'll most likely be using psycopg2 to connect to and query your database. I'm not sure what your previous database experience is but it sounds like you may be thinking in terms of PHP's persistent connections and on that I'll just say that PHP's persistent connections are not true persistent connections from your top-level application's perspective. If you want connection pooling then there are options available, pgpool-II and pgbouncer are probably the two most commonly used ones today. > - fast queries Stop and think about that one for a moment. Do you really expect to go to a specific database's mailing list(s) with that requirement expecting anything other than "We got you covered!" > - easy IDE to use (like pgadmin) I think you just answered your own question. > - the db server can handle many databases (webdb_nnn where nnn is > an integer) I know people hate hearing this answer but that's the wrong question. What matters isn't how many databases can be run (and, if you ever run into a database server in this day and age where it does, run) but rather how many concurrent connections you expect as well as both the total data set size and how much of that data set will be constantly worked with, i.e. needs to be in memory. Let's say you have 10 5G databases with each averaging 10% of their total size in use (in memory, hopefully) at any given moment. That's entirely different from 10 10G databases with each averaging 10% in memory and both are different from running 200 databases each with 5% in use at any given moment. All of that being said, a properly configured Postgres provisioned for the workload installation built with a little knowledge will scale with the best of 'em. > - I can add/modify a table, or a field to a table without "full > lock" on the table (like DBISAM restructure). Like in FireBird, > where the "add field" change only the table description. I don't > know that PG supports this way of the DB modifying. Nope. PostgreSQL is an all or nothing transactional database. I'd never heard of DBISAM before you mentioned it and have never used Firebird. After doing a little reading it turns out that if you plan to use transactions at all (which is very likely given even just the little you've described about the applications you're building) then you should realize that altering tables is not compatible with transactions and doing so will automatically commit any open transactions on the table. Looking into Firebird I couldn't find how it handles (or doesn't) that at all I but I did see that it will happily let you add a new not null column with no default to a table by writing nulls for the new attribute for any existing columns. That already makes me queasy. > - Quick and easy backup/restore system Well, again, that depends. I believe pgAdmin has some nice menu options to handle simple cases (others can either add to or correct me there) but as to the core Postgres distribution there's still a bit of ambiguity arising from cases of "there's more than one way to do it" even for basic tasks as well as some inconsistent UI aspects to the existing tools (see the pg_dump/pg_restore/psql comments in the recent Pet Peeves thread). > Another important thing that I don't understand (what as I saw) that > the PostGreSQL is store the databases in one, unseparatable file > set, in a directory named data. > In another databases, like DBISAM, FireBird, MySQL, the databases > are separated to another directories/files. > This "one datadir" is seems to be not too good for us. We used > DBISAM in our clients, and many times when we got some filesystem > error, we can simply recover the tables - from the files. Way off. While I didn't look into the storage layouts for DBISAM or Firebird I can tell you that in any MySQL server instance all of the tables will be under (to some depth) the same top level data directory directory with each database in each own directory under that and table files for each database under those. I think you may be able to use symlinks MyISAM table files but InnoDB tables will croak if you try to do that. What's more, by default InnoDB does *not* use separate files per table and instead stores all InnoDB tables in one file with it's own internal file system. Admittedly, I think most people turn on the innodb_file_per_table option such that each table will get it's own data and index files but even then all of a given table's data indexes will reside in the same file. Each of those InnoDB table data+index files is referred to as an InnoDB tablespace (but that's really different from what most people expect a tablespace to be, see next paragraph). One consequence of that is that adding an index to a table requires that the entire table and all of it's indexes be rewritten. PostgreSQL also follows the 1 directory per-database setup. Each database directory is located directly underneath the base/ direcotry in the main Postgres One big difference is that table data and index data are in separate files and multiple indexes for a given table are all in separate files. This can be viewed as the low-level embodiment of the fact that in PostgreSQL indexes are not part of the tables they are for but rather separate, but tightly coupled, database objects. With this scheme adding a new index, which doesn't even require the table to be locked any more for new indexes, only needs to write the new index file, not rewrite the table data and previously existing indexes as well. What's more, Postgres allows you to create real tablespaces so that you can place individual persistent database objects (databases, tables, indexes, and some constraints) on separate storage. > > When we want to backup or restore one database, we can do it in two > way: a.) archive all files b.) make sql dump from database. Both options are available with Postgres although, as has been noted by others, option 'a' requires other tools in order to back up a consistent data set. > If a file get corrupted in a database, then we can restore the datas > from files, and this filesystem error causes problems only for this > database, not for all. > I very fear from to keep all databases in one place, because if they > are corrupted, possible more of them injured (if they are not > separated). > I cannot make filesystem based (hard) copy from one db (only SQL > dump enabled). Again, I'm not sure what you mean. There are different kinds of corruption that can be found ion files. Database files could be considered corrupt if the database screws something up and writes bad data to a file. That would be localized to the files it wrote to in the bad operation (and is extremely rare with Postgres) whereas what I think of as filesystem corruption often doesn't see file boundaries. > Ok, I saw that pgsql supports "tablespaces", but as I saw, this > function can hold only table datas in the another directory, and not > the full database can separated with them. No, you can put entire databases in/on different tablespaces. > Because I don't used one PGSQL with many databases (up to 200), I > don't know, what happening, and which cases possible. But I think > you have many experience with it. Please share it with me! > > Please help me, because we need to determine which DB to use. > > I started the usage of the PG in prev. month, and I liked it except > the way of the data storage (one data dir). > > I tried the MySQL before I tried PG. InnoDB is seems to be "forcing > transaction system on MyISAM". And boolean data type is missing > (solved with enum?). > I don't like it all, but it is seems to be fast with little tables, > and it is separate the database files to another directories which > thing I like. Possible it have many limitations what I don't saw in > first time. Well, if you don't want transaction then stay away from Postgres and realize that Postgres actually gives you more built-in support for placing your data at different locations than MySQL. > Please help me, which DB is good for us, and how to configure, and > use PGSQL with these database-set which we need to use. Just realize that when choosing a complex technology solution like a database there really is no "yes" or "no" answer or "here's the right way to do it" and "here's the wrong way". You need to define your requirements and part of that is asking questions. For example, 'how to configure..' -- if you seriously ask a question like that here you're going to get more questions (what kind of IO subsystem? how many concurrent connections do you need to support? how much memory do you have? how large is our data set? etc...) than you will answers. Erik Jones, Database Administrator Engine Yard Support, Scalability, Reliability 866.518.9273 x 260 Location: US/Pacific IRC: mage2k
On Sat, Jan 31, 2009 at 2:13 AM, Erik Jones <ejones@engineyard.com> wrote: > > On Jan 30, 2009, at 11:37 AM, durumdara wrote: >> - I can add/modify a table, or a field to a table without "full lock" on >> the table (like DBISAM restructure). Like in FireBird, where the "add field" >> change only the table description. I don't know that PG supports this way of >> the DB modifying. > > Nope. PostgreSQL is an all or nothing transactional database. I'd never > heard of DBISAM before you mentioned it and have never used Firebird. After > doing a little reading it turns out that if you plan to use transactions at > all (which is very likely given even just the little you've described about > the applications you're building) then you should realize that altering > tables is not compatible with transactions and doing so will automatically > commit any open transactions on the table. Are talking about pgsql or some other database? Everything in pgsql can be done in a transaction, except create / drop database / tablespace. > Looking into Firebird I couldn't > find how it handles (or doesn't) that at all I but I did see that it will > happily let you add a new not null column with no default to a table by > writing nulls for the new attribute for any existing columns. That already > makes me queasy. That's pretty much what pgsql does. Why does it make you queasy?
Scott Marlowe wrote: > On Sat, Jan 31, 2009 at 2:13 AM, Erik Jones <ejones@engineyard.com> wrote: >> On Jan 30, 2009, at 11:37 AM, durumdara wrote: > >>> - I can add/modify a table, or a field to a table without "full lock" on >>> the table (like DBISAM restructure). Like in FireBird, where the "add field" >>> change only the table description. I don't know that PG supports this way of >>> the DB modifying. >> Nope. PostgreSQL is an all or nothing transactional database. I'd never >> heard of DBISAM before you mentioned it and have never used Firebird. After >> doing a little reading it turns out that if you plan to use transactions at >> all (which is very likely given even just the little you've described about >> the applications you're building) then you should realize that altering >> tables is not compatible with transactions and doing so will automatically >> commit any open transactions on the table. > > Are talking about pgsql or some other database? Everything in pgsql > can be done in a transaction, except create / drop database / > tablespace. > >> Looking into Firebird I couldn't >> find how it handles (or doesn't) that at all I but I did see that it will >> happily let you add a new not null column with no default to a table by >> writing nulls for the new attribute for any existing columns. That already >> makes me queasy. > > That's pretty much what pgsql does. Why does it make you queasy? > I think the key is that the new column is NOT NULL, so defaulting the new column's values to NULL results in immediate data integrity inconsistency. If I remember rightly, PG doesn't allow this: you have to create the column as NULL, UPDATE and then add the NOT NULL constraint, or perhaps (I haven't tried this) create the column with a default and then remove it immediately afterwards.
On Sun, Feb 1, 2009 at 7:33 AM, Russ Brown <pickscrape@gmail.com> wrote: > Scott Marlowe wrote: >> >> On Sat, Jan 31, 2009 at 2:13 AM, Erik Jones <ejones@engineyard.com> wrote: >>> >>> On Jan 30, 2009, at 11:37 AM, durumdara wrote: >>> Looking into Firebird I couldn't >>> find how it handles (or doesn't) that at all I but I did see that it will >>> happily let you add a new not null column with no default to a table by >>> writing nulls for the new attribute for any existing columns. That >>> already >>> makes me queasy. >> >> That's pretty much what pgsql does. Why does it make you queasy? >> > > I think the key is that the new column is NOT NULL, so defaulting the new > column's values to NULL results in immediate data integrity inconsistency. > > If I remember rightly, PG doesn't allow this: you have to create the column > as NULL, UPDATE and then add the NOT NULL constraint, or perhaps (I haven't > tried this) create the column with a default and then remove it immediately > afterwards. OK, I completely misunderstood what the other poster meant. Pgsql does NOT allow creating the not null column with nulls in place. That would make me quesy too. Creating it with a default works in postgresql.
On Jan 31, 2009, at 9:36 AM, Scott Marlowe wrote: > On Sat, Jan 31, 2009 at 2:13 AM, Erik Jones <ejones@engineyard.com> > wrote: >> >> On Jan 30, 2009, at 11:37 AM, durumdara wrote: > >>> - I can add/modify a table, or a field to a table without "full >>> lock" on >>> the table (like DBISAM restructure). Like in FireBird, where the >>> "add field" >>> change only the table description. I don't know that PG supports >>> this way of >>> the DB modifying. >> >> Nope. PostgreSQL is an all or nothing transactional database. I'd >> never >> heard of DBISAM before you mentioned it and have never used >> Firebird. After >> doing a little reading it turns out that if you plan to use >> transactions at >> all (which is very likely given even just the little you've >> described about >> the applications you're building) then you should realize that >> altering >> tables is not compatible with transactions and doing so will >> automatically >> commit any open transactions on the table. > > Are talking about pgsql or some other database? Everything in pgsql > can be done in a transaction, except create / drop database / > tablespace. I was referring to DBISAM there. >> Looking into Firebird I couldn't >> find how it handles (or doesn't) that at all I but I did see that >> it will >> happily let you add a new not null column with no default to a >> table by >> writing nulls for the new attribute for any existing columns. That >> already >> makes me queasy. > > That's pretty much what pgsql does. Why does it make you queasy? Another poster already beat me to answering this one so I'll not repeat what they said. Erik Jones, Database Administrator Engine Yard Support, Scalability, Reliability 866.518.9273 x 260 Location: US/Pacific IRC: mage2k
Hi! 2009.01.31. 10:13 keltezéssel, Erik Jones írta: > > On Jan 30, 2009, at 11:37 AM, durumdara wrote: > >> The main viewpoints: >> - quick (re)connect - because mod_python basically not store the >> database connections persistently > > mod_python is not a database adaptor. Yes, I know. But because of mod_python (apache) commonly configured to "1 interpreter/1 request", I need to reconnect with the adapter. But I tested it, and it is seems to be have enough speed. If not, I must use some proxy server to keep them alive. > rewrite the table data and previously existing indexes as well. > What's more, Postgres allows you to create real tablespaces so that > you can place individual persistent database objects (databases, > tables, indexes, and some constraints) on separate storage. I wanna ask that if I create a database in another tablespace, is the database name already stored in "main" data storage (like global metadata)? And If I makes a table or other object into this database, need I to define the tablespace name, or it is taken from the database "tablespace" name? For example: create db anything tablespace "new2" These sqls are different? create table any_table (a int) or create table any_table (a int) tablespace "new2". So: need I define tablespace instruction in every table/object creation, or not? And: if I store this database in another drive, and it is unmounted (because of any reason - like failure, etc), is it causes any problems with postgresql main databases (for example: service stops, etc). I ask about "after restart pg service", not online usage. So if I remove some tablespace path (directory, drive) in the level of the filesystem, can pg service up to serve main databases the it can access? Or is it completely die in this problem? Thanks for your help: dd
On Feb 2, 2009, at 12:23 AM, durumdara wrote: > Hi! > > 2009.01.31. 10:13 keltezéssel, Erik Jones írta: >> >> On Jan 30, 2009, at 11:37 AM, durumdara wrote: >> >>> The main viewpoints: >>> - quick (re)connect - because mod_python basically not store the >>> database connections persistently >> >> mod_python is not a database adaptor. > Yes, I know. But because of mod_python (apache) commonly configured > to "1 interpreter/1 request", I need to reconnect with the adapter. > But I tested it, and it is seems to be have enough speed. If not, I > must use some proxy server to keep them alive. There are various connection pooling softwares available such as pgpool or pgbouncer for that. >> rewrite the table data and previously existing indexes as well. >> What's more, Postgres allows you to create real tablespaces so that >> you can place individual persistent database objects (databases, >> tables, indexes, and some constraints) on separate storage. > I wanna ask that if I create a database in another tablespace, is > the database name already stored in "main" data storage (like global > metadata)? > And If I makes a table or other object into this database, need I to > define the tablespace name, or it is taken from the database > "tablespace" name? > For example: > create db anything tablespace "new2" > > These sqls are different? > create table any_table (a int) > or > create table any_table (a int) tablespace "new2". > > So: need I define tablespace instruction in every table/object > creation, or not? No, if you create a database as being in a given tablespace then all table created in that database will go in that tablespace unless you specify otherwise. > And: if I store this database in another drive, and it is unmounted > (because of any reason - like failure, etc), is it causes any > problems with postgresql main databases (for example: service stops, > etc). I ask about "after restart pg service", not online usage. > So if I remove some tablespace path (directory, drive) in the level > of the filesystem, can pg service up to serve main databases the it > can access? > Or is it completely die in this problem? IIRC, the database should run fine and will just throw errors if you try to access a database or table in a tablespace that isn't mounted. > Thanks for your help: > dd You're welcome :) Erik Jones, Database Administrator Engine Yard Support, Scalability, Reliability 866.518.9273 x 260 Location: US/Pacific IRC: mage2k