Thread: Notes on converting from MySQL 5.0.x to PostgreSQL 8.1.4
On Converting from MySQL 5.0.x to PostgreSQL 8.1.x -------------------------------------------------- I am in the process of converting a couple of major sites from MySQL 5.0.22 to PostgreSQL 8.1.4, and I thought I would share some of my observations on this process and the two database systems in general. I feel I am in a good position to do this, as I am fairly familiar in-depth with MySQL's modern features, am relatively technology-agnostic, and having done this since the MySQL project has matured substantially over the last couple of years, I feel I can give insight into the features of both systems from a current perspective. I will attempt to do this from an objective standpoint, as I still see benefits and drawbacks to both systems (no software will ever meet the needs of every situation). If you are looking for a document that mocks one system or another, this is not it; there are countless results for "X sucks" on $your_favorite_search_engine. My reasons for making this switch are primarily due to having a bit of spare time, wanting to expand my horizons and familiarize myself with another well-respected open source project, some of the nice in-built procedural language features of PostgreSQL, and basically wanting to form my own opinion of the features of both systems. That being said, I have really been impressed so far with the features I am discovering, and becoming happier each day. It is by no means comprehensive; I have just highlighted some of the more notable and obvious differences as I discovered them. Most of the information in here is pretty basic for those of you very familiar with both systems. I also only highlight the differences in the Unix/POSIX versions of these programs. (I don't run Windows as a server, and I don't care to. My technology agnosticism stops at inadequate systems.) All comments are current as of the time of writing (Spring/Summer 2006). Error corrections are very welcome. Flames can go to /dev/null. Don't care. Oh, and since I wrote this in vim, it is best read with a monospace font. :^) Major differences I have noted: ------------------------------- MySQL 5.0.x: * Multiple storage engines with different features. * Supports multi-insert syntax (INSERT INTO foo VALUES (1,2), (3,4) ...) * A few more access controls on features built-in to the GRANT tables. Many of these are still present, but implemented in other ways in PostgreSQL. * Single AUTO_INCREMENT column allowed per table. * Easy, built-in and extensive replication support. * Single datastore location per server. * ALL Stored Procedures are kept in the mysql system database. PostgreSQL 8.1.x: * Embedded procedures in multiple native languages (stored procedures and functions can be written in native Perl, Python, Tcl, SQL, PL/PgSQL) * Extensive and versatile procedural language functionality. * User-definable data types and operators. * Multiple sequence generators allowed per table. * Replication support still rudimentary. * Stored procedures are kept (somewhat more logically, imho) in the corresponding databases. * Multiple datastore locations possible using tablespaces concept. (For the record, MySQL will have tablespaces when 5.1.x is stabilized.) * Most system variables, "built-in" types and features configurable as they are just kept in a system catalog. * Allows deletions and subselects to specify the same table (e.g. DELETE FROM foo WHERE id IN (SELECT id FROM foo WHERE...) ). MySQL does not allow this as of 5.0.22. * Copious documentation on the database internals, for extending the database itself. Pointers, tips, quick facts and gotchas for other people converting: -------------------------------------------------------------------- * Don't bother using an old version, just go for 8.1.4 (or whatever is new at the time of your conversion. This should be common sense.) * Since Pg uses a full transactional storage engine, the speed is roughly comparable to InnoDB, rather than the stock MyISAM format. * PostgreSQL's TCP port is 5432 by default. * The main server process on PostgreSQL is 'postmaster'. * 'postmaster' can be controlled via the 'pg_ctl' command. * The administrative user is called 'postgres' by default. * Like MySQL, Pg uses the system user as default, if no username is specified when connecting. * The command-line client is called 'psql'. * PostgreSQL by default comes configured to disallow network connections. To enable these, you must follow these steps: 1. Edit $DATADIR/pg_hba.conf and add access permissions. 2. Edit $DATADIR/postgresql.conf and uncomment the listen_addresses line, setting it to something reasonable. 3. Restart postmaster. * PostgreSQL relies extensively upon quick aliases for common features within the CLI shell. MySQL offers many similar features, but they aren't used as much from what I have observed. * MySQL combines the concepts of 'database' and 'schema' into one. PostgreSQL differentiates the two. While the hierarchy in MySQL is database.table.field, PostgreSQL is roughly: database.schema.table.field. A schema is a 'logically grouped set of tables but still kept within a particular database.' This could allow separate applications to be built that still rely upon the same database, but can be kept somewhat logically separated. The default schema in each database is called 'public', and is the one referred to if no others are specified. This can be modified with 'SET search_path TO ...'. * Pg uses a 'template1' pseudo-database that can be tailored to provide default objects for new database creation, if you should desire. It obviously also offers a 'template0' database that is read-only and offers a barebones database, more equivalent to the empty db created with mysql's CREATE DATABASE statement. * Pg's ROLEs can specify a single user or a group, and be nested to contain multiple users. * Pg's default character set (in 8.1.4) is UTF8. * Pg uses the 'serial' column type instead of AUTO_INCREMENT. This allows more than one independent sequence to be specified per table (though the utility of this may be of dubious value). These are closer to Oracle's concept of sequence generators, and they can be manipulated with the currval(), nextval(), setval(), and lastval() functions. * Pg requires its tables and databases be 'vacuumed' regularly to remove completed transaction snapshots and optimize the tables on disk. It is necessary because the way that PostgreSQL implements true MVCC is by writing all temporary transactions to disk and setting a visibility flag for the record. Vacuuming can be performed automatically, and in a deferred manner by using vacuum_cost settings to limit it to low-load periods or based upon numerous other criteria. See the manual for more information. * Kept internally in Pg, there is a concept called the OID, which is a continuously incremented number used to assign unique IDs to system objects. This allows the database to store and refer uniquely to user operators, new databases, basically anything that the system needs to refer to in the 'data directory', regardless of user-defined names. * Most administrative procedures will refuse to run as root, and require you to su to the 'postgres' system user to perform the action. * PgAdminIII gives you a great overview of the hierarchy of system objects throughout the server. Even though you may administrate your server primarily via the CLI, as I do, it is still valuable during the learning process to use this tool to browse around the various objects, to learn the system setup and hierarchy visually. * While MySQL supports transactions with the InnoDB databases, many MySQL users generally do not use them extensively enough. With Pg, due to the behaviour of the server in attempting to ensure data integrity in a variety of situations (client disconnection, network trouble, server crashes, etc.), it is highly advisable to become familiar and utilize transactions a lot more, to ensure your DATA is left in a consistent state before and after every change you wish to make. * There is a conversion utility called 'mysql2pgsql' that will convert dump files from the mysqldump format, to a format that psql can understand. It is available at: - http://gborg.postgresql.org/project/mysql2psql/projdisplay.php * To turn on query time output, similar to the mysql CLI, use the '\timing' command from psql. (Note that the time is displayed in milliseconds, whereas in the mysql client it is displayed in seconds.) Common equivalents: ------------------- MySQL PostgreSQL ----- ----------- mysql database system tables mysql psql mysqld postmaster mysqladmin initdb/dropdb/createuser/dropuser/ createlang/droplang/vacuumdb mysqldump pg_dump/pg_dumpall/pg_restore -nothing- ecpg SHOW DATABASES; \l SHOW GRANTS; \du SHOW TABLES; \dt DESC tblname; \d foo USE dbname; \c dbname ALTER TABLE foo AUTO_INCREMENT = n; SELECT setval('seq_name',n); SHOW PROCESSLIST; SELECT * FROM pg_stat_activity; OPTIMIZE TABLE ... VACUUM ... Final thoughts: --------------- Overall, I have been happy and very impressed with the features offered by PostgreSQL 8.1.4, and believe that I will be using it for the majority of my future projects. There are still some niches where I see utility for the pluggable storage engines and raw speed of MySQL. I will be keeping my eye on that project closely, also, as I want to test out the new Scheduled Events feature from the 5.1 series once it is stabilized. I have also been using Apress' "Beginning Databases with PostgreSQL, 2nd Edition" (2005) as my learning material. I briefly examined Sams' "PostgreSQL - The Comprehensive Guide, 2nd Ed." (2005) and Pearson's "PostgreSQL: Introduction and Concepts" (2001), but found the Apress book the best of the 3 that I had access to. YMMV. More info in the book reviews linked below. Further information: -------------------- PostgreSQL manual: http://www.postgresql.org/docs/manuals/ PgAdmin3: http://www.pgadmin.org/ PostgreSQL book reviews: http://techdocs.postgresql.org/techdocs/bookreviews.php Community Support Channel: irc.freenode.net in #postgresql Varlena Consulting's General Bits archives: http://www.varlena.com/GeneralBits/ --------- Thank you, and I hope that these notes prove helpful to others! Jason McManus
"Jason McManus" <mcmanus.jason@gmail.com> writes: > I am in the process of converting a couple of major sites from MySQL 5.0.22 > to PostgreSQL 8.1.4, and I thought I would share some of my observations on > this process and the two database systems in general. Nice notes! I see only one small error: > * Pg's default character set (in 8.1.4) is UTF8. I don't believe there is any fixed "default character set". Each installation will have a default locale and encoding, but these depend on the locale under which initdb was run. From the above comment I surmise that you initdb'd under some UTF8-using locale ... > Thank you, and I hope that these notes prove helpful to others! Perhaps they should go on the project website somewhere? regards, tom lane
> I don't believe there is any fixed "default character set". Each > installation will have a default locale and encoding, but these depend > on the locale under which initdb was run. From the above comment I > surmise that you initdb'd under some UTF8-using locale ... Ah, great. Thank you for the clarification. I did indeed run initdb on different systems, but they were all under UTF-8, so I guess I had drawn the wrong conclusion. > Perhaps they should go on the project website somewhere? I was thinking so as well, but I'm not sure where, or who to submit to if there is no wiki or other external editing method. Any pointers for who to contact/where to submit? Thanks, Jason
"Jason McManus" <mcmanus.jason@gmail.com> writes: >> Perhaps they should go on the project website somewhere? > I was thinking so as well, but I'm not sure where, or who to submit to > if there is no wiki or other external editing method. Any pointers > for who to contact/where to submit? Not my department, but if none of the project's web people follow up in the next few hours, try inquiring on pgsql-docs or pgsql-www. regards, tom lane
I agree with Tom, nice notes. I noted a few minor issues that seem to derive from a familiarity with MySQL. I'll put my corrections below... On Fri, 2006-06-30 at 08:17, Jason McManus wrote: > On Converting from MySQL 5.0.x to PostgreSQL 8.1.x > -------------------------------------------------- > Major differences I have noted: > ------------------------------- > > MySQL 5.0.x: > * Easy, built-in and extensive replication support. Not sure how extensive it is. It's basically synchronous single master single slave, right? It is quite easy though. > PostgreSQL 8.1.x: > * Embedded procedures in multiple native languages (stored procedures and > functions can be written in native Perl, Python, Tcl, SQL, PL/PgSQL) Note that there are a dozen or more other languages as well. Just FYI. Off the top of my head, plPHP, plJ (java there's two different java implementations, I think) and plR (R is the open source equivalent of the S statistics language) > * Replication support still rudimentary. Hmmmm. I think that's an overly simplistic evaluation. The slony replication engine is actually VERY advanced, but the administrative tools consist mostly of "your brain". hehe. That said, once you've learned how to drive it, it's quite amazing. Keep in mind, slony can be applied to a living database while it's running, and can run between different major versions of postgresql. That's a pretty advanced feature. Plus, if the replication daemons die (kill -9ed or whatever) you can restart replication and slony will come right back where it was and catch up. > Pointers, tips, quick facts and gotchas for other people converting: > -------------------------------------------------------------------- > > * MySQL combines the concepts of 'database' and 'schema' into one. PostgreSQL > differentiates the two. While the hierarchy in MySQL is > database.table.field, PostgreSQL is roughly: database.schema.table.field. > A schema is a 'logically grouped set of tables but still kept within a > particular database.' This could allow separate applications to be built > that still rely upon the same database, but can be kept somewhat logically > separated. The default schema in each database is called 'public', and is > the one referred to if no others are specified. This can be modified with > 'SET search_path TO ...'. This is a VERY good analysis of the difference between the two databases. > * Pg uses a 'template1' pseudo-database that can be tailored to provide > default objects for new database creation, if you should desire. It > obviously also offers a 'template0' database that is read-only and > offers a barebones database, more equivalent to the empty db created with > mysql's CREATE DATABASE statement. This isn't quite right. template0 is a locked and "pure" copy of the template database. It's there for "break glass in case of emergency" use. :) template1, when you first initdb, is exactly the same as template0, but you can connect to it, and alter it. Both of these are "real" postgresql databases. template1 is the database that gets copied by default when you do "create database". Note that you can also define a different template database when running create database, which lets you easily clone any database on your machine. "create database newdb with template olddb" > * Pg uses the 'serial' column type instead of AUTO_INCREMENT. This allows > more than one independent sequence to be specified per table (though the > utility of this may be of dubious value). These are closer to Oracle's > concept of sequence generators, and they can be manipulated with the > currval(), nextval(), setval(), and lastval() functions. Don't forget 64bit bigserials too. > * Pg requires its tables and databases be 'vacuumed' regularly to remove > completed transaction snapshots and optimize the tables on disk. It is > necessary because the way that PostgreSQL implements true MVCC is by > writing all temporary transactions to disk and setting a visibility > flag for the record. Vacuuming can be performed automatically, and in > a deferred manner by using vacuum_cost settings to limit it to low-load > periods or based upon numerous other criteria. See the manual for more > information. Interestingly enough, MySQL's innodb tables do almost the exact same thing, but their vacuum process is wholly automated. Generally, this means fewer issues pop up for the new dba, but when they do, they can be a little harder to deal with. It's about a wash. Of course, as you mentioned earlier, most mysql folks aren't using innodb. > * While MySQL supports transactions with the InnoDB databases, many MySQL > users generally do not use them extensively enough. With Pg, due to the > behaviour of the server in attempting to ensure data integrity in a > variety of situations (client disconnection, network trouble, server > crashes, etc.), it is highly advisable to become familiar and utilize > transactions a lot more, to ensure your DATA is left in a consistent state > before and after every change you wish to make. A point you might want to throw in here is that EVERYTHING in postgresql is a transaction. If you don't issue a begin statement, then postgresql runs each statement you type in inside its own transaction. This means that inserting 10,000 rows without wrapping them inside an explicit transaction results in 10,000 individual transactions. However, the more interesting thing here, is that every statement, including DDL is transactable, except for a couple of big odd ones, like create database. So, in postgresql, you can do: begin; create table xyz... alter table abc... insert into abc select * from iii update iii...; drop table iii; (oops, I messed up something) rollback; and there's no change and no lost data. Quite impressive actually. > Common equivalents: > ------------------- > > MySQL PostgreSQL > ----- ----------- > OPTIMIZE TABLE ... VACUUM ... vacuum and analyze for optimize I think. Also, possibly reindex, although nominally that's the "sledge hammer" of optimization. One last thing I'd mention that I REALLY like about PostgreSQL over any other database I've used is that the psql interface has a complete syntax lookup feature that is WAY cool. \h brings it up, and \h COMMAND where COMMAND is the command you want to look up will bring up the syntax for your command. And, I hate the fact that CTRL-C in the mysql command line tool exits the tool instead of interrupting the current query. In PostgreSQL it interrupts the current query. CTRL-\ will kill the client if you need to. Overall, a great review. Thanks.
> -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Tom Lane > Sent: 30 June 2006 15:47 > To: Jason McManus > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] Notes on converting from MySQL 5.0.x > to PostgreSQL 8.1.4 > > "Jason McManus" <mcmanus.jason@gmail.com> writes: > >> Perhaps they should go on the project website somewhere? > > > I was thinking so as well, but I'm not sure where, or who > to submit to > > if there is no wiki or other external editing method. Any pointers > > for who to contact/where to submit? > > Not my department, but if none of the project's web people follow up > in the next few hours, try inquiring on pgsql-docs or pgsql-www. Documentation such as this can be added to the new techdocs area on the main site at http://www.postgresql.org/docs/techdocs under the relevant section (probably http://www.postgresql.org/docs/techdocs.3 in this case). Please note that the editting interface is still new and may still have a quirk or two... Interesting notes BTW Jason - thanks for sharing. Regards, Dave.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Scott Marlowe wrote: [snip] > However, the more interesting thing here, is that every > statement, including DDL is transactable, except for a couple of > big odd ones, like create database. So, in postgresql, you can do: > > begin; > create table xyz... > alter table abc... > insert into abc select * from iii > update iii...; > drop table iii; > (oops, I messed up something) > rollback; But isn't that what it means to be "transactional"? Or am I spoiled by my "big, expensive enterprise database"? - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEpUGPS9HxQb37XmcRAiJwAJ9/A/N/OgmslveSsX3Xym2QnDQz1gCghPD0 YX882Kv81hzZ4AKjaIVKHg8= =Gsml -----END PGP SIGNATURE-----
> -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Scott Marlowe > Sent: 30 June 2006 16:12 > To: Jason McManus > Cc: pgsql general > Subject: Re: [GENERAL] Notes on converting from MySQL 5.0.x > to PostgreSQL > > > * Replication support still rudimentary. > > Hmmmm. I think that's an overly simplistic evaluation. The slony > replication engine is actually VERY advanced, but the administrative > tools consist mostly of "your brain". hehe. That said, once you've > learned how to drive it, it's quite amazing. I'm not sure that many people necessarily realise it, but you can also drive Slony directly from pgAdmin 1.4+ if slonik scripts give you a headache. Regards, Dave.
Ron Johnson wrote: >>begin; >>create table xyz... >>alter table abc... >>insert into abc select * from iii >>update iii...; >>drop table iii; >>(oops, I messed up something) >>rollback; > > > But isn't that what it means to be "transactional"? Or am I spoiled > by my "big, expensive enterprise database"? Well, according to my colleague here this wasn't possible until now (partially!) in Oracle 10. Meaning it's not common-place even among enterprise db's. -- Alban Hertroys alban@magproductions.nl magproductions b.v. T: ++31(0)534346874 F: ++31(0)534346876 M: I: www.magproductions.nl A: Postbus 416 7500 AK Enschede // Integrate Your World //
Scott Marlowe <smarlowe@g2switchworks.com> writes: > On Fri, 2006-06-30 at 08:17, Jason McManus wrote: >> * Replication support still rudimentary. > Hmmmm. I think that's an overly simplistic evaluation. The slony > replication engine is actually VERY advanced, but the administrative > tools consist mostly of "your brain". hehe. That said, once you've > learned how to drive it, it's quite amazing. Keep in mind, slony can be > applied to a living database while it's running, and can run between > different major versions of postgresql. That's a pretty advanced > feature. Plus, if the replication daemons die (kill -9ed or whatever) > you can restart replication and slony will come right back where it was > and catch up. It might be worth pointing out that mysql's replication falls over if you so much as look at it crosseyed. I have not had to use it for production purposes, but I can tell you that the mysql replication regression tests fail ... irreproducibly of course ... almost one time in two in Red Hat's build environment. I've been able to trace a few of these failures to quirks of the build environment, like trying to build x86 and x86_64 at the same time in different chroots of the same machine (must take care not to use same TCP port numbers for tests), but it still seems flaky as hell. regards, tom lane
On Friday 30. June 2006 17:12, Scott Marlowe wrote: >And, I hate the fact that CTRL-C in the mysql command line tool exits >the tool instead of interrupting the current query. I agree, it's a nuisance. >In PostgreSQL it >interrupts the current query. CTRL-\ will kill the client if you need >to. Or Ctrl-D, which also is a common way of terminating a command-line interface, like the Python interpreter. It's much easier to remember than the Ctrl-\ . -- Leif Biberg Kristensen | Registered Linux User #338009 http://solumslekt.org/ | Cruising with Gentoo/KDE
Out of curiosity, which "big, expensive enterprise database" are you spoiled by? Many that I support do not allow DDL within an transaction, or if they allow it, there are many caveats and rules. -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Ron Johnson Sent: Friday, June 30, 2006 8:22 AM To: pgsql general Subject: Re: [GENERAL] Notes on converting from MySQL 5.0.x to PostgreSQL -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Scott Marlowe wrote: [snip] > However, the more interesting thing here, is that every > statement, including DDL is transactable, except for a couple of > big odd ones, like create database. So, in postgresql, you can do: > > begin; > create table xyz... > alter table abc... > insert into abc select * from iii > update iii...; > drop table iii; > (oops, I messed up something) > rollback; But isn't that what it means to be "transactional"? Or am I spoiled by my "big, expensive enterprise database"? - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEpUGPS9HxQb37XmcRAiJwAJ9/A/N/OgmslveSsX3Xym2QnDQz1gCghPD0 YX882Kv81hzZ4AKjaIVKHg8= =Gsml -----END PGP SIGNATURE----- ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster
Ron Johnson <ron.l.johnson@cox.net> writes: > Scott Marlowe wrote: >> However, the more interesting thing here, is that every >> statement, including DDL is transactable, except for a couple of >> big odd ones, like create database. So, in postgresql, you can do: > But isn't that what it means to be "transactional"? Or am I spoiled > by my "big, expensive enterprise database"? Being able to roll back DDL (table-schema modifications) isn't that common. Since PG keeps most of its schema information in tables, we have it easier than some other systems supporting DDL rollback, but it's still tricky. As an example, a long time ago we used to name table files after the table and database directories after the database, which made it easy to see what was what under $PGDATA, but prevented a lot of DDL from being transactional. For instance BEGIN; DROP TABLE foo; CREATE TABLE foo (some-new-definition); ROLLBACK; couldn't work because there would need to be two physical files named foo in the interim until you commit or roll back. ALTER TABLE RENAME had some related problems. Now we name all the filesystem objects using OIDs that can be chosen to never collide, even if they belong to database objects with similar names. Last I checked, mysql was still using table names for file names, so they're on the wrong side of this. regards, tom lane
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 paul rivers wrote: > Out of curiosity, which "big, expensive enterprise database" are > you spoiled by? Many that I support do not allow DDL within an > transaction, or if they allow it, there are many caveats and > rules. Oracle Rdb. Built by DEC back in the early 1980s. It's had tablespaces (Storage Areas in Rdb parlance) since the late 80s. Tables (including the system catalog, which itself is a set of tables) all go in a Storage Area. If you don't specify one, it does in the default: RDB$SYSTEM. Creating a table is no more than inserting records into a few system tables, and allocating a few pages in the relevant Storage Area. Thus, rolling back most all DDL is built deep into the engine. http://www.oracle.com/technology/products/rdb/index.html - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEpVAZS9HxQb37XmcRArUOAKDFNtLVqr9BeYi7k6nhp/GnVI7M6QCfV7hJ wNUUCx2sGUmRklxtwu6hoUA= =CLgH -----END PGP SIGNATURE-----
On Fri, Jun 30, 2006 at 04:22:28PM +0100, Dave Page wrote: > > > > > * Replication support still rudimentary. > > > > Hmmmm. I think that's an overly simplistic evaluation. The slony > > replication engine is actually VERY advanced, but the > > administrative tools consist mostly of "your brain". hehe. That > > said, once you've learned how to drive it, it's quite amazing. > > I'm not sure that many people necessarily realise it, but you can > also drive Slony directly from pgAdmin 1.4+ if slonik scripts give > you a headache. Last I checked, pgAdmin 1.4 doesn't help setting up clusters, which is one of the major headaches of a Slony-I setup. I also noticed that pgAdmin 1.6-to-be has at least some of those hooks. Any ETA on that? Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote!
On Fri, Jun 30, 2006 at 11:39:04AM -0400, Tom Lane wrote: > Scott Marlowe <smarlowe@g2switchworks.com> writes: > > On Fri, 2006-06-30 at 08:17, Jason McManus wrote: > >> * Replication support still rudimentary. > > It might be worth pointing out that mysql's replication falls over > if you so much as look at it crosseyed. I have not had to use it > for production purposes, but I can tell you that the mysql > replication regression tests fail ... irreproducibly of course ... > almost one time in two in Red Hat's build environment. Are those tests, or at least descriptions of them, available? Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote!
David Fetter <david@fetter.org> writes: > On Fri, Jun 30, 2006 at 11:39:04AM -0400, Tom Lane wrote: >> It might be worth pointing out that mysql's replication falls over >> if you so much as look at it crosseyed. I have not had to use it >> for production purposes, but I can tell you that the mysql >> replication regression tests fail ... irreproducibly of course ... >> almost one time in two in Red Hat's build environment. > Are those tests, or at least descriptions of them, available? Sure, it's just the standard "make test" sequence in mysql's source. If you want to do exactly what I'm talking about, grab the latest mysql SRPM off the Fedora download server and "rpmbuild --rebuild" it. There's a very long regression test suite (much larger than ours :-() and when it fails, it's invariably in one of the replication-related tests. regards, tom lane
On Fri, Jun 30, 2006 at 01:41:53PM -0400, Tom Lane wrote: > David Fetter <david@fetter.org> writes: > > On Fri, Jun 30, 2006 at 11:39:04AM -0400, Tom Lane wrote: > >> It might be worth pointing out that mysql's replication falls > >> over if you so much as look at it crosseyed. I have not had to > >> use it for production purposes, but I can tell you that the mysql > >> replication regression tests fail ... irreproducibly of course > >> ... almost one time in two in Red Hat's build environment. > > > Are those tests, or at least descriptions of them, available? > > Sure, it's just the standard "make test" sequence in mysql's source. Uh oh. I'm a little worried about writing tests based on GPLed code for Slony-I or other replication systems. Might these need to be clean-roomed? Cheers, D -- David Fetter <david@fetter.org> http://fetter.org/ phone: +1 415 235 3778 AIM: dfetter666 Skype: davidfetter Remember to vote!
David Fetter <david@fetter.org> writes: > On Fri, Jun 30, 2006 at 01:41:53PM -0400, Tom Lane wrote: >> Sure, it's just the standard "make test" sequence in mysql's source. > Uh oh. I'm a little worried about writing tests based on GPLed code > for Slony-I or other replication systems. Might these need to be > clean-roomed? Oh, is that what you wanted 'em for? Probably. My recollection from the few that I've really looked at is that they're pretty mysql-specific anyway (eg, some of them are actual "regression" tests to catch reappearance of old mysql bugs). regards, tom lane
ron.l.johnson@cox.net (Ron Johnson) writes: > Scott Marlowe wrote: > [snip] >> However, the more interesting thing here, is that every >> statement, including DDL is transactable, except for a couple of >> big odd ones, like create database. So, in postgresql, you can do: >> >> begin; >> create table xyz... >> alter table abc... >> insert into abc select * from iii >> update iii...; >> drop table iii; >> (oops, I messed up something) >> rollback; > > But isn't that what it means to be "transactional"? Or am I spoiled > by my "big, expensive enterprise database"? DDL commonly hasn't been "able to be rolled back," even in "big, expensive" databases... -- (format nil "~S@~S" "cbbrowne" "cbbrowne.com") http://www.ntlug.org/~cbbrowne/unix.html Rules of the Evil Overlord #180. "If I ever build a device to transfer the hero's energy into me, I will make sure it cannot operate in reverse." <http://www.eviloverlord.com/>
smarlowe@g2switchworks.com (Scott Marlowe) writes: > I agree with Tom, nice notes. I noted a few minor issues that seem to > derive from a familiarity with MySQL. I'll put my corrections below... > > On Fri, 2006-06-30 at 08:17, Jason McManus wrote: >> On Converting from MySQL 5.0.x to PostgreSQL 8.1.x >> -------------------------------------------------- >> Major differences I have noted: >> ------------------------------- >> >> MySQL 5.0.x: > >> * Easy, built-in and extensive replication support. > > Not sure how extensive it is. It's basically synchronous single master > single slave, right? It is quite easy though. And it's statement-based, is it not? Indications are that MySQL replication is quite non-deterministic, as a result; if you use SYSDATE() in INSERT/UPDATE queries to set timestamps, replicas will get the wrong time. It looks like anything that is dynamically evaluated will be processed incorrectly on replicas, such as timezones. It is possible for the data on the master and slave to become different if a statement is designed in such a way that the data modification is non-deterministic; that is, left to the will of the query optimizer. It's multi-slave, mind you... >> * Replication support still rudimentary. > > Hmmmm. I think that's an overly simplistic evaluation. The slony > replication engine is actually VERY advanced, but the administrative > tools consist mostly of "your brain". hehe. That said, once you've > learned how to drive it, it's quite amazing. Keep in mind, slony > can be applied to a living database while it's running, and can run > between different major versions of postgresql. That's a pretty > advanced feature. Plus, if the replication daemons die (kill -9ed > or whatever) you can restart replication and slony will come right > back where it was and catch up. And you can trust that the data that is replicated will actually be faithfully replicated, even in the presence of timestamps, triggers, and other things that challenge determinism... >> * Pg uses a 'template1' pseudo-database that can be tailored to provide >> default objects for new database creation, if you should desire. It >> obviously also offers a 'template0' database that is read-only and >> offers a barebones database, more equivalent to the empty db created with >> mysql's CREATE DATABASE statement. > > This isn't quite right. > > template0 is a locked and "pure" copy of the template database. It's > there for "break glass in case of emergency" use. :) > > template1, when you first initdb, is exactly the same as template0, but > you can connect to it, and alter it. Both of these are "real" > postgresql databases. template1 is the database that gets copied by > default when you do "create database". Note that you can also define a > different template database when running create database, which lets you > easily clone any database on your machine. "create database newdb with > template olddb" In the last few weeks, we've had fun using "createdb --template=" to create test copies of production databases (well, replicas thereof...). Creating a replica via Slony-I takes several hours, for large databases, as it has to load data into tables, then generate indexes. We've used "createdb" on such databases; the longest it took to set up an "extra duplicate" was something like 8 minutes, and that gave our sysadmins full copies of the production databases that could be used for testing... The speed was *stunning*... -- (format nil "~S@~S" "cbbrowne" "cbbrowne.com") http://cbbrowne.com/info/multiplexor.html "How much more helpful could I be than to provide you with the appropriate e-mail address? I could engrave it on a clue-by-four and deliver it to you in Chicago, I suppose." -- Seen on Slashdot...
On Fri, 2006-06-30 at 16:34, Chris Browne wrote: > smarlowe@g2switchworks.com (Scott Marlowe) writes: > > I agree with Tom, nice notes. I noted a few minor issues that seem to > > derive from a familiarity with MySQL. I'll put my corrections below... > > > > On Fri, 2006-06-30 at 08:17, Jason McManus wrote: > >> On Converting from MySQL 5.0.x to PostgreSQL 8.1.x > >> -------------------------------------------------- > >> Major differences I have noted: > >> ------------------------------- > >> > >> MySQL 5.0.x: > > > >> * Easy, built-in and extensive replication support. > > > > Not sure how extensive it is. It's basically synchronous single master > > single slave, right? It is quite easy though. > > And it's statement-based, is it not? > > Indications are that MySQL replication is quite non-deterministic, as > a result; if you use SYSDATE() in INSERT/UPDATE queries to set > timestamps, replicas will get the wrong time. > > It looks like anything that is dynamically evaluated will be processed > incorrectly on replicas, such as timezones. > > It is possible for the data on the master and slave to become > different if a statement is designed in such a way that the data > modification is non-deterministic; that is, left to the will of the > query optimizer. This is essentially correct. Note that I can use pgpool with postgresql and get about the same behaviour as mysql's replication, with the same basic draw backs, that it's best to copy the database between shutdown machines, and things that are dynamically evaluated can cause issues. With pgpool I get synchronous replication with automatic failover, and it's dead simple to build and install. Which kind of shows off the difference in philosophy between the two development camps. The postgresql folks are very very picky about what gets put into the main package, and let's face it, pgpool, while neat, is not really ready for integration into the backend. Meanwhile, a nearly identical replication system IS integrated into the backend of MySQL, warts and all for the sake of convenience of the users, and possibly marketing. It's not that one way is so much better than the other, it's just indicative of how the two camps operate.
On Fri, Jun 30, 2006 at 11:01:19AM -0700, David Fetter wrote: > On Fri, Jun 30, 2006 at 01:41:53PM -0400, Tom Lane wrote: > > David Fetter <david@fetter.org> writes: > > > On Fri, Jun 30, 2006 at 11:39:04AM -0400, Tom Lane wrote: > > >> It might be worth pointing out that mysql's replication falls > > >> over if you so much as look at it crosseyed. I have not had to > > >> use it for production purposes, but I can tell you that the mysql > > >> replication regression tests fail ... irreproducibly of course > > >> ... almost one time in two in Red Hat's build environment. > > > > > Are those tests, or at least descriptions of them, available? > > > > Sure, it's just the standard "make test" sequence in mysql's source. > > Uh oh. I'm a little worried about writing tests based on GPLed code > for Slony-I or other replication systems. Might these need to be > clean-roomed? Is there actually a lack of ideas for our regression tests, or a lack of people/motivation to work on them? -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Fri, Jun 30, 2006 at 11:39:04AM -0400, Tom Lane wrote: > Scott Marlowe <smarlowe@g2switchworks.com> writes: > > On Fri, 2006-06-30 at 08:17, Jason McManus wrote: > >> * Replication support still rudimentary. > > > Hmmmm. I think that's an overly simplistic evaluation. The slony > > replication engine is actually VERY advanced, but the administrative > > tools consist mostly of "your brain". hehe. That said, once you've > > learned how to drive it, it's quite amazing. Keep in mind, slony can be > > applied to a living database while it's running, and can run between > > different major versions of postgresql. That's a pretty advanced > > feature. Plus, if the replication daemons die (kill -9ed or whatever) > > you can restart replication and slony will come right back where it was > > and catch up. > > It might be worth pointing out that mysql's replication falls over > if you so much as look at it crosseyed. I have not had to use it > for production purposes, but I can tell you that the mysql replication > regression tests fail ... irreproducibly of course ... almost one time > in two in Red Hat's build environment. I've been able to trace a few of > these failures to quirks of the build environment, like trying to build > x86 and x86_64 at the same time in different chroots of the same machine > (must take care not to use same TCP port numbers for tests), but it > still seems flaky as hell. I attended a talk about MySQL and High Availability once and was pretty unimpressed. Lots of 'now you take the database down and copy files around' and the like. Nothing remotely close to the abilities of Slony. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Fri, Jun 30, 2006 at 05:16:46PM -0500, Scott Marlowe wrote: > This is essentially correct. Note that I can use pgpool with postgresql > and get about the same behaviour as mysql's replication, with the same > basic draw backs, that it's best to copy the database between shutdown > machines, and things that are dynamically evaluated can cause issues. > With pgpool I get synchronous replication with automatic failover, and > it's dead simple to build and install. Maybe we should be promoting pgpool's replication in that light. "If you're looking something that works like MySQL's replication, use pgpool. If you want a much more sophisticated and complex mechanism, use Slony". > Which kind of shows off the difference in philosophy between the two > development camps. The postgresql folks are very very picky about what > gets put into the main package, and let's face it, pgpool, while neat, > is not really ready for integration into the backend. Meanwhile, a > nearly identical replication system IS integrated into the backend of > MySQL, warts and all for the sake of convenience of the users, and > possibly marketing. > > It's not that one way is so much better than the other, it's just > indicative of how the two camps operate. I'd say google:'mysql gotchas' is a pretty good indicator of that. ;) MySQL tries desperatly hard to make databases 'easy', but the reality is that unless it's a pretty trivial embedded database, databases (both RDBMSes and database design and use) aren't easy; they're probably one of the most complex pieces of IT in commmon use today. IMO, in trying to 'make it simple', a lot of people end up burned. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Chris Browne wrote: > ron.l.johnson@cox.net (Ron Johnson) writes: > >> Scott Marlowe wrote: >> [snip] >>> However, the more interesting thing here, is that every >>> statement, including DDL is transactable, except for a couple of >>> big odd ones, like create database. So, in postgresql, you can do: >>> >>> begin; >>> create table xyz... >>> alter table abc... >>> insert into abc select * from iii >>> update iii...; >>> drop table iii; >>> (oops, I messed up something) >>> rollback; >> But isn't that what it means to be "transactional"? Or am I spoiled >> by my "big, expensive enterprise database"? > > DDL commonly hasn't been "able to be rolled back," even in "big, > expensive" databases... I guess I'm just fortunate... - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEpd6BS9HxQb37XmcRAgu8AKCqp6KxNYoa0tIcmbglG8XXSzgXpQCgjDLv vYkFNzwXF1K+b9ZNK6Svr64= =mW9L -----END PGP SIGNATURE-----
"Jim C. Nasby" <jnasby@pervasive.com> writes: > Is there actually a lack of ideas for our regression tests, or a lack of > people/motivation to work on them? Certainly there are plenty of ideas in the archives ... but writing regression tests is so *boring* :-(. This is definitely a weak spot for a mostly-volunteer project --- it's hard to get anyone to do that kind of work. Something that would actually hold some intellectual interest is to improve the testing infrastructure. The current setup is pretty limited as to its ability to deal with varying outputs, and even more limited in its ability to test concurrent behavior. Again, see the archives. regards, tom lane
On Fri, Jun 30, 2006 at 11:26:06PM -0400, Tom Lane wrote: > Something that would actually hold some intellectual interest is to > improve the testing infrastructure. The current setup is pretty limited > as to its ability to deal with varying outputs, and even more limited > in its ability to test concurrent behavior. Again, see the archives. I must admit I was kind of surprised this didn't generate much feedback: http://archives.postgresql.org/pgsql-hackers/2005-08/msg01073.php It's a simple tester that drives multiple postgres backends simultaneously and can test whether various concurrently running transaction correctly block/abort in various situations. I also used it to test whether all the different LOCK types work as documented in the documentation (they do, the program checked all combinations of two locks). It would in theory be possible to generate scripts to test thing like simultaneously firing multiple CREATE INDEX commands in seperate transactions and see how they interact. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
Hi Dave, > Documentation such as this can be added to the new techdocs area on the > main site at http://www.postgresql.org/docs/techdocs under the relevant > section (probably http://www.postgresql.org/docs/techdocs.3 in this > case). > > Please note that the editting interface is still new and may still have > a quirk or two... Great, thank you for the pointer to the page. However, I think I have been bitten by one of the two quirks ;) I managed to get through editing and formatting of the document, but upon submission, it fails to accept the document, stating that several properties are invalid (errors at the bottom of this message). I did not add any custom formatting, and only used the features and controls available within the editing interface.. So, it seems at this time, that I'll have to wait to post this until the form is fixed, or possibly submit it in some other fashion.. Thank you to everyone else who offered corrections, also! I knew there would be a few, and I will incorporate them into the revision before final submission. Cheers, -Jason ---- error output upon choosing 'Save' from the techdocs editor: --- Element H2: Invalid attribute "STYLE" Element U: Invalid element Element U: Invalid element Element DIV: Invalid attribute "STYLE" Element DIV: Invalid attribute "STYLE" Element DIV: Invalid attribute "STYLE" Element U: Invalid element Element LI: Invalid attribute "STYLE" Element U: Invalid element Element U: Invalid element Element U: Invalid element Element U: Invalid element Element U: Invalid element
> -----Original Message----- > From: David Fetter [mailto:david@fetter.org] > Sent: 30 June 2006 18:30 > To: Dave Page > Cc: Scott Marlowe; Jason McManus; pgsql general > Subject: Re: [GENERAL] Notes on converting from MySQL 5.0.x > to PostgreSQL > > Last I checked, pgAdmin 1.4 doesn't help setting up clusters, which is > one of the major headaches of a Slony-I setup. I also noticed that > pgAdmin 1.6-to-be has at least some of those hooks. Any ETA on that? It certainly can setup a new cluster - the only feature we don't support is failover because Andreas was never happy with the way it worked and such a critical operation needs to be flawless. That said, don't ask me how to setup a new cluster - I've never actually done it myself (in pgAdmin). Regards, Dave.
On Friday 30 June 2006 09:17, Jason McManus wrote: > On Converting from MySQL 5.0.x to PostgreSQL 8.1.x > * There is a conversion utility called 'mysql2pgsql' that will convert > dump files from the mysqldump format, to a format that psql can > understand. It is available at: > - http://gborg.postgresql.org/project/mysql2psql/projdisplay.php FWIW, I have some decent success using mysqls mysqldump programm with various --postgresql-compataible flags to get postgresql workable dumps. The flag tree of that program is a killer, but if you can figure out the right flags, the output is pretty good. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
jnasby@pervasive.com ("Jim C. Nasby") writes: >> Uh oh. I'm a little worried about writing tests based on GPLed code >> for Slony-I or other replication systems. Might these need to be >> clean-roomed? > > Is there actually a lack of ideas for our regression tests, or a lack of > people/motivation to work on them? Speaking solely for Slony-I, it's pretty difficult to write good tests, and it's somewhat boring, so there's probably some contribution of "all of the above." -- (reverse (concatenate 'string "moc.enworbbc" "@" "enworbbc")) http://www.ntlug.org/~cbbrowne/oses.html We are in fact well and truly doomed. -- Jamie Zawinski http://www.jwz.org/gruntle/nscpdorm.html
On 6/30/2006 11:12 AM, Scott Marlowe wrote: > I agree with Tom, nice notes. I noted a few minor issues that seem to > derive from a familiarity with MySQL. I'll put my corrections below... > > On Fri, 2006-06-30 at 08:17, Jason McManus wrote: >> On Converting from MySQL 5.0.x to PostgreSQL 8.1.x >> -------------------------------------------------- >> Major differences I have noted: >> ------------------------------- >> >> MySQL 5.0.x: > >> * Easy, built-in and extensive replication support. > > Not sure how extensive it is. It's basically synchronous single master > single slave, right? It is quite easy though. Last thing I heard was that MySQL still had only statement based replication and that it doesn't work together with some of the new enterprise features like triggers and stored procedures. Row level replication is on their TODO list and this major feature will probably appear in some minor 5.2.x release. Jan > >> PostgreSQL 8.1.x: >> * Embedded procedures in multiple native languages (stored procedures and >> functions can be written in native Perl, Python, Tcl, SQL, PL/PgSQL) > > Note that there are a dozen or more other languages as well. Just FYI. > Off the top of my head, plPHP, plJ (java there's two different java > implementations, I think) and plR (R is the open source equivalent of > the S statistics language) > >> * Replication support still rudimentary. > > Hmmmm. I think that's an overly simplistic evaluation. The slony > replication engine is actually VERY advanced, but the administrative > tools consist mostly of "your brain". hehe. That said, once you've > learned how to drive it, it's quite amazing. Keep in mind, slony can be > applied to a living database while it's running, and can run between > different major versions of postgresql. That's a pretty advanced > feature. Plus, if the replication daemons die (kill -9ed or whatever) > you can restart replication and slony will come right back where it was > and catch up. > >> Pointers, tips, quick facts and gotchas for other people converting: >> -------------------------------------------------------------------- >> >> * MySQL combines the concepts of 'database' and 'schema' into one. PostgreSQL >> differentiates the two. While the hierarchy in MySQL is >> database.table.field, PostgreSQL is roughly: database.schema.table.field. >> A schema is a 'logically grouped set of tables but still kept within a >> particular database.' This could allow separate applications to be built >> that still rely upon the same database, but can be kept somewhat logically >> separated. The default schema in each database is called 'public', and is >> the one referred to if no others are specified. This can be modified with >> 'SET search_path TO ...'. > > This is a VERY good analysis of the difference between the two > databases. > >> * Pg uses a 'template1' pseudo-database that can be tailored to provide >> default objects for new database creation, if you should desire. It >> obviously also offers a 'template0' database that is read-only and >> offers a barebones database, more equivalent to the empty db created with >> mysql's CREATE DATABASE statement. > > This isn't quite right. > > template0 is a locked and "pure" copy of the template database. It's > there for "break glass in case of emergency" use. :) > > template1, when you first initdb, is exactly the same as template0, but > you can connect to it, and alter it. Both of these are "real" > postgresql databases. template1 is the database that gets copied by > default when you do "create database". Note that you can also define a > different template database when running create database, which lets you > easily clone any database on your machine. "create database newdb with > template olddb" > >> * Pg uses the 'serial' column type instead of AUTO_INCREMENT. This allows >> more than one independent sequence to be specified per table (though the >> utility of this may be of dubious value). These are closer to Oracle's >> concept of sequence generators, and they can be manipulated with the >> currval(), nextval(), setval(), and lastval() functions. > > Don't forget 64bit bigserials too. > >> * Pg requires its tables and databases be 'vacuumed' regularly to remove >> completed transaction snapshots and optimize the tables on disk. It is >> necessary because the way that PostgreSQL implements true MVCC is by >> writing all temporary transactions to disk and setting a visibility >> flag for the record. Vacuuming can be performed automatically, and in >> a deferred manner by using vacuum_cost settings to limit it to low-load >> periods or based upon numerous other criteria. See the manual for more >> information. > > Interestingly enough, MySQL's innodb tables do almost the exact same > thing, but their vacuum process is wholly automated. Generally, this > means fewer issues pop up for the new dba, but when they do, they can be > a little harder to deal with. It's about a wash. Of course, as you > mentioned earlier, most mysql folks aren't using innodb. > >> * While MySQL supports transactions with the InnoDB databases, many MySQL >> users generally do not use them extensively enough. With Pg, due to the >> behaviour of the server in attempting to ensure data integrity in a >> variety of situations (client disconnection, network trouble, server >> crashes, etc.), it is highly advisable to become familiar and utilize >> transactions a lot more, to ensure your DATA is left in a consistent state >> before and after every change you wish to make. > > A point you might want to throw in here is that EVERYTHING in postgresql > is a transaction. If you don't issue a begin statement, then postgresql > runs each statement you type in inside its own transaction. > > This means that inserting 10,000 rows without wrapping them inside an > explicit transaction results in 10,000 individual transactions. > > However, the more interesting thing here, is that every statement, > including DDL is transactable, except for a couple of big odd ones, like > create database. So, in postgresql, you can do: > > begin; > create table xyz... > alter table abc... > insert into abc select * from iii > update iii...; > drop table iii; > (oops, I messed up something) > rollback; > > and there's no change and no lost data. Quite impressive actually. > > >> Common equivalents: >> ------------------- >> >> MySQL PostgreSQL >> ----- ----------- >> OPTIMIZE TABLE ... VACUUM ... > > vacuum and analyze for optimize I think. Also, possibly reindex, > although nominally that's the "sledge hammer" of optimization. > > One last thing I'd mention that I REALLY like about PostgreSQL over any > other database I've used is that the psql interface has a complete > syntax lookup feature that is WAY cool. \h brings it up, and \h COMMAND > where COMMAND is the command you want to look up will bring up the > syntax for your command. > > And, I hate the fact that CTRL-C in the mysql command line tool exits > the tool instead of interrupting the current query. In PostgreSQL it > interrupts the current query. CTRL-\ will kill the client if you need > to. > > Overall, a great review. Thanks. > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
http://dev.mysql.com/doc/refman/5.1/en/replication-row-based.html
5.1
Alex
5.1
Alex
On 7/10/06, Jan Wieck <JanWieck@yahoo.com> wrote:
On 6/30/2006 11:12 AM, Scott Marlowe wrote:
> I agree with Tom, nice notes. I noted a few minor issues that seem to
> derive from a familiarity with MySQL. I'll put my corrections below...
>
> On Fri, 2006-06-30 at 08:17, Jason McManus wrote:
>> On Converting from MySQL 5.0.x to PostgreSQL 8.1.x
>> --------------------------------------------------
>> Major differences I have noted:
>> -------------------------------
>>
>> MySQL 5.0.x:
>
>> * Easy, built-in and extensive replication support.
>
> Not sure how extensive it is. It's basically synchronous single master
> single slave, right? It is quite easy though.
Last thing I heard was that MySQL still had only statement based
replication and that it doesn't work together with some of the new
enterprise features like triggers and stored procedures. Row level
replication is on their TODO list and this major feature will probably
appear in some minor 5.2.x release.
Jan
>
>> PostgreSQL 8.1.x:
>> * Embedded procedures in multiple native languages (stored procedures and
>> functions can be written in native Perl, Python, Tcl, SQL, PL/PgSQL)
>
> Note that there are a dozen or more other languages as well. Just FYI.
> Off the top of my head, plPHP, plJ (java there's two different java
> implementations, I think) and plR (R is the open source equivalent of
> the S statistics language)
>
>> * Replication support still rudimentary.
>
> Hmmmm. I think that's an overly simplistic evaluation. The slony
> replication engine is actually VERY advanced, but the administrative
> tools consist mostly of "your brain". hehe. That said, once you've
> learned how to drive it, it's quite amazing. Keep in mind, slony can be
> applied to a living database while it's running, and can run between
> different major versions of postgresql. That's a pretty advanced
> feature. Plus, if the replication daemons die (kill -9ed or whatever)
> you can restart replication and slony will come right back where it was
> and catch up.
>
>> Pointers, tips, quick facts and gotchas for other people converting:
>> --------------------------------------------------------------------
>>
>> * MySQL combines the concepts of 'database' and 'schema' into one. PostgreSQL
>> differentiates the two. While the hierarchy in MySQL is
>> database.table.field, PostgreSQL is roughly: database.schema.table.field.
>> A schema is a 'logically grouped set of tables but still kept within a
>> particular database.' This could allow separate applications to be built
>> that still rely upon the same database, but can be kept somewhat logically
>> separated. The default schema in each database is called 'public', and is
>> the one referred to if no others are specified. This can be modified with
>> 'SET search_path TO ...'.
>
> This is a VERY good analysis of the difference between the two
> databases.
>
>> * Pg uses a 'template1' pseudo-database that can be tailored to provide
>> default objects for new database creation, if you should desire. It
>> obviously also offers a 'template0' database that is read-only and
>> offers a barebones database, more equivalent to the empty db created with
>> mysql's CREATE DATABASE statement.
>
> This isn't quite right.
>
> template0 is a locked and "pure" copy of the template database. It's
> there for "break glass in case of emergency" use. :)
>
> template1, when you first initdb, is exactly the same as template0, but
> you can connect to it, and alter it. Both of these are "real"
> postgresql databases. template1 is the database that gets copied by
> default when you do "create database". Note that you can also define a
> different template database when running create database, which lets you
> easily clone any database on your machine. "create database newdb with
> template olddb"
>
>> * Pg uses the 'serial' column type instead of AUTO_INCREMENT. This allows
>> more than one independent sequence to be specified per table (though the
>> utility of this may be of dubious value). These are closer to Oracle's
>> concept of sequence generators, and they can be manipulated with the
>> currval(), nextval(), setval(), and lastval() functions.
>
> Don't forget 64bit bigserials too.
>
>> * Pg requires its tables and databases be 'vacuumed' regularly to remove
>> completed transaction snapshots and optimize the tables on disk. It is
>> necessary because the way that PostgreSQL implements true MVCC is by
>> writing all temporary transactions to disk and setting a visibility
>> flag for the record. Vacuuming can be performed automatically, and in
>> a deferred manner by using vacuum_cost settings to limit it to low-load
>> periods or based upon numerous other criteria. See the manual for more
>> information.
>
> Interestingly enough, MySQL's innodb tables do almost the exact same
> thing, but their vacuum process is wholly automated. Generally, this
> means fewer issues pop up for the new dba, but when they do, they can be
> a little harder to deal with. It's about a wash. Of course, as you
> mentioned earlier, most mysql folks aren't using innodb.
>
>> * While MySQL supports transactions with the InnoDB databases, many MySQL
>> users generally do not use them extensively enough. With Pg, due to the
>> behaviour of the server in attempting to ensure data integrity in a
>> variety of situations (client disconnection, network trouble, server
>> crashes, etc.), it is highly advisable to become familiar and utilize
>> transactions a lot more, to ensure your DATA is left in a consistent state
>> before and after every change you wish to make.
>
> A point you might want to throw in here is that EVERYTHING in postgresql
> is a transaction. If you don't issue a begin statement, then postgresql
> runs each statement you type in inside its own transaction.
>
> This means that inserting 10,000 rows without wrapping them inside an
> explicit transaction results in 10,000 individual transactions.
>
> However, the more interesting thing here, is that every statement,
> including DDL is transactable, except for a couple of big odd ones, like
> create database. So, in postgresql, you can do:
>
> begin;
> create table xyz...
> alter table abc...
> insert into abc select * from iii
> update iii...;
> drop table iii;
> (oops, I messed up something)
> rollback;
>
> and there's no change and no lost data. Quite impressive actually.
>
>
>> Common equivalents:
>> -------------------
>>
>> MySQL PostgreSQL
>> ----- -----------
>> OPTIMIZE TABLE ... VACUUM ...
>
> vacuum and analyze for optimize I think. Also, possibly reindex,
> although nominally that's the "sledge hammer" of optimization.
>
> One last thing I'd mention that I REALLY like about PostgreSQL over any
> other database I've used is that the psql interface has a complete
> syntax lookup feature that is WAY cool. \h brings it up, and \h COMMAND
> where COMMAND is the command you want to look up will bring up the
> syntax for your command.
>
> And, I hate the fact that CTRL-C in the mysql command line tool exits
> the tool instead of interrupting the current query. In PostgreSQL it
> interrupts the current query. CTRL-\ will kill the client if you need
> to.
>
> Overall, a great review. Thanks.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
> http://archives.postgresql.org
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
On 7/10/2006 10:00 PM, Alex Turner wrote: > http://dev.mysql.com/doc/refman/5.1/en/replication-row-based.html > > 5.1 Ah, thanks. So I guess 5.1.5 and 5.1.8 must be considered major feature minor/bugfix releases. I still don't understand how people can use software in production that has literally zero bugfix upgrade path without the risk of incompatibility due to new features. I consider every IT manager, who makes that choice, simply overpaid. Jan > > Alex > > On 7/10/06, Jan Wieck <JanWieck@yahoo.com> wrote: >> >> On 6/30/2006 11:12 AM, Scott Marlowe wrote: >> > I agree with Tom, nice notes. I noted a few minor issues that seem to >> > derive from a familiarity with MySQL. I'll put my corrections below... >> > >> > On Fri, 2006-06-30 at 08:17, Jason McManus wrote: >> >> On Converting from MySQL 5.0.x to PostgreSQL 8.1.x >> >> -------------------------------------------------- >> >> Major differences I have noted: >> >> ------------------------------- >> >> >> >> MySQL 5.0.x: >> > >> >> * Easy, built-in and extensive replication support. >> > >> > Not sure how extensive it is. It's basically synchronous single master >> > single slave, right? It is quite easy though. >> >> Last thing I heard was that MySQL still had only statement based >> replication and that it doesn't work together with some of the new >> enterprise features like triggers and stored procedures. Row level >> replication is on their TODO list and this major feature will probably >> appear in some minor 5.2.x release. >> >> >> Jan >> >> >> > >> >> PostgreSQL 8.1.x: >> >> * Embedded procedures in multiple native languages (stored procedures >> and >> >> functions can be written in native Perl, Python, Tcl, SQL, PL/PgSQL) >> > >> > Note that there are a dozen or more other languages as well. Just FYI. >> > Off the top of my head, plPHP, plJ (java there's two different java >> > implementations, I think) and plR (R is the open source equivalent of >> > the S statistics language) >> > >> >> * Replication support still rudimentary. >> > >> > Hmmmm. I think that's an overly simplistic evaluation. The slony >> > replication engine is actually VERY advanced, but the administrative >> > tools consist mostly of "your brain". hehe. That said, once you've >> > learned how to drive it, it's quite amazing. Keep in mind, slony can be >> > applied to a living database while it's running, and can run between >> > different major versions of postgresql. That's a pretty advanced >> > feature. Plus, if the replication daemons die (kill -9ed or whatever) >> > you can restart replication and slony will come right back where it was >> > and catch up. >> > >> >> Pointers, tips, quick facts and gotchas for other people converting: >> >> -------------------------------------------------------------------- >> >> >> >> * MySQL combines the concepts of 'database' and 'schema' into >> one. PostgreSQL >> >> differentiates the two. While the hierarchy in MySQL is >> >> database.table.field, PostgreSQL is roughly: >> database.schema.table.field. >> >> A schema is a 'logically grouped set of tables but still kept within >> a >> >> particular database.' This could allow separate applications to be >> built >> >> that still rely upon the same database, but can be kept somewhat >> logically >> >> separated. The default schema in each database is called 'public', >> and is >> >> the one referred to if no others are specified. This can be modified >> with >> >> 'SET search_path TO ...'. >> > >> > This is a VERY good analysis of the difference between the two >> > databases. >> > >> >> * Pg uses a 'template1' pseudo-database that can be tailored to provide >> >> default objects for new database creation, if you should desire. It >> >> obviously also offers a 'template0' database that is read-only and >> >> offers a barebones database, more equivalent to the empty db created >> with >> >> mysql's CREATE DATABASE statement. >> > >> > This isn't quite right. >> > >> > template0 is a locked and "pure" copy of the template database. It's >> > there for "break glass in case of emergency" use. :) >> > >> > template1, when you first initdb, is exactly the same as template0, but >> > you can connect to it, and alter it. Both of these are "real" >> > postgresql databases. template1 is the database that gets copied by >> > default when you do "create database". Note that you can also define a >> > different template database when running create database, which lets you >> > easily clone any database on your machine. "create database newdb with >> > template olddb" >> > >> >> * Pg uses the 'serial' column type instead of AUTO_INCREMENT. This >> allows >> >> more than one independent sequence to be specified per table (though >> the >> >> utility of this may be of dubious value). These are closer to >> Oracle's >> >> concept of sequence generators, and they can be manipulated with the >> >> currval(), nextval(), setval(), and lastval() functions. >> > >> > Don't forget 64bit bigserials too. >> > >> >> * Pg requires its tables and databases be 'vacuumed' regularly to >> remove >> >> completed transaction snapshots and optimize the tables on disk. It >> is >> >> necessary because the way that PostgreSQL implements true MVCC is by >> >> writing all temporary transactions to disk and setting a visibility >> >> flag for the record. Vacuuming can be performed automatically, and >> in >> >> a deferred manner by using vacuum_cost settings to limit it to >> low-load >> >> periods or based upon numerous other criteria. See the manual for >> more >> >> information. >> > >> > Interestingly enough, MySQL's innodb tables do almost the exact same >> > thing, but their vacuum process is wholly automated. Generally, this >> > means fewer issues pop up for the new dba, but when they do, they can be >> > a little harder to deal with. It's about a wash. Of course, as you >> > mentioned earlier, most mysql folks aren't using innodb. >> > >> >> * While MySQL supports transactions with the InnoDB databases, many >> MySQL >> >> users generally do not use them extensively enough. With Pg, due to >> the >> >> behaviour of the server in attempting to ensure data integrity in a >> >> variety of situations (client disconnection, network trouble, server >> >> crashes, etc.), it is highly advisable to become familiar and utilize >> >> transactions a lot more, to ensure your DATA is left in a consistent >> state >> >> before and after every change you wish to make. >> > >> > A point you might want to throw in here is that EVERYTHING in postgresql >> > is a transaction. If you don't issue a begin statement, then postgresql >> > runs each statement you type in inside its own transaction. >> > >> > This means that inserting 10,000 rows without wrapping them inside an >> > explicit transaction results in 10,000 individual transactions. >> > >> > However, the more interesting thing here, is that every statement, >> > including DDL is transactable, except for a couple of big odd ones, like >> > create database. So, in postgresql, you can do: >> > >> > begin; >> > create table xyz... >> > alter table abc... >> > insert into abc select * from iii >> > update iii...; >> > drop table iii; >> > (oops, I messed up something) >> > rollback; >> > >> > and there's no change and no lost data. Quite impressive actually. >> > >> > >> >> Common equivalents: >> >> ------------------- >> >> >> >> MySQL PostgreSQL >> >> ----- ----------- >> >> OPTIMIZE TABLE ... VACUUM ... >> > >> > vacuum and analyze for optimize I think. Also, possibly reindex, >> > although nominally that's the "sledge hammer" of optimization. >> > >> > One last thing I'd mention that I REALLY like about PostgreSQL over any >> > other database I've used is that the psql interface has a complete >> > syntax lookup feature that is WAY cool. \h brings it up, and \h COMMAND >> > where COMMAND is the command you want to look up will bring up the >> > syntax for your command. >> > >> > And, I hate the fact that CTRL-C in the mysql command line tool exits >> > the tool instead of interrupting the current query. In PostgreSQL it >> > interrupts the current query. CTRL-\ will kill the client if you need >> > to. >> > >> > Overall, a great review. Thanks. >> > >> > ---------------------------(end of broadcast)--------------------------- >> > TIP 4: Have you searched our list archives? >> > >> > http://archives.postgresql.org >> >> >> -- >> #======================================================================# >> # It's easier to get forgiveness for being wrong than for being right. # >> # Let's break this rule - forgive me. # >> #================================================== JanWieck@Yahoo.com # >> >> ---------------------------(end of broadcast)--------------------------- >> TIP 9: In versions below 8.0, the planner will ignore your desire to >> choose an index scan if your joining column's datatypes do not >> match >> > -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
On Tue, 2006-07-11 at 10:45, Jan Wieck wrote: > On 7/10/2006 10:00 PM, Alex Turner wrote: > > > http://dev.mysql.com/doc/refman/5.1/en/replication-row-based.html > > > > 5.1 > > Ah, thanks. So I guess 5.1.5 and 5.1.8 must be considered major feature > minor/bugfix releases. I still don't understand how people can use > software in production that has literally zero bugfix upgrade path > without the risk of incompatibility due to new features. I consider > every IT manager, who makes that choice, simply overpaid. Dear god! That page made my eyes bleed. Individual users can choose the method of replication for their sessions? There's a mixed method that switches back and forth? "In addition to switching the logging format manually, a slave server may switch the format automatically." I'm pretty sure this kind of thing would never get into PostgreSQL. It's like reading a map of a minefield drawn in crayon.
On 7/11/2006 11:57 AM, Scott Marlowe wrote: > On Tue, 2006-07-11 at 10:45, Jan Wieck wrote: >> On 7/10/2006 10:00 PM, Alex Turner wrote: >> >> > http://dev.mysql.com/doc/refman/5.1/en/replication-row-based.html >> > >> > 5.1 >> >> Ah, thanks. So I guess 5.1.5 and 5.1.8 must be considered major feature >> minor/bugfix releases. I still don't understand how people can use >> software in production that has literally zero bugfix upgrade path >> without the risk of incompatibility due to new features. I consider >> every IT manager, who makes that choice, simply overpaid. > > Dear god! That page made my eyes bleed. > > Individual users can choose the method of replication for their > sessions? > > There's a mixed method that switches back and forth? It is totally unclear from that page what would make the server decide when to pick one or the other method. It seems to me that this is mainly an optimization for many single inserts in order to get a smaller binlog. Note that according to this page http://dev.mysql.com/doc/internals/en/replication-prepared-statements.html the master currently substitutes the parameters as literals into the query for prepared statements. What also is totally unclear, maybe someone with more MySQL experience can answer this question, is if the binary format actually does solve the problems discussed. Namely timestamps and also autoincrement. What exactly happens if an insert doesn't provide a value for an autoinc or timestamp column? Is the server chosen value placed into the binlog when using row format or not? Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #