Thread: SQL Manager 2007 for PostgreSQL released
SQL Manager 2007 for PostgreSQL released
From
"EMS Database Management Solutions (SQLManager.net)"
Date:
We, here at EMS Database Management Solutions, are pleased to announce SQL Manager 2007 for PostgreSQL - the new major version of the powerful PostgreSQL administration and development tool! You can download SQL Manager 2007 for PostgreSQL at: http://www.sqlmanager.net/products/postgresql/manager/download You can purchase SQL Manager 2007 for PostgreSQL at: http://www.sqlmanager.net/products/postgresql/manager/buy What's new in SQL Manager 2007 for PostgreSQL? 1. Support of UTF8 data in SQL Editor, Query Builder, Grid View, Data Export/Import Wizard. 2. Now you can continue working with SQL Manager while a query is being executed in SQL Editor or Query Builder or while table or view data are being loaded in the editor. A query taking too long to execute can be aborted when using PostgreSQL 8.0 and higher. 3. New Database Statistics form added; it allows viewing various statistic information on database objects, helps to find weak points in database performance, indicates if it's necessary to create new indices, etc. The corresponding options must be enabled in server configuration file (postgresql.conf.) for collecting statistics. 4. All Wizards, including Data Export/Import, DB Extract, Copy DB wizards can now be run in the background mode, so that you could continue working with the program. 5. New improved SQL Editor of the Visual Studio 2005 kind with Code Folding function and UNICODE support. 6. Improved Data Import Wizard now allows to import data up to 10 times faster. New import formats have been added: HTML, XML Document, Open Document Format (OpenOffice), Open XML Format (MS Word 2007, MS Excel 2007). 7. Support of new PostgreSQL 8.2 features is implemented (FILLFACTOR parameter for tables and indices, CONNECT permission for databases, USAGE permission for sequences, support of CREATE INDEX CONCURRENTLY, ALTER TABLE [ NO ] INHERIT support, creating domains based on other domains, support of IF EXISTS clause in DROP operators of the Extract Database Wizard, reassigning ownership of all objects owned by a role). 8. Object editors now contain the Permissions tab which allows to grant permissions on the object for users and groups. 9. Common SQL commands can now be generated from DB Explorer (Script to SQL Editor, Script to Clipboard). Now you can easily generate such SQL commands as CREATE, ALTER, DROP, SELECT etc for each database object. 10. Now you can save your queries in the Favorite Queries folders in DB Explorer; these queries сan be stored either locally in the registry (nobody but you will have access to them) or in a special database table (then all or some of your db users will have access to it). 11. Download File wizard has been added, which allows to download and upload files from/to PostgreSQL server machine using standard PostgreSQL functions and FTP/SFTP protocols. 12. Improved Visual database designer; new object type - comments - is added. One can now choose whether only object names or fields should be displayed. Now it is possible to select schemas whose tables should be added to the diagram when performing Reverse Engineering. 13. Improved Data Export Wizard now works faster. New export formats have been added: XML Document, Open Document Format (OpenOffice), Open XML Format (MS Word 2007, MS Excel 2007). 14. Database Properties window has been added, which displays database options and allows to change some of them. 15. Improved Table Editor. Now it is possible to view and change table properties on a new tab. 16. Tasks menu is added in DB Explorer context menu. It contains common operations which can be performed on the object selected. Also hosts are now named in accordance with the selected connection type (SSH or HTTP tunneling). 17. Improved report building system. 18. Function Debugger. The variable that changed its value at the last step is displayed bold at the Debug Information window. The SQLSTATE and SQLERRM variables are also displayed now. 19. Lots of other improvements and bug-fixes. Hope you will enjoy working with our software.
On May 23, 8:47 am, "EMS Database Management Solutions (SQLManager.net)" <sqlmana...@gmail.com> wrote: > We, here at EMS Database Management Solutions, are pleased to announce > SQL Manager 2007 for PostgreSQL - the new major version of the > powerful PostgreSQL administration and development tool! > > You can download SQL Manager 2007 for PostgreSQL at:http://www.sqlmanager.net/products/postgresql/manager/download > > You can purchase SQL Manager 2007 for PostgreSQL at:http://www.sqlmanager.net/products/postgresql/manager/buy > > What's new in SQL Manager 2007 for PostgreSQL? <snip> Thanks for this, but is there any plan to launch something like this for use on Linux admin servers? Something that I could install on a server, and perhaps work with a web interface? I would love some recommendations. Many thanks! LB -- Conan O' Brien gets it right! http://blogs.pcworld.com/tipsandtweaks/archives/004369.html
On 23/05/2007 11:40, L. Berger wrote: > Thanks for this, but is there any plan to launch something like this > for use on Linux admin servers? Something that I could install on a > server, and perhaps work with a web interface? I would love some > recommendations. Have you tried PHPPgAdmin? I use it all the time and am very happy with it. Ray. --------------------------------------------------------------- Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland rod@iol.ie ---------------------------------------------------------------
On 22 May 2007 17:47:46 -0700, EMS Database Management Solutions (SQLManager.net) <sqlmanager@gmail.com> wrote: > We, here at EMS Database Management Solutions, are pleased to announce > SQL Manager 2007 for PostgreSQL - the new major version of the > powerful PostgreSQL administration and development tool! I'd be happy to use EMS products, but I really need Linux versions. Any plans regarding *nix platforms support? -- Best regards, Nikolay
On May 22, 2007, at 19:47 , EMS Database Management Solutions (SQLManager.net) wrote: > We, here at EMS Database Management Solutions, are pleased to announce > SQL Manager 2007 for PostgreSQL - the new major version of the > powerful PostgreSQL administration and development tool! Congratulations on the release. In the future, refrain from posting product announcements to the pgsql-general list. Thanks! Michael Glaesemann grzm seespotcode net
Nikolay Samokhvalov wrote: > On 22 May 2007 17:47:46 -0700, EMS Database Management Solutions > (SQLManager.net) <sqlmanager@gmail.com> wrote: >> We, here at EMS Database Management Solutions, are pleased to announce >> SQL Manager 2007 for PostgreSQL - the new major version of the >> powerful PostgreSQL administration and development tool! > > I'd be happy to use EMS products, but I really need Linux versions. > Any plans regarding *nix platforms support? > > Ditto, when will we see the *nix versions? -- Kenneth Downs Secure Data Software, Inc. www.secdat.com www.andromeda-project.org 631-689-7200 Fax: 631-689-0527 cell: 631-379-0010
Kenneth Downs a écrit : > Nikolay Samokhvalov wrote: >> On 22 May 2007 17:47:46 -0700, EMS Database Management Solutions >> (SQLManager.net) <sqlmanager@gmail.com> wrote: >>> We, here at EMS Database Management Solutions, are pleased to announce >>> SQL Manager 2007 for PostgreSQL - the new major version of the >>> powerful PostgreSQL administration and development tool! >> >> I'd be happy to use EMS products, but I really need Linux versions. >> Any plans regarding *nix platforms support? >> >> > > Ditto, when will we see the *nix versions? > I asked them some time ago. They answered me this : As for SQL Manager for PostgreSQL - we regret to inform you that the development and support of Linux editions of EMS software products has become impossible now that Borland no longer supports Kylix libraries for Delphi, on which all Linux versions of EMS software were based. Hence, EMS has made a decision to discontinue Linux versions of its products. Linux products will not be available for sale or download anymore. -- Guillaume. <!-- http://abs.traduc.org/ http://lfs.traduc.org/ http://docs.postgresqlfr.org/ -->
Group, I have to admit, I'm a little disappointed. I'm a HUGE advocate of PostgreSQL(to state for the record) - in fact I always keep my eyes peeled for opportunities to recommend it in my day to day business. So why am I disappointed, and who really cares? I'm disappointed because SLONY-II has not been released yet to support multi-master replication! PostgreSQL is going through all of the releases - and that's great - BUT, where is the sync-up with the powerhouse of a component, that Slony-II would bring to the table? Slony-I is pretty sweet, but if Slony-II would release, I can imagine that this would introduce some major competition in the enterprise world against the commercial dyno's. I've been doing some really cool stuff with the latest versions of OpenLDAP and PostgreSQL on Solaris 10(in the past Solaris 9) also using Linux(Fedora Core - go Redhat). Anyone have any thoughts/comments to this end? On Fri, 1 Jun 2007, Guillaume Lelarge wrote: > Kenneth Downs a écrit : >> Nikolay Samokhvalov wrote: >>> On 22 May 2007 17:47:46 -0700, EMS Database Management Solutions >>> (SQLManager.net) <sqlmanager@gmail.com> wrote: >>>> We, here at EMS Database Management Solutions, are pleased to announce >>>> SQL Manager 2007 for PostgreSQL - the new major version of the >>>> powerful PostgreSQL administration and development tool! >>> >>> I'd be happy to use EMS products, but I really need Linux versions. >>> Any plans regarding *nix platforms support? >>> >>> >> >> Ditto, when will we see the *nix versions? >> > > I asked them some time ago. They answered me this : > > As for SQL Manager for PostgreSQL - we regret to inform you that the > development and support of Linux editions of EMS software products has > become impossible now that Borland no longer supports Kylix libraries > for Delphi, on which all Linux versions of EMS software were based. > Hence, EMS has made a decision to discontinue Linux versions of its > products. Linux products will not be available for sale or download anymore. > > > -- Louis Gonzales louis.gonzales@linuxlouis.net http://www.linuxlouis.net
gonzales@linuxlouis.net wrote: > Group, > I have to admit, I'm a little disappointed. I'm a HUGE advocate of > PostgreSQL(to state for the record) - in fact I always keep my eyes > peeled for opportunities to recommend it in my day to day business. > > So why am I disappointed, and who really cares? > > I'm disappointed because SLONY-II has not been released yet to support > multi-master replication! PostgreSQL is going through all of the > releases - and that's great - BUT, where is the sync-up with the > powerhouse of a component, that Slony-II would bring to the table? > Slony-I is pretty sweet, but if Slony-II would release, I can imagine > that this would introduce some major competition in the enterprise world > against the commercial dyno's. Which databases ship with multi-master replication? Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
On Fri, 1 Jun 2007, Joshua D. Drake wrote: > gonzales@linuxlouis.net wrote: >> Group, >> I have to admit, I'm a little disappointed. I'm a HUGE advocate of >> PostgreSQL(to state for the record) - in fact I always keep my eyes peeled >> for opportunities to recommend it in my day to day business. >> >> So why am I disappointed, and who really cares? >> >> I'm disappointed because SLONY-II has not been released yet to support >> multi-master replication! PostgreSQL is going through all of the releases >> - and that's great - BUT, where is the sync-up with the powerhouse of a >> component, that Slony-II would bring to the table? Slony-I is pretty >> sweet, but if Slony-II would release, I can imagine that this would >> introduce some major competition in the enterprise world against the >> commercial dyno's. > > Which databases ship with multi-master replication? I dunno, which ones? Which ones have robust and fully functional multi-master replication? (Oracle, MS SQL, not-PostgreSQL). > > Joshua D. Drake > > > > -- Louis Gonzales louis.gonzales@linuxlouis.net http://www.linuxlouis.net
gonzales@linuxlouis.net wrote: > On Fri, 1 Jun 2007, Joshua D. Drake wrote: >> Which databases ship with multi-master replication? > I dunno, which ones? > > Which ones have robust and fully functional multi-master replication? > (Oracle, MS SQL, not-PostgreSQL). You consider Oracle RAC fully functional multi-master? You do realize that Oracle is the second largest software company in the world right? With Microsoft being number 1? Tell ya what, write me a check, we will get right on it ;0 Joshua D. Drake > >> >> Joshua D. Drake >> >> >> >> > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
On 6/1/07, gonzales@linuxlouis.net <gonzales@linuxlouis.net> wrote: > I'm disappointed because SLONY-II has not been released yet to support > multi-master replication! I wouldn't pin all my hopes on a project still under development. (For me, personally, add the fact that Slony-I still has not solved single-master replication in a way that doesn't burden the developer/DBA with lots of unnecessary extra maintenance; I am not counting on its developers to fix this issue in Slony-II.) In the meantime, Cybertec (http://www.postgresql.at/, an Austrian company) just announced a commercial synchronous multimaster replication product based on 2-phase commit. It's expensive, and I can't speak for its maturity, and it may or may not scale as well as the projected Slony-II design, but the setup seems dead simple, and from the docs I have found it seems to transparently replicate schema changes, unlike Slony-I. So that's something. Alexander.
Alexander Staubo wrote: > On 6/1/07, gonzales@linuxlouis.net <gonzales@linuxlouis.net> wrote: >> I'm disappointed because SLONY-II has not been released yet to support >> multi-master replication! > > I wouldn't pin all my hopes on a project still under development. (For > me, personally, add the fact that Slony-I still has not solved > single-master replication in a way that doesn't burden the > developer/DBA with lots of unnecessary extra maintenance; I am not > counting on its developers to fix this issue in Slony-II.) > > In the meantime, Cybertec (http://www.postgresql.at/, an Austrian > company) just announced a commercial synchronous multimaster > replication product based on 2-phase commit. It's expensive, and I > can't speak for its maturity, and it may or may not scale as well as > the projected Slony-II design, but the setup seems dead simple, and > from the docs I have found it seems to transparently replicate schema > changes, unlike Slony-I. So that's something. I could be completely cranked but I believe that product is based on PgCluster which is horrendously slow. To be fair, it is still under heavy development and does show promise. Joshua D. Drake > > Alexander. > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
On 6/1/07, Joshua D. Drake <jd@commandprompt.com> wrote: > > In the meantime, Cybertec (http://www.postgresql.at/, an Austrian > > company) just announced a commercial synchronous multimaster > > replication product based on 2-phase commit. It's expensive, and I [snip] > I could be completely cranked but I believe that product is based on > PgCluster which is horrendously slow. Well, dang, that's disappointing. Last I checked, the PGCluster design was fundamentally unscalable. Alexander.
Alexander Staubo wrote: > On 6/1/07, Joshua D. Drake <jd@commandprompt.com> wrote: >> > In the meantime, Cybertec (http://www.postgresql.at/, an Austrian >> > company) just announced a commercial synchronous multimaster >> > replication product based on 2-phase commit. It's expensive, and I > [snip] >> I could be completely cranked but I believe that product is based on >> PgCluster which is horrendously slow. > > Well, dang, that's disappointing. Last I checked, the PGCluster design > was fundamentally unscalable. Multimaster replication generally is - thats why Slony-2 will almost certainly never exist in the form that it was originally imagined. Although I'm not (and never have been) an Oracle user, I've heard that RAC has it's own issues in this area as well. Regards, Dave
Dave Page wrote: > Alexander Staubo wrote: >> On 6/1/07, Joshua D. Drake <jd@commandprompt.com> wrote: >>>> In the meantime, Cybertec (http://www.postgresql.at/, an Austrian >>>> company) just announced a commercial synchronous multimaster >>>> replication product based on 2-phase commit. It's expensive, and I >> [snip] >>> I could be completely cranked but I believe that product is based on >>> PgCluster which is horrendously slow. >> Well, dang, that's disappointing. Last I checked, the PGCluster design >> was fundamentally unscalable. > > Multimaster replication generally is - thats why Slony-2 will almost > certainly never exist in the form that it was originally imagined. > Although I'm not (and never have been) an Oracle user, I've heard that > RAC has it's own issues in this area as well. IMO, the future is application partitioning, not multi-master. Also, what I find interesting here is that PostgreSQL on modest hardware does excessively well. I have a client right now that is running an 8 core, 16 gig box with only 14 spindles. They are processing 6ktps and it takes 14 tomcat servers to bring the database down. Any multi-master solution is going to fall over well before 6ktps. Sincerely, Joshua D. Drake P.S. I should note that we are working toward more than 6ktps and will report back ;) > > Regards, Dave > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
It does so well because it KICKS ICE! On Fri, 1 Jun 2007, Joshua D. Drake wrote: > Dave Page wrote: >> Alexander Staubo wrote: >>> On 6/1/07, Joshua D. Drake <jd@commandprompt.com> wrote: >>>>> In the meantime, Cybertec (http://www.postgresql.at/, an Austrian >>>>> company) just announced a commercial synchronous multimaster >>>>> replication product based on 2-phase commit. It's expensive, and I >>> [snip] >>>> I could be completely cranked but I believe that product is based on >>>> PgCluster which is horrendously slow. >>> Well, dang, that's disappointing. Last I checked, the PGCluster design >>> was fundamentally unscalable. >> >> Multimaster replication generally is - thats why Slony-2 will almost >> certainly never exist in the form that it was originally imagined. >> Although I'm not (and never have been) an Oracle user, I've heard that >> RAC has it's own issues in this area as well. > > IMO, the future is application partitioning, not multi-master. > > Also, what I find interesting here is that PostgreSQL on modest hardware does > excessively well. > > I have a client right now that is running an 8 core, 16 gig box with only 14 > spindles. > > They are processing 6ktps and it takes 14 tomcat servers to bring the > database down. > > Any multi-master solution is going to fall over well before 6ktps. > > Sincerely, > > Joshua D. Drake > > P.S. I should note that we are working toward more than 6ktps and will report > back ;) > >> >> Regards, Dave >> > > > -- Louis Gonzales louis.gonzales@linuxlouis.net http://www.linuxlouis.net
Guillaume Lelarge wrote: > I asked them some time ago. They answered me this : > > As for SQL Manager for PostgreSQL - we regret to inform you that the > development and support of Linux editions of EMS software products has > become impossible now that Borland no longer supports Kylix libraries > for Delphi, on which all Linux versions of EMS software were based. > Hence, EMS has made a decision to discontinue Linux versions of its > products. Linux products will not be available for sale or download anymore. > > > Just a FYI, their win32 versions will probably work fine in Linux via WINE. Lightning Admin does anyway(work via WINE) and they program their products with Delphi as well, so I think it would work fine. Just one word of advice about WINE, make sure you have the core MS true type fonts installed or the win32 apps will look funny, especially when using editors. Later, -- Tony Caduto AM Software Design http://www.amsoftwaredesign.com Home of PG Lightning Admin for Postgresql Your best option for Postgresql Administration
As an aside -- please don't start new topics in old threads. On Fri, Jun 01, 2007 at 10:42:02AM -0400, gonzales@linuxlouis.net wrote: > > I'm disappointed because SLONY-II has not been released yet to support > multi-master replication! Well, I wouldn't hold my breath. Most of the participants in that project moved on, after concluding either that it wasn't going to solve their problems, or concluding that it'd cost too much to develop and support for the likely benefit it would deliver. As near as I can tell, development on the project stopped. The inspiration for the Slony-II project, Postgres-R, has been ported forward to 8.x series by Markus Schiltknecht. Last I heard, he was looking for people to underwrite his work on that project. So if you really want those features, the obvious way to do it is to put a programmer on it, and there happens to be a programmer who has a demo as his argument that it can be done, and he can do it. I think you have to understand, however, that Slony-II or Postgres-R was not in fact the magic carpet you seem to think it was to be. There are some pretty significant limitations to the async multimaster approach it uses. To begin with, AFAIK nobody has a working, production-grade group communication system available for use by Postgres -- the ones that the prototypes were built on were pretty hacky, and appeared not to be ready for prime time. Second, nobody has come up with any way to make this work with READ COMMITTED mode, which means you pay a really huge price for the replication. My real question in all this is, "What is the problem you are trying to solve?" Hot failover using combinations of hardware and software, and a disk array that can be mounted across two machines, is actually probably good enough for most cases, assuming it is implemented correctly (see recent discussion on this topic). So the availability piece is mostly solved. What else do you want? A -- Andrew Sullivan | ajs@crankycanuck.ca When my information changes, I alter my conclusions. What do you do sir? --attr. John Maynard Keynes
Let me clarify - I don't know how/where the thought is that I need something per se, I personally like deploying the scaled solutions and playing with software that others have written. Can I deploy the right hardware and configurations to meet pretty much near anything anyone could ever need? I say yes, I can. BUT it would so much cooler IF, there was a multi-master environment configured for the sake of doing it. I'm glad that not everyone has this disposition about 'why do we need it.' The pursuit of knowledge is the facilitator of innovation. As I mentioned, in initial post, I've got OpenLDAP integrated with postgresql & another application, which my friends and I are going to release soon as a service - we'll see how far that goes. Nonetheless, I would like to have a multi-master postrgresql cluster handling my backend content. Why? Because I'd like to. If you're happy with not using pursuing it, great, that works for you. Sweet! Killer! Rock on! My take away is, now besides knowing their's no active pursuit on the matter, perhaps others in this huge forum, now know more status, thanks to those of 'you' who are so up to date! In all sincerity, thank you for clarifying that status. There are still advanced users our here, who would still like to see an opensource product reach that level stature, such that those who don't know - i.e. 'most' managers and people who have decision making power - would not be deterred from using Postgresql because it doesn't have a quote unquote, 'multi-mater' replication. On Fri, 1 Jun 2007, Andrew Sullivan wrote: > As an aside -- please don't start new topics in old threads. > > On Fri, Jun 01, 2007 at 10:42:02AM -0400, gonzales@linuxlouis.net wrote: >> >> I'm disappointed because SLONY-II has not been released yet to support >> multi-master replication! > > Well, I wouldn't hold my breath. Most of the participants in that > project moved on, after concluding either that it wasn't going to > solve their problems, or concluding that it'd cost too much to > develop and support for the likely benefit it would deliver. As near > as I can tell, development on the project stopped. > > The inspiration for the Slony-II project, Postgres-R, has been ported > forward to 8.x series by Markus Schiltknecht. Last I heard, he was > looking for people to underwrite his work on that project. So if you > really want those features, the obvious way to do it is to put a > programmer on it, and there happens to be a programmer who has a demo > as his argument that it can be done, and he can do it. > > I think you have to understand, however, that Slony-II or Postgres-R > was not in fact the magic carpet you seem to think it was to be. > There are some pretty significant limitations to the async > multimaster approach it uses. To begin with, AFAIK nobody has a > working, production-grade group communication system available for > use by Postgres -- the ones that the prototypes were built on were > pretty hacky, and appeared not to be ready for prime time. Second, > nobody has come up with any way to make this work with READ COMMITTED > mode, which means you pay a really huge price for the replication. > > My real question in all this is, "What is the problem you are trying > to solve?" Hot failover using combinations of hardware and software, > and a disk array that can be mounted across two machines, is actually > probably good enough for most cases, assuming it is implemented > correctly (see recent discussion on this topic). So the availability > piece is mostly solved. What else do you want? > > A > > -- Louis Gonzales louis.gonzales@linuxlouis.net http://www.linuxlouis.net
On Fri, Jun 01, 2007 at 12:39:42PM -0400, gonzales@linuxlouis.net wrote: > ever need? I say yes, I can. BUT it would so much cooler IF, > there was a multi-master environment configured for the sake of > doing it. Yes, it would be. Unfortunately, when I was in the position where I had to explain to my bosses that we still didn't have any demonstrable progress after a couple of years of effort and expense, they told me they weren't willing to continue funding that work. I presume other people had similar experiences with their bosses. Moreover, because we had something else that solved our _particular_ problem -- i.e. zero-transaction-loss failure recovery within our SLA -- it was hard to justify the continued work. > I'm glad that not everyone has this disposition about 'why do we > need it.' The pursuit of knowledge is the facilitator of > innovation. Others have argued that necessity is the mother of invention. I have no idea what causes better software in general; what I know is that, for companies that are likely to spend money to pay people to work on multimaster replication, a strong need and evidence of progress are two necessary conditions. Bruce mentioned (at PgCon in Ottawa) that one of the UCB people, in handing the project off, said that Postgres needed a few people with a lot of time rather than a lot of people with a little time. Multimaster replication is like that, and therefore finding someone who will work on it (either because they're paid to or because they are incapable of leaving the problem alone) is going to be the requirement to get it done. > Why? Because I'd like to. My point was really to ask, "How much do you want it?" If the answer to that is, "Enough to use it when someone else shows up with the goods," then I'm trying to suggest that that's not enough to cause the happy result you desire. Someone needs to do the work, which means that someone has to want the result badly enough to put the time and talent into getting the result. > would not be deterred from using Postgresql because it doesn't have a > quote unquote, 'multi-mater' replication. It is certainly true today that if what you really need is RAC, you have exactly one company to go to, and that's Oracle. If you need something else, there might in fact be a Postgres answer for you. Part of the problem here is that people often say "multimaster" without stating what they're asking for. Does MySQL have multimaster replication? Well, yes and no, depending on which set of features you might want to use and what data breakage you're willing to tolerate. What about SQL Server? See above. Ok, what about DB2? Well, no, not really, but it scales well and you can do HA with it using hot-standby failover clusters at the OS level, if you believe their literature. There's also some component that sort of hangs on the side and does some data synchronisation between points, allowing something that looks like multimaster but sounds like it has some nasty gotchas (and I get that feeling just from reading the brochures. OTOH, I am not completely rational when it comes to claims made by IBM). You can in fact build a Postgres system that does the same things today, though. These are all different solutions to different problems, so it's not surprising that they look different. This was the reason I asked, "What is the problem you are trying to solve?" A -- Andrew Sullivan | ajs@crankycanuck.ca Unfortunately reformatting the Internet is a little more painful than reformatting your hard drive when it gets out of whack. --Scott Morris
On Fri, 2007-06-01 at 17:00 +0200, Alexander Staubo wrote: > the projected Slony-II design, but the setup seems dead simple, and > from the docs I have found it seems to transparently replicate schema > changes, unlike Slony-I. So that's something. > To be fair to Slony-I, the fact that it does not replicate DDL is a feature, not a bug. It's table-based, which is a very flexible design. Regards, Jeff Davis
On 6/1/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote: > These are all different solutions to different problems, so it's not > surprising that they look different. This was the reason I asked, > "What is the problem you are trying to solve?" You mean aside from the obvious one, scalability? The databases is becoming a bottleneck for a lot of so-called "Web 2.0" apps which use a shared-nothing architecture (such as Rails, Django or PHP) in conjunction with a database. Lots of ad-hoc database queries that come not just from web hits but also from somewhat awkwardly fitting an object model onto a relational database. These "new" apps are typically intensely personal and contextual, where every page is personalized for the visiting user, and doing a whole bunch of crazy multijoin queries to fetch the latest posts, the most recent recommendations from your friends, the most highly rated stuff. In fact, merely doing something seemingly simple like incrementing a row's counter every time a post has been viewed is eventually going to have a negative performance impact on a traditional OLTP-optimized relational database. I'm sure some people would disagree with the significance of the above (possibly by replying that a relational database is the wrong kind of tool for such apps), or that there is an urgent need to scale beyond the single server, but I would hope that there would, at some point, appear a solution that could enable a database to scale horizontally with minimal impact on the application. In light of this need, I think we could be more productive by rephrasing the question "how/when we can implement multimaster replication?" as "how/when can we implement horizontal scaling?". As it stands today, horizontally partitioning a database into multiple separate "shards" is incredibly invasive on the application architecture, and typically relies on brittle and non-obvious hacks such as configuring sequence generators with staggered starting numbers, omitting referential integrity constraints, sacrificing transactional semantics, and moving query aggregation into the app level. On top of this, dumb caches such as Memcached are typically layered to avoid hitting the database in the first place. Still, with MySQL and a bit of glue, guys like eBay, Flickr and MySpace are partitioning their databases relatively successfully using such tricks. These guys are not average database users, but not they are not the only ones that have suffered from database bottlenecks and overcome them using clever, if desperate, measures. Cal Henderson (or was it Stewart Butterfield?) of Flickr has famously said he would never again start a project that didn't have a partitioning from the start. I would love to see a discussion about how PostgreSQL could address these issues. Alexander.
Alexander Staubo wrote: > On 6/1/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote: >> These are all different solutions to different problems, so it's not >> surprising that they look different. This was the reason I asked, >> "What is the problem you are trying to solve?" > > You mean aside from the obvious one, scalability? Multimaster doesn't give you scalability (at least not like a lot of people think it does). > > The databases is becoming a bottleneck for a lot of so-called "Web > 2.0" apps which use a shared-nothing architecture (such as Rails, > Django or PHP) in conjunction with a database. Lots of ad-hoc database > queries that come not just from web hits but also from somewhat > awkwardly fitting an object model onto a relational database. > Databases are a bottleneck when you get a bunch of so called web 2.0 developers thinking they know an inch about databases. What you are basically saying below is... web 2.0 developers such as rails developers have so fundamentally broken the way it is supposed to be done, we should too... Not too convincing. > > As it stands today, horizontally partitioning a database into multiple > separate "shards" is incredibly invasive on the application > architecture, and typically relies on brittle and non-obvious hacks > such as configuring sequence generators with staggered starting > numbers, omitting referential integrity constraints, sacrificing > transactional semantics, and moving query aggregation into the app > level. On top of this, dumb caches such as Memcached are typically > layered to avoid hitting the database in the first place. > > Still, with MySQL and a bit of glue, guys like eBay, Flickr and > MySpace are partitioning their databases relatively successfully using > such tricks. These guys are not average database users, but not they > are not the only ones that have suffered from database bottlenecks and > overcome them using clever, if desperate, measures. Cal Henderson (or > was it Stewart Butterfield?) of Flickr has famously said he would > never again start a project that didn't have a partitioning from the > start. > > I would love to see a discussion about how PostgreSQL could address > these issues. > > Alexander. > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org/ > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
On 6/1/07, Jeff Davis <pgsql@j-davis.com> wrote: > On Fri, 2007-06-01 at 17:00 +0200, Alexander Staubo wrote: > > the projected Slony-II design, but the setup seems dead simple, and > > from the docs I have found it seems to transparently replicate schema > > changes, unlike Slony-I. So that's something. > > To be fair to Slony-I, the fact that it does not replicate DDL is a > feature, not a bug. It's table-based, which is a very flexible design. I fail to see how that's an excuse not to replicate DDL. If I run "alter table" on the master, there is no reason whatever that this command cannot be executed on all the slaves -- which is what I would expect of a replication system. To put it differently: A slave's table is a replica of the master's table; if I alter the master table, and the slave is not updated to reflect this change, then the slave table is no longer a true replica, and the system has failed its core purpose, that of *replicating*. I could be wrong, but I believe Slony fails at this because it is trigger-based and simply cannot detect DDL changes. Alexander.
Alexander Staubo wrote: > On 6/1/07, Jeff Davis <pgsql@j-davis.com> wrote: >> On Fri, 2007-06-01 at 17:00 +0200, Alexander Staubo wrote: >> > the projected Slony-II design, but the setup seems dead simple, and >> > from the docs I have found it seems to transparently replicate schema >> > changes, unlike Slony-I. So that's something. >> >> To be fair to Slony-I, the fact that it does not replicate DDL is a >> feature, not a bug. It's table-based, which is a very flexible design. > > I fail to see how that's an excuse not to replicate DDL. If I run > "alter table" on the master, there is no reason whatever that this > command cannot be executed on all the slaves -- which is what I would > expect of a replication system. As the owner of a company that actually actively developing a replication system and has for years... I suggest you start putting your code where your words are. This is not nearly as simple as it seems. There is a reason that Slony attempts to do it in user space instead of postgresql space. Joshua D. Drake > > To put it differently: A slave's table is a replica of the master's > table; if I alter the master table, and the slave is not updated to > reflect this change, then the slave table is no longer a true replica, > and the system has failed its core purpose, that of *replicating*. > > I could be wrong, but I believe Slony fails at this because it is > trigger-based and simply cannot detect DDL changes. > > Alexander. > > ---------------------------(end of broadcast)--------------------------- > TIP 3: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
On Fri, Jun 01, 2007 at 08:57:36PM +0200, Alexander Staubo wrote: > I fail to see how that's an excuse not to replicate DDL. If I run > "alter table" on the master, there is no reason whatever that this > command cannot be executed on all the slaves -- which is what I would > expect of a replication system. There is a way to replicate the DDL. A well-documented way. It's sort of ugly, I think everyone will admit, but it was the path chosen because it allowed the entire thing to fit in user space, which meant it was possible to install it on an unpatched PostgreSQL that was already deployed in the field. That's a non-zero benefit. > I could be wrong, but I believe Slony fails at this because it is > trigger-based and simply cannot detect DDL changes. No, there were in fact alternatives (like, for instance, patching the back end code). But that was undesirable for the reason I note above. A -- Andrew Sullivan | ajs@crankycanuck.ca In the future this spectacle of the middle classes shocking the avant- garde will probably become the textbook definition of Postmodernism. --Brad Holland
On Fri, Jun 01, 2007 at 11:50:09AM -0700, Joshua D. Drake wrote: > What you are basically saying below is... web 2.0 developers such as > rails developers have so fundamentally broken the way it is supposed to > be done, we should too... I don't know that's all there is to it. After all, we support arrays in the database, which ought to make any relational purist shudder. It seems to me that there _are_ legitimate limits to single-back-end approaches, even for well designed systems, and being able to spread some of that load around would be a big advantage. I just don't want to do it without thinking hard about what the compromises might be, and figuring out which ones to make. "Breaks transactional model", for instance, is right out :) A -- Andrew Sullivan | ajs@crankycanuck.ca I remember when computers were frustrating because they *did* exactly what you told them to. That actually seems sort of quaint now. --J.D. Baldwin
On Fri, Jun 01, 2007 at 08:40:13PM +0200, Alexander Staubo wrote: > On 6/1/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote: > >These are all different solutions to different problems, so it's not > >surprising that they look different. This was the reason I asked, > >"What is the problem you are trying to solve?" > > You mean aside from the obvious one, scalability? Why is that the "obvious one"? If that's your problem, say so. I have different problems. I don't need 30 back end machines to keep my website running. Something obvious in one context is a misfeature of pointless complication in another. > appear a solution that could enable a database to scale horizontally > with minimal impact on the application. In light of this need, I think > we could be more productive by rephrasing the question "how/when we > can implement multimaster replication?" as "how/when can we implement > horizontal scaling?". Indeed, this may well be a different problem. In fact, if what you want is "to scale horizontally with minimal impact on the application", I encourage you to go out and buy the first database replication system that will actually do that for you. Not the one that _tells_ you they can, the one that actually does. I agree that horizontal scaling is a desirable feature, but I don't think it obvious that multimaster replication, whatever that means, is the thing that will solve that problem. > I would love to see a discussion about how PostgreSQL could address > these issues. Well, a good start would be to list what exactly you do and do not mean by horizontal scaling: what is the behaviour under various scenarios. That's a good way to list at least what the problem is. (Your mail was a good start, but only a start. Is RI required across nodes? Why not? Why? What is allowed to break? &c.) A -- Andrew Sullivan | ajs@crankycanuck.ca The fact that technology doesn't work is no bar to success in the marketplace. --Philip Greenspun
"Alexander Staubo" <alex@purefiction.net> writes: > On 6/1/07, Jeff Davis <pgsql@j-davis.com> wrote: >> To be fair to Slony-I, the fact that it does not replicate DDL is a >> feature, not a bug. It's table-based, which is a very flexible design. > I fail to see how that's an excuse not to replicate DDL. > I could be wrong, but I believe Slony fails at this because it is > trigger-based and simply cannot detect DDL changes. You are wrong. The Slony guys say this is intentional, and they have some good arguments. They may be making a virtue of necessity, but automatic replication of DDL is not nearly as open-and-shut a decision as you paint it. regards, tom lane
Andrew Sullivan wrote: > On Fri, Jun 01, 2007 at 11:50:09AM -0700, Joshua D. Drake wrote: >> What you are basically saying below is... web 2.0 developers such as >> rails developers have so fundamentally broken the way it is supposed to >> be done, we should too... > > I don't know that's all there is to it. After all, we support arrays > in the database, which ought to make any relational purist shudder. > It seems to me that there _are_ legitimate limits to single-back-end > approaches, even for well designed systems, and being able to spread > some of that load around would be a big advantage. I am not arguing against balancing load. I am arguing against the idea that multi-master is some holy grail. Sincerely, Joshua D. Drake > I just don't want > to do it without thinking hard about what the compromises might be, > and figuring out which ones to make. "Breaks transactional model", > for instance, is right out :) > > A > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
gonzales@linuxlouis.net writes: > I'm disappointed because SLONY-II has not been released yet to support > multi-master replication! PostgreSQL is going through all of the > releases - and that's great - BUT, where is the sync-up with the > powerhouse of a component, that Slony-II would bring to the table? > Slony-I is pretty sweet, but if Slony-II would release, I can imagine > that this would introduce some major competition in the enterprise > world against the commercial dyno's. There is some effort still going into Postgres-R from a research perspective, but as far as I know, nobody has been working on Slony-II for well over a year now. Unfortunately, a combination of factors went together to make it "not workable." - Spread licensing was something of an issue; - Spread scalability caused quite a few issues (it's apparently tough to make it stable when using it under HEAVY load); - There was a token passing bottleneck in Spread limiting its performance; - New application failure scenarios emerged that wouldn't have take place in non-MM systems. The issues built to the point of making it unworthwhile to continue development effort :-(. If someone completed a suitable reimplementation of the wheel on GCS, and produced something more usable for the purpose, such as the (theorized) Anasazi system, it might be worth proceeding again. http://www.lethargy.org/~jesus/archives/53-Lets-reimplement-the-wheel...-or-at-least-another-GCS..html But it's fair to say that reality did not live up to the early hopes. Supposing we came up with a Wicked Better GCS, that might merely allow efforts to get a bit further, and then hang up on something else. Don't hold your breath expecting Slony-II to be around any corners... -- let name="cbbrowne" and tld="acm.org" in name ^ "@" ^ tld;; http://linuxfinances.info/info/sap.html "It seems certain that much of the success of Unix follows from the readability, modifiability, and portability of its software." -- Dennis M. Ritchie, September, 1979
alex@purefiction.net ("Alexander Staubo") writes: > On 6/1/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote: >> These are all different solutions to different problems, so it's not >> surprising that they look different. This was the reason I asked, >> "What is the problem you are trying to solve?" > > You mean aside from the obvious one, scalability? I'd have to call that expectation "obviously WRONG." It was *CERTAIN* that Slony-II, if it had turned out as good as the *most optimistic hopes* were going, would have some substantial losses of performance compared to a single DB instance due to the need to apply locks across all nodes. There would be *some* scalability gains to be had, but the primary reason for looking for multimaster replication is that you need high availability so badly that you are willing to give up performance to get it. > As it stands today, horizontally partitioning a database into multiple > separate "shards" is incredibly invasive on the application > architecture, and typically relies on brittle and non-obvious hacks > such as configuring sequence generators with staggered starting > numbers, omitting referential integrity constraints, sacrificing > transactional semantics, and moving query aggregation into the app > level. On top of this, dumb caches such as Memcached are typically > layered to avoid hitting the database in the first place. Question: In what way would you expect an attempt to do mostly-trying-to-be-transparent multimaster replication to help with these issues you're bringing up? Slony-II was trying to provide answers to various of those "non-obvious hacks"; various of those things point at areas where it would *have to* be somewhat slow. > Still, with MySQL and a bit of glue, guys like eBay, Flickr and > MySpace are partitioning their databases relatively successfully using > such tricks. These guys are not average database users, but not they > are not the only ones that have suffered from database bottlenecks and > overcome them using clever, if desperate, measures. Cal Henderson (or > was it Stewart Butterfield?) of Flickr has famously said he would > never again start a project that didn't have a partitioning from the > start. > > I would love to see a discussion about how PostgreSQL could address > these issues. Partitioning isn't multimaster replication; it's something worthy of having a discussion independent of anything about MMR. -- (reverse (concatenate 'string "ofni.sesabatadxunil" "@" "enworbbc")) http://linuxdatabases.info/info/sap.html "There are almost unlimited ways for making your programs more complicated or bizarre" -- Arthur Norman
On 06/01/07 11:16, Andrew Sullivan wrote: [snip] > > My real question in all this is, "What is the problem you are trying > to solve?" Hot failover using combinations of hardware and software, > and a disk array that can be mounted across two machines, is actually > probably good enough for most cases, assuming it is implemented > correctly (see recent discussion on this topic). So the availability > piece is mostly solved. What else do you want? What you need are disk blocks to be mirrored to a machine at the DR site. Or "continuous PITR" to the DR machine. How difficult would it be to modify the process (the postmaster?) that writes the xlogs(?) to tee them to a listening process across the cloud on the DR machine, which then applies them to the DR database? This then begs the question: are CREATE|ALTER TABLESPACE commands stored in the xlogs? -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good!
On 6/1/07, Joshua D. Drake <jd@commandprompt.com> wrote: > As the owner of a company that actually actively developing a > replication system and has for years... I suggest you start putting your > code where your words are. That doesn't make any sense. As a database *user* it's my prerogative to criticize the bits that make my life painful. Intentional or not, the Slony design compromises its user-friendliness. I would love for the answer to have been "sorry, we did not have time or manpower enough to implement fully transparent replication yet, because it's a rather complex, you see"; but it's not, and I balk at the idea that you cannot strive for something better. For example, there is clearly an opportunity to implement the appropriate hooks in PostgreSQL that can be used *if they are available*; otherwise, on unpatched/older systems, require the use of the slonik command. Alexander.
On 6/1/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote: > > I could be wrong, but I believe Slony fails at this because it is > > trigger-based and simply cannot detect DDL changes. > > No, there were in fact alternatives (like, for instance, patching the > back end code). But that was undesirable for the reason I note > above. Curiously enough, that does not conflict with anything I wrote. I am, clearly, not wrong: A deliberate decision was made not to patch PostgreSQL with the hooks Slony would need to support DDL changes; therefore, since it relies purely on triggers, it cannot detect DDL changes. Alexander.
On Fri, Jun 01, 2007 at 03:58:01PM -0500, Ron Johnson wrote: > What you need are disk blocks to be mirrored to a machine at the DR > site. Or "continuous PITR" to the DR machine. I thought you could already do this? (I'm not, but I was pretty sure someone reported doing it already.) > This then begs the question: are CREATE|ALTER TABLESPACE commands > stored in the xlogs? (I'll spare the rant about begging the question.) Since they're transactional, they must be, no? A -- Andrew Sullivan | ajs@crankycanuck.ca When my information changes, I alter my conclusions. What do you do sir? --attr. John Maynard Keynes
On Fri, Jun 01, 2007 at 11:08:50PM +0200, Alexander Staubo wrote: > That doesn't make any sense. As a database *user* it's my prerogative > to criticize the bits that make my life painful. Sure. And as a user of free software, it is your prerogative to propose a way that the software can be modified to make it do what you want, and to do the work to so modify it. Does this mean you are offering? > For example, there is clearly an opportunity to implement the > appropriate hooks in PostgreSQL that can be used *if they are > available*; otherwise, on unpatched/older systems, require the use of > the slonik command. I don't know that that is clear at all. To begin with, one would have to get agreement on what those hooks would be. If you look on the pgfoundry site, you'll note that I set up a project there to try to get such a list of hooks defined. It went nowhere: everyone who was working on replication said it was premature and impossible to do this in advance and such like. Moreover, what you are suggesting is a _massive_ increase in the complications of the code, because it suggests to me that you want DDL to happen as easily as it does in single-node cases. But it's not that easy, which is another part of the reason DDL is handled specially. If you don't know what the hard parts are, I suggest you go and read the rather detailed original concept document that Jan put together for the community prior to starting work on the system. But just as a teaser: what do you do if your DDL on the local node has succeeded, and you added additional data in the same transaction, but the DDL fails for some reason on a remote node? Note that this one isn't even one of the actually tricky cases. A -- Andrew Sullivan | ajs@crankycanuck.ca A certain description of men are for getting out of debt, yet are against all taxes for raising money to pay it off. --Alexander Hamilton
"Alexander Staubo" <alex@purefiction.net> writes: > I would love for the answer to have been "sorry, we did not have time > or manpower enough to implement fully transparent replication yet, > because it's a rather complex, you see"; Would you still love that if you're one of the people who use replication to move the data to a reporting database which has a modified schema appropriate for the different usage? This improvement would make it useless for that purpose. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
On 6/2/07, Gregory Stark <stark@enterprisedb.com> wrote: > "Alexander Staubo" <alex@purefiction.net> writes: > > > I would love for the answer to have been "sorry, we did not have time > > or manpower enough to implement fully transparent replication yet, > > because it's a rather complex, you see"; > > Would you still love that if you're one of the people who use replication to > move the data to a reporting database which has a modified schema appropriate > for the different usage? This improvement would make it useless for that > purpose. All you would require is a simple boolean flag to enable or disable automatic DDL propagation, surely. Clearly people use replication for different purposes; the current system favours people who prefer to handle DDL propagation manually, and I am not one of them. Alexander.
On 06/01/07 16:25, Andrew Sullivan wrote: > On Fri, Jun 01, 2007 at 03:58:01PM -0500, Ron Johnson wrote: >> What you need are disk blocks to be mirrored to a machine at the DR >> site. Or "continuous PITR" to the DR machine. > > I thought you could already do this? (I'm not, but I was pretty sure > someone reported doing it already.) From within PG, or "manually"? >> This then begs the question: are CREATE|ALTER TABLESPACE commands >> stored in the xlogs? > > (I'll spare the rant about begging the question.) Since they're > transactional, they must be, no? I wasn't sure whether they were a special case or not. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good!
On 6/1/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote: > On Fri, Jun 01, 2007 at 11:08:50PM +0200, Alexander Staubo wrote: > > That doesn't make any sense. As a database *user* it's my prerogative > > to criticize the bits that make my life painful. > > Sure. And as a user of free software, it is your prerogative to > propose a way that the software can be modified to make it do what > you want, and to do the work to so modify it. Does this mean you are > offering? It's my prerogative, but not my moral obligation. I have no intention of becoming a Slony developer. I use Slony because I have no choice, and I would not have touched it with a bargepole if I did not need to. (I do contribute to projects that I do enjoy using.) On the other hand, if my current project goes well, I hope that I will be able to persuade my partners to set aside some cash to fund improvements to PostgreSQL and/or Slony. > > For example, there is clearly an opportunity to implement the > > appropriate hooks in PostgreSQL that can be used *if they are > > available*; otherwise, on unpatched/older systems, require the use of > > the slonik command. > > I don't know that that is clear at all. [snip] Perhaps not, but all I wrote was there is an *opportunity*, technical complexities aside, in an effort to provide *constructive* criticism. I said this based on your previous comment, that "the path [was] chosen because it allowed the entire thing to fit in user space, which meant it was possible to install it on an unpatched PostgreSQL", which implies that patching PostgreSQL is a possibility that may be considered. > To begin with, one would > have to get agreement on what those hooks would be. If you look on > the pgfoundry site, you'll note that I set up a project there to try > to get such a list of hooks defined. It went nowhere: [snip] For the record, I really appreciate the effort you made. I wasn't paying attention to pgsql-hackers at the time, so I missed the somewhat depressing discussion you had with Markus Schiltknecht. > Moreover, what you are suggesting is > a _massive_ increase in the complications of the code That, incidentally, is the kind of thing I want to hear, as opposed to "you are wrong", which is neither helpful nor polite. [snip] > the reason DDL is handled specially. If you don't know what the hard > parts are, I suggest you go and read the rather detailed original > concept document that Jan put together for the community prior to > starting work on the system. But just as a teaser: what do you do if > your DDL on the local node has succeeded, and you added additional > data in the same transaction, but the DDL fails for some reason on a > remote node? Note that this one isn't even one of the actually > tricky cases. Could you not (I ask naively) detect the first DDL statement is submitted in a transaction on the master, then start a transaction on each slave, then funnel this and all subsequent statements synchronously to every nodes, then prepare and commit everyone? Mind you, I profess virtual ignorance of the numerous border cases involved, so do go ahead and tell me how wrong I am and how silly I am for suggesting I could have the balls to even consider suggesting something like this. :) This suggestion implies transparent DDL replication would be synchronous, which seems like a decent compromise when you can ensure that all nodes are committed atomically, and virtually guaranteed to do so. That would be an improvement on the current behaviour (section 15 of the Slony manual): "If there is anything broken about the script, or about how it executes on a particular node, this will cause the slon daemon for that node to panic and crash" Alexander.
Alexander Staubo wrote: > As it stands today, horizontally partitioning a database into multiple > separate "shards" is incredibly invasive on the application > architecture, and typically relies on brittle and non-obvious hacks > such as configuring sequence generators with staggered starting > numbers, omitting referential integrity constraints, sacrificing > transactional semantics, and moving query aggregation into the app > level. On top of this, dumb caches such as Memcached are typically > layered to avoid hitting the database in the first place. Have you looked at raidb? http://c-jdbc.objectweb.org. Narrow niche, but if it happens to be the one you are in, then it's an option. I took a quick look at the user's page, and both of them were using PostgreSQL. -- Guy Rouillier
On Sat, Jun 02, 2007 at 12:05:20AM +0200, Alexander Staubo wrote: > All you would require is a simple boolean flag to enable or disable > automatic DDL propagation, surely. You know, it is just possible that some of the responses you are getting in this thread have to do with the glib way you say "just a simple flag", waving away all the corner cases and difficult parts. What do you do, for instance, when your automatic DDL wedges the replication system after data-replicating events have come in? What do you do when there happens to be a schema mismatch that you didn't know about? (How do you even detect such a thing?) That isn't a SMOP: it requires design that is not trivial, and you don't seem to be spending any time thinking about those issues before brushing off the current design as some sort of nasty thoughtless attempt to make your life more difficult on the part of those who have worked on the system. DDL changes require that every node have the new schema before any of the node-affecting data gets there. We have _enough_ problems with DDL failing on target systems without increasing this problem tenfold by doing it automatically. A -- Andrew Sullivan | ajs@crankycanuck.ca The whole tendency of modern prose is away from concreteness. --George Orwell
On Sat, Jun 02, 2007 at 12:23:44AM +0200, Alexander Staubo wrote: > Could you not (I ask naively) detect the first DDL statement is > submitted in a transaction Maybe. > on the master, then start a transaction on > each slave, then funnel this and all subsequent statements > synchronously to every nodes, then prepare and commit everyone? You could if 2PC was ubiquitous, which is certainly wasn't when the code was designed (remember, it was originally compatible all the way back to 7.3). Some people suggested using 2PC "if it's there", but that just seems to me to be asking for really painful problems. It also entails that all DDL has to happen on every node at the same time, which imposes a bottleneck not actually currently in the system. It is probably the case, however, that version 2 of the system will break some of these backwards compatibility attempts in order to depend on some new back end features -- putting this entirely in user space turns out to be awful. It's how we got the monstrous catalog corruption hack. This is getting pretty Slony specific, though, so if we're to continue this thread, I suggest we do it on the Slony list. A -- Andrew Sullivan | ajs@crankycanuck.ca "The year's penultimate month" is not in truth a good way of saying November. --H.W. Fowler
On Sat, 2007-06-02 at 00:05 +0200, Alexander Staubo wrote: > On 6/2/07, Gregory Stark <stark@enterprisedb.com> wrote: > > "Alexander Staubo" <alex@purefiction.net> writes: > > > > > I would love for the answer to have been "sorry, we did not have time > > > or manpower enough to implement fully transparent replication yet, > > > because it's a rather complex, you see"; > > > > Would you still love that if you're one of the people who use replication to > > move the data to a reporting database which has a modified schema appropriate > > for the different usage? This improvement would make it useless for that > > purpose. > > All you would require is a simple boolean flag to enable or disable > automatic DDL propagation, surely. Clearly people use replication for > different purposes; the current system favours people who prefer to > handle DDL propagation manually, and I am not one of them. > Here is some work going on that looks like what you want: http://archives.postgresql.org/pgsql-hackers/2007-03/msg00050.php http://momjian.postgresql.org/cgi-bin/pgtodo?pitr You might also seriously consider PgPool-II. There are many, many approaches to replication, high-availability, and related topics. For almost any combination of log-based, query-based, trigger-based and synchronous, asynchronous and master/master, master/slave and shared storage, local storage -- there is some PG person working on it. They all have advantages and disadvantages. The one, specific type of replication that suits you is not necessarily what everyone else wants. Try to understand that some of these options are built around businesses out of *need* rather than want. Businesses *need* functionality and flexibility. Administrative simplicity and conveniences (like replicated DDL) are obviously not the only goals of something like Slony-I. Regards, Jeff Davis
On 06/01/07 17:31, Andrew Sullivan wrote: > On Sat, Jun 02, 2007 at 12:23:44AM +0200, Alexander Staubo wrote: >> Could you not (I ask naively) detect the first DDL statement is >> submitted in a transaction > > Maybe. > >> on the master, then start a transaction on >> each slave, then funnel this and all subsequent statements >> synchronously to every nodes, then prepare and commit everyone? > > You could if 2PC was ubiquitous, which is certainly wasn't when the > code was designed (remember, it was originally compatible all the way > back to 7.3). Some people suggested using 2PC "if it's there", but > that just seems to me to be asking for really painful problems. It > also entails that all DDL has to happen on every node at the same > time, which imposes a bottleneck not actually currently in the > system. Since DDL is infrequent, is that bottleneck an acceptable trade-off? > It is probably the case, however, that version 2 of the system will > break some of these backwards compatibility attempts in order to > depend on some new back end features -- putting this entirely in user > space turns out to be awful. It's how we got the monstrous catalog > corruption hack. > > This is getting pretty Slony specific, though, so if we're to > continue this thread, I suggest we do it on the Slony list. -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good!
On 6/2/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote: > On Sat, Jun 02, 2007 at 12:05:20AM +0200, Alexander Staubo wrote: > > All you would require is a simple boolean flag to enable or disable > > automatic DDL propagation, surely. > > You know, it is just possible that some of the responses you are > getting in this thread have to do with the glib way you say "just a > simple flag", waving away all the corner cases and difficult parts. Or maybe I'm focused on the end-user experience and trying to mentally fit the technology to that idea rather than the other way around. Too much software is designed by developers, and too little software is designed for users. I don't mean to be glib at all -- the person was asking, in effect, "but this hypothetical feature X would break case Y", to which I suggested a flag to turn X on or off. PostgreSQL has plenty of flags that turn features on and off. Whether or not the hypothetical feature is hard to implement, or should be implemented, is an orthogonal concern. > What do you do, for instance, when your automatic DDL wedges the > replication system after data-replicating events have come in? There needs to be a point of synchronization when a DDL transaction appears that blocks further write transactions from running. As far as I can tell, the slaves themselves can continue to receive pending events, but perhaps not. As far as I can tell, table locking on the master ensures that concurrent, not-yet-committed transactions started *before* the DDL transaction will block the DDL itself, but I'm sure there's a gap here that could lead to craziness; it seems to me that you could detect it reliably and throw an error when it happens. I admit I am at a disadvantage here. I never intended to enter into a full-fledged discussion about implementation details, but you teased me, dammit. I clearly don't have enough knowledge about the way Slony works, but that does not disqualify me from suggesting that the current behaviour is unfriendly. Certainly there are technical and historic facts that explain this behaviour, but I see nothing wrong with challenging those notions in the name of helping the situation, at least according to my own aesthetics. Last I checked, nobody was actually terribly *happy* about having to pipe schema changes through slonik. Of course, as far as I can see (especially with regard to this thread), nobody seems to mind all that much, either, but then again people put up with a lot of crap before we had, say, sliced bread or psql tab completion. > [...] before brushing off > the current design as some sort of nasty thoughtless attempt to make > your life more difficult on the part of those who have worked on the > system. I'm not that paranoid. :) Alexander.
Ron Johnson wrote: > On 06/01/07 17:31, Andrew Sullivan wrote: >> On Sat, Jun 02, 2007 at 12:23:44AM +0200, Alexander Staubo wrote: >>> Could you not (I ask naively) detect the first DDL statement is >>> submitted in a transaction >> >> Maybe. >> >>> on the master, then start a transaction on >>> each slave, then funnel this and all subsequent statements >>> synchronously to every nodes, then prepare and commit everyone? >> >> You could if 2PC was ubiquitous, which is certainly wasn't when the >> code was designed (remember, it was originally compatible all the way >> back to 7.3). Some people suggested using 2PC "if it's there", but >> that just seems to me to be asking for really painful problems. It >> also entails that all DDL has to happen on every node at the same >> time, which imposes a bottleneck not actually currently in the >> system. > > Since DDL is infrequent, is that bottleneck an acceptable trade-off? Define infrequent? I have customers that do it, everyday in prod. They do it willingly and refuse to change that habit. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
On 6/2/07, Jeff Davis <pgsql@j-davis.com> wrote: > Here is some work going on that looks like what you want: > > http://archives.postgresql.org/pgsql-hackers/2007-03/msg00050.php I had no idea someone was working on WAL-log-based replication; I saw the TODO entry a while ago, but I missed the thread. I think WAL replication is a beautiful idea, so I'll gladly throw my support behind this. Thanks for the pointer. > You might also seriously consider PgPool-II. pgpool-II seems like a decent idea. I'm not sure if the partitioning can support referential integrity though -- would they have to be declared as CHECK constraints that used dblink()? Also, it doesn't seem capable of planning a query intelligently, which means that a query such as "select * from foo where id = 123" is going to be aggregated across all nodes even though only one node has the partition covering the id range [0, 1000], say. Alexander.
On 06/01/07 18:35, Joshua D. Drake wrote: > Ron Johnson wrote: >> On 06/01/07 17:31, Andrew Sullivan wrote: >>> On Sat, Jun 02, 2007 at 12:23:44AM +0200, Alexander Staubo wrote: >>>> Could you not (I ask naively) detect the first DDL statement is >>>> submitted in a transaction >>> >>> Maybe. >>> >>>> on the master, then start a transaction on >>>> each slave, then funnel this and all subsequent statements >>>> synchronously to every nodes, then prepare and commit everyone? >>> >>> You could if 2PC was ubiquitous, which is certainly wasn't when the >>> code was designed (remember, it was originally compatible all the way >>> back to 7.3). Some people suggested using 2PC "if it's there", but >>> that just seems to me to be asking for really painful problems. It >>> also entails that all DDL has to happen on every node at the same >>> time, which imposes a bottleneck not actually currently in the >>> system. >> >> Since DDL is infrequent, is that bottleneck an acceptable trade-off? > > Define infrequent? I have customers that do it, everyday in prod. They > do it willingly and refuse to change that habit. Even 2 or 3 ALTER TABLE or CREATE INDEX or CREATE TABLE statements per day is a drop in the bucket compared to the number of I/U/D statements, no? -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good!
Ron Johnson wrote: > On 06/01/07 18:35, Joshua D. Drake wrote: >>> >>> Since DDL is infrequent, is that bottleneck an acceptable trade-off? >> >> Define infrequent? I have customers that do it, everyday in prod. They >> do it willingly and refuse to change that habit. > > Even 2 or 3 ALTER TABLE or CREATE INDEX or CREATE TABLE statements per > day is a drop in the bucket compared to the number of I/U/D statements, no? True. J > -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/
On 6/1/07, Joshua D. Drake <jd@commandprompt.com> wrote: > Alexander Staubo wrote: > > You mean aside from the obvious one, scalability? > > Multimaster doesn't give you scalability (at least not like a lot of > people think it does). That depends on your particular definition of multimaster. > Databases are a bottleneck when you get a bunch of so called web 2.0 > developers thinking they know an inch about databases. That's a fairly bizarre and disrespectful blanket statement that includes some good friends of mine, and probably many of people on this list. I wonder you what you mean by it; regardless of the type of application, a single machine has a very real, very finite number of transactions it can process per second, and a finite number of spindles you can attach to it. No amount of handwaving about the incompetence of others changes that. Note that I am not talking about multimaster replication here, just scalability. Like Andrew Sullivan I think multimaster replication is infeasible by design, at least with a system such as PostgreSQL. > What you are basically saying below is... web 2.0 developers such as > rails developers have so fundamentally broken the way it is supposed to > be done, we should too... I don't know if I said that, but I would love to hear how they have broken it, and what you propose the solution to be. The eBay architecture is one interesting example, demonstrating a situation where no single box could possibly handle an entire database. Their requirements are extreme, but there are much smaller datasets and applications with similar performance characteristics that easily saturate a single box, so at a high level the same principles apply. http://glinden.blogspot.com/2006/12/talk-on-ebay-architecture.html The MySpace scaling story is similarly interesting, if mostly because of the egregious blunders made underway: http://www.baselinemag.com/article2/0,1540,2082921,00.asp You could argue that they solved the problems in an inelegant, irrelational way, but it seems that they solved it, and this is reflected on their balance sheets, which in the end is probably the most appropriate metric of success. Alexander.
On 6/2/07, Guy Rouillier <guyr-ml1@burntmail.com> wrote: > Have you looked at raidb? http://c-jdbc.objectweb.org. Narrow niche, > but if it happens to be the one you are in, then it's an option. I took > a quick look at the user's page, and both of them were using PostgreSQL. Looked at it briefly. C-JDBC is called Sequoia now. It's probably pretty good, but it's Java, and using JDBC from Ruby is a bit awkward; we're not really ready to use JRuby. Alexander.
On Sat, 2007-06-02 at 01:44 +0200, Alexander Staubo wrote: > On 6/2/07, Jeff Davis <pgsql@j-davis.com> wrote: > > Here is some work going on that looks like what you want: > > > > http://archives.postgresql.org/pgsql-hackers/2007-03/msg00050.php > > I had no idea someone was working on WAL-log-based replication; I saw > the TODO entry a while ago, but I missed the thread. I think WAL > replication is a beautiful idea, so I'll gladly throw my support > behind this. Thanks for the pointer. > I think that project would add some great functionality to postgres. > > You might also seriously consider PgPool-II. > > pgpool-II seems like a decent idea. I'm not sure if the partitioning > can support referential integrity though -- would they have to be > declared as CHECK constraints that used dblink()? You shouldn't use a volatile function in a check constraint. Use a trigger instead, but even that is unlikely to work for enforcing constraints correctly. In general, for partitioning, you have to make some sacrifices. It's very challenging (and/or expensive) to ensure uniqueness across partitions. > Also, it doesn't seem capable of planning a query intelligently, which > means that a query such as "select * from foo where id = 123" is going > to be aggregated across all nodes even though only one node has the > partition covering the id range [0, 1000], say. Take a closer look. There is quite a lot of intelligence in there. The query you mention will be rewritten to send simple WHERE clauses to the underlying partitions. That, combined with constraint exclusion (CE), will ensure that the only partitions scanned are those not excluded by the partition's predicate (i.e. it will only scan one partition if it's partitioned based on "id"). Actually, I'm not sure whether it relies on CE or not, but the point is that it won't scan all the partitions. Also, this means it could use an index scan on that underlying partition, which is crucial. Regards, Jeff Davis
On 6/1/07, Chris Browne <cbbrowne@acm.org> wrote: > There would be *some* scalability gains to be had, but the primary > reason for looking for multimaster replication is that you need high > availability so badly that you are willing to give up performance to > get it. ...dependent on some specific definition of "multimaster". I note that the MySQL people happily use "multimaster replication" to mean a master/slave setup where masters are also slaves, and conflicts are supposedly handled by assuming that the user always assigns unique IDs to new rows. That's not necessarily my definition. > > As it stands today, horizontally partitioning a database into multiple > > separate "shards" is incredibly invasive on the application > > architecture, and typically relies on brittle and non-obvious hacks > > such as configuring sequence generators with staggered starting > > numbers, omitting referential integrity constraints, sacrificing > > transactional semantics, and moving query aggregation into the app > > level. On top of this, dumb caches such as Memcached are typically > > layered to avoid hitting the database in the first place. > > Question: In what way would you expect an attempt to do > mostly-trying-to-be-transparent multimaster replication to help with > these issues you're bringing up? I don't expect anything. What I said was: "I think we could be more productive by rephrasing the question 'how/when we can implement multimaster replication?' as 'how/when can we implement horizontal scaling?'". Which means we're in agreement when you say: > Partitioning isn't multimaster replication; it's something worthy of > having a discussion independent of anything about MMR. Alexander.
On 06/01/07 19:17, Joshua D. Drake wrote: > Ron Johnson wrote: >> On 06/01/07 18:35, Joshua D. Drake wrote: >>>> >>>> Since DDL is infrequent, is that bottleneck an acceptable trade-off? >>> >>> Define infrequent? I have customers that do it, everyday in prod. >>> They do it willingly and refuse to change that habit. >> >> Even 2 or 3 ALTER TABLE or CREATE INDEX or CREATE TABLE statements per >> day is a drop in the bucket compared to the number of I/U/D >> statements, no? > > True. So Alexander Staubo's idea of synchronous DDL replication via 2PC has some merit? -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good!
On 06/01/07 19:29, Jeff Davis wrote: [snip] > You shouldn't use a volatile function in a check constraint. Use a > trigger instead, but even that is unlikely to work for enforcing > constraints correctly. > > In general, for partitioning, you have to make some sacrifices. It's > very challenging (and/or expensive) to ensure uniqueness across > partitions. Are partitioned databases the same as federated databases? -- Ron Johnson, Jr. Jefferson LA USA Give a man a fish, and he eats for a day. Hit him with a fish, and he goes away for good!
On Fri, 1 Jun 2007, Ron Johnson wrote: > How difficult would it be to modify the process (the postmaster?) that writes > the xlogs(?) to tee them to a listening process across the cloud on the DR > machine, which then applies them to the DR database? On an 8.2 server, you can practically do this right now by aggressively using the pg_xlogfile_name_offset() function and piping the data over to somewhere else. See section 23.4.4 of http://www.postgresql.org/docs/8.2/static/warm-standby.html It's not quite putting a "tee" in the engine itself, but there's not a lot of practial difference if you're copying over the data only a moment after it appears with the offset function instead; it's just a different set of efficiency trade-offs (more polling the way it's implemented, but the base engine never gets held up waiting for network I/O to the DR box). P.S. it's not the "the cloud" anymore, it's "the tubes". > This then begs the question: are CREATE|ALTER TABLESPACE commands stored in > the xlogs? See section 23.3.5 of http://www.postgresql.org/docs/8.2/static/continuous-archiving.html for information about what goes into the WAL when you do that. You'd need to implement a process that made sure to sync changes in the underlying filesystem before modifying a tablespace. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
> Have you looked at raidb? http://c-jdbc.objectweb.org. Narrow niche, > but if it happens to be the one you are in, then it's an option. I took > a quick look at the user's page, and both of them were using PostgreSQL. I just love those Java guys. The world starts and ends with Java. How do you execute a psql script with that thing ? How do you actually use any non-java stuff with it ? The same features could be implemented in a connection pool like pgpool2...
On Sat, Jun 02, 2007 at 02:27:06AM +0200, Alexander Staubo wrote: > >What you are basically saying below is... web 2.0 developers such as > >rails developers have so fundamentally broken the way it is supposed to > >be done, we should too... > > I don't know if I said that, but I would love to hear how they have > broken it, and what you propose the solution to be. I don't know if it's a general problem, but I've been involved in a using rails and it appears to have it's own way of declaring the database. It presumes to handle referential integrity and uniqueness in the application code (!). Lo and behold, there are now some uniqueness violations and no-one knows why. There seems to be a general unwillingness to let the database check this because it's "slow". My proposal is: let databases do what they're good at: checking uniqueness, referential integrity and constraints, and let the user code deal with the actual work. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
Alexander Staubo wrote: > On 6/2/07, Guy Rouillier <guyr-ml1@burntmail.com> wrote: >> Have you looked at raidb? http://c-jdbc.objectweb.org. Narrow niche, >> but if it happens to be the one you are in, then it's an option. I took >> a quick look at the user's page, and both of them were using PostgreSQL. > > Looked at it briefly. C-JDBC is called Sequoia now. It's probably > pretty good, but it's Java, and using JDBC from Ruby is a bit awkward; > we're not really ready to use JRuby. > > Alexander. > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > > Check carob, it include and odbc driver to connect to sequoia
On 6/2/07, Martijn van Oosterhout <kleptog@svana.org> wrote: > I don't know if it's a general problem, but I've been involved in a > using rails and it appears to have it's own way of declaring the > database. It presumes to handle referential integrity and uniqueness in > the application code (!). I think you've been misled. True, Rails/ActiveRecord does bear the signs of having been designed for MySQL/MyISAM, which has neither transactions nor referential integrity, but this does not mean that Rails does not support these constructs, or that Rails users don't use them. I value my data integrity, so all my relations have RI, unique constraints, null constraints, etc. as in any well-designed schema. Alexander.
On Fri, Jun 01, 2007 at 06:15:40PM -0500, Ron Johnson wrote: > > Since DDL is infrequent, is that bottleneck an acceptable trade-off? I don't know. We'd have to do the analysis. But it could be a problem. Look at it this way: if you have a replica that is, for isntance, _always_ 30 minutes behind, as a sort of poor-person's fast-recovery PITR, then you lose that functionality if you have to perform DDL on the replica at the same time as on the origin, because you have to catch up first. A -- Andrew Sullivan | ajs@crankycanuck.ca "The year's penultimate month" is not in truth a good way of saying November. --H.W. Fowler
On Sat, Jun 02, 2007 at 01:30:53AM +0200, Alexander Staubo wrote: > There needs to be a point of synchronization when a DDL transaction > appears that blocks further write transactions from running. As far as > I can tell, the slaves themselves can continue to receive pending > events, but perhaps not. In order to do it automatically, you have to lock everyone, get all the events through, and then perform the DDL, and then come out of lock. Otherwise, what happens when you do DROP COLUMN? If it goes through ahead of data that ought to go into that column, you have just broken your cluster. I suppose you could figure out a way to work around this, but pretty soon you are building an artificial intelligence expert system with event-predicting capabilities. Such systems are not well known for their simplicity and ease of maintenance. > Last I checked, nobody was actually terribly *happy* about having to > pipe schema changes through slonik. Nobody would suggest it's the friendliest arrangement. But this is a field where the details really count, and therefore proposals to make it more friendly have to account for how that friendliness in a lot of cases doesn't lead to complete breakage in others. (I had to be exposed to the multimaster MS SQL stuff, years ago, and I have to say that it was great when it worked; but when things went south, boy did your life suck. Whether it is better now, I don't know.) A -- Andrew Sullivan | ajs@crankycanuck.ca Unfortunately reformatting the Internet is a little more painful than reformatting your hard drive when it gets out of whack. --Scott Morris
PFC wrote: > >> Have you looked at raidb? http://c-jdbc.objectweb.org. Narrow niche, >> but if it happens to be the one you are in, then it's an option. I >> took a quick look at the user's page, and both of them were using >> PostgreSQL. > > I just love those Java guys. > The world starts and ends with Java. > How do you execute a psql script with that thing ? How do you > actually use any non-java stuff with it ? That's why I said "Narrow niche, but if it happens to be the one you're in ...". I do find your response rather peculiar. The objectweb guys saw a need and filled it in the domain they work in - Java. Should the Java folks complain because Perl scripts on CPAN are not accessible from Java? -- Guy Rouillier
On 6/3/07, Ian Harding <harding.ian@gmail.com> wrote: > Rails propaganda explicitly proposes not "repeating yourself" and [...] > The creator of RoR explicitly discourages use of RI, rules, triggers, > etc in the database as unnecessary. His disciples take this as > gospel. The creator of Rails is not the only developer working on Rails. There are Rails developers who disagree with him on these issues, and "his disciples" does not equate "the community". On the other hand, I admit that this mindset has a negative impact on the development of Rails as a whole. I consider myself a moderate pragmatist, and I think both sides are slightly wrong; the database side not pragmatic enough, and the Rails side pragmatic at the cost of moving too much database logic into the app. For example, part of the point of having validations declared on the model is so that you can raise user-friendly errors (and pipe them through gettext for localization) such as "Your password must be at least 4 characters long and contain only letters and digits". Databases don't support this, and so applications end up having to duplicate data-validation logic in order to find out what kind of user input is invalid. There might be hoops you could jump through to reduce the duplication. You could, perhaps, introspect the schema and see that the password column has a "check" constraint with a certain expression (eg., "password ~ '^\w+$'). On insertion failure, you use the expression string to generate a select -- eg., "select password ~ '^\w+$' from (select 'foobar'::text as password) as t" -- on every column you have to see what failed. I don't think PostgreSQL had full support for ANSI SQLSTATE column context information yet, but even if it had, I think you would not get information about *all* failing columns, only the first one (anyone know?), so you would still needt run the values through the database with a select. At this point you don't have an error message, but you could store these in a separate table (table_name, column_name, message) or perhaps (table_name, constraint_name, message) and look them up on failure. Another option might be to use triggers that call "raise" -- which may be acceptable for "check" constraints, but breaks the idiom for the others type of constraints; at best you will be repeating yourself. Another idea: I believe SQLSTATE (as implemented by PostgreSQL) currently lacks context information about which columns failed a constraint, but you had this, you could correlate None of this is terribly idiomatic, and involves a bunch of glue between application and database which needs to incorporated into a database layer. This may be a case for letting constraints have an optional description; this way the schema would also be self-documenting, eg.: create table foo ( id serial, name text check (name ~ '^(\w|\s)+$') or raise error 'Name must contain letters, digits and spaces only' ); This still isn't enough for the app side -- if multiple columns fail to validate, the app needs to know about them all, not just the first one: # create table persons (name text check (name != '') not null unique, age int check (age > 0)); # insert into persons (name, age) values ('', 0); ERROR: new row for relation "persons" violates check constraint "test_name_check1" > I have used rails for a couple projects and I repeated myself > religiously in database constraints. This revealed a couple > disturbing properties of rails that made me not want to use it, > notably, assuming an empty input box equates to an empty string vice > null for text data types, even if the table is defined as accepting > nulls. An empty string is not null! Null means the value is missing, which is clearly not the case here. I would say Rails is exactly in the right here. When an HTML form is posted, empty input boxes are declared as empty strings, which what the user entered. The problem is not with Rails/ActiveRecord but with your form handling. If you want empty boxes to become null, add some client-side JavaScript logic that sets the "disabled" attribute on empty input elements before form is submitted; this will prevent the client from sending the value. Alexander.
On 6/2/07, Alexander Staubo <alex@purefiction.net> wrote: > On 6/2/07, Martijn van Oosterhout <kleptog@svana.org> wrote: > > I don't know if it's a general problem, but I've been involved in a > > using rails and it appears to have it's own way of declaring the > > database. It presumes to handle referential integrity and uniqueness in > > the application code (!). > > I think you've been misled. True, Rails/ActiveRecord does bear the > signs of having been designed for MySQL/MyISAM, which has neither > transactions nor referential integrity, but this does not mean that > Rails does not support these constructs, or that Rails users don't use > them. I value my data integrity, so all my relations have RI, unique > constraints, null constraints, etc. as in any well-designed schema. > Rails propaganda explicitly proposes not "repeating yourself" and since the RI constraints are defined in the rails models, and they are enforced in the framework with a graceful feedback mechanism for users, they implicitly denigrate defining constraints in the DB as "Repeating yourself." The creator of RoR explicitly discourages use of RI, rules, triggers, etc in the database as unnecessary. His disciples take this as gospel. I have used rails for a couple projects and I repeated myself religiously in database constraints. This revealed a couple disturbing properties of rails that made me not want to use it, notably, assuming an empty input box equates to an empty string vice null for text data types, even if the table is defined as accepting nulls. - Ian
On Sun, Jun 03, 2007 at 07:47:04PM +0200, Alexander Staubo wrote: > The creator of Rails is not the only developer working on Rails. There > are Rails developers who disagree with him on these issues, and "his > disciples" does not equate "the community". On the other hand, I admit > that this mindset has a negative impact on the development of Rails as > a whole. Indeed, it certainly left me with a bad taste in my mouth. > For example, part of the point of having validations declared on the > model is so that you can raise user-friendly errors (and pipe them > through gettext for localization) such as "Your password must be at > least 4 characters long and contain only letters and digits". > Databases don't support this, and so applications end up having to > duplicate data-validation logic in order to find out what kind of user > input is invalid. I think you're confusing validation and integrity constraints. The example you're giving could be implemented in either the DB or the app. Personnaly I'd do it in the app since it's something that doesn't affect the integrity of the data. If I go in and manually change someone's password to something not following that rule it's not going to affect anything. Integrity constraints are different: if you violate them your data has serious problem. They are the assumptions of your model upon which everything depends. We're talking referential integrity and uniqueness. These things *cannot* be checked reliably in the app, and you shouldn't try. You assume the constraints are valid and feel confident they are, because the database has checked them for you. > This still isn't enough for the app side -- if multiple columns fail > to validate, the app needs to know about them all, not just the first > one: Basically, validation in this case is completely orthoginal to integrity checks. If the data being validated isn't crucial to the operation of the app, I wouldn't bother putting it in the database. But fundamental integrity constraints, the database is the only place. I wish you success in your efforts to making rails a little more sane in this area. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
On 6/3/07, Martijn van Oosterhout <kleptog@svana.org> wrote: > > For example, part of the point of having validations declared on the > > model is so that you can raise user-friendly errors (and pipe them > > through gettext for localization) such as "Your password must be at > > least 4 characters long and contain only letters and digits". > > Databases don't support this, and so applications end up having to > > duplicate data-validation logic in order to find out what kind of user > > input is invalid. > > I think you're confusing validation and integrity constraints. The > example you're giving could be implemented in either the DB or the app. No, I was pointing out that Rails supports uniqueness and referential integrity, but that it implemented validations as a general construct in order to (among other things) provide user-friendly messages. But what I said also applies to uniqueness and foreign key constraints. Databases, including PostgreSQL, makes it hard for an application to determine what part of the data failed when it did. You get an error for some arbitrary column, but not all columns; and the error does not (as far as I know) actually contain the column that failed. > Personnaly I'd do it in the app since it's something that doesn't > affect the integrity of the data. If I go in and manually change > someone's password to something not following that rule it's not going > to affect anything. I agree with you and I don't; as it stands now, it's too hard to implement validation in the database alone, for the reasons I stated earlier. But I would love for it to be possible, so that I can be sure that not even plain SQL can screw up the data. Alexander.
L. Berger wrote: > On May 23, 8:47 am, "EMS Database Management Solutions > (SQLManager.net)" <sqlmana...@gmail.com> wrote: >> We, here at EMS Database Management Solutions, are pleased to announce >> SQL Manager 2007 for PostgreSQL - the new major version of the >> powerful PostgreSQL administration and development tool! >> >> You can download SQL Manager 2007 for PostgreSQL at:http://www.sqlmanager.net/products/postgresql/manager/download >> >> You can purchase SQL Manager 2007 for PostgreSQL at:http://www.sqlmanager.net/products/postgresql/manager/buy >> >> What's new in SQL Manager 2007 for PostgreSQL? > > <snip> > > > Thanks for this, but is there any plan to launch something like this > for use on Linux admin servers? Something that I could install on a > server, and perhaps work with a web interface? I would love some > recommendations. Maybe I'm a bit old fashioned, but I would never ever consider graphical frontends for a server. I mean, ssh -L and local GUI client work very well here. Even web based stuff seems dangerous (if not just for educational purposes) Regards Tino
> P.S. it's not the "the cloud" anymore, it's "the tubes". It was always tubes. The cloud was just a convenient simplification for the technically declined ;-) -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
On Sun, 2007-06-03 at 22:54 +0200, Alexander Staubo wrote: > I agree with you and I don't; as it stands now, it's too hard to > implement validation in the database alone, for the reasons I stated > earlier. But I would love for it to be possible, so that I can be sure > that not even plain SQL can screw up the data. You're blurring the line between an RDBMS and an application. Applications errors and database errors do not have a one-to-one mapping, although they do usually overlap. There are times when one database error maps onto several possible user-level errors; and when many database errors map onto the same user-level error; and when one database error does not cause any user-level error; and when something that is a user-level error might not have a matching constraint in the database at all. Trying to equate the two concepts is a bad idea. The application has much more information about the user and the context of the error that the database shouldn't have. For instance, the language that the user speaks might affect the error message. Or, there may be two possible user interface actions that result in the same constaint violation. For instance if you have a two-column unique constraint, perhaps there is one interface to change one column and one another. But you might want to return a different error to the user that makes sense in the context of which value they tried to change. A database error doesn't even always need to be propogated back to the user. If so, there would be no need for SAVEPOINTs and nobody would use ISOLATION LEVEL SERIALIZABLE (not directly related to constraints, but can cause an error just the same). Some user errors don't have a corresponding database constriant at all. For instance, how about a "re-type your password here" field? That should cause an error if it doesn't match the "password" field, but the database would have no matching constraint. Regards, Jeff Davis
On 6/4/07, Jeff Davis <pgsql@j-davis.com> wrote: > On Sun, 2007-06-03 at 22:54 +0200, Alexander Staubo wrote: > > I agree with you and I don't; as it stands now, it's too hard to > > implement validation in the database alone, for the reasons I stated > > earlier. But I would love for it to be possible, so that I can be sure > > that not even plain SQL can screw up the data. > > You're blurring the line between an RDBMS and an application. > Applications errors and database errors do not have a one-to-one > mapping, although they do usually overlap. True, and when they overlap you tend to want to describe the validation errors in one place, not two -- either the database or the app, not both. Relational databases have traditionally argued that these rules should be in the former, so that there's one layer through which every single change has to go. > There are times when one database error maps onto several possible > user-level errors; and when many database errors map onto the same > user-level error; and when one database error does not cause any > user-level error; and when something that is a user-level error might > not have a matching constraint in the database at all. Trying to equate > the two concepts is a bad idea. I agree. In my experience, however, the best kind of data model is the one that is immediately mappable to user-level concepts -- to human concepts. A "user" relation has attributes like "name", "birth_date", etc. If you manage to keep the model flat and friendly enough, you can map the attributes to forms and translate attribute-level errors directly to form error messages. In the cases where a user-level attribute is represented by a set of columns, or a referenced relation, or similar, you provide simple shims that translate between them. For example, you probably want to store date-time attributes as a single "timestamp with timezone" column, but offer two fields to the user, one for the date and for the time. With Rails this kind of shim is simple: class User < ActiveRecord::Base ... validates_each :human_birth_date do |record, user, value| record.errors.add(attr, "Bad date") unless MyDateParser.valid?(value) end def human_birth_date birth_datetime.strftime("%Y-%m-d") end def human_birth_date=(date) year, month, day = MyDateParser.parse(date) birth_datetime = Time.local(year, month, day, birth_datetime.hour, birth_datetime.minute) end end With a well-designed, normalized schema, mapping relations and their attributes to user input is very easy. I would argue that if mapping is a problem, your schema is probably to blame. > The application has much more information about the user and the context > of the error that the database shouldn't have. For instance, the > language that the user speaks might affect the error message. Localization is easily accomplished by piping the error message through gettext. > Some user errors don't have a corresponding database constriant at all. > For instance, how about a "re-type your password here" field? That > should cause an error if it doesn't match the "password" field, but the > database would have no matching constraint. That's a user-interface detail, and not a data model detail; a re-typed password has no database counterpart. I am speaking purely about invariant constraints on the data itself. Alexander.
On Mon, 2007-06-04 at 00:51 +0200, Alexander Staubo wrote: > True, and when they overlap you tend to want to describe the > validation errors in one place, not two -- either the database or the > app, not both. Relational databases have traditionally argued that > these rules should be in the former, so that there's one layer through > which every single change has to go. > I disagree here. You often _do_ want to describe some types of errors twice. You check the same thing in different ways at different points in the code, and that isolates errors and assures developers that certain assumptions are safe. In the database world, it's particularly important to use declarative constraints. If developer A inserts bad data and developer B uses that bad data, it could compound the problem and yet remain invisible until the problem is almost impossible to debug. Constraints assure the developers that they are starting with some known state. Applications should check for inconsistencies when it makes sense, as well. Every important API that I can think of checks the input, and reports some kind of useful error when the assumptions are violated. Every system call has all sorts of useful error codes. For example, read() can return EBADF. Nobody thinks "Hey, I'll send a random integer for the file descriptor", but I'd be willing to bet that the error condition has been reached by accident before, and probably saved someone a lot of time versus just filling the target buffer with random bytes and returning success. I would argue it's more important in a database, because the error conditions can persist for a longer period of time and cause more damage in the process, but the idea is the same. > I agree. In my experience, however, the best kind of data model is the > one that is immediately mappable to user-level concepts -- to human > concepts. A "user" relation has attributes like "name", "birth_date", > etc. If you manage to keep the model flat and friendly enough, you can > map the attributes to forms and translate attribute-level errors > directly to form error messages. > I think you're oversimplifying. What you say works when user input is a separate, contained, isolated chunk of data. In that case, any error is only related to the self-consistency of the input, and can easily be mapped back to a user-level error. However, it breaks down when you have constraints on the interrelationships between pieces of data. These interrelationships are what can be broken from multiple points in the application code, and there is no way to map backwards from the constraint violation to a specific user error. Hence, the application must translate. Try to imagine some of the complexities in a scheduling system, and what kind of constraints that might involve. Then, think about how some of the same constraints might be broken in very different ways. Time conflicts could come about either by participants overscheduling themselves, or by the event itself shifting in time such that some participants are overscheduled. Perhaps someone tries to sign up for an event that's already full, or perhaps the venue moves to a new location with a lower capacity. I can't think of any way to map backwards from the constraint violation to the user level error. You could probably imagine similar problems with an inventory system. > > The application has much more information about the user and the context > > of the error that the database shouldn't have. For instance, the > > language that the user speaks might affect the error message. > > Localization is easily accomplished by piping the error message through gettext. > And what about the two-column unique index that can be violated from two different aspects of the UI? You only get one database error, but you really should present two different errors to the user. Any time that a constraint can be violated through two completely different paths, your one-to-one constraint-to-application-error breaks down. The root of the issue is that the database knows that an inconsistent state has been reached, it does not know _how_, nor should it. The how might change as new code is added or perhaps as new bugs are introduced. Constraints in an RDBMS should be declarative which is very important (you don't need to prove that a column is always in a unique state, you look, and it says it is unique). You can add procedural code to an RDBMS, but you can end up making your database your application that way. User-level errors are heavily dependent on _how_ the error occurred. The database doesn't know this, so the constraint violation message shouldn't presume to know how it happened. I'll use the analogy to read() again: who passes EBADF back to the user directly? Does that mean it's a bad API? No, it just means it had no idea why you passed it a bad file descriptor, but it knows it's bad, and it tells the caller. Similarly with exceptions in java/ruby/python: you shouldn't pass those exceptions back to the user. Regards, Jeff Davis
Alexander Staubo schrieb: > On 6/1/07, Andrew Sullivan <ajs@crankycanuck.ca> wrote: >> These are all different solutions to different problems, so it's not >> surprising that they look different. This was the reason I asked, >> "What is the problem you are trying to solve?" > > You mean aside from the obvious one, scalability? > > The databases is becoming a bottleneck for a lot of so-called "Web > 2.0" apps which use a shared-nothing architecture (such as Rails, > Django or PHP) in conjunction with a database. Lots of ad-hoc database > queries that come not just from web hits but also from somewhat > awkwardly fitting an object model onto a relational database. ... > the single server, but I would hope that there would, at some point, > appear a solution that could enable a database to scale horizontally > with minimal impact on the application. In light of this need, I think > we could be more productive by rephrasing the question "how/when we > can implement multimaster replication?" as "how/when can we implement > horizontal scaling?". > > As it stands today, horizontally partitioning a database into multiple > separate "shards" is incredibly invasive on the application > architecture, and typically relies on brittle and non-obvious hacks > such as configuring sequence generators with staggered starting > numbers, omitting referential integrity constraints, sacrificing > transactional semantics, and moving query aggregation into the app > level. On top of this, dumb caches such as Memcached are typically Did you have a look at BizgresMPP? Especially for your shared-nothing approach it seems to be a better solution then just replicating everything. Regards Tino
On 6/4/07, Tino Wildenhain <tino@wildenhain.de> wrote: > Did you have a look at BizgresMPP? > > Especially for your shared-nothing approach it seems to be a better > solution then just replicating everything. I had completely forgotten about that one. Bizgres.org seems down at the moment, but looking at their whitepaper, the architecture looks similar to that of pgpool-II; from what I can see, you connect through a proxy that transparently partitions data across multiple PostgreSQL database instances, and then queries them in parallel and merges the query results. Looks promising. Alexander.
On Fri, 2007-06-01 at 22:13 -0500, Ron Johnson wrote: > On 06/01/07 19:29, Jeff Davis wrote: > [snip] > > You shouldn't use a volatile function in a check constraint. Use a > > trigger instead, but even that is unlikely to work for enforcing > > constraints correctly. > > > > In general, for partitioning, you have to make some sacrifices. It's > > very challenging (and/or expensive) to ensure uniqueness across > > partitions. > > Are partitioned databases the same as federated databases? > I think that usually people refer to a table that is split to be partitioned (whether across servers or within a single server). I think federated databases are where various parts of the database are split across servers, but tables may be intact. That's my own understanding of the terminology. Regards, Jeff Davis
On Sun, Jun 03, 2007 at 07:47:04PM +0200, Alexander Staubo wrote: > For example, part of the point of having validations declared on the > model is so that you can raise user-friendly errors (and pipe them > through gettext for localization) such as "Your password must be at > least 4 characters long and contain only letters and digits". If anyone is interested, we have sort of re-implemented gettext in SQL: http://cvs.savannah.gnu.org/viewvc/gnumed/gnumed/server/sql/?root=gnumed (see the gmI18n-*.sql stuff) In essence it enables you to write queries like so select pk, name, _(name) as l10n_name from states which will give you a localized name for states.name in l10n_name. Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346