Aberdeen Study on OS RDBMS: exceprts and breakdown. - Mailing list pgsql-advocacy
From | Robert Treat |
---|---|
Subject | Aberdeen Study on OS RDBMS: exceprts and breakdown. |
Date | |
Msg-id | 1080765857.6103.73.camel@camel Whole thread Raw |
Responses |
Re: Aberdeen Study on OS RDBMS: exceprts and breakdown.
Re: Aberdeen Study on OS RDBMS: exceprts and breakdown. Re: Aberdeen Study on OS RDBMS: exceprts and |
List | pgsql-advocacy |
The Aberdeen group recently released a study on open source databases, focusing on BerkleyDB, My$QL, and PostgreSQL. Sleepycat Software (the makers of BerkleyDB) have released a copy of the report available at: ttp://www.sleepycat.com/aberdeenwp (registration required). You can read the press release for yourselves, though I thought I would point out one item I thought was interesting; they refer to BerkleyDB as "the most widely used application-specific data management software in the world with over 200 million deployments". I have no reason to believe this isn't true, but I do notice that it does rub against my$ql's slogan of being "the worlds most popular open source database". Anyway, it is worth noting that the report is slanted to show BerkleyDB in a positive light, but I wanted to included some excerpts of the full report below that I think are relevant to our group: -- Motivations For Choosing Open Source Databases (For ISV's) * Control over maintenance and support * Availability of source code * Cost, such as for licenses and subsequent upgrades (Some cite development cost and TCO as well, but others find that open source database development cost and TCO are greater than for a proprietary alternative.) * Flexibility to add new features as needed (Note that users rarely change open source databases.) * Reliability -- In it's description of my$ql, it describes them as providing the following features (among others) "* MySQL is a full relational database with SQL support. * ACID (atomicity, consistency, isolation, and durability) compliance (That is, MySQL guarantees that some system failure will not corrupt the database and make it transactionally inconsistent.)" These two bullet-points are important for this group because one of the messages that we often use in comparisons between our systems is that my$ql is not a full relational database, and that it is not truly ACID compliant. While I tend to agree with those statements, obviously the general mindset does not agree with that. It mentions my$ql as lacking all the usual suspects like views/triggers etc... but also lists this: "* Incremental and online/parallel backup/restore" which I find a little bit interesting because afaik they do offer this as a proprietary add on solution. -- The following is their general description of PostgreSQL, my comments are surrounded by ## The PostgreSQL Open Source Database PostgreSQL is a full relational database with SQL support. Many users view PostgreSQL as more complex and powerful than MySQL, but much less powerful than an enterprise database. Interestingly, some knowledgeable MySQL users view MySQL as more scalable than PostgreSQL. # I think this links back to my$ql's ability to scale up on read only type database systems, and having some big names associated with this type of use (yahoo for instance). The other half being our legacy of being a "slow" database. # PostgreSQL offers the following key features beyond the relational “basics”: * Tablespace backups, i.e., the ability to back up a set of tables rather than the entire database — with limits * ODBC support * Online backup (with limits) * ACID compliance for transactions * Deadlock detection * Unicode support (non-U.S. character sets) * Views (read-only, i.e., with limits) * Stored procedures and triggers (with limits on programming-language support, including no support for Java) # I thought it was interesting in how many of ours were labeled "with limits" while the other descriptions did not have this so much... however I find this one especially interesting, since we have support for 10+ languages and the ability to add more. I think this stems from not having all of these languages available "out of the box", and potentially from the time frame they were studying in. # * Object table and data-type support PostgreSQL does not offer the following key features typically provided in enterprise databases: * Incremental and parallel backup/restore * Encryption (security) * Deadlock detection # but up above you said we did have it... ? I'd grant that it may not be as robust as some of the commercial guys, but it's there. # * Row-level locking (typically required by large packaged applications) — although PostgreSQL does offer alternative sub-table locking schemes # uh... we do have row level locking, and in fact we have better than row level locking # * Bit-mapped indexing (for large data warehouses) * A single GUI administrative interface # again... nothing packaged with the server itself, though we do have several options in this area # * View update/insert/delete # cough...rules..cough# PostgreSQL users indicate that improvements in some areas are contemplated, but the next version of PostgreSQL may come as late as two years from now, and the community that drives PostgreSQL development is not committing to any particular improvements. # uh... two years out... i couldn't come up with an explanation for that... well, many of these features are planned for in 7.5, and I wonder if it is worth coming up with a "response" to the information above # -- Key Characteristics of Enterprise Databases Scalability OLTP scalability technologies: · SMP support · Clustering support · Multithreading Decision-support scalability technologies: · Query optimization · Replication · Cost optimization · Bit-mapped indexing “Mixed” scalability technologies: · Stored procedures · Distributed database synchronization (2PC) · Running the application “inside the database” · Load balancing Robustness/administrative costs · Online, parallel load and backup/recovery, monitoring, and metadata management · “Zero-administration” tools and automated reorganization · Cross-database tools Flexibility · Standards support - SQL- Java/EJBs - XML· Complex-data-type support - Rich-media support - File support - Object and object-relational technology - Content management technology · Integration with infrastructure software - Application servers - Infrastructure APIs and component libraries Programmer-productivity support · High-level development environment support ·Java programming support ·Frameworks Of the above, the following four were listed as critical to distinguishing between high end and low end DBMS: 1. Stored procedures (and triggers) 2. Replication (and two-phase commit) 3. SMP (symmetric multiprocessing) support 4. Cost-based query optimization # Seems we are pretty darn close... but read on# Many are clearly deficient in at least the first three aforementioned technologies — they do not offer (or offer limited) stored procedures, do not offer two-phase commit, and do not offer exceptional multiprocessing support. However, interviewees’ perception is that in many cases, today’s open source databases are enterprise class in scalability and robustness. Moreover, like databases offered by lower profile proprietary suppliers, such as Pervasive and Progress, open source databases offer exceptional manageability, requiring little maintenance compared with major enterprise databases. -- Key Potential Applications for Open Source Databases * Financial -- Investment, credit card, bank, insurance * Telecommunications -- Telephone companies and PTTs, wireless suppliers * Retail -- Department store chains, restaurants * Manufacturing -- Car makers, such as General Motors and Ford; equipment makers, such as Caterpillar; semiconductor suppliers * Health care -- Hospitals and hospital chains * Oil/gas/aerospace -- Energy suppliers, such as Exxon Mobil; aerospace manufacturers, such as Boeing * Travel/entertainment -- Hotels; luxury cruise lines; television, movie, and music companies * Transportation -- Truck, train, and shipping companies * Government -- NSA, Department of Defense * Education -- Universities # It would be interesting have someone gather a comprehensive list of large businesses (global 5000 maybe?) that use postgresql in each of the major markets listed above # -- "Although some open source suppliers have fallen by the wayside as the Web frenzy subsided and they were forced to identify sources of revenue, it now appears clear that several open source databases — including MySQL, PostgreSQL, and Sleepycat Software’s Berkeley DB — are here for the long haul." I thought this was important to note since we've recently seen several articles that refer back to the demise of Great Bridge... -- In a who is using open source databases section, then mention "Most applications involved a substantial transaction stream of reads, with relatively few — but a significant number of — complex queries and updates." which would be a good indicator of why my$ql and berkelydb are so much more popular. -- when asked why the ISV's chose an OS DB "All interviewees immediately cited “price” (license cost) as the main reason." -- when asked what they would like to see from their current open source vendor... "One interviewee cited full-text indexing, one cited triggers and a “geographic information system (GIS),” one cited rollback, and one cited a GUI “control center.”"... seems they should all be looking at postgresql a little closer :-) Also in response to that question was this telling line.. "Two-phase commit is not available in PostgreSQL. Every PostgreSQL user requested this feature." ... makes you wish you email those folks about the patches that have been posted to -hackers :-\ -- finally, of the respondents in the study whom were using proprietary databases for the past several years, none expressed interest in trying an open source database package. -- Hope this was informative for everyone here, and this will hopefully help give some inspiration to new ideas on how we can push forward postgresql adoption in the future. Robert Treat -- Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
pgsql-advocacy by date: