Re: Interesting Slashdot article that attracted a lot of comments about pg - Mailing list pgsql-advocacy

From Gavin Flower
Subject Re: Interesting Slashdot article that attracted a lot of comments about pg
Date
Msg-id 4E1E3052.9080606@archidevsys.co.nz
Whole thread Raw
In response to Re: Interesting Slashdot article that attracted a lot of comments about pg  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-advocacy
On 10/07/11 22:47, Simon Riggs wrote:
> On Sun, Jul 10, 2011 at 11:30 AM, Gavin Flower
> <GavinFlower@archidevsys.co.nz>  wrote:
>> On 10/07/11 21:51, Simon Riggs wrote:
>>> On Sun, Jul 10, 2011 at 5:35 AM, Gavin Flower
>>> <GavinFlower@archidevsys.co.nz>    wrote:
>>>
>>>> I would be interested in the developers' comments about the XID problem
>>>> (not
>>>> that it is likely to affect my usage of pg!).
>>>>
>>>>
>>>> http://developers.slashdot.org/story/11/07/09/1256241/Facebook-Trapped-In-MySQL-a-Fate-Worse-Than-Death
>>>> [...]
>>>> Re:Oracle vs Facebook?
>>>> by evilviper (135110) Alter Relationship on 2011-07-10 14:06 (#36708666)
>>>> Journal
>>>>
>>>> I agree that Postgres is vastly superior to MySQL and competitive with
>>>> enterprise-class databases. We're using Postgres for databases in excess
>>>> of
>>>> a terabyte ourselves, for large-scale production purposes, so I know it
>>>> works... HOWEVER, I think the XID issue is significant, not getting any
>>>> developer attention, and is something you're just about guaranteed to
>>>> run-into with terabyte databases.
>>>>
>>>> You see, even if you run Postgres on a 64-bit platform, you're limited to
>>>> XIDs of 2^31, or 2 billion rows. Now, to prevent this causing a problem,
>>>> Postgres' VACUUM process will start running when you get close to 1
>>>> billion,
>>>> killing your performance. Of course there are a dozen possible ways to
>>>> workaround this, but none are trivial or work perfectly and consistently.
>>>> This is postgres' biggest scalability limitation, ahead of even the
>>>> imperfect replication options available.
>>>> [...]
>>>>
>>> I've posted the following response. Thanks for bringing this to the
>>> attention of the list.
>>>
>>>
>>> Thanks for your positive comments about PostgreSQL. I wanted to
>>> respond in detail to the points made about limitations, which aren't
>>> correct.
>>>
>>> You are correct that PostgreSQL uses only 2^31 XIDs or transaction
>>> ids. That means every billion write transactions PostgreSQL needs to
>>> perform a VACUUM operation. You mention that kills performance.
>>> PostgreSQL allows you to run this as a low priority background job,
>>> which continues to allow reads and writes, yet runs at a deliberately
>>> slower pace to avoid impact on system resources. All of that is
>>> automatic, with the VACUUM task even cancelling itself if it really
>>> does get in the way of any user request. The way this works has
>>> received significant improvements in performance and operability in
>>> every annual release for many, many years. The benefit of all of this
>>> is that read queries aren't blocked by write queries - an incredibly
>>> important feature for general purpose computing.
>>>
>>> The number of XIDs has no effect whatsoever on the number of rows in a
>>> database. There is no limit of 2 billion rows - in fact there is no
>>> limit to the number of rows in a table at all. Currently, tables are
>>> limited to 32TB, but we can increase that if people find that a
>>> limiting factor. There *is* an object PostgreSQL called an OID, that
>>> acted somewhat similar to that described by the poster but any
>>> limitation there was removed more than 7 years ago.  Many PostgreSQL
>>> users have installations above a Terabyte in size.
>>>
>>> Replication has been one of the areas that has received significant
>>> and consistent developer attention across many years. Starting in 9.0
>>> we have native binary replication, with options that make it the full
>>> equivalent of the best options in Oracle Enterprise Edition, including
>>> pay for options of Active Data Guard. Synchronous replication will be
>>> available in PostgreSQL 9.1, out real soon now. PostgreSQL 9.2 will
>>> have cascaded replication, multiple performance options and other
>>> features, many of which are in advance of what is available in pay-for
>>> commercial systems. We think this area so important that there is a
>>> dedicated conference on this aspect of PostgreSQL.
>>>
>> Hi Simon,
>>
>> I like your response.
>>
>> Your post will start of at level 1, if you log in - otherwise, as an  you
>> start at zero, which is a level filtered out by many people.
>>
>> You can see my postings there by searching for 'Nivag064'.
>>
>> There are some negative comments (at least one is clear flame bait!), I
>> quote one such below, could someone more knowledgeable please respond.
>>
>> [begin quote]
>> Re:Oracle vs Facebook?
>> by Skal Tura (595728) Alter Relationship<[aleksi] [at] [nucode.fi]>  on
>> 2011-07-10 11:53 (#36708156) Homepage
>>
>> If you are worth your weight as a developer, you've already done model
>> isolation layer where your all queries would be, thus it's not that hard to
>> rewrite the queries. If this was to be expected, you've made it far simpler
>> already.
>>
>> In any case, i don't see the anti-MySQL points. I've tried Postgres once -
>> that was enough, i'm not going back to it. It was weird as shit, required
>> some weird conundrums for permissions and DBs changes, didn't seem to be
>> properly isolating but more like hacked together to support more than 1 DB,
>> with 1 set of perms per server - no, i don't mean it didn't support, that's
>> just the way it felt.
>>
>> If you got 2 choices, otherwise equal for what you need now - always choose
>> the simpler one. Postgres definitely is not the simpler one.
>> [end quote]
>
> LOL. When you've been standing on your head for years, coming the
> right way up is definitely going to feel weird.
>
> We must acknowledge everybody's right to prefer something else. All we
> can do is make sure that rational, fact-based discussions contain
> balanced and accurate facts. People who make arbitrary or too
> personally-based decisions don't tend to be in charge of decision
> making, so we should ignore the flames, as most people do.
>
Yeah - I got a bit carried away!  I've spent more time replying to
postings in that slashdot article than any other slashdot article, by at
least an order of magnitude - ever.  My user name thee is 'Nivag064'.

The impression I have from that article, is that most people prefer
PostgreSQL, on the basis of performance, ease of use, and cost -
compared to Oracle, SQL Server, and MySQL.

Though thee appears to be no actual evidence, that FaceBook is seriously
considering changing thee RDBMS!

pgsql-advocacy by date:

Previous
From: Josh Berkus
Date:
Subject: Two PHP projects looking for PostgreSQL help
Next
From: Bruce Momjian
Date:
Subject: Additional Sao Paulo conference