Thread: Interesting Slashdot article that attracted a lot of comments about pg

Interesting Slashdot article that attracted a lot of comments about pg

From
Gavin Flower
Date:
Hi,

I think the article and comments may be of interest to pg developer, and
other pg users.

Mostly the pg comments are positive and well informed.

I have added a couple of comments myself about pg.

I would be interested in the developers' comments about the XID problem
(not that it is likely to affect my usage of pg!).

http://developers.slashdot.org/story/11/07/09/1256241/Facebook-Trapped-In-MySQL-a-Fate-Worse-Than-Death
[...]
Re:Oracle vs Facebook?
by evilviper (135110) Alter Relationship on 2011-07-10 14:06 (#36708666)
Journal

I agree that Postgres is vastly superior to MySQL and competitive with
enterprise-class databases. We're using Postgres for databases in excess
of a terabyte ourselves, for large-scale production purposes, so I know
it works... HOWEVER, I think the XID issue is significant, not getting
any developer attention, and is something you're just about guaranteed
to run-into with terabyte databases.

You see, even if you run Postgres on a 64-bit platform, you're limited
to XIDs of 2^31, or 2 billion rows. Now, to prevent this causing a
problem, Postgres' VACUUM process will start running when you get close
to 1 billion, killing your performance. Of course there are a dozen
possible ways to workaround this, but none are trivial or work perfectly
and consistently. This is postgres' biggest scalability limitation,
ahead of even the imperfect replication options available.
[...]

On Sun, Jul 10, 2011 at 11:30 AM, Gavin Flower
<GavinFlower@archidevsys.co.nz> wrote:
> On 10/07/11 21:51, Simon Riggs wrote:
>>
>> On Sun, Jul 10, 2011 at 5:35 AM, Gavin Flower
>> <GavinFlower@archidevsys.co.nz>  wrote:
>>
>>> I would be interested in the developers' comments about the XID problem
>>> (not
>>> that it is likely to affect my usage of pg!).
>>>
>>>
>>> http://developers.slashdot.org/story/11/07/09/1256241/Facebook-Trapped-In-MySQL-a-Fate-Worse-Than-Death
>>> [...]
>>> Re:Oracle vs Facebook?
>>> by evilviper (135110) Alter Relationship on 2011-07-10 14:06 (#36708666)
>>> Journal
>>>
>>> I agree that Postgres is vastly superior to MySQL and competitive with
>>> enterprise-class databases. We're using Postgres for databases in excess
>>> of
>>> a terabyte ourselves, for large-scale production purposes, so I know it
>>> works... HOWEVER, I think the XID issue is significant, not getting any
>>> developer attention, and is something you're just about guaranteed to
>>> run-into with terabyte databases.
>>>
>>> You see, even if you run Postgres on a 64-bit platform, you're limited to
>>> XIDs of 2^31, or 2 billion rows. Now, to prevent this causing a problem,
>>> Postgres' VACUUM process will start running when you get close to 1
>>> billion,
>>> killing your performance. Of course there are a dozen possible ways to
>>> workaround this, but none are trivial or work perfectly and consistently.
>>> This is postgres' biggest scalability limitation, ahead of even the
>>> imperfect replication options available.
>>> [...]
>>>
>> I've posted the following response. Thanks for bringing this to the
>> attention of the list.
>>
>>
>> Thanks for your positive comments about PostgreSQL. I wanted to
>> respond in detail to the points made about limitations, which aren't
>> correct.
>>
>> You are correct that PostgreSQL uses only 2^31 XIDs or transaction
>> ids. That means every billion write transactions PostgreSQL needs to
>> perform a VACUUM operation. You mention that kills performance.
>> PostgreSQL allows you to run this as a low priority background job,
>> which continues to allow reads and writes, yet runs at a deliberately
>> slower pace to avoid impact on system resources. All of that is
>> automatic, with the VACUUM task even cancelling itself if it really
>> does get in the way of any user request. The way this works has
>> received significant improvements in performance and operability in
>> every annual release for many, many years. The benefit of all of this
>> is that read queries aren't blocked by write queries - an incredibly
>> important feature for general purpose computing.
>>
>> The number of XIDs has no effect whatsoever on the number of rows in a
>> database. There is no limit of 2 billion rows - in fact there is no
>> limit to the number of rows in a table at all. Currently, tables are
>> limited to 32TB, but we can increase that if people find that a
>> limiting factor. There *is* an object PostgreSQL called an OID, that
>> acted somewhat similar to that described by the poster but any
>> limitation there was removed more than 7 years ago.  Many PostgreSQL
>> users have installations above a Terabyte in size.
>>
>> Replication has been one of the areas that has received significant
>> and consistent developer attention across many years. Starting in 9.0
>> we have native binary replication, with options that make it the full
>> equivalent of the best options in Oracle Enterprise Edition, including
>> pay for options of Active Data Guard. Synchronous replication will be
>> available in PostgreSQL 9.1, out real soon now. PostgreSQL 9.2 will
>> have cascaded replication, multiple performance options and other
>> features, many of which are in advance of what is available in pay-for
>> commercial systems. We think this area so important that there is a
>> dedicated conference on this aspect of PostgreSQL.
>>
>
> Hi Simon,
>
> I like your response.
>
> Your post will start of at level 1, if you log in - otherwise, as an  you
> start at zero, which is a level filtered out by many people.
>
> You can see my postings there by searching for 'Nivag064'.
>
> There are some negative comments (at least one is clear flame bait!), I
> quote one such below, could someone more knowledgeable please respond.
>
> [begin quote]
> Re:Oracle vs Facebook?
> by Skal Tura (595728) Alter Relationship <[aleksi] [at] [nucode.fi]> on
> 2011-07-10 11:53 (#36708156) Homepage
>
> If you are worth your weight as a developer, you've already done model
> isolation layer where your all queries would be, thus it's not that hard to
> rewrite the queries. If this was to be expected, you've made it far simpler
> already.
>
> In any case, i don't see the anti-MySQL points. I've tried Postgres once -
> that was enough, i'm not going back to it. It was weird as shit, required
> some weird conundrums for permissions and DBs changes, didn't seem to be
> properly isolating but more like hacked together to support more than 1 DB,
> with 1 set of perms per server - no, i don't mean it didn't support, that's
> just the way it felt.
>
> If you got 2 choices, otherwise equal for what you need now - always choose
> the simpler one. Postgres definitely is not the simpler one.
> [end quote]


LOL. When you've been standing on your head for years, coming the
right way up is definitely going to feel weird.

We must acknowledge everybody's right to prefer something else. All we
can do is make sure that rational, fact-based discussions contain
balanced and accurate facts. People who make arbitrary or too
personally-based decisions don't tend to be in charge of decision
making, so we should ignore the flames, as most people do.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

On Sun, Jul 10, 2011 at 5:35 AM, Gavin Flower
<GavinFlower@archidevsys.co.nz> wrote:

> I would be interested in the developers' comments about the XID problem (not
> that it is likely to affect my usage of pg!).
>
> http://developers.slashdot.org/story/11/07/09/1256241/Facebook-Trapped-In-MySQL-a-Fate-Worse-Than-Death
> [...]
> Re:Oracle vs Facebook?
> by evilviper (135110) Alter Relationship on 2011-07-10 14:06 (#36708666)
> Journal
>
> I agree that Postgres is vastly superior to MySQL and competitive with
> enterprise-class databases. We're using Postgres for databases in excess of
> a terabyte ourselves, for large-scale production purposes, so I know it
> works... HOWEVER, I think the XID issue is significant, not getting any
> developer attention, and is something you're just about guaranteed to
> run-into with terabyte databases.
>
> You see, even if you run Postgres on a 64-bit platform, you're limited to
> XIDs of 2^31, or 2 billion rows. Now, to prevent this causing a problem,
> Postgres' VACUUM process will start running when you get close to 1 billion,
> killing your performance. Of course there are a dozen possible ways to
> workaround this, but none are trivial or work perfectly and consistently.
> This is postgres' biggest scalability limitation, ahead of even the
> imperfect replication options available.
> [...]
>

I've posted the following response. Thanks for bringing this to the
attention of the list.


Thanks for your positive comments about PostgreSQL. I wanted to
respond in detail to the points made about limitations, which aren't
correct.

You are correct that PostgreSQL uses only 2^31 XIDs or transaction
ids. That means every billion write transactions PostgreSQL needs to
perform a VACUUM operation. You mention that kills performance.
PostgreSQL allows you to run this as a low priority background job,
which continues to allow reads and writes, yet runs at a deliberately
slower pace to avoid impact on system resources. All of that is
automatic, with the VACUUM task even cancelling itself if it really
does get in the way of any user request. The way this works has
received significant improvements in performance and operability in
every annual release for many, many years. The benefit of all of this
is that read queries aren't blocked by write queries - an incredibly
important feature for general purpose computing.

The number of XIDs has no effect whatsoever on the number of rows in a
database. There is no limit of 2 billion rows - in fact there is no
limit to the number of rows in a table at all. Currently, tables are
limited to 32TB, but we can increase that if people find that a
limiting factor. There *is* an object PostgreSQL called an OID, that
acted somewhat similar to that described by the poster but any
limitation there was removed more than 7 years ago.  Many PostgreSQL
users have installations above a Terabyte in size.

Replication has been one of the areas that has received significant
and consistent developer attention across many years. Starting in 9.0
we have native binary replication, with options that make it the full
equivalent of the best options in Oracle Enterprise Edition, including
pay for options of Active Data Guard. Synchronous replication will be
available in PostgreSQL 9.1, out real soon now. PostgreSQL 9.2 will
have cascaded replication, multiple performance options and other
features, many of which are in advance of what is available in pay-for
commercial systems. We think this area so important that there is a
dedicated conference on this aspect of PostgreSQL.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Interesting Slashdot article that attracted a lot of comments about pg

From
Gavin Flower
Date:
On 10/07/11 21:51, Simon Riggs wrote:
> On Sun, Jul 10, 2011 at 5:35 AM, Gavin Flower
> <GavinFlower@archidevsys.co.nz>  wrote:
>
>> I would be interested in the developers' comments about the XID problem (not
>> that it is likely to affect my usage of pg!).
>>
>> http://developers.slashdot.org/story/11/07/09/1256241/Facebook-Trapped-In-MySQL-a-Fate-Worse-Than-Death
>> [...]
>> Re:Oracle vs Facebook?
>> by evilviper (135110) Alter Relationship on 2011-07-10 14:06 (#36708666)
>> Journal
>>
>> I agree that Postgres is vastly superior to MySQL and competitive with
>> enterprise-class databases. We're using Postgres for databases in excess of
>> a terabyte ourselves, for large-scale production purposes, so I know it
>> works... HOWEVER, I think the XID issue is significant, not getting any
>> developer attention, and is something you're just about guaranteed to
>> run-into with terabyte databases.
>>
>> You see, even if you run Postgres on a 64-bit platform, you're limited to
>> XIDs of 2^31, or 2 billion rows. Now, to prevent this causing a problem,
>> Postgres' VACUUM process will start running when you get close to 1 billion,
>> killing your performance. Of course there are a dozen possible ways to
>> workaround this, but none are trivial or work perfectly and consistently.
>> This is postgres' biggest scalability limitation, ahead of even the
>> imperfect replication options available.
>> [...]
>>
> I've posted the following response. Thanks for bringing this to the
> attention of the list.
>
>
> Thanks for your positive comments about PostgreSQL. I wanted to
> respond in detail to the points made about limitations, which aren't
> correct.
>
> You are correct that PostgreSQL uses only 2^31 XIDs or transaction
> ids. That means every billion write transactions PostgreSQL needs to
> perform a VACUUM operation. You mention that kills performance.
> PostgreSQL allows you to run this as a low priority background job,
> which continues to allow reads and writes, yet runs at a deliberately
> slower pace to avoid impact on system resources. All of that is
> automatic, with the VACUUM task even cancelling itself if it really
> does get in the way of any user request. The way this works has
> received significant improvements in performance and operability in
> every annual release for many, many years. The benefit of all of this
> is that read queries aren't blocked by write queries - an incredibly
> important feature for general purpose computing.
>
> The number of XIDs has no effect whatsoever on the number of rows in a
> database. There is no limit of 2 billion rows - in fact there is no
> limit to the number of rows in a table at all. Currently, tables are
> limited to 32TB, but we can increase that if people find that a
> limiting factor. There *is* an object PostgreSQL called an OID, that
> acted somewhat similar to that described by the poster but any
> limitation there was removed more than 7 years ago.  Many PostgreSQL
> users have installations above a Terabyte in size.
>
> Replication has been one of the areas that has received significant
> and consistent developer attention across many years. Starting in 9.0
> we have native binary replication, with options that make it the full
> equivalent of the best options in Oracle Enterprise Edition, including
> pay for options of Active Data Guard. Synchronous replication will be
> available in PostgreSQL 9.1, out real soon now. PostgreSQL 9.2 will
> have cascaded replication, multiple performance options and other
> features, many of which are in advance of what is available in pay-for
> commercial systems. We think this area so important that there is a
> dedicated conference on this aspect of PostgreSQL.
>

Hi Simon,

I like your response.

Your post will start of at level 1, if you log in - otherwise, as an
you start at zero, which is a level filtered out by many people.

You can see my postings there by searching for 'Nivag064'.

There are some negative comments (at least one is clear flame bait!), I
quote one such below, could someone more knowledgeable please respond.

[begin quote]
Re:Oracle vs Facebook?
by Skal Tura (595728) Alter Relationship <[aleksi] [at] [nucode.fi]> on
2011-07-10 11:53 (#36708156) Homepage

If you are worth your weight as a developer, you've already done model
isolation layer where your all queries would be, thus it's not that hard
to rewrite the queries. If this was to be expected, you've made it far
simpler already.

In any case, i don't see the anti-MySQL points. I've tried Postgres once
- that was enough, i'm not going back to it. It was weird as shit,
required some weird conundrums for permissions and DBs changes, didn't
seem to be properly isolating but more like hacked together to support
more than 1 DB, with 1 set of perms per server - no, i don't mean it
didn't support, that's just the way it felt.

If you got 2 choices, otherwise equal for what you need now - always
choose the simpler one. Postgres definitely is not the simpler one.
[end quote]


Cheers,
Gavin

Re: Interesting Slashdot article that attracted a lot of comments about pg

From
Gavin Flower
Date:
On 10/07/11 22:47, Simon Riggs wrote:
> On Sun, Jul 10, 2011 at 11:30 AM, Gavin Flower
> <GavinFlower@archidevsys.co.nz>  wrote:
>> On 10/07/11 21:51, Simon Riggs wrote:
>>> On Sun, Jul 10, 2011 at 5:35 AM, Gavin Flower
>>> <GavinFlower@archidevsys.co.nz>    wrote:
>>>
>>>> I would be interested in the developers' comments about the XID problem
>>>> (not
>>>> that it is likely to affect my usage of pg!).
>>>>
>>>>
>>>> http://developers.slashdot.org/story/11/07/09/1256241/Facebook-Trapped-In-MySQL-a-Fate-Worse-Than-Death
>>>> [...]
>>>> Re:Oracle vs Facebook?
>>>> by evilviper (135110) Alter Relationship on 2011-07-10 14:06 (#36708666)
>>>> Journal
>>>>
>>>> I agree that Postgres is vastly superior to MySQL and competitive with
>>>> enterprise-class databases. We're using Postgres for databases in excess
>>>> of
>>>> a terabyte ourselves, for large-scale production purposes, so I know it
>>>> works... HOWEVER, I think the XID issue is significant, not getting any
>>>> developer attention, and is something you're just about guaranteed to
>>>> run-into with terabyte databases.
>>>>
>>>> You see, even if you run Postgres on a 64-bit platform, you're limited to
>>>> XIDs of 2^31, or 2 billion rows. Now, to prevent this causing a problem,
>>>> Postgres' VACUUM process will start running when you get close to 1
>>>> billion,
>>>> killing your performance. Of course there are a dozen possible ways to
>>>> workaround this, but none are trivial or work perfectly and consistently.
>>>> This is postgres' biggest scalability limitation, ahead of even the
>>>> imperfect replication options available.
>>>> [...]
>>>>
>>> I've posted the following response. Thanks for bringing this to the
>>> attention of the list.
>>>
>>>
>>> Thanks for your positive comments about PostgreSQL. I wanted to
>>> respond in detail to the points made about limitations, which aren't
>>> correct.
>>>
>>> You are correct that PostgreSQL uses only 2^31 XIDs or transaction
>>> ids. That means every billion write transactions PostgreSQL needs to
>>> perform a VACUUM operation. You mention that kills performance.
>>> PostgreSQL allows you to run this as a low priority background job,
>>> which continues to allow reads and writes, yet runs at a deliberately
>>> slower pace to avoid impact on system resources. All of that is
>>> automatic, with the VACUUM task even cancelling itself if it really
>>> does get in the way of any user request. The way this works has
>>> received significant improvements in performance and operability in
>>> every annual release for many, many years. The benefit of all of this
>>> is that read queries aren't blocked by write queries - an incredibly
>>> important feature for general purpose computing.
>>>
>>> The number of XIDs has no effect whatsoever on the number of rows in a
>>> database. There is no limit of 2 billion rows - in fact there is no
>>> limit to the number of rows in a table at all. Currently, tables are
>>> limited to 32TB, but we can increase that if people find that a
>>> limiting factor. There *is* an object PostgreSQL called an OID, that
>>> acted somewhat similar to that described by the poster but any
>>> limitation there was removed more than 7 years ago.  Many PostgreSQL
>>> users have installations above a Terabyte in size.
>>>
>>> Replication has been one of the areas that has received significant
>>> and consistent developer attention across many years. Starting in 9.0
>>> we have native binary replication, with options that make it the full
>>> equivalent of the best options in Oracle Enterprise Edition, including
>>> pay for options of Active Data Guard. Synchronous replication will be
>>> available in PostgreSQL 9.1, out real soon now. PostgreSQL 9.2 will
>>> have cascaded replication, multiple performance options and other
>>> features, many of which are in advance of what is available in pay-for
>>> commercial systems. We think this area so important that there is a
>>> dedicated conference on this aspect of PostgreSQL.
>>>
>> Hi Simon,
>>
>> I like your response.
>>
>> Your post will start of at level 1, if you log in - otherwise, as an  you
>> start at zero, which is a level filtered out by many people.
>>
>> You can see my postings there by searching for 'Nivag064'.
>>
>> There are some negative comments (at least one is clear flame bait!), I
>> quote one such below, could someone more knowledgeable please respond.
>>
>> [begin quote]
>> Re:Oracle vs Facebook?
>> by Skal Tura (595728) Alter Relationship<[aleksi] [at] [nucode.fi]>  on
>> 2011-07-10 11:53 (#36708156) Homepage
>>
>> If you are worth your weight as a developer, you've already done model
>> isolation layer where your all queries would be, thus it's not that hard to
>> rewrite the queries. If this was to be expected, you've made it far simpler
>> already.
>>
>> In any case, i don't see the anti-MySQL points. I've tried Postgres once -
>> that was enough, i'm not going back to it. It was weird as shit, required
>> some weird conundrums for permissions and DBs changes, didn't seem to be
>> properly isolating but more like hacked together to support more than 1 DB,
>> with 1 set of perms per server - no, i don't mean it didn't support, that's
>> just the way it felt.
>>
>> If you got 2 choices, otherwise equal for what you need now - always choose
>> the simpler one. Postgres definitely is not the simpler one.
>> [end quote]
>
> LOL. When you've been standing on your head for years, coming the
> right way up is definitely going to feel weird.
>
> We must acknowledge everybody's right to prefer something else. All we
> can do is make sure that rational, fact-based discussions contain
> balanced and accurate facts. People who make arbitrary or too
> personally-based decisions don't tend to be in charge of decision
> making, so we should ignore the flames, as most people do.
>
Yeah - I got a bit carried away!  I've spent more time replying to
postings in that slashdot article than any other slashdot article, by at
least an order of magnitude - ever.  My user name thee is 'Nivag064'.

The impression I have from that article, is that most people prefer
PostgreSQL, on the basis of performance, ease of use, and cost -
compared to Oracle, SQL Server, and MySQL.

Though thee appears to be no actual evidence, that FaceBook is seriously
considering changing thee RDBMS!