Thread: RE: Is this still accurate?

RE: Is this still accurate?

From
"Moser, Glen G"
Date:

Can someone confirm the accuracy of the information found at https://www.postgresql.org/about/?

 

Specifically the maximum data values in the screen shot below…it seems as though this documentation might be out of date.

 

cid:image001.jpg@01D38614.03F28650

 

Glen Moser

Director, Reporting & Analytics, NOC

Charter Communications

636.387.6888 (O) | 314.308.5680 (M)

 

From: Vianello, Daniel A
Sent: Friday, January 05, 2018 11:16 AM
To: Moser, Glen G <Glen.Moser@charter.com>; Coleman, Cynthia A <Cynthia.Coleman@charter.com>; Boyce, Sherwyn <Sherwyn.Boyce@charter.com>
Subject: RE: Is this still accurate?

 

It can’t be accurate. Feel free to email pgsql-docs@postgresql.org to let the development group know that they should fix it.

 

Dan

 

From: Moser, Glen G
Sent: Friday, January 05, 2018 10:58 AM
To: Vianello, Daniel A <Daniel.Vianello@charter.com>; Coleman, Cynthia A <Cynthia.Coleman@charter.com>; Boyce, Sherwyn <Sherwyn.Boyce@charter.com>
Subject: Is this still accurate?

 

Is this still accurate information?

 

https://www.postgresql.org/about/

 

cid:image001.jpg@01D38614.03F28650

 

Glen Moser

Director, Reporting & Analytics, NOC

Charter Communications

636.387.6888 (O) | 314.308.5680 (M)

 

The contents of this e-mail message and
any attachments are intended solely for the
addressee(s) and may contain confidential
and/or legally privileged information. If you
are not the intended recipient of this message
or if this message has been addressed to you
in error, please immediately alert the sender
by reply e-mail and then delete this message
and any attachments. If you are not the
intended recipient, you are notified that
any use, dissemination, distribution, copying,
or storage of this message or any attachment
is strictly prohibited.
Attachment

Re: Is this still accurate?

From
Scott Marlowe
Date:
On Fri, Jan 5, 2018 at 10:34 AM, Moser, Glen G <Glen.Moser@charter.com> wrote:
>
> Can someone confirm the accuracy of the information found at https://www.postgresql.org/about/?
>
> Specifically the maximum data values in the screen shot below…it seems as though this documentation might be out of
date.

What numbers specifically do you think are no longer accurate?


Re: Is this still accurate?

From
Stephen Frost
Date:
Greetings,

* Moser, Glen G (Glen.Moser@charter.com) wrote:
> Can someone confirm the accuracy of the information found at https://www.postgresql.org/about/?
>
> Specifically the maximum data values in the screen shot below...it seems as though this documentation might be out of
date.

The part you highlighted was:

"There are active PostgreSQL systems in production environments that
manage in excess of 4 terabytes of data."

Which is pretty accurate, I know of some myself that are larger than
4TB.  That 4TB number isn't a limit of any kind and the sentence says
"in excess of" meaning that there are databases larger than that.
There's actually some which are quite a bit larger than that, in fact.

We could bump the number up there or remove the sentence, but I don't
think there's anything inaccurate about the statement.

Thanks!

Stephen

Attachment

RE: Is this still accurate?

From
"Moser, Glen G"
Date:
That's really the gist of the concern from a team member of mine.  Not that the 4TB number is wrong but that it could
bemisleading to assume that 4TB is some sort of upper bound. 

That's how this concern was relayed to me and I am just following up.

Glen Moser
Director, Reporting & Analytics, NOC
Charter Communications
636.387.6888 (O) | 314.308.5680 (M)

-----Original Message-----
From: Stephen Frost [mailto:sfrost@snowman.net]
Sent: Friday, January 05, 2018 11:55 AM
To: Moser, Glen G <Glen.Moser@charter.com>
Cc: pgsql-docs@postgresql.org
Subject: Re: Is this still accurate?

Greetings,

* Moser, Glen G (Glen.Moser@charter.com) wrote:
> Can someone confirm the accuracy of the information found at https://www.postgresql.org/about/?
>
> Specifically the maximum data values in the screen shot below...it seems as though this documentation might be out of
date.

The part you highlighted was:

"There are active PostgreSQL systems in production environments that manage in excess of 4 terabytes of data."

Which is pretty accurate, I know of some myself that are larger than 4TB.  That 4TB number isn't a limit of any kind
andthe sentence says "in excess of" meaning that there are databases larger than that. 
There's actually some which are quite a bit larger than that, in fact.

We could bump the number up there or remove the sentence, but I don't think there's anything inaccurate about the
statement.

Thanks!

Stephen
E-MAIL CONFIDENTIALITY NOTICE:
The contents of this e-mail message and any attachments are intended solely for the addressee(s) and may contain
confidentialand/or legally privileged information. If you are not the intended recipient of this message or if this
messagehas been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this
messageand any attachments. If you are not the intended recipient, you are notified that any use, dissemination,
distribution,copying, or storage of this message or any attachment is strictly prohibited. 



Re: Is this still accurate?

From
Stephen Frost
Date:
Greetings,

* Moser, Glen G (Glen.Moser@charter.com) wrote:
> That's really the gist of the concern from a team member of mine.  Not that the 4TB number is wrong but that it could
bemisleading to assume that 4TB is some sort of upper bound. 
>
> That's how this concern was relayed to me and I am just following up.

Well, saying 'in excess of' is pretty clear, but I don't think the
sentence is really adding much either, so perhaps we should just remove
it.

Thanks!

Stephen

Attachment

Re: Is this still accurate?

From
Steve Atkins
Date:
> On Jan 5, 2018, at 10:00 AM, Stephen Frost <sfrost@snowman.net> wrote:
>
> Greetings,
>
> * Moser, Glen G (Glen.Moser@charter.com) wrote:
>> That's really the gist of the concern from a team member of mine.  Not that the 4TB number is wrong but that it
couldbe misleading to assume that 4TB is some sort of upper bound. 
>>
>> That's how this concern was relayed to me and I am just following up.
>
> Well, saying 'in excess of' is pretty clear, but I don't think the
> sentence is really adding much either, so perhaps we should just remove
> it.

It's been useful a few times to reassure people that we can handle "large"
databases operationally, rather than just having large theoretical limits.

Updating it would be great, or wrapping a little more verbiage around the
4TB number, but a mild -1 on removing it altogether.

Cheers,
  Steve

Re: Is this still accurate?

From
Alvaro Herrera
Date:
Steve Atkins wrote:

> It's been useful a few times to reassure people that we can handle "large"
> databases operationally, rather than just having large theoretical limits.
> 
> Updating it would be great, or wrapping a little more verbiage around the
> 4TB number, but a mild -1 on removing it altogether.

I'd just add a 0 to "40TB" and be done with it.  We have larger
databases but this is a decent enough number.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Is this still accurate?

From
Magnus Hagander
Date:


On Fri, Jan 5, 2018 at 8:09 PM, Jonathan S. Katz <jkatz@postgresql.org> wrote:
Hi,

On Jan 5, 2018, at 1:33 PM, Steve Atkins <steve@blighty.com> wrote:


On Jan 5, 2018, at 10:00 AM, Stephen Frost <sfrost@snowman.net> wrote:

Greetings,

* Moser, Glen G (Glen.Moser@charter.com) wrote:
That's really the gist of the concern from a team member of mine.  Not that the 4TB number is wrong but that it could be misleading to assume that 4TB is some sort of upper bound.

That's how this concern was relayed to me and I am just following up.

Well, saying 'in excess of' is pretty clear, but I don't think the
sentence is really adding much either, so perhaps we should just remove
it.

It's been useful a few times to reassure people that we can handle "large"
databases operationally, rather than just having large theoretical limits.

Updating it would be great, or wrapping a little more verbiage around the
4TB number, but a mild -1 on removing it altogether.

Here is a proposed patch that updates the wording:

"There are active PostgreSQL instances in production environments that manage many terabytes of data, as well as clusters managing petabytes.”

The idea is that it gives a sense of scope for how big instances/clusters can run without fixing people on a number.  People can draw their own conclusions from the hard limits further down the page.

+1.
 


--

Re: Is this still accurate?

From
"Jonathan S. Katz"
Date:
Hi,

On Jan 6, 2018, at 9:45 AM, Magnus Hagander <magnus@hagander.net> wrote:



On Fri, Jan 5, 2018 at 8:09 PM, Jonathan S. Katz <jkatz@postgresql.org> wrote:
Hi,

On Jan 5, 2018, at 1:33 PM, Steve Atkins <steve@blighty.com> wrote:


On Jan 5, 2018, at 10:00 AM, Stephen Frost <sfrost@snowman.net> wrote:

Greetings,

* Moser, Glen G (Glen.Moser@charter.com) wrote:
That's really the gist of the concern from a team member of mine.  Not that the 4TB number is wrong but that it could be misleading to assume that 4TB is some sort of upper bound.

That's how this concern was relayed to me and I am just following up.

Well, saying 'in excess of' is pretty clear, but I don't think the
sentence is really adding much either, so perhaps we should just remove
it.

It's been useful a few times to reassure people that we can handle "large"
databases operationally, rather than just having large theoretical limits.

Updating it would be great, or wrapping a little more verbiage around the
4TB number, but a mild -1 on removing it altogether.

Here is a proposed patch that updates the wording:

"There are active PostgreSQL instances in production environments that manage many terabytes of data, as well as clusters managing petabytes.”

The idea is that it gives a sense of scope for how big instances/clusters can run without fixing people on a number.  People can draw their own conclusions from the hard limits further down the page.

+1.

Changes pushed.

Jonathan

Re: Is this still accurate?

From
Simon Riggs
Date:
On 6 January 2018 at 16:35, Jonathan S. Katz <jkatz@postgresql.org> wrote:
> Hi,
>
> On Jan 6, 2018, at 9:45 AM, Magnus Hagander <magnus@hagander.net> wrote:
>
>
>
> On Fri, Jan 5, 2018 at 8:09 PM, Jonathan S. Katz <jkatz@postgresql.org>
> wrote:
>>
>> Hi,
>>
>> On Jan 5, 2018, at 1:33 PM, Steve Atkins <steve@blighty.com> wrote:
>>
>>
>> On Jan 5, 2018, at 10:00 AM, Stephen Frost <sfrost@snowman.net> wrote:
>>
>> Greetings,
>>
>> * Moser, Glen G (Glen.Moser@charter.com) wrote:
>>
>> That's really the gist of the concern from a team member of mine.  Not
>> that the 4TB number is wrong but that it could be misleading to assume that
>> 4TB is some sort of upper bound.
>>
>> That's how this concern was relayed to me and I am just following up.
>>
>>
>> Well, saying 'in excess of' is pretty clear, but I don't think the
>> sentence is really adding much either, so perhaps we should just remove
>> it.
>>
>>
>> It's been useful a few times to reassure people that we can handle "large"
>> databases operationally, rather than just having large theoretical limits.
>>
>> Updating it would be great, or wrapping a little more verbiage around the
>> 4TB number, but a mild -1 on removing it altogether.
>>
>>
>> Here is a proposed patch that updates the wording:
>>
>> "There are active PostgreSQL instances in production environments that
>> manage many terabytes of data, as well as clusters managing petabytes.”
>>
>> The idea is that it gives a sense of scope for how big instances/clusters
>> can run without fixing people on a number.  People can draw their own
>> conclusions from the hard limits further down the page.
>>
> +1.

I don't think that's as useful, so -1 for removing the stated limit.

People always ask "how big can it go?" and having a specific number
there is important. We have publicly documented cases above 50TB, so I
think we should say that.

Clusters in Petabyte range? We need to be able to substantiate that
with publicly documented cases. They also need to be pure PostgreSQL,
not "with added tech", no?


Also, I can't see that the 1.6 TB per row is accurate, because that
would mean 1600 toast pointers at 20 bytes each = 32000 bytes, which
is above what we can normally support with 8kB blocksize as we
normally shipped.

Lastly, the "per table limit" should really say "32 TB per table, 128
PB for a partitioned table (4000 partitions)"

--
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services