Thread: Examples of Large Datasets on Postgres?

Examples of Large Datasets on Postgres?

From
Don Parris
Date:
Hi all,

I believe this relates primarily to advocacy, so am posting here.  I would like to find examples of organizations running large datasets on PostgreSQL.  The last survey on this (that I can find) is from 2004 (http://www.postgresql.org/community/survey/26-how-large-is-your-postgresql-database/).

I have also seen this example from 2008: http://it.toolbox.com/blogs/oracle-guide/worlds-largest-database-runs-on-postgres-24979

Given these limited examples, I wonder if someone would be willing to discuss their large PostgreSQL DB?  I would like to write an article for my blog.

The driving factor for me is that, in my local community college, where I am taking database classes, the (admittedly biased) Oracle pros who are teaching the classes keep referring to PostgreSQL as "light".  Even when I bring up the Yahoo example, they respond with, "I've never seen that".  While I can truly appreciate their bias (I am surely biased in quite the opposite direction), I think they need to be willing to acknowledge that their understanding of Postgres is limited.

I am here in Charlotte, North Carolina, home to the SouthEast LinuxFest and a member of the Charlotte Linux User Group.  I also did a presentation at SELF on the Ltree extension module.  In a prior life, I was the Editor-in-Chief of LXer.com.

Again, I would love to write an article about Postgres running large datasets (TB & PB range), if anyone would be willing to discuss such an animal.  I would like to focus on how you are using your Postgres and what hardware you're running on, in somewhat similar fashion to the Yahoo article.

Can anyone point me in the right direction?  I realize I'm preaching among evangelists on this list, but maybe some of the folks here know someone who might be willing to talk?

Thanks,
Don
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Facility Management Coordinator, Free Software Advocate
GPG Key ID: F5E179BE

Re: Examples of Large Datasets on Postgres?

From
Josh Berkus
Date:
Don,

> Can anyone point me in the right direction?  I realize I'm preaching among
> evangelists on this list, but maybe some of the folks here know someone who
> might be willing to talk?

A few off the top of my head:

Instagram had over 20TB of data in PostgreSQL at acquisition time per
their presentation; no doubt they have more, now.

Comptel's cell call tracking database for the EU had up to 75TB per city
for 20+ cities.

There's a marketing company in Australia which had over 200TB of data in
PostgreSQL -- I can't recall the name right now.

The Mormon Tabernacle's entire geneology database is in PostgreSQL; not
sure how big that is, but it covers over 200m individuals.

I don't know that anyone has petabytes on vanilla PostgreSQL; we don't
do well at that scale.  GreenPlum, Aster, etc. yes, but not mainstream
Postgres.  However, nobody has petabytes on mainstream Oracle either.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Examples of Large Datasets on Postgres?

From
Szymon Guz
Date:
On 2 October 2013 19:32, Josh Berkus <josh@agliodbs.com> wrote:
Don,

> Can anyone point me in the right direction?  I realize I'm preaching among
> evangelists on this list, but maybe some of the folks here know someone who
> might be willing to talk?

A few off the top of my head:

Instagram had over 20TB of data in PostgreSQL at acquisition time per
their presentation; no doubt they have more, now.

Comptel's cell call tracking database for the EU had up to 75TB per city
for 20+ cities.

There's a marketing company in Australia which had over 200TB of data in
PostgreSQL -- I can't recall the name right now.

The Mormon Tabernacle's entire geneology database is in PostgreSQL; not
sure how big that is, but it covers over 200m individuals.

I don't know that anyone has petabytes on vanilla PostgreSQL; we don't
do well at that scale.  GreenPlum, Aster, etc. yes, but not mainstream
Postgres.  However, nobody has petabytes on mainstream Oracle either.


Hi Josh,
I think it should be promoted on the website. Many managers in huge companies don't know that and are afraid of using Postgres, I used to hear for years that "I'm afraid of switching to Postgres, as nobody uses that, and if they use it, there are just some toy projects".

Szymon

Re: Examples of Large Datasets on Postgres?

From
Don Parris
Date:
On Wed, Oct 2, 2013 at 1:32 PM, Josh Berkus <josh@agliodbs.com> wrote:
Don,

> Can anyone point me in the right direction?  I realize I'm preaching among
> evangelists on this list, but maybe some of the folks here know someone who
> might be willing to talk?

A few off the top of my head:

<snip>
 
I don't know that anyone has petabytes on vanilla PostgreSQL; we don't
do well at that scale.  GreenPlum, Aster, etc. yes, but not mainstream
Postgres.  However, nobody has petabytes on mainstream Oracle either.

--

Josh,

Thanks for the listing.  That gives me something to be able to discuss in class, at least.  I don't suppose you could help me get in contact with any of these companies?  Or maybe if I just drop a line on the general list, someone might take interest?

I saw Szymon's reply also, and agree it might be a good thing to promote what PostgreSQL is really capable of, as there does seem to be quite a bit of doubt (which could be a competitive advantage in the eyes of some folks).

Regards,
Don
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Facility Services Coordinator, Free Software Advocate
GPG Key ID: F5E179BE

Re: Examples of Large Datasets on Postgres?

From
Josh Berkus
Date:
Szymon,

> I think it should be promoted on the website. Many managers in huge
> companies don't know that and are afraid of using Postgres, I used to hear
> for years that "I'm afraid of switching to Postgres, as nobody uses that,
> and if they use it, there are just some toy projects".

Oh, no question.  It just requires a bunch of legwork by someone.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Examples of Large Datasets on Postgres?

From
Josh Berkus
Date:
> Thanks for the listing.  That gives me something to be able to discuss in
> class, at least.  I don't suppose you could help me get in contact with any
> of these companies?  Or maybe if I just drop a line on the general list,
> someone might take interest?

Hmmm.  I don't know that I can reach anyone at those organizations,
currently. One of the drawbacks of not having a formal sales
organization is that you don't have the ability to entice ongoing
testimonials via marketing discounts.

You have looked at the Quotes page also, yes?

Oh, and there's a great Case Study from the French Social Security
Administration:
http://www.wcm.bull.com/internet/pr/rend.jsp?DocId=594915&lang=en

(hey, why isn't that case study linked from the main site?)


--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


Re: Examples of Large Datasets on Postgres?

From
Ian Lawrence Barwick
Date:
2013/10/3 Don Parris <parrisdc@gmail.com>:
> Hi all,
>
> I believe this relates primarily to advocacy, so am posting here.  I would
> like to find examples of organizations running large datasets on PostgreSQL.
> The last survey on this (that I can find) is from 2004
> (http://www.postgresql.org/community/survey/26-how-large-is-your-postgresql-database/).
>
> I have also seen this example from 2008:
> http://it.toolbox.com/blogs/oracle-guide/worlds-largest-database-runs-on-postgres-24979

I can't provide details right now, but I'll be giving a talk at the
Japan PostgreSQL conference in November [1] about my employer's recent
Oracle -> PostgreSQL migration and will be making the material
available in English after that. The dataset is "only" a few TB
(albeit growing).

[1] http://www.postgresql.jp/events/jpug-pgcon2013/#A1

Regards

Ian Barwick


Re: Examples of Large Datasets on Postgres?

From
Don Parris
Date:
On Thu, Oct 3, 2013 at 1:09 PM, Josh Berkus <josh@agliodbs.com> wrote:

> Thanks for the listing.  That gives me something to be able to discuss in
> class, at least.  I don't suppose you could help me get in contact with any
> of these companies?  Or maybe if I just drop a line on the general list,
> someone might take interest?

Hmmm.  I don't know that I can reach anyone at those organizations,
currently. One of the drawbacks of not having a formal sales
organization is that you don't have the ability to entice ongoing
testimonials via marketing discounts.

You have looked at the Quotes page also, yes?

I had missed the quotes page - or just haven't gotten that far yet.
 

Oh, and there's a great Case Study from the French Social Security
Administration:
http://www.wcm.bull.com/internet/pr/rend.jsp?DocId=594915&lang=en

(hey, why isn't that case study linked from the main site?)


This case study is a great example, for sure.  I know there are serious examples of PostgreSQL handling large datasets in mission critical roles - just finding them can be a little challenging.


--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
GPG Key ID: F5E179BE