Thread: High traffic websites...

High traffic websites...

From
Robert Treat
Date:
I'm sure that a lot of you saw the article on /. a couple days ago about
"PostgreSQL on big sites?", where someone asked for a list of high
traffic websites that are using PostgreSQL on the backend.  Of course
there were a bunch of the standard replies about Afilias Inc and
pointing to the case studies on the PostgreSQL website, but out of all
of the replies I saw only one seemed like it really fit the bill of a
high traffic website (Whitepages.com, if you work for this company
please drop me an email). Now I know of some high traffic (I think they
have high traffic) sites that use PostgreSQL (mobygames, cdbaby), and  I
know that some of the sites using popular PostgreSQL based CMS systems
have good traffic (http://openacs.org/community/sites/,
http://www.bricolage.cc/about/sites/), but this got me wondering and
thinking so I looked up a list of the top 100 high traffic websites
(English only: see th list I used at
http://www.alexa.com/site/ds/top_sites?ts_mode=lang&lang=en) and started
going through the list: Yahoo...oracle/my$ql, MSNBC...M$,
Google...homegrown, Passport.net....m$, EBay...oracle, M$....M$,
Amazon...Oracle, Fastclick...unknown, AOLAnywhere...Oracle,
Google.uk...homegrown... wow... pretty depressing, although we do come
out on par with db2 (unless fastclick uses them.. oh my) Scrolling down
through the list I really didn't see any sites that I knew that use
PostgreSQL... so I wondered if anyone else could vouch for any that do,
and/or also any other really high traffic sites (lets say more than 100
million page views a day?)  It would be nice to get a list of these
companies into the known world... right now we seem on the short end of
this segment.

--
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

Re: High traffic websites...

From
"Joshua D. Drake"
Date:
music.com
hi5.com
radioparadise.com
vacationhomes.com



On Thu, 2005-03-31 at 15:30 -0500, Robert Treat wrote:
> I'm sure that a lot of you saw the article on /. a couple days ago about
> "PostgreSQL on big sites?", where someone asked for a list of high
> traffic websites that are using PostgreSQL on the backend.  Of course
> there were a bunch of the standard replies about Afilias Inc and
> pointing to the case studies on the PostgreSQL website, but out of all
> of the replies I saw only one seemed like it really fit the bill of a
> high traffic website (Whitepages.com, if you work for this company
> please drop me an email). Now I know of some high traffic (I think they
> have high traffic) sites that use PostgreSQL (mobygames, cdbaby), and  I
> know that some of the sites using popular PostgreSQL based CMS systems
> have good traffic (http://openacs.org/community/sites/,
> http://www.bricolage.cc/about/sites/), but this got me wondering and
> thinking so I looked up a list of the top 100 high traffic websites
> (English only: see th list I used at
> http://www.alexa.com/site/ds/top_sites?ts_mode=lang〈=en) and started
> going through the list: Yahoo...oracle/my$ql, MSNBC...M$,
> Google...homegrown, Passport.net....m$, EBay...oracle, M$....M$,
> Amazon...Oracle, Fastclick...unknown, AOLAnywhere...Oracle,
> Google.uk...homegrown... wow... pretty depressing, although we do come
> out on par with db2 (unless fastclick uses them.. oh my) Scrolling down
> through the list I really didn't see any sites that I knew that use
> PostgreSQL... so I wondered if anyone else could vouch for any that do,
> and/or also any other really high traffic sites (lets say more than 100
> million page views a day?)  It would be nice to get a list of these
> companies into the known world... right now we seem on the short end of
> this segment.
>
--
Command Prompt, Inc., Your PostgreSQL solutions company. 503-667-4564
Custom programming, 24x7 support, managed services, and hosting
Open Source Authors: plPHP, pgManage, Co-Authors: plPerlNG
Reliable replication, Mammoth Replicator - http://www.commandprompt.com/


Re: High traffic websites...

From
"Joshua D. Drake"
Date:
Oh and Salon? Don't they use Bricolage and thus PostgreSQL?


On Thu, 2005-03-31 at 15:30 -0500, Robert Treat wrote:
> I'm sure that a lot of you saw the article on /. a couple days ago about
> "PostgreSQL on big sites?", where someone asked for a list of high
> traffic websites that are using PostgreSQL on the backend.  Of course
> there were a bunch of the standard replies about Afilias Inc and
> pointing to the case studies on the PostgreSQL website, but out of all
> of the replies I saw only one seemed like it really fit the bill of a
> high traffic website (Whitepages.com, if you work for this company
> please drop me an email). Now I know of some high traffic (I think they
> have high traffic) sites that use PostgreSQL (mobygames, cdbaby), and  I
> know that some of the sites using popular PostgreSQL based CMS systems
> have good traffic (http://openacs.org/community/sites/,
> http://www.bricolage.cc/about/sites/), but this got me wondering and
> thinking so I looked up a list of the top 100 high traffic websites
> (English only: see th list I used at
> http://www.alexa.com/site/ds/top_sites?ts_mode=lang〈=en) and started
> going through the list: Yahoo...oracle/my$ql, MSNBC...M$,
> Google...homegrown, Passport.net....m$, EBay...oracle, M$....M$,
> Amazon...Oracle, Fastclick...unknown, AOLAnywhere...Oracle,
> Google.uk...homegrown... wow... pretty depressing, although we do come
> out on par with db2 (unless fastclick uses them.. oh my) Scrolling down
> through the list I really didn't see any sites that I knew that use
> PostgreSQL... so I wondered if anyone else could vouch for any that do,
> and/or also any other really high traffic sites (lets say more than 100
> million page views a day?)  It would be nice to get a list of these
> companies into the known world... right now we seem on the short end of
> this segment.
>
--
Command Prompt, Inc., Your PostgreSQL solutions company. 503-667-4564
Custom programming, 24x7 support, managed services, and hosting
Open Source Authors: plPHP, pgManage, Co-Authors: plPerlNG
Reliable replication, Mammoth Replicator - http://www.commandprompt.com/


Re: High traffic websites...

From
Robert Treat
Date:
I thought there was a connection there, but they're not listed on the
bricolage or kineticode websites.

Robert Treat

On Thu, 2005-03-31 at 15:32, Joshua D. Drake wrote:
> Oh and Salon? Don't they use Bricolage and thus PostgreSQL?
>
>
> On Thu, 2005-03-31 at 15:30 -0500, Robert Treat wrote:
> > I'm sure that a lot of you saw the article on /. a couple days ago about
> > "PostgreSQL on big sites?", where someone asked for a list of high
> > traffic websites that are using PostgreSQL on the backend.  Of course
> > there were a bunch of the standard replies about Afilias Inc and
> > pointing to the case studies on the PostgreSQL website, but out of all
> > of the replies I saw only one seemed like it really fit the bill of a
> > high traffic website (Whitepages.com, if you work for this company
> > please drop me an email). Now I know of some high traffic (I think they
> > have high traffic) sites that use PostgreSQL (mobygames, cdbaby), and  I
> > know that some of the sites using popular PostgreSQL based CMS systems
> > have good traffic (http://openacs.org/community/sites/,
> > http://www.bricolage.cc/about/sites/), but this got me wondering and
> > thinking so I looked up a list of the top 100 high traffic websites
> > (English only: see th list I used at
> > http://www.alexa.com/site/ds/top_sites?ts_mode=lang〈=en) and started
> > going through the list: Yahoo...oracle/my$ql, MSNBC...M$,
> > Google...homegrown, Passport.net....m$, EBay...oracle, M$....M$,
> > Amazon...Oracle, Fastclick...unknown, AOLAnywhere...Oracle,
> > Google.uk...homegrown... wow... pretty depressing, although we do come
> > out on par with db2 (unless fastclick uses them.. oh my) Scrolling down
> > through the list I really didn't see any sites that I knew that use
> > PostgreSQL... so I wondered if anyone else could vouch for any that do,
> > and/or also any other really high traffic sites (lets say more than 100
> > million page views a day?)  It would be nice to get a list of these
> > companies into the known world... right now we seem on the short end of
> > this segment.
> >
> --
> Command Prompt, Inc., Your PostgreSQL solutions company. 503-667-4564
> Custom programming, 24x7 support, managed services, and hosting
> Open Source Authors: plPHP, pgManage, Co-Authors: plPerlNG
> Reliable replication, Mammoth Replicator - http://www.commandprompt.com/
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
--
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


Re: High traffic websites...

From
Simon Riggs
Date:
On Thu, 2005-03-31 at 15:30 -0500, Robert Treat wrote:
> I'm sure that a lot of you saw the article on /. a couple days ago about
> "PostgreSQL on big sites?", where someone asked for a list of high
> traffic websites that are using PostgreSQL on the backend.

My penny contribution...

Show me a list of high traffic websites that use only one
server/subdomain for all of the connected pages. All of them I know of
use many subdomains and almost all use many different systems on each,
so its a strange question, designed mostly to attack. All multi-sites
have a range of traffic levels on various applications that make up
their sites. Many of these are RDBMS connected, many are not. Google
sure as hell doesn't use any RDBMS.

No wish to start a flamewar, but I am content in the thought that
PostgreSQL can't do the top slice of performance requirements that
exist. How big is that slice? Thats the point for debate, for me. There
isn't any market anywhere with more than 1 player in, where the cheapest
is as good as the most expensive; thats economics.

You'll never please the people who want to see "Big", "More" etc
references and proof. I am interested in talking to people who want
"Enough", "Sufficient" and "Cost/Effective"; that is sufficient for
me...

Best Regards, Simon Riggs




Re: High traffic websites...

From
Christopher Kings-Lynne
Date:
calorieking.com

Joshua D. Drake wrote:
> music.com
> hi5.com
> radioparadise.com
> vacationhomes.com
>
>
>
> On Thu, 2005-03-31 at 15:30 -0500, Robert Treat wrote:
>
>>I'm sure that a lot of you saw the article on /. a couple days ago about
>>"PostgreSQL on big sites?", where someone asked for a list of high
>>traffic websites that are using PostgreSQL on the backend.  Of course
>>there were a bunch of the standard replies about Afilias Inc and
>>pointing to the case studies on the PostgreSQL website, but out of all
>>of the replies I saw only one seemed like it really fit the bill of a
>>high traffic website (Whitepages.com, if you work for this company
>>please drop me an email). Now I know of some high traffic (I think they
>>have high traffic) sites that use PostgreSQL (mobygames, cdbaby), and  I
>>know that some of the sites using popular PostgreSQL based CMS systems
>>have good traffic (http://openacs.org/community/sites/,
>>http://www.bricolage.cc/about/sites/), but this got me wondering and
>>thinking so I looked up a list of the top 100 high traffic websites
>>(English only: see th list I used at
>>http://www.alexa.com/site/ds/top_sites?ts_mode=lang〈=en) and started
>>going through the list: Yahoo...oracle/my$ql, MSNBC...M$,
>>Google...homegrown, Passport.net....m$, EBay...oracle, M$....M$,
>>Amazon...Oracle, Fastclick...unknown, AOLAnywhere...Oracle,
>>Google.uk...homegrown... wow... pretty depressing, although we do come
>>out on par with db2 (unless fastclick uses them.. oh my) Scrolling down
>>through the list I really didn't see any sites that I knew that use
>>PostgreSQL... so I wondered if anyone else could vouch for any that do,
>>and/or also any other really high traffic sites (lets say more than 100
>>million page views a day?)  It would be nice to get a list of these
>>companies into the known world... right now we seem on the short end of
>>this segment.
>>

Re: High traffic websites...

From
Robert Treat
Date:
On Thursday 31 March 2005 17:57, Simon Riggs wrote:
> On Thu, 2005-03-31 at 15:30 -0500, Robert Treat wrote:
> > I'm sure that a lot of you saw the article on /. a couple days ago about
> > "PostgreSQL on big sites?", where someone asked for a list of high
> > traffic websites that are using PostgreSQL on the backend.
>
> My penny contribution...
>
> Show me a list of high traffic websites that use only one
> server/subdomain for all of the connected pages. All of them I know of
> use many subdomains and almost all use many different systems on each,
> so its a strange question, designed mostly to attack. All multi-sites
> have a range of traffic levels on various applications that make up
> their sites. Many of these are RDBMS connected, many are not. Google
> sure as hell doesn't use any RDBMS.
>

Hey I'd be happy with a site the employed several postgresql databases to
handle its various subdomains.  Even better would be one that used slony to
handle extremely high read only traffic... nothing wrong with that.

> No wish to start a flamewar, but I am content in the thought that
> PostgreSQL can't do the top slice of performance requirements that
> exist. How big is that slice? Thats the point for debate, for me. There
> isn't any market anywhere with more than 1 player in, where the cheapest
> is as good as the most expensive; thats economics.
>

I think that's an arguable position... look at apache. I'd be willing to say
it's the cheapest and is *better* than the most expensive.

> You'll never please the people who want to see "Big", "More" etc
> references and proof. I am interested in talking to people who want
> "Enough", "Sufficient" and "Cost/Effective"; that is sufficient for
> me...
>

I think your wrong on that... people want to know if X brand database can
handle high traffic websites... if you can say "we power amazon" or "we power
yahoo" then I think that satisfies *a lot* of people. Maybe not all but
certainly a good number, so I don't think there is anything wrong with trying
to find some shining examples that we can point to.

--
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

Re: High traffic websites...

From
Jeff
Date:
On Mar 31, 2005, at 3:30 PM, Robert Treat wrote:

> of the replies I saw only one seemed like it really fit the bill of a
> high traffic website (Whitepages.com, if you work for this company
> please drop me an email).

Got the OK to mention this.

Raging Bull (http://ragingbull.lycos.com/) has been using PG for years
(going back to 7.0) to power various systems.   Although PG isn't our
"main" db yet, it is now just as important as the current Informix
engine.  We're currently migrating off of Informix to PG and hopefully
in a couple months we'll be completely off of it.

The site is pushing a few million page views/day

PG is doing an average of 700 updates / minute and about 8000 queries /
minute (these spike up to 1100 and 12000 respectively).   The box
running PG is nothing special - dual xeon with shoddy disks.

We also replicate that db via slony onto a hot spare. That works
wonderfully.

I plan on writing up some more info about it when the new system is
launched.

I also go the go-ahead to release some of my nifty tools I've written
for PG over the years.

I welcome various questions you may have about it.

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/


Re: High traffic websites...

From
Sam Hahn
Date:
Jeff - Thanks for your mention below. I'm curious how PG was selected in
the first place, and whether management was conscious of it at the
time... How long does it normally take for you to do the hot-swap (if
ever)? Have you explicitly decided what PG-specific features /
extensions you will adopt? or not? Were there Informix-specific features
that you had taken advantage of? Thx - Sam


Jeff wrote:

>
> On Mar 31, 2005, at 3:30 PM, Robert Treat wrote:
>
>> of the replies I saw only one seemed like it really fit the bill of a
>> high traffic website (Whitepages.com, if you work for this company
>> please drop me an email).
>
>
> Got the OK to mention this.
>
> Raging Bull (http://ragingbull.lycos.com/) has been using PG for years
> (going back to 7.0) to power various systems.   Although PG isn't our
> "main" db yet, it is now just as important as the current Informix
> engine.  We're currently migrating off of Informix to PG and hopefully
> in a couple months we'll be completely off of it.
>
> The site is pushing a few million page views/day
>
> PG is doing an average of 700 updates / minute and about 8000 queries
> / minute (these spike up to 1100 and 12000 respectively).   The box
> running PG is nothing special - dual xeon with shoddy disks.
>
> We also replicate that db via slony onto a hot spare. That works
> wonderfully.
>
> I plan on writing up some more info about it when the new system is
> launched.
>
> I also go the go-ahead to release some of my nifty tools I've written
> for PG over the years.
>
> I welcome various questions you may have about it.
>
> --
> Jeff Trout <jeff@jefftrout.com>
> http://www.jefftrout.com/
> http://www.stuarthamm.net/


Re: High traffic websites...

From
Jeff
Date:
On Apr 1, 2005, at 10:32 AM, Sam Hahn wrote:

> Jeff - Thanks for your mention below. I'm curious how PG was selected
> in the first place, and whether management was conscious of it at the
> time... How long does it normally take for you to do the hot-swap (if
> ever)? Have you explicitly decided what PG-specific features /
> extensions you will adopt? or not? Were there Informix-specific
> features that you had taken advantage of? Thx - Sam
>

We fear informix here. We're stuck on an ancient version that is fairly
broken and there is nothing we can do about it.  So we had some
products come up that needed some db support but we didn't want to add
anything to Informix so I advocated PG because (mostly) of stored
procedures and I had been playing with it recently.   I did get some
resistance (They wanted me to use flat files) but I won.  You know
what? It works like a champ.  It was the DB people would forget about
because it would keep plugging away.  The only big "failures" it has
had were when the power supply blew.

As for the hot swap, we've never actually had to do it, but it will
just be a matter of changing pgpool's config to point to the spare and
restarting it (and of course, the slony side of it).

I think I forgot to mention we use pgpool - that thing is the
bestestestestest thing ever (thanks Tatsuo!).  I wish we had one for
Oracle (We use Oracle on some other Lycos products)

I use triggers & stored procs extensively in PG.  Also I have some
clever stuff using LISTEN/NOTIFY.  One of the niftier tools I wrote
takes a stored proc in PG and will generate glue code in either c or
perl so you can call it natively.  (The C one will actually build out
the structs and perform the data mangling so you can call those stored
procs like a regular function.  That is the basis of the new arch for
RB).

The only thing we really used on Informix were stored procs.  Every
other nifty Informix feature we've tried has been broken - table
partitioning didn't work (We ended up getting invalid results when we
added an index - informix told us it was a bug and we're in trouble)...
we hit the 21.7M pages of data/table "limit".  That was painful.  The
error you get when you hit that limit is "No more extents" so you
figure "ok, I'll redo the table with bigger extents" [12 hours later]
"Hmm.. I got that error again. WTF?" [more googling] "ARRRG! 21.7M
pages of data! Grumble".

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/


Re: High traffic websites...

From
Josh Berkus
Date:
Jeff,

> Raging Bull (http://ragingbull.lycos.com/) has been using PG for years
> (going back to 7.0) to power various systems.  

All right!  I look forward to more details.

--
Josh Berkus
Aglio Database Solutions
San Francisco