Re: Partitioning / Clustering - Mailing list pgsql-performance

From David Roussel
Subject Re: Partitioning / Clustering
Date
Msg-id 1115798277.29223.233869472@webmail.messagingengine.com
Whole thread Raw
In response to Partitioning / Clustering  (Alex Stapleton <alexs@advfn.com>)
Responses Re: Partitioning / Clustering
Re: Partitioning / Clustering
Re: Partitioning / Clustering
Re: Partitioning / Clustering
Re: Partitioning / Clustering
List pgsql-performance
For an interesting look at scalability, clustering, caching, etc for a
large site have a look at how livejournal did it.
http://www.danga.com/words/2004_lisa/lisa04.pdf

They have 2.6 Million active users, posting 200 new blog entries per
minute, plus many comments and countless page views.

Although this system is of a different sort to the type I work on it's
interesting to see how they've made it scale.

They use mysql on dell hardware! And found single master replication did
not scale.  There's a section on multimaster replication, not sure if
they use it.  The main approach they use is to parition users into
spefic database clusters.  Caching is done using memcached at the
application level to avoid hitting the db for rendered pageviews.

It's interesting that the solution livejournal have arrived at is quite
similar in ways to the way google is set up.

David

pgsql-performance by date:

Previous
From: Neil Conway
Date:
Subject: Re: Partitioning / Clustering
Next
From: "Edin Kadribasic"
Date:
Subject: Optimizer wrongly picks Nested Loop Left Join