Re: Partitioning / Clustering - Mailing list pgsql-performance

From PFC
Subject Re: Partitioning / Clustering
Date
Msg-id op.sqnqh8pyth1vuj@localhost
Whole thread Raw
In response to Re: Partitioning / Clustering  ("David Roussel" <pgsql-performance@diroussel.xsmail.com>)
List pgsql-performance
> machines. Which has it's own set of issues entirely. I am not entirely
> sure that memcached actually does serialize data when it's comitted into

    I think it does, ie. it's a simple mapping of [string key] => [string
value].

> memcached either, although I could be wrong, I have not looked at the
> source. Certainly if you can ensure that a client always goes back to
> the same machine you can simplify the whole thing hugely. It's generally
> not that easy though, you need a proxy server of some description
> capable of understanding the HTTP traffic and maintaining a central

    Yes...
    You could implement it by mapping servers to the hash of the user session
id.
    Statistically, the servers would get the same numbers of sessions on each
of them, but you have to trust statistics...
    It does eliminate the lookup table though.

> idea, I would like to hear a way of implementing them cheaply (and on
> PHP) as well. I may have to give that some thought in fact. Oh yeah, and
> load balancers software often sucks in annoying (if not always
> important) ways.

    You can use lighttpd as a load balancer, I believe it has a stick
sessions plugin (or you could code one in, it's open source after all). It
definitely support simple round-robin load balancing, acting as a proxy to
any number of independent servers.


>> matter, it's pretty impressive. The google filesystem has nothing to do
>> with databases though, it's more a massive data store / streaming
>> storage.
>
> Since when did Massive Data stores have nothing to do with DBs? Isn't
> Oracle Cluster entirely based on forming an enormous scalable disk array
> to store your DB on?

    Um, well, the Google Filesystem is (like its name implies) a filesystem
designed to store huge files in a distributed and redundant manner. Files
are structured as a stream of records (which are themselves big in size)
and it's designed to support appending records to these stream files
efficiently and without worrying about locking.

    It has no querying features however, that is why I said it was not a
database.

    I wish I could find the whitepaper, I think the URL was on this list some
day, maybe it's on Google's site ?

pgsql-performance by date:

Previous
From: PFC
Date:
Subject: Re: BLOB's bypassing the OS Filesystem for better Image loading speed?
Next
From: "Marc Mamin"
Date:
Subject: tuning Postgres for large data import (using Copy from)