Thread: PostgreSQL, clusters and load-balance

PostgreSQL, clusters and load-balance

From
"Bill Wordsworth"
Date:
When traffic goes up, my webserver creates multiple instances of postgresql.exe. At some basic level, aren't they similar to Oracle's RAC "clusters", except that they are not aware of each other? Does this mean that if I were to create copies of postgresql.exe beforehand and somehow split traffic to them, traffic could be handled better?
 
Also, if I install postgresql on multiple boxes, how can I load-balance- configure traffic to be split depending on load, at application or webserver level?
 
All this for mostly read-only data.
 
Cheers, Bill

Re: PostgreSQL, clusters and load-balance

From
"Joshua D. Drake"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 25 Mar 2008 14:16:37 -0400
"Bill Wordsworth" <bill.wordsworth@gmail.com> wrote:

> When traffic goes up, my webserver creates multiple instances of
> postgresql.exe. At some basic level, aren't they similar to Oracle's
> RAC "clusters", except that they are not aware of each other? Does
> this mean that if I were to create copies of postgresql.exe
> beforehand and somehow split traffic to them, traffic could be
> handled better?

Ehh no :).

Each of the postgresql.exe is a forked connection to a specific
database within PostgreSQL. They may be one database (if you only have
one) or many.

You could benefit from connection pooling but I have no idea if we have
a connection pooler that runs on Windows.

> 
> Also, if I install postgresql on multiple boxes, how can I
> load-balance- configure traffic to be split depending on load, at
> application or webserver level?

That is a *long* conversation. It entirely depends on your business
requirements.

> 
> All this for mostly read-only data.
> 
> http://www.postgresql.org/docs/8.3/interactive/high-availability.html

That is correct, you can not have multiple write nodes.

Sincerely,

Joshua D. Drake
- -- 
The PostgreSQL Company since 1997: http://www.commandprompt.com/ 
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFH6UMGATb/zqfZUUQRAoT0AJ44/gXekfNVfcjl6FhiO03GnqOj9gCgqYRy
cmWK72yX/YHFJMqt/6RIZNY=
=kM04
-----END PGP SIGNATURE-----

Re: PostgreSQL, clusters and load-balance

From
Thomas Kellerer
Date:
Bill Wordsworth wrote on 25.03.2008 19:16:
> When traffic goes up, my webserver creates multiple instances of
> postgresql.exe. At some basic level, aren't they similar to Oracle's RAC
> "clusters", except that they are not aware of each other?

No, absolutely not. Each client request is handled by a single postgres process
which is spawned by the postmaster upon connection.

It has nothing to do with "instances" or high-availibility

Thomas

Re: PostgreSQL, clusters and load-balance

From
"Bill Wordsworth"
Date:
On Tue, Mar 25, 2008 at 2:24 PM, Thomas Kellerer <spam_eater@gmx.net> wrote:
Bill Wordsworth wrote on 25.03.2008 19:16:
> When traffic goes up, my webserver creates multiple instances of
> postgresql.exe. At some basic level, aren't they similar to Oracle's RAC
> "clusters", except that they are not aware of each other?

No, absolutely not. Each client request is handled by a single postgres process
which is spawned by the postmaster upon connection.
 
Thanks Joshua and Thomas. I guess my ignorance is showing :). Anyway, is this spawning being done by postmaster or webserver or both? If postmaster, does an application-level persistent connection request communicate itself directly to the postmaster, and can the postmaster keep track of its spawning?
 
Also, at some crude level, if I were to direct every alternate connection to a different install box of postgresql, won't that help with *some* load-balance?
Cheers, Bill

Re: PostgreSQL, clusters and load-balance

From
Rodrigo Gonzalez
Date:
Bill Wordsworth escribió:
> On Tue, Mar 25, 2008 at 2:24 PM, Thomas Kellerer <spam_eater@gmx.net
> <mailto:spam_eater@gmx.net>> wrote:
>
>     Bill Wordsworth wrote on 25.03.2008 19:16:
>     > When traffic goes up, my webserver creates multiple instances of
>     > postgresql.exe. At some basic level, aren't they similar to
>     Oracle's RAC
>     > "clusters", except that they are not aware of each other?
>
>     No, absolutely not. Each client request is handled by a single
>     postgres process
>     which is spawned by the postmaster upon connection.
>
>
> Thanks Joshua and Thomas. I guess my ignorance is showing :). Anyway,
> is this spawning being done by postmaster or webserver or both? If
> postmaster, does an application-level persistent connection request
> communicate itself directly to the postmaster, and can the
> postmaster keep track of its spawning?
>
> Also, at some crude level, if I were to direct every alternate
> connection to a different install box of postgresql, won't that help
> with *some* load-balance?
> Cheers, Bill
I dont know the first answer, but maybe you are needing pgpool, check
it, I think that is what you are needing...


Attachment

Re: PostgreSQL, clusters and load-balance

From
Thomas Kellerer
Date:
Bill Wordsworth wrote on 25.03.2008 21:04:
> On Tue, Mar 25, 2008 at 2:24 PM, Thomas Kellerer <spam_eater@gmx.net> wrote:
>
>> Bill Wordsworth wrote on 25.03.2008 19:16:
>>> When traffic goes up, my webserver creates multiple instances of
>>> postgresql.exe. At some basic level, aren't they similar to Oracle's RAC
>>> "clusters", except that they are not aware of each other?
>> No, absolutely not. Each client request is handled by a single postgres
>> process
>> which is spawned by the postmaster upon connection.
>
> Thanks Joshua and Thomas. I guess my ignorance is showing :). Anyway, is
> this spawning being done by postmaster or webserver or both?

By the postmaster - for every connection initiated by the "client" (connection
pool in the app server, richt client, ...)

Thomas


Re: PostgreSQL, clusters and load-balance

From
Shane Ambler
Date:
Bill Wordsworth wrote:
> On Tue, Mar 25, 2008 at 2:24 PM, Thomas Kellerer <spam_eater@gmx.net> wrote:
>
>> Bill Wordsworth wrote on 25.03.2008 19:16:
>>> When traffic goes up, my webserver creates multiple instances of
>>> postgresql.exe. At some basic level, aren't they similar to Oracle's RAC
>>> "clusters", except that they are not aware of each other?
>> No, absolutely not. Each client request is handled by a single postgres
>> process
>> which is spawned by the postmaster upon connection.
>
>
> Thanks Joshua and Thomas. I guess my ignorance is showing :). Anyway, is
> this spawning being done by postmaster or webserver or both? If postmaster,
> does an application-level persistent connection request communicate itself
> directly to the postmaster, and can the postmaster keep track of its
> spawning?

In simplified terms - you have one backend postgres process that handles
the data storage and caching etc. Then you have one postgres process
running for each client connected to the server at any given time. This
client process handles all requests to and from the client and talks to
the backend process to get the data required for the request. You will
have one postgres client connection running for each concurrent db
connection required by the web server.

With the scripting used for building your web pages - each time you open
a connection you start a postgres client process running as you have
seen happen. Then when you close the connection the client process for
that will finish.

If you are using persistant connections - then when you close a
connection the web server will keep the client process running and use
it again for the next new connection saving time in starting the process up.


> Also, at some crude level, if I were to direct every alternate connection to
> a different install box of postgresql, won't that help with *some*
> load-balance?
> Cheers, Bill
>

All of these postgres processes will be running on the one machine -
this may be the same machine as the web server or a separate one. You
can use replication to store the same data on more than one server and
use all of them for responding to selects for the web server.

Most replication options go for only using one of these servers for
updates and the others for selects only. You can then use pooling
options such as pgpool (or code it into your scripting if you wish) to
distribute your connection requests between these replicated servers.



--

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz