Thread: PostgreSQL, clusters and load-balance
When traffic goes up, my webserver creates multiple instances of postgresql.exe. At some basic level, aren't they similar to Oracle's RAC "clusters", except that they are not aware of each other? Does this mean that if I were to create copies of postgresql.exe beforehand and somehow split traffic to them, traffic could be handled better?
Also, if I install postgresql on multiple boxes, how can I load-balance- configure traffic to be split depending on load, at application or webserver level?
All this for mostly read-only data.
Cheers, Bill
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tue, 25 Mar 2008 14:16:37 -0400 "Bill Wordsworth" <bill.wordsworth@gmail.com> wrote: > When traffic goes up, my webserver creates multiple instances of > postgresql.exe. At some basic level, aren't they similar to Oracle's > RAC "clusters", except that they are not aware of each other? Does > this mean that if I were to create copies of postgresql.exe > beforehand and somehow split traffic to them, traffic could be > handled better? Ehh no :). Each of the postgresql.exe is a forked connection to a specific database within PostgreSQL. They may be one database (if you only have one) or many. You could benefit from connection pooling but I have no idea if we have a connection pooler that runs on Windows. > > Also, if I install postgresql on multiple boxes, how can I > load-balance- configure traffic to be split depending on load, at > application or webserver level? That is a *long* conversation. It entirely depends on your business requirements. > > All this for mostly read-only data. > > http://www.postgresql.org/docs/8.3/interactive/high-availability.html That is correct, you can not have multiple write nodes. Sincerely, Joshua D. Drake - -- The PostgreSQL Company since 1997: http://www.commandprompt.com/ PostgreSQL Community Conference: http://www.postgresqlconference.org/ United States PostgreSQL Association: http://www.postgresql.us/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFH6UMGATb/zqfZUUQRAoT0AJ44/gXekfNVfcjl6FhiO03GnqOj9gCgqYRy cmWK72yX/YHFJMqt/6RIZNY= =kM04 -----END PGP SIGNATURE-----
Bill Wordsworth wrote on 25.03.2008 19:16: > When traffic goes up, my webserver creates multiple instances of > postgresql.exe. At some basic level, aren't they similar to Oracle's RAC > "clusters", except that they are not aware of each other? No, absolutely not. Each client request is handled by a single postgres process which is spawned by the postmaster upon connection. It has nothing to do with "instances" or high-availibility Thomas
On Tue, Mar 25, 2008 at 2:24 PM, Thomas Kellerer <spam_eater@gmx.net> wrote:
Bill Wordsworth wrote on 25.03.2008 19:16:> When traffic goes up, my webserver creates multiple instances ofNo, absolutely not. Each client request is handled by a single postgres process
> postgresql.exe. At some basic level, aren't they similar to Oracle's RAC
> "clusters", except that they are not aware of each other?
which is spawned by the postmaster upon connection.
Thanks Joshua and Thomas. I guess my ignorance is showing :). Anyway, is this spawning being done by postmaster or webserver or both? If postmaster, does an application-level persistent connection request communicate itself directly to the postmaster, and can the postmaster keep track of its spawning?
Also, at some crude level, if I were to direct every alternate connection to a different install box of postgresql, won't that help with *some* load-balance?
Cheers, Bill
Bill Wordsworth escribió: > On Tue, Mar 25, 2008 at 2:24 PM, Thomas Kellerer <spam_eater@gmx.net > <mailto:spam_eater@gmx.net>> wrote: > > Bill Wordsworth wrote on 25.03.2008 19:16: > > When traffic goes up, my webserver creates multiple instances of > > postgresql.exe. At some basic level, aren't they similar to > Oracle's RAC > > "clusters", except that they are not aware of each other? > > No, absolutely not. Each client request is handled by a single > postgres process > which is spawned by the postmaster upon connection. > > > Thanks Joshua and Thomas. I guess my ignorance is showing :). Anyway, > is this spawning being done by postmaster or webserver or both? If > postmaster, does an application-level persistent connection request > communicate itself directly to the postmaster, and can the > postmaster keep track of its spawning? > > Also, at some crude level, if I were to direct every alternate > connection to a different install box of postgresql, won't that help > with *some* load-balance? > Cheers, Bill I dont know the first answer, but maybe you are needing pgpool, check it, I think that is what you are needing...
Attachment
Bill Wordsworth wrote on 25.03.2008 21:04: > On Tue, Mar 25, 2008 at 2:24 PM, Thomas Kellerer <spam_eater@gmx.net> wrote: > >> Bill Wordsworth wrote on 25.03.2008 19:16: >>> When traffic goes up, my webserver creates multiple instances of >>> postgresql.exe. At some basic level, aren't they similar to Oracle's RAC >>> "clusters", except that they are not aware of each other? >> No, absolutely not. Each client request is handled by a single postgres >> process >> which is spawned by the postmaster upon connection. > > Thanks Joshua and Thomas. I guess my ignorance is showing :). Anyway, is > this spawning being done by postmaster or webserver or both? By the postmaster - for every connection initiated by the "client" (connection pool in the app server, richt client, ...) Thomas
Bill Wordsworth wrote: > On Tue, Mar 25, 2008 at 2:24 PM, Thomas Kellerer <spam_eater@gmx.net> wrote: > >> Bill Wordsworth wrote on 25.03.2008 19:16: >>> When traffic goes up, my webserver creates multiple instances of >>> postgresql.exe. At some basic level, aren't they similar to Oracle's RAC >>> "clusters", except that they are not aware of each other? >> No, absolutely not. Each client request is handled by a single postgres >> process >> which is spawned by the postmaster upon connection. > > > Thanks Joshua and Thomas. I guess my ignorance is showing :). Anyway, is > this spawning being done by postmaster or webserver or both? If postmaster, > does an application-level persistent connection request communicate itself > directly to the postmaster, and can the postmaster keep track of its > spawning? In simplified terms - you have one backend postgres process that handles the data storage and caching etc. Then you have one postgres process running for each client connected to the server at any given time. This client process handles all requests to and from the client and talks to the backend process to get the data required for the request. You will have one postgres client connection running for each concurrent db connection required by the web server. With the scripting used for building your web pages - each time you open a connection you start a postgres client process running as you have seen happen. Then when you close the connection the client process for that will finish. If you are using persistant connections - then when you close a connection the web server will keep the client process running and use it again for the next new connection saving time in starting the process up. > Also, at some crude level, if I were to direct every alternate connection to > a different install box of postgresql, won't that help with *some* > load-balance? > Cheers, Bill > All of these postgres processes will be running on the one machine - this may be the same machine as the web server or a separate one. You can use replication to store the same data on more than one server and use all of them for responding to selects for the web server. Most replication options go for only using one of these servers for updates and the others for selects only. You can then use pooling options such as pgpool (or code it into your scripting if you wish) to distribute your connection requests between these replicated servers. -- Shane Ambler pgSQL (at) Sheeky (dot) Biz Get Sheeky @ http://Sheeky.Biz