Thread: What else are OIDs used for?
I know that an OID is generated each time I add a new row to a table (unless the table was created without oids). I also know that OIDs are generated for entries in the pg_largeobject table. Are OIDs generated for anything else? _________________________________________________________________ MSN Photos is the easiest way to share and print your photos: http://photos.msn.com/support/worldwide.aspx
Hello all: I'm curious about the limits to postgres' scalability. More exactly, I'm curious exactly how to make it scale. I'm trying to develop specs for an application which must support a minimum of 800 concurrent users, up to double that in short peak periods. I'm hoping to do this with a Postgres back end and an Apache/PHP front end. On the front end, as I understand it (I know this is not really Postgres-specific), with Apache and mod_php I need one process per concurrent user. Anyone care to speculate how many concurrent users I could get on a single box? I really don't know what's reasonable to expect. On the back end, is there any direct relationship between the number of open client connections and the number of processes used? (I guess that's another way of asking whether client connections are handled in a multithreaded fashion). My question again would be, is it at all reasonable to think that the postgres back end, running on a single box, could handle 800-1200 concurrent users? Is it a matter of running multiple postmasters? If I can't get all those users on one back-end box, how do I distribute them across multiple servers but have them all access the same data store? Thanks, steve
On Sat, 13 Apr 2002 17:00:19 -0500 "Steve Lane" <slane@fmpro.com> wrote: > On the front end, as I understand it (I know this is not really > Postgres-specific), with Apache and mod_php I need one process per > concurrent user. Anyone care to speculate how many concurrent users I could > get on a single box? I really don't know what's reasonable to expect. For the front-end, it totally depends on the hardware you're using, the OS you're running this on, and the design/performance requirements of your application. For example, a good caching layer could easily improve performance by 100% or more. > On the back end, is there any direct relationship between the number of open > client connections and the number of processes used? Yes, there is 1 postgres process per database connection. Whether you create 1 database connection per client would depend on how you design your application. > My question again would be, is it at all reasonable to think that > the postgres back end, running on a single box, could handle 800-1200 > concurrent users? Not really sure. By "concurrent users", do you mean "executing queries at any given time", or "logged in" (so that perhaps 10% of those will actually be hitting the DB)? > Is it a matter of running multiple postmasters? If you mean running multiple postmasters on a single machine, that is unlikely to help. > If I can't get all those users on one back-end box, how do I distribute them > across multiple servers but have them all access the same data store? There might be support for replication in 7.3; until then, there are some projects like erServer you can take a look at. Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC
On 4/13/02 5:10 PM, "Neil Conway" <nconway@klamath.dyndns.org> wrote: > On Sat, 13 Apr 2002 17:00:19 -0500 > "Steve Lane" <slane@fmpro.com> wrote: >> On the front end, as I understand it (I know this is not really >> Postgres-specific), with Apache and mod_php I need one process per >> concurrent user. Anyone care to speculate how many concurrent users I could >> get on a single box? I really don't know what's reasonable to expect. > > For the front-end, it totally depends on the hardware you're using, > the OS you're running this on, and the design/performance requirements > of your application. For example, a good caching layer could easily > improve performance by 100% or more. I'm less concerned with performance (at the moment) than concurrency. My worry is that (lacking a multithreaded web server, which Apache 2.0 appears to give me), I need to have 800-1200 processes, one per connection, running on the web server or servers. I don't know if that's feasible under any circumstances. I guess I'm less worried about the front end though, because load-balancing across multiple web servers is not a huge deal. > >> On the back end, is there any direct relationship between the number of open >> client connections and the number of processes used? > > Yes, there is 1 postgres process per database connection. Whether > you create 1 database connection per client would depend on how you > design your application. Can you clarify that second sentence a bit? I wasn't aware I had much choice -- meaning that, since Apache 1.x + PHP is not multithreaded and does not do connection pooling, I think I'm stuck with one database connection per front-side client connection. > >> My question again would be, is it at all reasonable to think that >> the postgres back end, running on a single box, could handle 800-1200 >> concurrent users? > > Not really sure. By "concurrent users", do you mean "executing queries > at any given time", or "logged in" (so that perhaps 10% of those will > actually be hitting the DB)? More the latter -- open database connections that may be active or idle. >> If I can't get all those users on one back-end box, how do I distribute them >> across multiple servers but have them all access the same data store? > > There might be support for replication in 7.3; until then, there > are some projects like erServer you can take a look at. Thanks for the detailed response. -- steve
On Sat, 13 Apr 2002, Steve Lane wrote: > On 4/13/02 5:10 PM, "Neil Conway" <nconway@klamath.dyndns.org> wrote: > > > On Sat, 13 Apr 2002 17:00:19 -0500 > > "Steve Lane" <slane@fmpro.com> wrote: > >> On the front end, as I understand it (I know this is not really > >> Postgres-specific), with Apache and mod_php I need one process per > >> concurrent user. Anyone care to speculate how many concurrent users I could > >> get on a single box? I really don't know what's reasonable to expect. > > > > For the front-end, it totally depends on the hardware you're using, > > the OS you're running this on, and the design/performance requirements > > of your application. For example, a good caching layer could easily > > improve performance by 100% or more. > > I'm less concerned with performance (at the moment) than concurrency. My > worry is that (lacking a multithreaded web server, which Apache 2.0 appears > to give me), I need to have 800-1200 processes, one per connection, running > on the web server or servers. I don't know if that's feasible under any > circumstances. It's not something I've done but I don't see why not if the OS is configured with appropiate limits on processes, open file descriptors etc. > [snip] > > >> On the back end, is there any direct relationship between the number of open > >> client connections and the number of processes used? > > > > Yes, there is 1 postgres process per database connection. Whether > > you create 1 database connection per client would depend on how you > > design your application. > > Can you clarify that second sentence a bit? I wasn't aware I had much choice > -- meaning that, since Apache 1.x + PHP is not multithreaded and does not do > connection pooling, I think I'm stuck with one database connection per > front-side client connection. He means each frontend process opens one and only one connection to the DB. Since it is possible for an application to open more than one connection and the relationship is one to one for the number of connections to backend processes (plus the postmaster, stats. etc) it is possible to have a one application process to many backend processes arrangement. -- Nigel J. Andrews Director --- Logictree Systems Limited Computer Consultants
On Sat, Apr 13, 2002 at 05:24:48PM -0500, Steve Lane wrote: > On 4/13/02 5:10 PM, "Neil Conway" <nconway@klamath.dyndns.org> wrote: > > > On Sat, 13 Apr 2002 17:00:19 -0500 > > "Steve Lane" <slane@fmpro.com> wrote: > >> On the front end, as I understand it (I know this is not really > >> Postgres-specific), with Apache and mod_php I need one process per > >> concurrent user. Anyone care to speculate how many concurrent users I could > >> get on a single box? I really don't know what's reasonable to expect. > > > > For the front-end, it totally depends on the hardware you're using, > > the OS you're running this on, and the design/performance requirements > > of your application. For example, a good caching layer could easily > > improve performance by 100% or more. > > I'm less concerned with performance (at the moment) than concurrency. My > worry is that (lacking a multithreaded web server, which Apache 2.0 appears > to give me), I need to have 800-1200 processes, one per connection, running > on the web server or servers. I don't know if that's feasible under any > circumstances. In a sense they are tied together. Say all your queries for each transaction are done in less than 100ms, then you can handle many more clients for the same load than if they took 10 times as long. Also, an idling client generally does not keep a connection open to the Apache server. So if you have 800 people changing webpage once a minute, you're really only going to be handling 15 processes at the same time. > I guess I'm less worried about the front end though, because load-balancing > across multiple web servers is not a huge deal. I find it helpful if 1) you make all your queries simple and fast and 2) if your query does require some trawling, cache and/or precalculate. > > Yes, there is 1 postgres process per database connection. Whether > > you create 1 database connection per client would depend on how you > > design your application. > > Can you clarify that second sentence a bit? I wasn't aware I had much choice > -- meaning that, since Apache 1.x + PHP is not multithreaded and does not do > connection pooling, I think I'm stuck with one database connection per > front-side client connection. Well, number of apache connections == number database connections == number of postgres backend generally. The connection pooling in PHP only stops it reconnecting for each page. But as I state above, the number of apache connections will probably be *far* less than the number of clients. HTH, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Ignorance continues to thrive when intelligent people choose to do > nothing. Speaking out against censorship and ignorance is the imperative > of all intelligent people.
On Sun, 14 Apr 2002 10:38:06 +1000 "Martijn van Oosterhout" <kleptog@svana.org> wrote: > Also, an idling client generally does not keep a connection open to the > Apache server. So if you have 800 people changing webpage once a minute, > you're really only going to be handling 15 processes at the same time. This assumes you're not using KeepAlives, in which case an httpd child will wait around for KeepAliveTimeout seconds before serving other clients. Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC
On Sat, Apr 13, 2002 at 08:45:56PM -0400, Neil Conway wrote: > On Sun, 14 Apr 2002 10:38:06 +1000 > "Martijn van Oosterhout" <kleptog@svana.org> wrote: > > Also, an idling client generally does not keep a connection open to the > > Apache server. So if you have 800 people changing webpage once a minute, > > you're really only going to be handling 15 processes at the same time. > > This assumes you're not using KeepAlives, in which case an httpd child > will wait around for KeepAliveTimeout seconds before serving other > clients. Hmm, the default is 15 seconds. So if you are expecting lots of short transactions, this could blow out your connection count to 200 or so. Depending on the situation I'd be tempted to drop that down since the costs of setting up connections is much lower on a LAN than over the internet (assuming he's running on a LAN). But a valid point notheless... -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Ignorance continues to thrive when intelligent people choose to do > nothing. Speaking out against censorship and ignorance is the imperative > of all intelligent people.
On 4/13/02 7:38 PM, "Martijn van Oosterhout" <kleptog@svana.org> wrote: >> I'm less concerned with performance (at the moment) than concurrency. My >> worry is that (lacking a multithreaded web server, which Apache 2.0 appears >> to give me), I need to have 800-1200 processes, one per connection, running >> on the web server or servers. I don't know if that's feasible under any >> circumstances. > > In a sense they are tied together. Say all your queries for each transaction > are done in less than 100ms, then you can handle many more clients for the > same load than if they took 10 times as long. > > Also, an idling client generally does not keep a connection open to the > Apache server. So if you have 800 people changing webpage once a minute, > you're really only going to be handling 15 processes at the same time. Actually, I have a lot of trouble with this if I instruct PHP to make persistent connections. If I do this, the idle connections just pile up and never close. A few hours or days, and postgres maxes out on the number of open connections. The idle ones are never reclaimed. I have heard rumors that PHP's persistent connection to postgres has some troubles, and this seems to be so. > >> I guess I'm less worried about the front end though, because load-balancing >> across multiple web servers is not a huge deal. > > I find it helpful if 1) you make all your queries simple and fast and 2) if > your query does require some trawling, cache and/or precalculate. For the largest application I currently run, we are fairly denormalized and push a lot of data around the back end with triggers. I imagine this is helping to some degree. >> >> Can you clarify that second sentence a bit? I wasn't aware I had much choice >> -- meaning that, since Apache 1.x + PHP is not multithreaded and does not do >> connection pooling, I think I'm stuck with one database connection per >> front-side client connection. > > Well, number of apache connections == number database connections == number > of postgres backend generally. The connection pooling in PHP only stops it > reconnecting for each page. But as I state above, the number of apache > connections will probably be *far* less than the number of clients. Ah, that's the kind of simple math I was looking for. Thanks. -- steve
On 4/13/02 7:45 PM, "Neil Conway" <nconway@klamath.dyndns.org> wrote: > On Sun, 14 Apr 2002 10:38:06 +1000 > "Martijn van Oosterhout" <kleptog@svana.org> wrote: >> Also, an idling client generally does not keep a connection open to the >> Apache server. So if you have 800 people changing webpage once a minute, >> you're really only going to be handling 15 processes at the same time. > > This assumes you're not using KeepAlives, in which case an httpd child > will wait around for KeepAliveTimeout seconds before serving other > clients. Ah ... Good point. I should check this. My problem with persistent connections might be related to exactly this.
On 4/13/02 8:04 PM, "Martijn van Oosterhout" <kleptog@svana.org> wrote: > On Sat, Apr 13, 2002 at 08:45:56PM -0400, Neil Conway wrote: >> On Sun, 14 Apr 2002 10:38:06 +1000 >> "Martijn van Oosterhout" <kleptog@svana.org> wrote: >>> Also, an idling client generally does not keep a connection open to the >>> Apache server. So if you have 800 people changing webpage once a minute, >>> you're really only going to be handling 15 processes at the same time. >> >> This assumes you're not using KeepAlives, in which case an httpd child >> will wait around for KeepAliveTimeout seconds before serving other >> clients. > > Hmm, the default is 15 seconds. So if you are expecting lots of short > transactions, this could blow out your connection count to 200 or so. > Depending on the situation I'd be tempted to drop that down since the costs > of setting up connections is much lower on a LAN than over the internet > (assuming he's running on a LAN). > > But a valid point notheless... Apache and Postgres are actually right on the same box. Any suggestions as to an appropriate value for KeepAliveTimeout under those circumstances?
On Sat, 13 Apr 2002 22:24:07 -0500 "Steve Lane" <slane@fmpro.com> wrote: > Apache and Postgres are actually right on the same box. Any suggestions as > to an appropriate value for KeepAliveTimeout under those circumstances? I think the usefulness of KeepAlives varies directly with the proportion of your clients that are on slow connections. If this is a LAN situation, I'd be inclined to disable KeepAlives entirely. If it's a more heterogenous environment, I'm not sure I can give you any canonical figures -- if you can setup a benchmarking environment in which you have multiple clients using different connection speeds (perhaps simulated), that would probably give you the best data. Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC
... comments inline Steve Lane wrote: >I'm less concerned with performance (at the moment) than concurrency. My >worry is that (lacking a multithreaded web server, which Apache 2.0 appears >to give me), I need to have 800-1200 processes, one per connection, running >on the web server or servers. I don't know if that's feasible under any >circumstances. > >I guess I'm less worried about the front end though, because load-balancing >across multiple web servers is not a huge deal. > With PHP and say a dual box with Piii Xeons @800Mhz you can expect 600-800 users with a 'moderate' dynamic content volume. Your app probably falls more towards the heavy column, and it requires loads of RAM. Apache 2 may solve some of that but... well.... I have no idea as I've yet to even look at it :) > >Can you clarify that second sentence a bit? I wasn't aware I had much choice >-- meaning that, since Apache 1.x + PHP is not multithreaded and does not do >connection pooling, I think I'm stuck with one database connection per >front-side client connection. > PHP with mysql does do DB connection pooling, and MAY do conneciton pooling for postgres (check the docs), and in fact probably does.
KEepAlives must be enabled on client and server, Netscape 4 doesn't do keep-alives, NS6 I don't know, mosta ll versions of MSIE do keep alive if the server allows them.
Martijn van Oosterhout wrote:
Martijn van Oosterhout wrote:
On Sat, Apr 13, 2002 at 08:45:56PM -0400, Neil Conway wrote:On Sun, 14 Apr 2002 10:38:06 +1000
"Martijn van Oosterhout" <kleptog@svana.org> wrote:Also, an idling client generally does not keep a connection open to the
Apache server. So if you have 800 people changing webpage once a minute,
you're really only going to be handling 15 processes at the same time.This assumes you're not using KeepAlives, in which case an httpd child
will wait around for KeepAliveTimeout seconds before serving other
clients.
Hmm, the default is 15 seconds. So if you are expecting lots of short
transactions, this could blow out your connection count to 200 or so.
Depending on the situation I'd be tempted to drop that down since the costs
of setting up connections is much lower on a LAN than over the internet
(assuming he's running on a LAN).
But a valid point notheless...
> >>>Also, an idling client generally does not keep a connection open > >>>to the Apache server. So if you have 800 people changing webpage > >>>once a minute, you're really only going to be handling 15 > >>>processes at the same time. > >>> > >>This assumes you're not using KeepAlives, in which case an httpd > >>child will wait around for KeepAliveTimeout seconds before serving > >>other clients. > > > >Hmm, the default is 15 seconds. So if you are expecting lots of > >short transactions, this could blow out your connection count to > >200 or so. Depending on the situation I'd be tempted to drop that > >down since the costs of setting up connections is much lower on a > >LAN than over the internet (assuming he's running on a LAN). > > > >But a valid point notheless... > > KEepAlives must be enabled on client and server, Netscape 4 doesn't > do keep-alives, NS6 I don't know, mosta ll versions of MSIE do keep > alive if the server allows them. There is no point to keep alives with a value greater than 1 second unless all of your clients are over high latency links and _want_ them on because of the number of images on a site. This is getting pretty OT, but if you're using persistent database connections, turning keep alives OFF is an absolute must. The only reason to turn keep alives on is if you're trying to avoid the overhead of bringing up and tearing down a TCP connection, which, with props to most OS implementations, is very quick and efficient (save HPUX, AIX, and Solaris... errr.. Slowaris). Expense of bringing up a new TCP connection versus the number of requests that could be served while _WAITING_ for the client to send its next HTTP request. No brainer: keep alives, in almost every context/situation, must die a painful and agonizing death. -sc PS The only instance where someone could reasonably justify using keep alives would be for an Intranet site with 50+ images per HTML page... and even then, that's a non-issue given that browsers cache the images after the 1st page load. -- Sean Chittenden
Attachment
AFAIK, connections aren't linked to clients per se, but rather to active connections in PHP4, Using pg_pconnect in PHP4, connections are recycled after a client's page hit is finished, so it's available for the next page hit. Meaning, to service 800 simultaneous users with 10% active saturation, you'd have active 80 DB connections - something an appropriately configured server should handle easily. BTW - 10% active connections is high.Those 800 people would have to be working it pretty darn hard to get numbers like that. My experience would be more like (tops) 2-5%. In most circumstances, 1/2 or more of the wait from click to view on a web hit is download/render time on the client, not the web server. -Ben ----- Original Message ----- From: Nigel J. Andrews <nandrews@investsystems.co.uk> To: <pgsql-general@postgresql.org> Sent: Saturday, April 13, 2002 4:30 PM Subject: Re: [GENERAL] Scaling postgres > > On Sat, 13 Apr 2002, Steve Lane wrote: > > > On 4/13/02 5:10 PM, "Neil Conway" <nconway@klamath.dyndns.org> wrote: > > > > > On Sat, 13 Apr 2002 17:00:19 -0500 > > > "Steve Lane" <slane@fmpro.com> wrote: > > >> On the front end, as I understand it (I know this is not really > > >> Postgres-specific), with Apache and mod_php I need one process per > > >> concurrent user. Anyone care to speculate how many concurrent users I could > > >> get on a single box? I really don't know what's reasonable to expect. > > > > > > For the front-end, it totally depends on the hardware you're using, > > > the OS you're running this on, and the design/performance requirements > > > of your application. For example, a good caching layer could easily > > > improve performance by 100% or more. > > > > I'm less concerned with performance (at the moment) than concurrency. My > > worry is that (lacking a multithreaded web server, which Apache 2.0 appears > > to give me), I need to have 800-1200 processes, one per connection, running > > on the web server or servers. I don't know if that's feasible under any > > circumstances. > > It's not something I've done but I don't see why not if the OS is configured > with appropiate limits on processes, open file descriptors etc. > > > [snip] > > > > >> On the back end, is there any direct relationship between the number of open > > >> client connections and the number of processes used? > > > > > > Yes, there is 1 postgres process per database connection. Whether > > > you create 1 database connection per client would depend on how you > > > design your application. > > > > Can you clarify that second sentence a bit? I wasn't aware I had much choice > > -- meaning that, since Apache 1.x + PHP is not multithreaded and does not do > > connection pooling, I think I'm stuck with one database connection per > > front-side client connection. > > He means each frontend process opens one and only one connection to the > DB. Since it is possible for an application to open more than one connection > and the relationship is one to one for the number of connections to backend > processes (plus the postmaster, stats. etc) it is possible to have a one > application process to many backend processes arrangement. > > > -- > Nigel J. Andrews > Director > > --- > Logictree Systems Limited > Computer Consultants > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster
I think you mean pg_connect(); with pg_pconnect() the connection is always there till you close the connection manually with pg_close(); as well pg_connect() is only open till the script has finshed exacuting. ----- Original Message ----- From: "Benjamin Smith" <bens@effortlessis.com> To: <pgsql-general@postgresql.org> Sent: Saturday, April 13, 2002 4:40 PM Subject: Re: [GENERAL] Scaling postgres > AFAIK, connections aren't linked to clients per se, but rather to active > connections in PHP4, Using pg_pconnect in PHP4, connections are recycled > after a client's page hit is finished, so it's available for the next page > hit. > > Meaning, to service 800 simultaneous users with 10% active saturation, you'd > have active 80 DB connections - something an appropriately configured server > should handle easily. > > BTW - 10% active connections is high.Those 800 people would have to be > working it pretty darn hard to get numbers like that. My experience would be > more like (tops) 2-5%. In most circumstances, 1/2 or more of the wait from > click to view on a web hit is download/render time on the client, not the > web server. > > -Ben > > ----- Original Message ----- > From: Nigel J. Andrews <nandrews@investsystems.co.uk> > To: <pgsql-general@postgresql.org> > Sent: Saturday, April 13, 2002 4:30 PM > Subject: Re: [GENERAL] Scaling postgres > > > > > > On Sat, 13 Apr 2002, Steve Lane wrote: > > > > > On 4/13/02 5:10 PM, "Neil Conway" <nconway@klamath.dyndns.org> wrote: > > > > > > > On Sat, 13 Apr 2002 17:00:19 -0500 > > > > "Steve Lane" <slane@fmpro.com> wrote: > > > >> On the front end, as I understand it (I know this is not really > > > >> Postgres-specific), with Apache and mod_php I need one process per > > > >> concurrent user. Anyone care to speculate how many concurrent users I > could > > > >> get on a single box? I really don't know what's reasonable to expect. > > > > > > > > For the front-end, it totally depends on the hardware you're using, > > > > the OS you're running this on, and the design/performance requirements > > > > of your application. For example, a good caching layer could easily > > > > improve performance by 100% or more. > > > > > > I'm less concerned with performance (at the moment) than concurrency. My > > > worry is that (lacking a multithreaded web server, which Apache 2.0 > appears > > > to give me), I need to have 800-1200 processes, one per connection, > running > > > on the web server or servers. I don't know if that's feasible under any > > > circumstances. > > > > It's not something I've done but I don't see why not if the OS is > configured > > with appropiate limits on processes, open file descriptors etc. > > > > > [snip] > > > > > > >> On the back end, is there any direct relationship between the number > of open > > > >> client connections and the number of processes used? > > > > > > > > Yes, there is 1 postgres process per database connection. Whether > > > > you create 1 database connection per client would depend on how you > > > > design your application. > > > > > > Can you clarify that second sentence a bit? I wasn't aware I had much > choice > > > -- meaning that, since Apache 1.x + PHP is not multithreaded and does > not do > > > connection pooling, I think I'm stuck with one database connection per > > > front-side client connection. > > > > He means each frontend process opens one and only one connection to the > > DB. Since it is possible for an application to open more than one > connection > > and the relationship is one to one for the number of connections to > backend > > processes (plus the postmaster, stats. etc) it is possible to have a one > > application process to many backend processes arrangement. > > > > > > -- > > Nigel J. Andrews > > Director > > > > --- > > Logictree Systems Limited > > Computer Consultants > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 4: Don't 'kill -9' the postmaster > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org >