Re: Performance Bottleneck - Mailing list pgsql-performance

From Matt Clark
Subject Re: Performance Bottleneck
Date
Msg-id 007f01c47d54$233b4bf0$8300a8c0@solent
Whole thread Raw
In response to Re: Performance Bottleneck  (Jan Wieck <JanWieck@Yahoo.com>)
Responses Re: Performance Bottleneck
Re: Performance Bottleneck
List pgsql-performance
> And this is exactly where the pgpool advantage lies.
> Especially with the
> TPC-W, the Apache is serving a mix of PHP (or whatever CGI
> technique is
> used) and static content like images. Since the 200+ Apache
> kids serve
> any of that content by random and the emulated browsers very much
> encourage it to ramp up MaxClients children by using up to 4
> concurrent
> image connections, one does end up with MaxClients DB
> connections that
> are all relatively low frequently used. In contrast to that the real
> pgpool causes lesser, more active DB connections, which is better for
> performance.

There are two well-worn and very mature techniques for dealing with the
issue of web apps using one DB connection per apache process, both of which
work extremely well and attack the issue at its source.

1)    Use a front-end caching proxy like Squid as an accelerator.  Static
content will be served by the accelerator 99% of the time.  Additionally,
large pages can be served immediately to the accelerator by Apache, which
can then go on to serve another request without waiting for the end user's
dial-up connection to pull the data down.  Massive speedup, fewer apache
processes needed.

2)    Serve static content off an entirely separate apache server than the
dynamic content, but by using separate domains (e.g. 'static.foo.com').

Personally I favour number 1.  Our last biggish peak saw 6000 open HTTP and
HTTPS connections and only 200 apache children, all of them nice and busy,
not hanging around on street corners looking bored.  During quiet times
Apache drops back to its configured minimum of 40 kids.  Option 2 has the
advantage that you can use a leaner build for the 'dynamic' apache server,
but with RAM so plentiful these days that's a less useful property.

Basically this puts the 'pooling' back in the stateless HTTP area where it
truly belongs and can be proven not to have any peculiar side effects
(especially when it comes to transaction safety).  Even better, so long as
you use URL parameters for searches and the like, you can have the
accelerator cache those pages for a certain time too so long as slightly
stale results are OK.

I'm sure pgpool and the like have their place, but being band-aids for
poorly configured websites probably isn't the best use for them.

M


pgsql-performance by date:

Previous
From: Jan Wieck
Date:
Subject: Re: Performance Bottleneck
Next
From: Martin Foster
Date:
Subject: Re: Performance Bottleneck