Re: Anyone working on better transaction locking? - Mailing list pgsql-hackers

From Shridhar Daithankar
Subject Re: Anyone working on better transaction locking?
Date
Msg-id 200304121221.12377.shridhar_daithankar@nospam.persistent.co.in
Whole thread Raw
In response to Re: Anyone working on better transaction locking?  (Kevin Brown <kevin@sysexperts.com>)
Responses Re: Anyone working on better transaction locking?  (Kevin Brown <kevin@sysexperts.com>)
Re: Anyone working on better transaction locking?  (Greg Stark <gsstark@mit.edu>)
List pgsql-hackers
On Saturday 12 April 2003 03:02, you wrote:
> Ron Peacetree wrote:
> > Zeus had a performance ceiling roughly 3x that of Apache when Zeus
> > supported threading as well as pre-forking and Apache only supported
> > pre forking.  The Apache folks now support both.  DB2, Oracle, and SQL
> > Server all use threads.  Etc, etc.
>
> You can't use Apache as an example of why you should thread a database
> engine, except for the cases where the database is used much like the
> web server is: for numerous short transactions.

OK. Let me put my experiences. These are benchmarks on a intranet(100MBps lan) 
run off a 1GHZ P-III/IV webserver on mandrake9 for a single 8K file.

apache2044: 1300 rps
boa:      4500rps
Zeus:     6500 rps.

Apache does too many things to be a speed daemon and what it offers is pretty 
impressive from performance POV.

But database is not webserver. It is not suppose to handle tons of concurrent 
requests. That is a fundamental difference.

>
> > That's an awful lot of very bright programmers and some serious $$
> > voting that threads are worth it.  Given all that, if PostgreSQL
> > specific thread support is =not= showing itself to be a win that's
> > an unexpected enough outcome that we should be asking hard questions
> > as to why not.
>
> It's not that there won't be any performance benefits to be had from
> threading (there surely will, on some platforms), but gaining those
> benefits comes at a very high development and maintenance cost.  You
> lose a *lot* of robustness when all of your threads share the same
> memory space, and make yourself vulnerable to classes of failures that
> simply don't happen when you don't have shared memory space.

Well. Threading does not necessarily imply one thread per connection model. 
Threading can be used to make CPU work during I/O and taking advantage of SMP 
for things like sort etc. This is especially true for 2.4.x linux kernels 
where async I/O can not be used for threaded apps. as threads and signal do 
not mix together well.

One connection per thread is not a good model for postgresql since it has 
already built a robust product around process paradigm. If I have to start a 
new database project today, a mix of process+thread is what I would choose bu 
postgresql is not in same stage of life.

> > At their core, threads are a context switching efficiency tweak.
>
> This is the heart of the matter.  Context switching is an operating
> system problem, and *that* is where the optimization belongs.  Threads
> exist in large part because operating system vendors didn't bother to
> do a good job of optimizing process context switching and
> creation/destruction.

But why would a database need a tons of context switches if it is not supposed 
to service loads to request simaltenously? If there are 50 concurrent 
connections, how much context switching overhead is involved regardless of 
amount of work done in a single connection? Remeber that database state is 
maintened in shared memory. It does not take a context switch to access it.

The assumption stems from database being very efficient in creating and 
servicing a new connection. I am not very comfortable with that argument.

> Under Linux, from what I've read, process creation/destruction and
> context switching happens almost as fast as thread context switching
> on other operating systems (Windows in particular, if I'm not
> mistaken).

I hear solaris also has very heavy processes. But postgresql has other issues 
with solaris as well.
>
> > Since DB's switch context a lot under many circumstances, threads
> > should be a win under such circumstances.  At the least, it should be
> > helpful in situations where we have multiple CPUs to split query
> > execution between.

Can you give an example where database does a lot of context switching for 
moderate number of connections?
Shridhar



pgsql-hackers by date:

Previous
From: Joe Conway
Date:
Subject: OT: cvsup for Red Hat 9 or rsync cvs
Next
From: Kevin Brown
Date:
Subject: Re: Anyone working on better transaction locking?