Re: Spinlock performance improvement proposal - Mailing list pgsql-hackers

From D. Hageman
Subject Re: Spinlock performance improvement proposal
Date
Msg-id Pine.LNX.4.33.0109262224040.1173-100000@typhon.dracken.com
Whole thread Raw
In response to Re: Spinlock performance improvement proposal  (Alex Pilosov <alex@pilosoft.com>)
Responses Re: Spinlock performance improvement proposal
Re: Spinlock performance improvement proposal
List pgsql-hackers
On Wed, 26 Sep 2001, Alex Pilosov wrote:

> On Wed, 26 Sep 2001, D. Hageman wrote:
> 
> > When you need data that is specific to a thread you use a TSD (Thread 
> > Specific Data).  

> Which Linux does not support with a vengeance, to my knowledge.

I am not sure what that means.  If it works it works. 

> As a matter of fact, quote from Linus on the matter was something like
> "Solution to slow process switching is fast process switching, not another
> kernel abstraction [referring to threads and TSD]". TSDs make
> implementation of thread switching complex, and fork() complex.

Linus does have some interesting ideas.  I always like to hear his 
perspective on matters, but just like the government - I don't always 
agree with him.  I don't see why TSDs would make the implementation of 
thread switching complex - seems to me that would be something that is 
implemented in the userland side part of the pthreads implemenation and 
not the kernel side.  I don't really like to talk specifics, but both the 
lightweight process and the system call fork() are implemented using the 
__clone kernel function with the parameters slightly different (This is 
in the Linux kernel, btw since you wanted to use that as an example).  The 
speed improvements the kernel has given the fork() command (like copy on 
write) only lasts until the process writes to memmory.  The next time it 
comes around - it is for all intents and purposes a full context switch 
again.  With threads ... the cost is relatively consistant.

> The question about threads boils down to: Is there far more data that is
> shared than unshared? If yes, threads are better, if not, you'll be
> abusing TSD and slowing things down. 

I think the question about threads boils down to if the core members of 
the PostgreSQL team want to try it or not.  At this time, I would have to 
say they pretty much agree they like things the way they are now, which is 
completely fine.  They are the ones that spend most of the time on it and 
want to support it.

> I believe right now, postgresql' model of sharing only things that need to
> be shared is pretty damn good. The only slight problem is overhead of
> forking another backend, but its still _fast_.

Oh, man ... am I reading stuff into what you are writing or are you 
reading stuff into what I am writing?  Maybe a little bit of both?  My 
original contention is that I think that the best way to get the full 
potential out of SMP machines is to use a threads model.  I didn't say the 
present way wasn't fast.  

>  Actually, if I remember, there was someone who ported postgresql (I think
> it was 6.5) to be multithreaded with major pain, because the requirement
> was to integrate with CORBA. I believe that person posted some benchmarks
> which were essentially identical to non-threaded postgres...

Actually, it was 7.0.2 and the performance gain was interesting.  The 
posting can be found at:

http://candle.pha.pa.us/mhonarc/todo.detail/thread/msg00007.html

The results are:

20 clients, 900 inserts per client, 1 insert per transaction, 4 different
tables.

7.0.2    About    10:52 average completion
multi-threaded    2:42 average completion
7.1beta3          1:13 average completion

If the multi-threaded version was 7.0.2 and threads increased performance 
that much - I would have to say that was a bonus.  However, the 
performance increases that the PostgreSQL team implemented later ... 
pushed the regular version ahead again.  That kinda says to me that 
potential is there.

If you look at Myron Scott's post today you will see that it had other 
advantages going for it (like auto-vacuum!) and disadvantages ... rogue 
thread corruption (already debated today).

-- 
//========================================================\\
||  D. Hageman                    <dhageman@dracken.com>  ||
\\========================================================//





pgsql-hackers by date:

Previous
From: Alex Pilosov
Date:
Subject: Re: Spinlock performance improvement proposal
Next
From: Alex Pilosov
Date:
Subject: Re: Spinlock performance improvement proposal