Thread: Latch implementation

Latch implementation

From

Ganesh Venkitachalam-1

Date:

23 September 2010, 09:31:10

Hi,

I've been playing around with measuring the latch implementation in 9.1, 
and here are the results of a ping-pong test with 2 processes signalling 
and waiting on the latch. I did three variations (linux 2.6.18, nehalem 
processor).

One is the current one.

The second is built on native semaphors on linux. This one cannot
implement WaitLatchOrSocket, there's no select involved.

The third is an implementation based on pipe() and poll. Note: in its 
current incarnation it's essentially a hack to measure performance, it's 
not usable in postgres, this assumes all latches are created before any 
process is forked. We'd need to use mkfifo to sort that out if we really 
want to go this route, or similar.

- Current implementation: 1 pingpong is avg 15 usecs
- Pipe+poll: 9 usecs
- Semaphore: 6 usecs

The test program & modified unix_latch.c is attached, you can compile it 
like "gcc -DPIPE -O2 sema.c" or "gcc -DLINUX_SEM -O2 sema.c" or "gcc -O2 
sema.c".


Thanks,
--Ganesh

Re: Latch implementation

From

Robert Haas

Date:

23 September 2010, 10:24:00

On Wed, Sep 22, 2010 at 4:31 PM, Ganesh Venkitachalam-1
<ganesh@vmware.com> wrote:
> I've been playing around with measuring the latch implementation in 9.1, and
> here are the results of a ping-pong test with 2 processes signalling and
> waiting on the latch. I did three variations (linux 2.6.18, nehalem
> processor).
>
> One is the current one.
>
> The second is built on native semaphors on linux. This one cannot
> implement WaitLatchOrSocket, there's no select involved.
>
> The third is an implementation based on pipe() and poll. Note: in its
> current incarnation it's essentially a hack to measure performance, it's not
> usable in postgres, this assumes all latches are created before any process
> is forked. We'd need to use mkfifo to sort that out if we really want to go
> this route, or similar.
>
> - Current implementation: 1 pingpong is avg 15 usecs
> - Pipe+poll: 9 usecs
> - Semaphore: 6 usecs

Interesting numbers.  I guess one question is how much improving the
performance of the latch implementation would affect overall system
performance.  Synchronous replication is obviously going to be highly
sensitive to latency, but even in that context I'm not really sure
whether this is enough to matter.  Do you have any sense of that?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

Re: Latch implementation

From

Heikki Linnakangas

Date:

23 September 2010, 10:55:59

On 22/09/10 23:31, Ganesh Venkitachalam-1 wrote:
> I've been playing around with measuring the latch implementation in 9.1,
> and here are the results of a ping-pong test with 2 processes signalling
> and waiting on the latch. I did three variations (linux 2.6.18, nehalem
> processor).
>
> One is the current one.
>
> The second is built on native semaphors on linux. This one cannot
> implement WaitLatchOrSocket, there's no select involved.
>
> The third is an implementation based on pipe() and poll. Note: in its
> current incarnation it's essentially a hack to measure performance, it's
> not usable in postgres, this assumes all latches are created before any
> process is forked. We'd need to use mkfifo to sort that out if we really
> want to go this route, or similar.
>
> - Current implementation: 1 pingpong is avg 15 usecs
> - Pipe+poll: 9 usecs
> - Semaphore: 6 usecs
>
> The test program & modified unix_latch.c is attached, you can compile it
> like "gcc -DPIPE -O2 sema.c" or "gcc -DLINUX_SEM -O2 sema.c" or "gcc -O2
> sema.c".

Interesting, thanks for the testing! Could you also test how much faster 
the current implementation gets by just replacing select() with poll()? 
That should shave off some overhead.


--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Latch implementation

From

Simon Riggs

Date:

23 September 2010, 12:33:10

On Wed, 2010-09-22 at 13:31 -0700, Ganesh Venkitachalam-1 wrote:
> Hi,
> 
> I've been playing around with measuring the latch implementation in 9.1, 
> and here are the results of a ping-pong test with 2 processes signalling 
> and waiting on the latch. I did three variations (linux 2.6.18, nehalem 
> processor).
> 
> One is the current one.
> 
> The second is built on native semaphors on linux. This one cannot
> implement WaitLatchOrSocket, there's no select involved.

That looks interesting. If we had a need for a latch that would not need
to wait on a socket as well, this would be better. In sync rep, we
certainly do. Thanks for measuring this.

Question is: in that case would we use latches or a PGsemaphore?

If the answer is "latch" then we could just have an additional boolean
option when we request InitLatch() to see what kind of latch we want.

> The third is an implementation based on pipe() and poll. Note: in its 
> current incarnation it's essentially a hack to measure performance, it's 
> not usable in postgres, this assumes all latches are created before any 
> process is forked. We'd need to use mkfifo to sort that out if we really 
> want to go this route, or similar.
> 
> - Current implementation: 1 pingpong is avg 15 usecs
> - Pipe+poll: 9 usecs
> - Semaphore: 6 usecs

Pipe+poll not worth it then.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services

Re: Latch implementation

From

Ganesh Venkitachalam-1

Date:

23 September 2010, 14:35:23

Attached is the current implementation redone with poll. It lands at 
around 10.5 usecs, right above pipe, but better than the current 
implementation.

As to the other questions: yes, this would matter for sync replication. 
Cosider an enterprise use case with 10Gb network & SSDs (not at all 
uncommon): a 10Gb network can do a roundtrip with the commitlog in <10 
usecs, and SSDs have write latency < 50 usec. Now if the latch takes tens 
of usescs (this stuff scales somewhat with the number of processes, my 
data is all with 2 processes), that becomes a very significant part of the 
net commit latency. So I'd think this is worth fixing.

Thanks,
--Ganesh

On Thu, 23 Sep 2010, Simon Riggs wrote:

> Date: Thu, 23 Sep 2010 06:56:38 -0700
> From: Simon Riggs <simon@2ndQuadrant.com>
> To: Ganesh Venkitachalam <ganesh@vmware.com>
> Cc: "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org>
> Subject: Re: [HACKERS] Latch implementation
> 
> On Wed, 2010-09-22 at 13:31 -0700, Ganesh Venkitachalam-1 wrote:
>> Hi,
>>
>> I've been playing around with measuring the latch implementation in 9.1,
>> and here are the results of a ping-pong test with 2 processes signalling
>> and waiting on the latch. I did three variations (linux 2.6.18, nehalem
>> processor).
>>
>> One is the current one.
>>
>> The second is built on native semaphors on linux. This one cannot
>> implement WaitLatchOrSocket, there's no select involved.
>
> That looks interesting. If we had a need for a latch that would not need
> to wait on a socket as well, this would be better. In sync rep, we
> certainly do. Thanks for measuring this.
>
> Question is: in that case would we use latches or a PGsemaphore?
>
> If the answer is "latch" then we could just have an additional boolean
> option when we request InitLatch() to see what kind of latch we want.
>
>> The third is an implementation based on pipe() and poll. Note: in its
>> current incarnation it's essentially a hack to measure performance, it's
>> not usable in postgres, this assumes all latches are created before any
>> process is forked. We'd need to use mkfifo to sort that out if we really
>> want to go this route, or similar.
>>
>> - Current implementation: 1 pingpong is avg 15 usecs
>> - Pipe+poll: 9 usecs
>> - Semaphore: 6 usecs
>
> Pipe+poll not worth it then.
>
> -- 
> Simon Riggs           www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Training and Services
>
>