Re: Performance degradation in commit ac1d794 - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Performance degradation in commit ac1d794
Date
Msg-id CA+Tgmoaay7n0HbqH4BS+XhdOOYhvYJ7DqZGrMXvduZM8UUXedQ@mail.gmail.com
Whole thread Raw
In response to Re: Performance degradation in commit ac1d794  (Andres Freund <andres@anarazel.de>)
Responses Re: Performance degradation in commit ac1d794
Re: Performance degradation in commit ac1d794
List pgsql-hackers
On Thu, Jan 14, 2016 at 9:39 AM, Andres Freund <andres@anarazel.de> wrote:
> I finally got back to working on this. Attached is a WIP patch series
> implementing:
> 0001: Allow to easily choose between the readiness primitives in unix_latch.c
>       Pretty helpful for testing, not useful for anything else.

Looks good.

> 0002: Error out if waiting on socket readiness without a specified socket.

Looks good.

> 0003: Only clear unix_latch.c's self-pipe if it actually contains data.
>       ~2% on high qps workloads

everytime -> every time

+        if ((wakeEvents & WL_LATCH_SET) && latch->is_set)
+        {
+            result |= WL_LATCH_SET;
+        }

Excess braces.

Doesn't this code make it possible for the self-pipe to fill up,
self-deadlocking the process?  Suppose we repeatedly enter
WaitLatchOrSocket().  Each time we do, just after waiting = true is
set, somebody sets the latch.  We handle the signal and put a byte
into the pipe.  Returning from the signal handler, we then notice that
is_set is true and return at once, without draining the pipe.  Repeat
until something bad happens.

> 0004: Support using epoll as the polling primitive in unix_latch.c.
>       ~3% on high qps workloads, massive scalability improvements (x3)
>       on very large machines.
>
> With 0004 obviously being the relevant bit for this thread. I verified
> that using epoll addresses the performance problem, using the hardware
> the OP noticed the performance problem on.

+                /*
+                 * Unnecessarily pass data for delete due to bug errorneously
+                 * requiring it in the past.
+                 */

This is pretty vague.  And it has a spelling mistake.

Further down, nodified -> notified.

+        if (wakeEvents != latch->lastmask || latch->lastwatchfd != sock)

I don't like this very much.  I think it's a bad idea to test
latch->lastwatchfd != sock.  That has an excellent change of letting
people write code that appears to work but then doesn't.  I think it
would be better, if we're going to change the API contract, to make it
a hard break, as I see Tom has also suggested while I've been writing
this.

Incidentally, if we're going to whack around the latch API, it would
be nice to pick a design which wouldn't be too hard to extend to
waiting on multiple sockets.  The application I have in mind is to
send of queries to several foreign servers at once and then wait until
bytes come back from any of them.  It's mostly pie in the sky at this
point, but it seems highly likely to me that we'd want to do such a
thing by waiting for bytes from any of the sockets involved OR a latch
event.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Performance degradation in commit ac1d794
Next
From: Tom Lane
Date:
Subject: Re: Performance degradation in commit ac1d794