Re: Escaping from blocked send() reprised. - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Escaping from blocked send() reprised.
Date
Msg-id 20150110163502.GM12509@alap3.anarazel.de
Whole thread Raw
In response to Re: Escaping from blocked send() reprised.  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Escaping from blocked send() reprised.
List pgsql-hackers
On 2014-09-04 08:49:22 -0400, Robert Haas wrote:
> On Tue, Sep 2, 2014 at 3:01 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> > I'm slightly worried about the added overhead due to the latch code. In
> > my implementation I only use latches after a nonblocking read, but
> > still. Every WaitLatchOrSocket() does a drainSelfPipe(). I wonder if
> > that can be made problematic.
> 
> I think that's not the word you're looking for.

There's a "less" missing...

> At some point I hacked up a very crude prototype that made LWLocks use
> latches to sleep instead of semaphores.  It was slow.

Interesting. I dimly remembered you mentioning this, that's how I
rediscovered this message.

Do you remember any details?

My guess that's not so much the overhead of the latch itself, but the
lack of the directed wakeup stuff the OS provides for semaphores.


If we could replace all usages of semaphores that set immediate
interrupts to ok, we could quite easily make the deadlock detector
et. al. run outside of signal handlers. That would imo make it more
robust, and easier to understand - right now the correctness of locking
done in the deadlock detector isn't obvious.  With the infrastructure in
place it'd also allow your new parallelism code to run outside of signal
handlers.

Unfortunately currently sempahores can't be unlocked in a signal handler
(as sysv semaphores aren't signal safe)... It'd also be not so nice to
set both a latch and semaphores in every signal handler.


> AIUI, the only reason why we need the self-pipe thing is because on
> some platforms signals don't interrupt system calls.  But my
> impression was that those platforms were somewhat obscure.

To the contrary, I think it's only very obscure platforms where signals
still interrupt syscalls - we set SA_RESTART for pretty much
everything. There's a couple of system calls that ignore SA_RESTART. For
some that's defined in posix, for others it's operating system
specific. E.g. on linux semop(), poll(), select() are defined to always
return EINTR when interrupted.

Anyway, the discussion since cleared up that we need the self byte to
handle a race, anyway.

> Basically, it doesn't feel like a good thing that we've got two sets
> of primitives for making a backend wait that (1) don't really know
> about each other and (2) use different operating system primitives.
> Presumably one of the two systems is better; let's figure out which
> one it is, use that one all the time, and get rid of the other one.

I think the latch interface is clearly better for what we use
sema/latches for as it allows to wait for signals (latch sets), a socket
and timeouts. So let's try to figure out how to make it perform
comparably or better than semaphores.

There's imo only one semaphore user that can't trivially be replaced by
latches: the semaphore spinlock emulation. Both proc.c and and lwlock.c
can be converted quite easily - in the latter case, it might actually
end up saving some code.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: INSERT ... ON CONFLICT UPDATE and RLS
Next
From: Stephen Frost
Date:
Subject: Re: Parallel Seq Scan