Our poll() based WaitLatch implementation is broken - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Our poll() based WaitLatch implementation is broken
Date
Msg-id CAEYLb_XU6iNpPa_ybHW_EbDaTWs3yXmhECOt7a96_yft1ddjBQ@mail.gmail.com
Whole thread Raw
Responses Re: Our poll() based WaitLatch implementation is broken
Re: Our poll() based WaitLatch implementation is broken
List pgsql-hackers
Build Postgres master, on Linux or another platform that will use the
poll() implementation rather than the older select(). Send the
Postmaster SIGKILL. Observe that the WAL Writer lives on, representing
a denial of service as it stays attached to shared memory, busy
waiting (evident from the fact that it quickly leaks memory).

WAL writer shouldn't do this. It isn't doing anything stupid like
relying on the return value of WaitLatch(), which is documented to
only reliably indicate certain wake events but not others. The main
event loop calls PostmasterIsAlive(), which is supposed to be totally
reliable. If I use the select() based latch implementation, it behaves
just fine.

I have some doubts about the latch usage within WAL Writer as things
stands - it needs to be cleaned up a bit (I think it should use the
process latch, because I'm paranoid about timeout invalidation issues
now and in the future, plus it doesn't record errno in the handlers).
These smaller issues are covered in passing in the group commit patch
that Simon Riggs and I are currently working on in advance of the
final 9.2 commitfest.

In case it matters:

[peter@peterlaptop postmaster]$ uname -a
Linux peterlaptop 3.1.6-1.fc16.x86_64 #1 SMP Wed Dec 21 22:41:17 UTC
2011 x86_64 x86_64 x86_64 GNU/Linux

I'd debug this myself, but I'm a little bit preoccupied with group
commit right now.

The rationale for introducing the poll()-based implementation where
available was that it performed better than a select()-based one. I
wonder, how compelling a win is that expected to be?

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: lots of unused variable warnings in assert-free builds
Next
From: Jaime Casanova
Date:
Subject: pg_stats_recovery view