Re: Reducing WaitEventSet syscall churn - Mailing list pgsql-hackers

From Kyotaro Horiguchi
Subject Re: Reducing WaitEventSet syscall churn
Date
Msg-id 20200330.141458.2257302020194885474.horikyota.ntt@gmail.com
Whole thread Raw
In response to Re: Reducing WaitEventSet syscall churn  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
At Fri, 13 Mar 2020 16:21:13 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in 
> > 0007: "Use a WaitEventSet for postgres_fdw."
> 
> Continues..

At Thu, 27 Feb 2020 12:17:45 +1300, Thomas Munro <thomas.munro@gmail.com> wrote in 
> 0007: "Use a WaitEventSet for postgres_fdw."
> 
> Create a single WaitEventSet and use it for all FDW connections.  By
> having our own dedicated WES, we can do the bookkeeping required to
> know when sockets have been closed or need to removed from kernel wait
> sets explicitly (which would be much harder to achieve if
> CommonWaitSet were to be used like that; you need to know when sockets
> are closed by other code, so you can provide fd_closed to
> RemoveWaitEvent()).

It is straightforward and looks fine. If we use the libpq-event-based
solution instead of using the changecount, pgfdw_wait_for_socket and
disconnect_pg_server would be a bit simpler.

> Concretely, if you use just one postgres_fdw connection, you'll see
> just epoll_wait()/kevent() calls for waits, but whever you switch
> between different connections, you'll see eg EPOLL_DEL/EV_DELETE
> followed by EPOLL_ADD/EV_ADD when the set is adjusted (in the kqueue
> implementation these could be collapse into the following wait, but I

Such syscall sequences is shorter than or equal to what we issue now,
so this patch is still an improvement.

> haven't done the work for that).  An alternative would be to have one
> WES per FDW connection, but that seemed wasteful of file descriptors.

In the multi-connection case, we can save both fds and syscalls by one
WaitEventSet containing all connection fds together.  WaitEventSetWait
should take event mask parameter or ModifyWaitEvent should set the
mask in that case. The event is now completely level-triggered so we
can safely ignore unwanted events.

> 0008: "Use WL_EXIT_ON_PM_DEATH in FeBeWaitSet."
> 
> The FATAL message you get if you happen to be waiting for IO rather
> than waiting somewhere else seems arbitrarily different.  By switching
> to a standard automatic exit, it opens the possibility of using
> FeBeWaitSet in a couple more places that would otherwise need to
> create their own WES (see also [1]).  Thoughts?

Don't we really need any message on backend disconnection due to
postmaster death?  As for authentication code, it seems to me the
rationale for not writing the log is that the connection has not been
established at the time. (That depends on what we think the
"connection" is.)

And I suppose that the reason for the authentication logic ignoring
signals is that clients expect that authentication should be completed
as far as possible.

> 0009: "Use FeBeWaitSet for walsender.c."
> 
> Enabled by 0008.

It works and doesn't change behavior. But I found it a bit difficult
to find what events the WaitEventSetWait waits.  Maybe a comment at
the caller sites would be sufficient. I think any comment about the
bare number "0" as the event position of ModifyWaitEvent is also
needed.

> 0010: "Introduce a WaitEventSet for the stats collector."

This looks fine.  The variable event is defined outside its effective
scope but almost all of its function variables the same way.



By the way I found that pqcomm.c uses -1 instead of PGINVALID_SOCKET
for AddWaitEventToSet.

> [1]
https://www.postgresql.org/message-id/flat/CA%2BhUKGK%3Dm9dLrq42oWQ4XfK9iDjGiZVwpQ1HkHrAPfG7Kh681g%40mail.gmail.com

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Laurenz Albe
Date:
Subject: Re: Creating a database with a non-predefined collation
Next
From: Fabien COELHO
Date:
Subject: Re: pg11+: pg_ls_*dir LIMIT 1: temporary files .. not closed atend-of-transaction