Home > mailing lists

WaitLatchOrSocket optimization - Mailing list pgsql-hackers

From	Konstantin Knizhnik
Subject	WaitLatchOrSocket optimization
Date	March 15, 2018 22:01:40
Msg-id	9326cdf1-e3e7-d892-d796-ec0150b4be32@postgrespro.ru Whole thread Raw
Responses	Re: WaitLatchOrSocket optimization (Andres Freund <andres@anarazel.de>)
List	pgsql-hackers

Tree view

Hi hackers,

Right now function WaitLatchOrSocket is implemented in very inefficient 
way: for each invocation it creates epoll instance, registers events and 
then closes this instance.
Certainly it is possible to create wait event set once with 
CreateWaitEventSet and then use WaitEventSetWait.
And it is done in most performance critical places.
But there are still lot of places in Postgres where WaitLatchOrSocket or 
WaitLatch are used.

One of them is postgres_fdw.
If we run pgbench through postgres_fdw (just redirect pgbench tables 
using postgres_fdw to the localhost),
then at the system with large number (72) of CPU cores, "pgbench -S -M 
prepared -c 100 -j 32" shows performance about 38k TPS with the 
following profile:

-   73.83%     0.12%  postgres         postgres [.] PostgresMain ▒
    - 73.82% PostgresMain ▒
       + 28.18% PortalRun ▒
       + 26.48% finish_xact_command ▒
       + 13.66% PortalStart ▒
       + 1.83% exec_simple_query ▒
       + 1.22% pq_getbyte ▒
       + 0.89% ReadyForQuery ▒
-   66.04%     0.03%  postgres         [kernel.kallsyms] [k] 
entry_SYSCALL_64_fastpath ▒
    - 66.01% entry_SYSCALL_64_fastpath ▒
       + 61.39% syscall_return_slowpath ▒
       + 1.52% sys_epoll_create1 ▒
       + 1.30% SYSC_sendto ▒
       + 0.94% sys_epoll_pwait ▒
         0.57% SYSC_recvfrom ▒
-   65.61%     0.03%  postgres         postgres_fdw.so [.] 
pgfdw_get_result ▒
    - 65.58% pgfdw_get_result ▒
       - 65.00% WaitLatchOrSocket ▒
          + 62.60% __close ▒
          + 1.62% CreateWaitEventSet ▒
-   65.09%     0.02%  postgres         postgres [.] WaitLatchOrSocket ▒
    - 65.08% WaitLatchOrSocket ▒
       + 62.68% __close ▒
       + 1.62% CreateWaitEventSet ▒
+   62.69%     0.02%  postgres         libpthread-2.26.so [.] __close ▒


So, you can see that more than 60% of CPU is spent in close.

If we cache used wait event sets, then performance is increased to 225k 
TPS: five times!
At the systems with smaller number of cores effect of this patch is not 
so large: at my desktop with 4 cores I get just about 10% improvement at 
the same test.

There are two possible ways of fixing this issue:
1. Patch postgres_fdw to store WaitEventSet in connection.
2. Patch WaitLatchOrSocket to cache created wait event sets.

Second approach is more generic and cover all cases of WaitLatch usages.
Attached patch implements with approach.
The most challenging  of this approach is using  socket descriptor as 
part of hash code.
Socket can be closed be already and reused, so cached epoll set will not 
work any more.
To solve this issue I always try to add socket to the epoll set, 
ignoring EEXIST error.
Certainly it will add some extra overhead, but looks like it is 
negligible comparing with overhead of close (if I comment this branch, 
then pgbench performance is almost the same - 227k TPS).
But if there are some other arguments against using cache in 
WaitLatchOrSocket, we have a patch particularly for postgres_fdw.



-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment

latch.patch

pgsql-hackers by date:

From: Robert Haas
Date: 15 March 2018, 21:49:43
Subject: Re: [HACKERS] Partition-wise aggregation/grouping

From: Tom Lane
Date: 15 March 2018, 22:12:18
Subject: Re: fixing more format truncation issues

WaitLatchOrSocket optimization - Mailing list pgsql-hackers

Attachment

Previous

Next