Re: pgbench stopped supporting large number of client connections on Windows - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: pgbench stopped supporting large number of client connections on Windows
Date
Msg-id alpine.DEB.2.22.394.2011062159530.1605435@pseudo
Whole thread Raw
In response to pgbench stopped supporting large number of client connections on Windows  (Marina Polyakova <m.polyakova@postgrespro.ru>)
Responses Re: pgbench stopped supporting large number of client connections on Windows
Re: pgbench stopped supporting large number of client connections on Windows
List pgsql-hackers
Hello Marina,

> While trying to test a patch that adds a synchronization barrier in pgbench 
> [1] on Windows,

Thanks for trying that, I do not have a windows setup for testing, and the 
sync code I wrote for Windows is basically blind coding:-(

> I found that since the commit "Use ppoll(2), if available, to 
> wait for input in pgbench." [2] I cannot use a large number of client 
> connections in pgbench on my Windows virtual machines (Windows Server 2008 R2 
> and Windows 2019), for example:
>
>> bin\pgbench.exe -c 90 -S -T 3 postgres
> starting vacuum...end.

ISTM that 1 thread with 90 clients is a bad idea, see below.

> The almost same thing happens with reindexdb and vacuumdb (build on 
> commit [3]):

Windows fd implementation is somehow buggy because it does not return the 
smallest number available, and then with the assumption that select uses a 
dense array indexed with them (true on linux, less so on Windows which 
probably uses a sparse array), so that the number gets over the limit, 
even if less are actually used, hence the catch, as you noted.

Another point is windows has a hardcoded number of objects one thread can 
really wait for, typically 64, so that waiting for more requires actually 
forking threads to do the waiting. But if you are ready to fork threads 
just to wait, then probaly you could have started pgbench with more 
threads in the first place. Now it would probably not make the problem go 
away because fd numbers would be per process, not per thread, but it 
really suggests that one should not load a thread is more than 64 clients.

> IIUC the checks below are not correct on Windows, since on this system 
> sockets can have values equal to or greater than FD_SETSIZE (see Windows 
> documentation [4] and pgbench debug output in attached pgbench_debug.txt).

Okay.

But then, how may one detect that there are too many fds in the set?

I think that an earlier version of the code needed to make assumptions 
about the internal implementation of windows (there is a counter somewhere 
in windows fd_set struct), which was rejected because if was breaking the 
interface. Now your patch is basically resurrecting that. Why not if there 
is no other solution, but this is quite depressing, and because it breaks 
the interface it would be broken if windows changed its internals for some 
reason:-(

Doesn't windows has "ppoll"? Should we implement the stuff above windows 
polling capabilities and coldly skip its failed posix portability 
attempts? This raises again the issue that you should not have more that 
64 clients per thread anyway, because it is an intrinsic limit on windows.

I think that at one point it was suggested to error or warn if 
nclients/nthreads is too great, but that was not kept in the end.

> I tried to fix this, see attached fix_max_client_conn_on_Windows.patch (based 
> on commit [3]). I checked it for reindexdb and vacuumdb, and it works for 
> simple databases (1025 jobs are not allowed and 1024 jobs is ok). 
> Unfortunately, pgbench was getting connection errors when it tried to use 
> 1000 jobs on my virtual machines, although there were no errors for fewer 
> jobs (500) and the same number of clients (1000)...

It seems that the max number of threads you can start depends on available 
memory, because each thread is given its own stack, so it would depend on 
your vm settings?

> Any suggestions are welcome!

Use ppoll, and start more threads but not too many?

-- 
Fabien.



pgsql-hackers by date:

Previous
From: Sergei Kornilov
Date:
Subject: Re: Allow some recovery parameters to be changed with reload
Next
From: Tom Lane
Date:
Subject: First-draft release notes for back branches are up