Re: pgbench bug / limitation - Mailing list pgsql-bugs

From Fabien COELHO
Subject Re: pgbench bug / limitation
Date
Msg-id alpine.DEB.2.22.394.2006150902260.646816@pseudo
Whole thread Raw
In response to Re: pgbench bug / limitation  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-bugs
Hello David,

>> I suggest that we might as well get all the way in and dodge the 
>> FD_SETSIZE limitation altogether, as per the attached utterly-untested 
>> draft patch.
>
> I compiled this on Visual Studio 2017 and tested it. I didn't
> encounter any problems.
>
> The only thing I see in Winsock2.h that relies on FD_SETSIZE being the
> same size as the fd_set array is the FD_SET macro. So, I think it
> should be safe if these differ, like they will with your patch. We'll
> just need to make sure we don't use FD_SET in the future.
>
>> A remaining problem with this is that in theory, repeatedly applying
>> socket_has_input() after the wait could also be O(N^2) (unless FD_ISSET
>> is way smarter than I suspect it is).
>
> The FD_ISSET() just calls a function, so I don't know what's going on
> under the hood.
>
> #define FD_ISSET(fd, set) __WSAFDIsSet((SOCKET)(fd), (fd_set FAR *)(set))
>
> However, I don't see what else it could do other than loop over the
> array until it finds a match.

Independently of the wisdom of handling many client connections with just 
one pgbench thread, I'm really wondering what goes on under the hood on 
windows implementation of select().

AFAICS from online docs, windows native interfaces for waiting on IOs are:

  - WaitForMultipleObjects
    with a MAXIMUM_WAIT_OBJECTS limit which is 64.

  - WSAWaitForMultipleEvents
    with a WSA_MAXIMUM_WAIT_EVENTS limit which is 64.

Then their (strange) implementation of POSIX select uses FD_SETSIZE which 
is, you may have guessed, 64.

Although this is consistent, M$ doc indeeds suggest that FD_SETSIZE can be 
extended, but then why would the underlying implementation do a better job 
(handle more fds) than the native implementations? How can one really tell 
that "it works"? Maybe it just waited for the first few objects? Maybe it 
did some active scan on objects to check for their status? Maybe it forked 
threads to do the waiting? Maybe something else?

Having some idea of what is really happening would help to know what is 
best to do in pgbench.

I'd suggest the following test, with pgbench compiled with the extended 
FD_SETSIZE for windows:

   script sleep.sql:
   SELECT pg_sleep(150 - :client_id);

The run something like the following under some debugger:

   pgbench -c 128 -f "sleep.sql"

and look at where the process is when interrupted under select? Now I 
cannot run this test, because I do not have access to a windows host.

-- 
Fabien.



pgsql-bugs by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Potential G2-item cycles under serializable isolation
Next
From: baki baki
Date:
Subject: Re: BUG #16488: psql installation initdb