Re: BUG #15964: vacuumdb.c:187:10: error: use of undeclaredidentifier 'FD_SETSIZE' - Mailing list pgsql-bugs

From Andres Freund
Subject Re: BUG #15964: vacuumdb.c:187:10: error: use of undeclaredidentifier 'FD_SETSIZE'
Date
Msg-id 20190820160553.37nd5jxs4dk2qujy@alap3.anarazel.de
Whole thread Raw
In response to Re: BUG #15964: vacuumdb.c:187:10: error: use of undeclaredidentifier 'FD_SETSIZE'  (Michael Paquier <michael@paquier.xyz>)
Responses Re: BUG #15964: vacuumdb.c:187:10: error: use of undeclaredidentifier 'FD_SETSIZE'
List pgsql-bugs
Hi,

On 2019-08-19 14:12:51 +0900, Michael Paquier wrote:
> On Mon, Aug 19, 2019 at 12:32:51AM -0400, Tom Lane wrote:
> > I think Andres' suggestion is probably fine: don't try to detect
> > it in advance.  Just open the files, and error out if we need to
> > put an fd index >= FD_SETSIZE into an fd_set.  It'll be a shade
> > less user-friendly, in that the program might run for a bit before
> > failing; but I doubt that such cases arise often enough to be worth
> > working harder.
> 
> Thanks.  I have somewhat not catched what Andres was suggesting here.
> So attached are two patches:
> - 0001 should take care of the compilation failure, by moving
> FD_SETSIZE into scripts_parallel.c.

I don't understand why we would still want to keep the check? The check
is pretty much useless, imo. And it adds code to multiple different
files?


> - 0002 makes vacuumdb and reindexdb fail when trying to assign a
> socket with an unsupported range.  Should this bit be backpatched?  We
> are doing that for vacuumdb for some time now, and the error is
> confusing so I would prefer fixing it on older branches as well.

Yea, I think we clearly need to. The code right now essentially is a
wild write, albeit with a somewhat limited range of what it can impact.


> diff --git a/src/bin/scripts/scripts_parallel.c b/src/bin/scripts/scripts_parallel.c
> index 2b571a470e..b124db0d48 100644
> --- a/src/bin/scripts/scripts_parallel.c
> +++ b/src/bin/scripts/scripts_parallel.c
> @@ -160,6 +160,17 @@ ParallelSlotsGetIdle(ParallelSlot *slots, int numslots)
>              if (sock < 0)
>                  continue;
>  
> +            /*
> +             * Fail immediately if trying to use an index in an unsupported
> +             * range.  Doing a hard exit here is not beautiful, but that's
> +             * not worth complicating the logic.
> +             */
> +            if (sock >= FD_SETSIZE)
> +            {
> +                fprintf(stderr, "too many client connections for select()\n");
> +                exit(1);
> +            }
> +
>              FD_SET(sock, &slotset);
>              if (sock > maxFd)
>                  maxFd = sock;
> -- 
> 2.23.0.rc1
> 

I think the error message ought be reformulated, so users have a chance
to actually understand what they need to change to avoid the error. At
least something roughly like "too many jobs for this platform's select()".

ISTM that we should fail in ParallelSlotsSetup(), rather than
ParallelSlotsGetIdle() though? That's always going to be earlier, and
there's no way to get into the problematic situation at a later point,
no?

Greetings,

Andres Freund



pgsql-bugs by date:

Previous
From: Christoph Ziegenberg
Date:
Subject: Re: BUG #15967: Sequence generation using NEXTVAL() fails on 64bit systems
Next
From: Merlin Moncure
Date:
Subject: Re: BUG #15967: Sequence generation using NEXTVAL() fails on 64bit systems