Thread: pgsql: Fix error handling of vacuumdb and reindexdb when running outof

pgsql: Fix error handling of vacuumdb and reindexdb when running outof

From
Michael Paquier
Date:
Fix error handling of vacuumdb and reindexdb when running out of fds

When trying to use a high number of jobs, vacuumdb (and more recently
reindexdb) has only checked for a maximum number of jobs used, causing
confusing failures when running out of file descriptors when the jobs
open connections to Postgres.  This commit changes the error handling so
as we do not check anymore for a maximum number of allowed jobs when
parsing the option value with FD_SETSIZE, but check instead if a file
descriptor is within the supported range when opening the connections
for the jobs so as this is detected at the earliest time possible.

Also, improve the error message to give a hint about the number of jobs
recommended, using a wording given by the reviewers of the patch.

Reported-by: Andres Freund
Author: Michael Paquier
Reviewed-by: Andres Freund, Álvaro Herrera, Tom Lane
Discussion: https://postgr.es/m/20190818001858.ho3ev4z57fqhs7a5@alap3.anarazel.de
Backpatch-through: 9.5

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/71d84efba714db3b8a330a54be15c4d385719ad6

Modified Files
--------------
src/bin/scripts/reindexdb.c        |  6 ------
src/bin/scripts/scripts_parallel.c | 26 ++++++++++++--------------
src/bin/scripts/scripts_parallel.h |  2 --
src/bin/scripts/vacuumdb.c         |  6 ------
4 files changed, 12 insertions(+), 28 deletions(-)


Re: pgsql: Fix error handling of vacuumdb and reindexdb when runningout of

From
Michael Paquier
Date:
On Mon, Aug 26, 2019 at 02:17:06AM +0000, Michael Paquier wrote:
> Fix error handling of vacuumdb and reindexdb when running out of fds
>
> When trying to use a high number of jobs, vacuumdb (and more recently
> reindexdb) has only checked for a maximum number of jobs used, causing
> confusing failures when running out of file descriptors when the jobs
> open connections to Postgres.  This commit changes the error handling so
> as we do not check anymore for a maximum number of allowed jobs when
> parsing the option value with FD_SETSIZE, but check instead if a file
> descriptor is within the supported range when opening the connections
> for the jobs so as this is detected at the earliest time possible.
>
> Also, improve the error message to give a hint about the number of jobs
> recommended, using a wording given by the reviewers of the patch.

jacana does not like that, and has reported an error for 9.6:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2019-08-26 03%3A51%3A19
# Running: vacuumdb -zj2 postgres
vacuumdb: vacuuming database "postgres"
vacuumdb: too many jobs for this platform -- try 1

I am able to reproduce the problem locally, and the problem is that we
don't declare FD_SETSIZE on Windows before winsock2.h in
scripts_parallel.c (or vacuumdb.c in ~12) so all the Windows machines
running TAP are going to complain on that.

This matches with the same problem reported here
https://www.postgresql.org/message-id/1248903114.6348.195.camel@lapdragon

We have never done that before in vacuumdb.c, and because of that I
think that with a high number of jobs we run into buffer overflows.

The patch attached does the job on my end, any objections?  There
is an argument to do that in win32_port.h, but for now I'd rather take
a safe path, or just do for the change in win32_port.h on HEAD.

(Noticed the missing newline as well in the error string in
back-branches...  I'll address it at the same time.)
--
Michael

Attachment

Re: pgsql: Fix error handling of vacuumdb and reindexdb when runningout of

From
Andrew Dunstan
Date:
On 8/26/19 1:40 AM, Michael Paquier wrote:
> On Mon, Aug 26, 2019 at 02:17:06AM +0000, Michael Paquier wrote:
>> Fix error handling of vacuumdb and reindexdb when running out of fds
>>
>> When trying to use a high number of jobs, vacuumdb (and more recently
>> reindexdb) has only checked for a maximum number of jobs used, causing
>> confusing failures when running out of file descriptors when the jobs
>> open connections to Postgres.  This commit changes the error handling so
>> as we do not check anymore for a maximum number of allowed jobs when
>> parsing the option value with FD_SETSIZE, but check instead if a file
>> descriptor is within the supported range when opening the connections
>> for the jobs so as this is detected at the earliest time possible.
>>
>> Also, improve the error message to give a hint about the number of jobs
>> recommended, using a wording given by the reviewers of the patch.
> jacana does not like that, and has reported an error for 9.6:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2019-08-26 03%3A51%3A19
> # Running: vacuumdb -zj2 postgres
> vacuumdb: vacuuming database "postgres"
> vacuumdb: too many jobs for this platform -- try 1
>
> I am able to reproduce the problem locally, and the problem is that we
> don't declare FD_SETSIZE on Windows before winsock2.h in
> scripts_parallel.c (or vacuumdb.c in ~12) so all the Windows machines
> running TAP are going to complain on that.
>
> This matches with the same problem reported here
> https://www.postgresql.org/message-id/1248903114.6348.195.camel@lapdragon
>
> We have never done that before in vacuumdb.c, and because of that I
> think that with a high number of jobs we run into buffer overflows.
>
> The patch attached does the job on my end, any objections?  There
> is an argument to do that in win32_port.h, but for now I'd rather take
> a safe path, or just do for the change in win32_port.h on HEAD.
>
> (Noticed the missing newline as well in the error string in
> back-branches...  I'll address it at the same time.)



Looks OK.


cheers


andrew



-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: pgsql: Fix error handling of vacuumdb and reindexdb when runningout of

From
Michael Paquier
Date:
On Mon, Aug 26, 2019 at 11:36:50AM -0400, Andrew Dunstan wrote:
> Looks OK.

Thanks Andrew for looking at the patch.  I have now committed it on
all the impacted branches, after running vcregress bincheck for all of
them with MSVC.
--
Michael

Attachment

Re: pgsql: Fix error handling of vacuumdb and reindexdb when runningout of

From
Michael Paquier
Date:
On Tue, Aug 27, 2019 at 09:18:27AM +0900, Michael Paquier wrote:
> Thanks Andrew for looking at the patch.  I have now committed it on
> all the impacted branches, after running vcregress bincheck for all of
> them with MSVC.

bowerbird has turned back to green on 9.5 and 9.6 now, so it seems
that we are fine.
--
Michael

Attachment