Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ] - Mailing list pgsql-hackers

From Andres Freund
Subject Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date
Msg-id 20150104015713.GG3064@awork2.anarazel.de
Whole thread Raw
In response to Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
List pgsql-hackers
On 2014-12-31 18:35:38 +0530, Amit Kapila wrote:
> +      <term><option>-j <replaceable class="parameter">jobs</replaceable></option></term>
> +      <term><option>--jobs=<replaceable class="parameter">njobs</replaceable></option></term>
> +      <listitem>
> +       <para>
> +        Number of concurrent connections to perform the operation.
> +        This option will enable the vacuum operation to run on asynchronous
> +        connections, at a time one table will be operated on one connection.
> +        So at one time as many tables will be vacuumed parallely as number of
> +        jobs.  If number of jobs given are more than number of tables then
> +        number of jobs will be set to number of tables.

"asynchronous connections" isn't a very well defined term. Also, the
second part of that sentence doesn't seem to be gramattically correct.

> +       </para>
> +       <para>
> +        <application>vacuumdb</application> will open
> +        <replaceable class="parameter"> njobs</replaceable> connections to the
> +        database, so make sure your <xref linkend="guc-max-connections">
> +        setting is high enough to accommodate all connections.
> +       </para>

Isn't it njobs+1?

> @@ -141,6 +199,7 @@ main(int argc, char *argv[])
>          }
>      }
>  
> +    optind++;

Hm, where's that coming from?

> +    PQsetnonblocking(connSlot[0].connection, 1);
> +
> +    for (i = 1; i < concurrentCons; i++)
> +    {
> +        connSlot[i].connection = connectDatabase(dbname, host, port, username,
> +                                  prompt_password, progname, false);
> +
> +        PQsetnonblocking(connSlot[i].connection, 1);
> +        connSlot[i].isFree = true;
> +        connSlot[i].sock = PQsocket(connSlot[i].connection);
> +    }

Are you sure about this global PQsetnonblocking()? This means that you
might not be able to send queries... And you don't seem to be waiting
for sockets waiting for writes in the select loop - which means you
might end up being stuck waiting for reads when you haven't submitted
the query.

I think you might need a more complex select() loop. On nonfree
connections also wait for writes if PQflush() returns != 0.


> +/*
> + * GetIdleSlot
> + * Process the slot list, if any free slot is available then return
> + * the slotid else perform the select on all the socket's and wait
> + * until atleast one slot becomes available.
> + */
> +static int
> +GetIdleSlot(ParallelSlot *pSlot, int max_slot, const char *dbname,
> +            const char *progname, bool completedb)
> +{
> +    int        i;
> +    fd_set    slotset;


Hm, you probably need to limit -j to FD_SETSIZE - 1 or so.

> +    int     firstFree = -1;
> +    pgsocket maxFd;
> +
> +    for (i = 0; i < max_slot; i++)
> +        if (pSlot[i].isFree)
> +            return i;

> +    FD_ZERO(&slotset);
> +
> +    maxFd = pSlot[0].sock;
> +
> +    for (i = 0; i < max_slot; i++)
> +    {
> +        FD_SET(pSlot[i].sock, &slotset);
> +        if (pSlot[i].sock > maxFd)
> +            maxFd = pSlot[i].sock;
> +    }

So we're waiting for idle connections?

I think you'll have to have to use two fdsets here, and set the write
set based on PQflush() != 0.

> +/*
> + * A select loop that repeats calling select until a descriptor in the read
> + * set becomes readable. On Windows we have to check for the termination event
> + * from time to time, on Unix we can just block forever.
> + */

Should a) mention why we have to check regularly on windows b) that on
linux we don't have to because we send a cancel event from the signal
handler.

> +static int
> +select_loop(int maxFd, fd_set *workerset)
> +{
> +    int            i;
> +    fd_set        saveSet = *workerset;
>
> +#ifdef WIN32
> +    /* should always be the master */

Hm?


I have to say, this is a fairly large patch for such a minor feature...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: logical column ordering
Next
From: Peter Geoghegan
Date:
Subject: Re: Problems with approach #2 to value locking (INSERT ... ON CONFLICT UPDATE/IGNORE patch)