Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ] - Mailing list pgsql-hackers

From Dilip kumar
Subject Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date
Msg-id 4205E661176A124FAF891E0A6BA913526638CA9E@szxeml509-mbs.china.huawei.com
Whole thread Raw
In response to Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers

On 06 December 2014 20:01 Amit Kapila Wrote

 

>I wanted to understand what exactly the above loop is doing.

 

>a.

>first of all the comment on top of it says "Some of the slot

>are free, ...", if some slot is free, then why do you want

>to process the results? (Do you mean to say that *None* of

>the slot is free....?)

 

This comment is wrong, I will remove this.

 

>b.

>IIUC, you have called function select_loop(maxFd, &slotset)

>to check if socket descriptor is readable, if yes then why

>in do..while loop the same maxFd is checked always, don't

>you want to check different socket descriptors?  I am not sure

>if I am missing something here

 

select_loop(maxFd, &slotset)

 

maxFd is the max descriptor among all SETS, and slotset contains all the descriptor, so if any of the descriptor get some message select_loop will come out, and once select loop come out,

we need to check how many descriptor have got the message from server so we loop and process the results.

 

So it’s not only for a maxFd, it’s for all the descriptors. And it’s in do..while loop, because it possible that select_loop come out because of some intermediate message on any of the socket but still query is not complete,

and if none of the socket is still free (that we check in below for loop), then go to select_loop again.

 

 

>c.

>After checking the socket descriptor for maxFd why you want

>to run run the below for loop for all slots?

>for (i = 0; i < max_slot; i++)

After Select loop is out, it’s possible that we might have got result on multiple connections, so consume input and check if still busy, then nothing to do, but if finished process the result and mark the connection free.

And if any of the connection is free, then we will break the do..while loop.

 

 

 

 

 

 

 

From: Amit Kapila [mailto:amit.kapila16@gmail.com]
Sent: 06 December 2014 20:01
To: Dilip kumar
Cc: Magnus Hagander; Alvaro Herrera; Jan Lentfer; Tom Lane; PostgreSQL-development; Sawada Masahiko; Euler Taveira
Subject: Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

 


On Mon, Dec 1, 2014 at 12:18 PM, Dilip kumar <dilip.kumar@huawei.com> wrote:
>
> On 24 November 2014 11:29, Amit Kapila Wrote,
>

 

I have verified that all previous comments are addressed and

the new version is much better than previous version.

 

>
> here we are setting each target once and doing for all the tables..
>

Hmm, theoretically I think new behaviour could lead to more I/O in

certain cases as compare to existing behaviour.  The reason for more I/O

is that in the new behaviour, while doing Analyze for a particular table at

different targets, in-between it has Analyze of different table as well,

so the pages in shared buffers or OS cache for a particular table needs to

be reloded again for a new target whereas currently it will do all stages

of Analyze for a particular table in one-go which means that each stage

of Analyze could get benefit from the pages of a table loaded by previous

stage.  If you agree, then we should try to avoid this change in new

behaviour.

  
>
> Please provide you opinion.

 

I have few questions regarding function GetIdleSlot()

 

+ static int

+ GetIdleSlot(ParallelSlot *pSlot, int max_slot, const char *dbname,

+                                const 

char *progname, bool completedb)

{

..

+        /*

+        * Some of the slot are free, Process the results for slots whichever

+        * are free

+        */

+        do

+        {

+                    SetCancelConn(pSlot[0].connection);

+                    i = select_loop(maxFd, 

&slotset);

+                    ResetCancelConn();

+                    if (i < 0)

+                    {

+                                /*

+       

                      * This can only happen if user has sent the cancel request using

+                                * 

Ctrl+C, Cancel is handled by 0th slot, so fetch the error result.

+                                */

+                               

GetQueryResult(pSlot[0].connection, dbname, progname,

+                                                                      

completedb);

+                                return NO_SLOT;

+                    }

+                    Assert(i != 0);

+                   

for (i = 0; i < max_slot; i++)

+                    {

+                                if (!FD_ISSET(pSlot[i].sock, 

&slotset))

+                                            continue;

+                                PQconsumeInput(pSlot[i].connection);

                                  if (PQisBusy(pSlot[i].connection))

+                                            continue;

+                   

          pSlot[i].isFree = true;

+                                if (!GetQueryResult(pSlot[i].connection, dbname, 

progname,

+                                                                                            completedb))

+                                           

return NO_SLOT;

+                                if (firstFree < 0)

+                                            firstFree = i;

+       

          }

+        }while(firstFree < 0);

}

 

I wanted to understand what exactly the above loop is doing.

 

a.

first of all the comment on top of it says "Some of the slot

are free, ...", if some slot is free, then why do you want

to process the results? (Do you mean to say that *None* of

the slot is free....?)

 

b.

IIUC, you have called function select_loop(maxFd, &slotset)

to check if socket descriptor is readable, if yes then why

in do..while loop the same maxFd is checked always, don't

you want to check different socket descriptors?  I am not sure

if I am missing something here

 

c.

After checking the socket descriptor for maxFd why you want

to run run the below for loop for all slots?

for (i = 0; i < max_slot; i++)



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Fractions in GUC variables
Next
From: Etsuro Fujita
Date:
Subject: Re: inherit support for foreign tables