Thread: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:

This patch implementing the following TODO item

 

Allow parallel cores to be used by vacuumdb

http://www.postgresql.org/message-id/4F10A728.7090403@agliodbs.com

 

Like Parallel pg_dump, vacuumdb is provided with the option to run the vacuum of multiple tables in parallel. [ vacuumdb –j ]

 

1.       One new option is provided with vacuumdb to give the number of workers.

2.       All worker will be started in beginning and all will be waiting for the vacuum instruction from the master.

3.       Now, if table list is provided in vacuumdb command using –t then, it will send the vacuum of one table to one of the IDLE worker, next table to next IDLE worker and so on.

4.       If vacuum is given for one DB then, it will execute select on pg_class to get the table list and fetch the table name one by one and also assign the vacuum responsibility to IDLE workers.

 

Performance Data by parallel vacuumdb:

Machine Configuration:

                                Core : 8

                                RAM: 24GB        

Test Scenario:

                                16 tables all with 4M records. [many records are deleted and inserted using some pattern, (files is attached in the mail)]

 

Test Result

 

{Base Code}    Time(s)    %CPU Usage      Avg Read(kB/s)    Avg Write(kB/s)

                                521       3%                         12000                       20000                                                   

 

 

{With Parallel Vacuum Patch}

   worker          Time(s)    %CPU Usage    Avg Read(kB/s)          Avg Write(kB/s)

      1                     518                     3%                 12000                           20000   --> this will take the same path as base code

      2                     390                     5%                 14000                           30000            

      8                     235                     7%                 18000                           40000

      16                   197                     8%                 20000                           50000

 

Conclusion:

                By running the vacuumdb in parallel, CPU and I/O throughput is increasing and it can give >50% performance improvement.

 

Work to be Done:

1.       Documentations of the new command.

2.       Parallel support for vacuum all db.

 

Is it required to move the common code for parallel operation of pg_dump and vacuumdb to one place and reuse it ?

 

Prototype patch is attached in the mail, please provide your feedback/Suggestions…

 

                Thanks & Regards,

                Dilip Kumar

 

 

Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Euler Taveira
Date:
On 07-11-2013 09:42, Dilip kumar wrote:

Dilip, this is on my TODO for 9.4. I've already had a half-backed patch
for it. Let's see what I can come up with.

> Is it required to move the common code for parallel operation of pg_dump and vacuumdb to one place and reuse it ?
> 
I'm not sure about that because the pg_dump parallel code is tight to
TOC entry. Also, dependency matters for pg_dump while in the scripts
case, an order to be choosen will be used. However, vacuumdb can share
the parallel code with clusterdb and reindexdb (my patch does it).

Of course, a refactor to unify parallel code (pg_dump and scripts) can
be done in a separate patch.

> Prototype patch is attached in the mail, please provide your feedback/Suggestions...
> 
I'll try to merge your patch with the one I have here until the next CF.


--   Euler Taveira                   Timbira - http://www.timbira.com.br/  PostgreSQL: Consultoria, Desenvolvimento,
Suporte24x7 e Treinamento
 



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Jan Lentfer
Date:

Am 07.11.2013 12:42, schrieb Dilip kumar:

This patch implementing the following TODO item

 

Allow parallel cores to be used by vacuumdb

http://www.postgresql.org/message-id/4F10A728.7090403@agliodbs.com

 

Like Parallel pg_dump, vacuumdb is provided with the option to run the vacuum of multiple tables in parallel. [ vacuumdb –j ]

 

1.       One new option is provided with vacuumdb to give the number of workers.

2.       All worker will be started in beginning and all will be waiting for the vacuum instruction from the master.

3.       Now, if table list is provided in vacuumdb command using –t then, it will send the vacuum of one table to one of the IDLE worker, next table to next IDLE worker and so on.

4.       If vacuum is given for one DB then, it will execute select on pg_class to get the table list and fetch the table name one by one and also assign the vacuum responsibility to IDLE workers.

  [...]

For this use case, would it make sense to queue work (tables) in order of their size, starting on the largest one?

For the case where you have tables of varying size this would lead to a reduced overall processing time as it prevents large (read: long processing time) tables to be processed in the last step. While processing large tables at first and filling up "processing slots/jobs" when they get free with smaller tables one after the other would safe overall execution time.

 

Regards

Jan

 

-- 
professional: http://www.oscar-consult.de

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
 On 08 November 2013 03:22, Euler Taveira Wrote

> On 07-11-2013 09:42, Dilip kumar wrote:
>
> Dilip, this is on my TODO for 9.4. I've already had a half-backed patch
> for it. Let's see what I can come up with.

Ok, Let me know if I can contribute to this..

> > Is it required to move the common code for parallel operation of
> pg_dump and vacuumdb to one place and reuse it ?
> >
> I'm not sure about that because the pg_dump parallel code is tight to
> TOC entry. Also, dependency matters for pg_dump while in the scripts
> case, an order to be choosen will be used. However, vacuumdb can share
> the parallel code with clusterdb and reindexdb (my patch does it).

+1

>
> Of course, a refactor to unify parallel code (pg_dump and scripts) can
> be done in a separate patch.
>
> > Prototype patch is attached in the mail, please provide your
> feedback/Suggestions...
> >
> I'll try to merge your patch with the one I have here until the next CF.

Regards,
Dilip



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:

On 08 November 2013 13:38, Jan Lentfer

 

> For this use case, would it make sense to queue work (tables) in order of their size, starting on the largest one?

> For the case where you have tables of varying size this would lead to a reduced overall processing time as it prevents large (read: long processing time) tables to be processed in the last step. While processing large tables at first and filling up "processing slots/jobs" when they get free with smaller tables one after the other would safe overall execution time.

Good point, I have made the change and attached the modified patch.

 

 

Regards,

Dilip

Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Euler Taveira
Date:
On 08-11-2013 05:07, Jan Lentfer wrote:
> For the case where you have tables of varying size this would lead to
> a reduced overall processing time as it prevents large (read: long
> processing time) tables to be processed in the last step. While
> processing large tables at first and filling up "processing
> slots/jobs" when they get free with smaller tables one after the
> other would safe overall execution time.
> 
That is certainly a good strategy (not the optimal [1] -- that is hard
to achieve). Also, the strategy must:

(i) consider the relation age before size (for vacuum);
(ii) consider that you can't pick indexes for the same relation (for
reindex).


[1]
http://www.postgresql.org/message-id/CA+TgmobwxqsagXKtyQ1S8+gMpqxF_MLXv=4350tFZVqAwKEqgQ@mail.gmail.com


--   Euler Taveira                   Timbira - http://www.timbira.com.br/  PostgreSQL: Consultoria, Desenvolvimento,
Suporte24x7 e Treinamento
 



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Michael Paquier
Date:
On Thu, Nov 7, 2013 at 8:42 PM, Dilip kumar <dilip.kumar@huawei.com> wrote:
> This patch implementing the following TODO item
>
> Allow parallel cores to be used by vacuumdb
> http://www.postgresql.org/message-id/4F10A728.7090403@agliodbs.com
>
>
>
> Like Parallel pg_dump, vacuumdb is provided with the option to run the
> vacuum of multiple tables in parallel. [ vacuumdb –j ]
>
>
>
> 1.       One new option is provided with vacuumdb to give the number of
> workers.
>
> 2.       All worker will be started in beginning and all will be waiting for
> the vacuum instruction from the master.
>
> 3.       Now, if table list is provided in vacuumdb command using –t then,
> it will send the vacuum of one table to one of the IDLE worker, next table
> to next IDLE worker and so on.
>
> 4.       If vacuum is given for one DB then, it will execute select on
> pg_class to get the table list and fetch the table name one by one and also
> assign the vacuum responsibility to IDLE workers.
>
>
>
> Performance Data by parallel vacuumdb:
>
> Machine Configuration:
>
>                                 Core : 8
>
>                                 RAM: 24GB
>
> Test Scenario:
>
>                                 16 tables all with 4M records. [many records
> are deleted and inserted using some pattern, (files is attached in the
> mail)]
>
>
>
> Test Result
>
>
>
> {Base Code}    Time(s)    %CPU Usage      Avg Read(kB/s)    Avg Write(kB/s)
>
>                                 521       3%                         12000
> 20000
>
>
>
>
>
> {With Parallel Vacuum Patch}
>
>    worker          Time(s)    %CPU Usage    Avg Read(kB/s)          Avg
> Write(kB/s)
>
>       1                     518                     3%                 12000
> 20000   --> this will take the same path as base code
>
>       2                     390                     5%                 14000
> 30000
>
>       8                     235                     7%                 18000
> 40000
>
>       16                   197                     8%                 20000
> 50000
>
>
>
> Conclusion:
>
>                 By running the vacuumdb in parallel, CPU and I/O throughput
> is increasing and it can give >50% performance improvement.
>
>
>
> Work to be Done:
>
> 1.       Documentations of the new command.
>
> 2.       Parallel support for vacuum all db.
>
>
>
> Is it required to move the common code for parallel operation of pg_dump and
> vacuumdb to one place and reuse it ?
>
>
>
> Prototype patch is attached in the mail, please provide your
> feedback/Suggestions…
>
>
>
>                 Thanks & Regards,
>
>                 Dilip Kumar
>
>
>
>
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



--
Michael



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Michael Paquier
Date:
On Thu, Nov 7, 2013 at 8:42 PM, Dilip kumar <dilip.kumar@huawei.com> wrote:
> This patch implementing the following TODO item
>
> Allow parallel cores to be used by vacuumdb
> http://www.postgresql.org/message-id/4F10A728.7090403@agliodbs.com
Cool. Could you add this patch to the next commit fest for 9.4? It
begins officially in a couple of days. Here is the URL to it:
https://commitfest.postgresql.org/action/commitfest_view?id=20

Regards,
-- 
Michael



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Euler Taveira
Date:
On 08-11-2013 06:20, Dilip kumar wrote:
> On 08 November 2013 13:38, Jan Lentfer
> 
> 
>> For this use case, would it make sense to queue work (tables) in order of their size, starting on the largest one?
> 
>> For the case where you have tables of varying size this would lead to a reduced overall processing time as it
preventslarge (read: long processing time) tables to be processed in the last step. While processing large tables at
firstand filling up "processing slots/jobs" when they get free with smaller tables one after the other would safe
overallexecution time.
 
> Good point, I have made the change and attached the modified patch.
> 
Don't you submit it for a CF, do you? Is it too late for this CF?


--   Euler Taveira                   Timbira - http://www.timbira.com.br/  PostgreSQL: Consultoria, Desenvolvimento,
Suporte24x7 e Treinamento
 



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Alvaro Herrera
Date:
Euler Taveira wrote:
> On 08-11-2013 06:20, Dilip kumar wrote:
> > On 08 November 2013 13:38, Jan Lentfer
> > 
> > 
> >> For this use case, would it make sense to queue work (tables) in order of their size, starting on the largest
one?
> > 
> >> For the case where you have tables of varying size this would lead to a reduced overall processing time as it
preventslarge (read: long processing time) tables to be processed in the last step. While processing large tables at
firstand filling up "processing slots/jobs" when they get free with smaller tables one after the other would safe
overallexecution time.
 
> > Good point, I have made the change and attached the modified patch.
> > 
> Don't you submit it for a CF, do you? Is it too late for this CF?

Not too late.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
On 16 January 2014 19:53, Euler Taveira Wrote,

> >
> >> For the case where you have tables of varying size this would lead
> to a reduced overall processing time as it prevents large (read: long
> processing time) tables to be processed in the last step. While
> processing large tables at first and filling up "processing slots/jobs"
> when they get free with smaller tables one after the other would safe
> overall execution time.
> > Good point, I have made the change and attached the modified patch.
> >
> Don't you submit it for a CF, do you? Is it too late for this CF?

 Attached the latest updated patch
 1. Rebased the patch to current GIT head.
 2. Doc is updated.
 3. Supported parallel execution for all db option also.
 
Same I will add to current open commitfest..

Regards,
Dilip

Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Jeff Janes
Date:
On Fri, Mar 21, 2014 at 12:48 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:
> On 16 January 2014 19:53, Euler Taveira Wrote,
>
>> >
>> >> For the case where you have tables of varying size this would lead
>> to a reduced overall processing time as it prevents large (read: long
>> processing time) tables to be processed in the last step. While
>> processing large tables at first and filling up "processing slots/jobs"
>> when they get free with smaller tables one after the other would safe
>> overall execution time.
>> > Good point, I have made the change and attached the modified patch.
>> >
>> Don't you submit it for a CF, do you? Is it too late for this CF?
>
>  Attached the latest updated patch
>  1. Rebased the patch to current GIT head.
>  2. Doc is updated.
>  3. Supported parallel execution for all db option also.

This patch needs to be rebased after the analyze-in-stages patch,
c92c3d50d7fbe7391b5fc864b44434.

Although that patch still needs to some work itself, despite being
committed, as still loops over the stages for each db, rather than the
dbs for each stage.

So I don't know if this patch is really reviewable at this point, as
it is not clear how those things are going to interact with each
other.

Cheers,

Jeff



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
On 24 June 2014 03:31, Jeff Wrote,

> >  Attached the latest updated patch
> >  1. Rebased the patch to current GIT head.
> >  2. Doc is updated.
> >  3. Supported parallel execution for all db option also.
> 
> This patch needs to be rebased after the analyze-in-stages patch,
> c92c3d50d7fbe7391b5fc864b44434.

Thank you for giving your attention to this, I will rebase this.. 
> Although that patch still needs to some work itself, despite being
> committed, as still loops over the stages for each db, rather than the
> dbs for each stage.

If I understood your comment properly, Here you mean to say that 
In vacuum_all_databases instead to running all DB's in parallel, we are running db by db in parallel?

I think we can fix this..

> 
> So I don't know if this patch is really reviewable at this point, as it
> is not clear how those things are going to interact with each other.

Exactly what points you want to mention here ?

Regards,
Dilip Kumar

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Jeff Janes
Date:
On Monday, June 23, 2014, Dilip kumar <dilip.kumar@huawei.com> wrote:
On 24 June 2014 03:31, Jeff Wrote,

> >  Attached the latest updated patch
> >  1. Rebased the patch to current GIT head.
> >  2. Doc is updated.
> >  3. Supported parallel execution for all db option also.
>
> This patch needs to be rebased after the analyze-in-stages patch,
> c92c3d50d7fbe7391b5fc864b44434.

Thank you for giving your attention to this, I will rebase this..

> Although that patch still needs to some work itself, despite being
> committed, as still loops over the stages for each db, rather than the
> dbs for each stage.

If I understood your comment properly, Here you mean to say that
In vacuum_all_databases instead to running all DB's in parallel, we are running db by db in parallel?

I mean that the other commit, the one conflicting with your patch, is still not finished.  It probably would not have been committed if we realized the problem at the time.  That other patch runs analyze in stages at different settings of default_statistics_target, but it has the loops in the wrong order, so it analyzes one database in all three stages, then moves to the next database.  I think that these two changes are going to interact with each other.  But I can't predict right now what that interaction will look like.   So it is hard for me to evaluate your patch, until the other one is resolved.

Normally I would evaluate your patch in isolation, but since the conflicting patch is already committed (and is in the 9.4 branch) that would probably not be very useful in this case.

Cheers,

Jeff

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
<div class="WordSection1"><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">On</span><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"></span><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">24June 2014 11:02 Jeff Wrote,</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><p class="MsoNormal">>I
meanthat the other commit, the one conflicting with your patch, is still not finished.  It probably would not have been
committedif we realized the problem at the time.  That other patch runs analyze in stages at  <p class="MsoNormal">>
differentsettings of default_statistics_target, but it has the loops in the wrong order, so it analyzes one database in
allthree stages, then moves to the next database.  I think that these two changes are going to<p class="MsoNormal">>
interactwith each other.  But I can't predict right now what that interaction will look like.   So it is hard for me to
evaluateyour patch, until the other one is resolved.<p class="MsoNormal"> <p class="MsoNormal">>Normally I would
evaluateyour patch in isolation, but since the conflicting patch is already committed (and is in the 9.4 branch) that
wouldprobably not be very useful in this case.<p class="MsoNormal"> <p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">Ohk,  Got your point, I will also try to think how these two
patchcan interact together.. </span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">Regards,</span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">Dilip</span></div>

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Sawada Masahiko
Date:
Hi,

I got following FAILED when I patched v3 to HEAD.

$ patch -d. -p1 < ../patch/vacuumdb_parallel_v3.patch
patching file doc/src/sgml/ref/vacuumdb.sgml
Hunk #1 succeeded at 224 (offset 20 lines).
patching file src/bin/scripts/Makefile
Hunk #2 succeeded at 65 with fuzz 2 (offset -1 lines).
patching file src/bin/scripts/vac_parallel.c
patching file src/bin/scripts/vac_parallel.h
patching file src/bin/scripts/vacuumdb.c
Hunk #3 succeeded at 61 with fuzz 2.
Hunk #4 succeeded at 87 (offset 2 lines).
Hunk #5 succeeded at 143 (offset 2 lines).
Hunk #6 succeeded at 158 (offset 5 lines).
Hunk #7 succeeded at 214 with fuzz 2 (offset 5 lines).
Hunk #8 FAILED at 223.
Hunk #9 succeeded at 374 with fuzz 1 (offset 35 lines).
Hunk #10 FAILED at 360.
Hunk #11 FAILED at 387.
3 out of 11 hunks FAILED -- saving rejects to file src/bin/scripts/vacuumdb.c.rej

---
Sawada Masahiko

On Friday, March 21, 2014, Dilip kumar <dilip.kumar@huawei.com> wrote:
On 16 January 2014 19:53, Euler Taveira Wrote,

> >
> >> For the case where you have tables of varying size this would lead
> to a reduced overall processing time as it prevents large (read: long
> processing time) tables to be processed in the last step. While
> processing large tables at first and filling up "processing slots/jobs"
> when they get free with smaller tables one after the other would safe
> overall execution time.
> > Good point, I have made the change and attached the modified patch.
> >
> Don't you submit it for a CF, do you? Is it too late for this CF?

 Attached the latest updated patch
 1. Rebased the patch to current GIT head.
 2. Doc is updated.
 3. Supported parallel execution for all db option also.

Same I will add to current open commitfest..

Regards,
Dilip


--
Regards,

-------
Sawada Masahiko

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:

On 25 June 2014 23:37 Sawada Masahiko Wrote


>I got following FAILED when I patched v3 to HEAD.

>$ patch -d. -p1 < ../patch/vacuumdb_parallel_v3.patch
>patching file doc/src/sgml/ref/vacuumdb.sgml
>Hunk #1 succeeded at 224 (offset 20 lines).
>patching file src/bin/scripts/Makefile
>Hunk #2 succeeded at 65 with fuzz 2 (offset -1 lines).
>patching file src/bin/scripts/vac_parallel.c
>patching file src/bin/scripts/vac_parallel.h
>patching file src/bin/scripts/vacuumdb.c
>Hunk #3 succeeded at 61 with fuzz 2.
>Hunk #4 succeeded at 87 (offset 2 lines).
>Hunk #5 succeeded at 143 (offset 2 lines).
>Hunk #6 succeeded at 158 (offset 5 lines).
>Hunk #7 succeeded at 214 with fuzz 2 (offset 5 lines).
>Hunk #8 FAILED at 223.
>Hunk #9 succeeded at 374 with fuzz 1 (offset 35 lines).
>Hunk #10 FAILED at 360.
>Hunk #11 FAILED at 387.
>3 out of 11 hunks FAILED -- saving rejects to file src/bin/scripts/vacuumdb.c.rej

 

Thank you for giving your time,  Please review the updated patch attached in the mail.

 

1.      Rebased the patch

2.      Implemented parallel execution for new option --analyze-in-stages

 

 

Regards,

Dilip Kumar

Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Jeff Janes
Date:
On Thu, Jun 26, 2014 at 2:35 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:

>
> Thank you for giving your time,  Please review the updated patch attached in
> the mail.
>
>
>
> 1.      Rebased the patch
>
> 2.      Implemented parallel execution for new option --analyze-in-stages

Hi Dilip,

Thanks for rebasing.

I haven't done an architectural or code review on it, I just applied
it and used it a little on Linux.

Based on that, I find most importantly that it doesn't seem to
correctly vacuum tables which have upper case letters in the name,
because it does not quote the table names when they need quotes.

Of course that needs to be fixed, but taking it as it is, the
resulting error message to the console is just:
: Execute failed

Which is not very informative.  I get the same error if I do a "pg_ctl
shutdown -mi" while running the parallel vacuumdb. Without the -j
option it produces a more descriptive error message "FATAL:
terminating connection due to administrator command", so something
about the new feature suppresses the informative error messages.

I get some compiler warnings with the new patch:

vac_parallel.c: In function 'parallel_msg_master':
vac_parallel.c:147: warning: function might be possible candidate for
'gnu_printf' format attribute
vac_parallel.c:147: warning: function might be possible candidate for
'gnu_printf' format attribute
vac_parallel.c: In function 'exit_horribly':
vac_parallel.c:1071: warning: 'noreturn' function does return


In the usage message, the string has a tab embedded within it
(immediately before "use") that should be converted to literal spaces,
otherwise the output of --help gets misaligned:

printf(_("  -j, --jobs=NUM                  use this many parallel
jobs to vacuum\n"));

Thanks for the work on this.

Cheers,

Jeff



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
On 27 June 2014 02:57, Jeff Wrote,

> Based on that, I find most importantly that it doesn't seem to
> correctly vacuum tables which have upper case letters in the name,
> because it does not quote the table names when they need quotes.

Thanks for your comments....

There are two problem
First -> When doing the vacuum of complete database that time if any table with upper case letter, it was giving error
--FIXED by adding quotes for table name

Second -> When user pass the table using -t option, and if it has uppercase letter 
--This is the existing problem (without parallel implementation), 

One solution to this is, always add Quote to the relation name passed by user, but this can break existing applications
forsome users..
 

 
> Of course that needs to be fixed, but taking it as it is, the resulting
> error message to the console is just:

FIXED

> 
> Which is not very informative.  I get the same error if I do a "pg_ctl
> shutdown -mi" while running the parallel vacuumdb. Without the -j
> option it produces a more descriptive error message "FATAL:
> terminating connection due to administrator command", so something
> about the new feature suppresses the informative error messages.
> 
> I get some compiler warnings with the new patch:
> 
> vac_parallel.c: In function 'parallel_msg_master':
> vac_parallel.c:147: warning: function might be possible candidate for
> 'gnu_printf' format attribute
> vac_parallel.c:147: warning: function might be possible candidate for
> 'gnu_printf' format attribute
> vac_parallel.c: In function 'exit_horribly':
> vac_parallel.c:1071: warning: 'noreturn' function does return

FIXED

> In the usage message, the string has a tab embedded within it
> (immediately before "use") that should be converted to literal spaces,
> otherwise the output of --help gets misaligned:
> 
> printf(_("  -j, --jobs=NUM                  use this many parallel
> jobs to vacuum\n"));
> 

FIXED

Updated patch is attached in the mail..


Thanks & Regards,
Dilip Kumar

Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Jeff Janes
Date:
On Fri, Jun 27, 2014 at 4:10 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:
...
>
> Updated patch is attached in the mail..

Thanks Dilip.

I get a compiler warning when building on Windows.  When I started
looking into that, I see that two files have too much code duplication
between them:

src/bin/scripts/vac_parallel.c   (new file)
src/bin/pg_dump/parallel.c      (existing file)

In particular, pgpipe is almost an exact duplicate between them,
except the copy in vac_parallel.c has fallen behind changes made to
parallel.c.  (Those changes would have fixed the Windows warnings).  I
think that this function (and perhaps other parts as
well--"exit_horribly" for example) need to refactored into a common
file that both files can include.  I don't know where the best place
for that would be, though.  (I haven't done this type of refactoring
myself.)

Also, there are several places in the patch which use spaces for
indentation where tabs are called for by the coding style. It looks
like you may have copied the code from one terminal window and copied
it into another one, converting tabs to spaces in the process.  This
makes it hard to evaluate the amount of code duplication.

In some places the code spins in a tight loop while waiting for a
worker process to become free.  If I strace the process, I got a long
list of selects with 0 time outs:

select(13, [6 8 10 12], NULL, NULL, {0, 0}) = 0 (Timeout)

I have not tried to track down the code that causes it.  I did notice
that vacuumdb spends an awful lot of time at the top of the Linux
"top" output, and this is probably why.


Cheers,

Jeff



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Alvaro Herrera
Date:
Jeff Janes wrote:

> In particular, pgpipe is almost an exact duplicate between them,
> except the copy in vac_parallel.c has fallen behind changes made to
> parallel.c.  (Those changes would have fixed the Windows warnings).  I
> think that this function (and perhaps other parts as
> well--"exit_horribly" for example) need to refactored into a common
> file that both files can include.  I don't know where the best place
> for that would be, though.  (I haven't done this type of refactoring
> myself.)

I think commit d2c1740dc275543a46721ed254ba3623f63d2204 is apropos.
Maybe we should move pgpipe back to src/port and have pg_dump and this
new thing use that.  I'm not sure about the rest of duplication in
vac_parallel.c; there might be a lot in common with what
pg_dump/parallel.c does too.  Having two copies of code is frowned upon
for good reasons.  This patch introduces 1200 lines of new code in
vac_parallel.c, ugh.

If we really require 1200 lines to get parallel vacuum working for
vacuumdb, I would question the wisdom of this effort.  To me, it seems
better spent improving autovacuum to cover whatever it is that this
patch is supposed to be good for --- or maybe just enable having a shell
script that launches multiple vacuumdb instances in parallel ...

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
On 01 July 2014 03:31, Jeff Janes Wrote,
> 
> I get a compiler warning when building on Windows.  When I started
> looking into that, I see that two files have too much code duplication
> between them:

Thanks for Reviewing, 

> 
> src/bin/scripts/vac_parallel.c   (new file)
> src/bin/pg_dump/parallel.c      (existing file)
> 
> In particular, pgpipe is almost an exact duplicate between them, except
> the copy in vac_parallel.c has fallen behind changes made to parallel.c.
> (Those changes would have fixed the Windows warnings).  I think that
> this function (and perhaps other parts as well--"exit_horribly" for
> example) need to refactored into a common file that both files can
> include.  I don't know where the best place for that would be, though.
> (I haven't done this type of refactoring
> myself.)

When I started doing this patch, I thought of sharing the common code b/w vacuumdb and pg_dump, But if we notice 
Pg_dump code is tightly coupled with ArchiveHandle, almost all function take this parameter as input or they operate on
this,and other functions
 
uses some structure like ParallelState or ParallelSlot which has ArchiveHandle member. I think making this code common
mayneed to change complete code of
 
Parallel pg_dump.

However there are some function which are independent of Archive Handle and can directly move to common code,
As you mention pg_pipe, piperead, readMessageFromPipe, select_loop.
For moving them to common place we need to decide where the common file to be placed.

Thoughts ?

> 
> Also, there are several places in the patch which use spaces for
> indentation where tabs are called for by the coding style. It looks
> like you may have copied the code from one terminal window and copied
> it into another one, converting tabs to spaces in the process.  This
> makes it hard to evaluate the amount of code duplication.
> 
> In some places the code spins in a tight loop while waiting for a
> worker process to become free.  If I strace the process, I got a long
> list of selects with 0 time outs:
> 
> select(13, [6 8 10 12], NULL, NULL, {0, 0}) = 0 (Timeout)
> 
> I have not tried to track down the code that causes it.  I did notice
> that vacuumdb spends an awful lot of time at the top of the Linux "top"
> output, and this is probably why.

I will look into these and fix..

Thanks & Regards,
Dilip Kumar


Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
On 01 July 2014 03:48, Alvaro Wrote,

> > In particular, pgpipe is almost an exact duplicate between them,
> > except the copy in vac_parallel.c has fallen behind changes made to
> > parallel.c.  (Those changes would have fixed the Windows warnings).
> I
> > think that this function (and perhaps other parts as
> > well--"exit_horribly" for example) need to refactored into a common
> > file that both files can include.  I don't know where the best place
> > for that would be, though.  (I haven't done this type of refactoring
> > myself.)
>
> I think commit d2c1740dc275543a46721ed254ba3623f63d2204 is apropos.
> Maybe we should move pgpipe back to src/port and have pg_dump and this
> new thing use that.  I'm not sure about the rest of duplication in
> vac_parallel.c; there might be a lot in common with what
> pg_dump/parallel.c does too.  Having two copies of code is frowned upon
> for good reasons.  This patch introduces 1200 lines of new code in
> vac_parallel.c, ugh.

>
> If we really require 1200 lines to get parallel vacuum working for
> vacuumdb, I would question the wisdom of this effort.  To me, it seems
> better spent improving autovacuum to cover whatever it is that this
> patch is supposed to be good for --- or maybe just enable having a
> shell script that launches multiple vacuumdb instances in parallel ...

Thanks for looking into the patch,

I think if we use shell script for launching parallel vacuumdb, we cannot get complete control of dividing the task,
If we directly divide table b/w multiple process, it may happen some process get very big tables then it will be as
goodas one process is doing operation. 

In this patch at a time we assign only one table to each process and whichever process finishes fast, we assign new
table,this way all process get equal sharing of the task. 


Thanks & Regards,
Dilip Kumar




Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Sawada Masahiko
Date:
On Tue, Jul 1, 2014 at 1:25 PM, Dilip kumar <dilip.kumar@huawei.com> wrote:
> On 01 July 2014 03:48, Alvaro Wrote,
>
>> > In particular, pgpipe is almost an exact duplicate between them,
>> > except the copy in vac_parallel.c has fallen behind changes made to
>> > parallel.c.  (Those changes would have fixed the Windows warnings).
>> I
>> > think that this function (and perhaps other parts as
>> > well--"exit_horribly" for example) need to refactored into a common
>> > file that both files can include.  I don't know where the best place
>> > for that would be, though.  (I haven't done this type of refactoring
>> > myself.)
>>
>> I think commit d2c1740dc275543a46721ed254ba3623f63d2204 is apropos.
>> Maybe we should move pgpipe back to src/port and have pg_dump and this
>> new thing use that.  I'm not sure about the rest of duplication in
>> vac_parallel.c; there might be a lot in common with what
>> pg_dump/parallel.c does too.  Having two copies of code is frowned upon
>> for good reasons.  This patch introduces 1200 lines of new code in
>> vac_parallel.c, ugh.
>
>>
>> If we really require 1200 lines to get parallel vacuum working for
>> vacuumdb, I would question the wisdom of this effort.  To me, it seems
>> better spent improving autovacuum to cover whatever it is that this
>> patch is supposed to be good for --- or maybe just enable having a
>> shell script that launches multiple vacuumdb instances in parallel ...
>
> Thanks for looking into the patch,
>
> I think if we use shell script for launching parallel vacuumdb, we cannot get complete control of dividing the task,
> If we directly divide table b/w multiple process, it may happen some process get very big tables then it will be as
goodas one process is doing operation.
 
>
> In this patch at a time we assign only one table to each process and whichever process finishes fast, we assign new
table,this way all process get equal sharing of the task.
 
>
>
> Thanks & Regards,
> Dilip Kumar
>

I have executed latest patch.
One question is that how to use --jobs option is correct?
$ vacuumdb  -d postgres  --jobs=30

I got following error.
vacuumdb: unrecognized option '--jobs=30'
Try "vacuumdb --help" for more information.

Regards,
-------
Sawada Masahiko



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
On 01 July 2014 22:17, Sawada Masahiko Wrote,

 
> I have executed latest patch.
> One question is that how to use --jobs option is correct?
> $ vacuumdb  -d postgres  --jobs=30
> 
> I got following error.
> vacuumdb: unrecognized option '--jobs=30'
> Try "vacuumdb --help" for more information.
> 

Thanks for comments, Your usage are correct, but there are some problem in code and I have fixed the same in attached
patch.

Apart from this issue fix currently I am working on jeff's comments for making the code common between
pg_dump/parallel.cand scripts/vac_parallel.c.
 
I found that almost 300 lines of code we can move to common place, but only problem is where to keep the common code.

I am thinking of 
1. keeping a common folder in bin folder --> src/bin/common and move common code which is specific to parallel
operationin src/bin/common/parallel_common.c
 
2. Both vacuum db and pg_dump will compile this file in while generating there executables(in future other executable
likereindex can also use same for parallel functionality)
 

Thoughts ?

Thanks & Regards,
Dilip Kumar




Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Jeff Janes
Date:
On Mon, Jun 30, 2014 at 3:17 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Jeff Janes wrote:
>
>> In particular, pgpipe is almost an exact duplicate between them,
>> except the copy in vac_parallel.c has fallen behind changes made to
>> parallel.c.  (Those changes would have fixed the Windows warnings).  I
>> think that this function (and perhaps other parts as
>> well--"exit_horribly" for example) need to refactored into a common
>> file that both files can include.  I don't know where the best place
>> for that would be, though.  (I haven't done this type of refactoring
>> myself.)
>
> I think commit d2c1740dc275543a46721ed254ba3623f63d2204 is apropos.
> Maybe we should move pgpipe back to src/port and have pg_dump and this
> new thing use that.  I'm not sure about the rest of duplication in
> vac_parallel.c; there might be a lot in common with what
> pg_dump/parallel.c does too.  Having two copies of code is frowned upon
> for good reasons.  This patch introduces 1200 lines of new code in
> vac_parallel.c, ugh.
>
> If we really require 1200 lines to get parallel vacuum working for
> vacuumdb, I would question the wisdom of this effort.  To me, it seems
> better spent improving autovacuum to cover whatever it is that this
> patch is supposed to be good for --- or maybe just enable having a shell
> script that launches multiple vacuumdb instances in parallel ...

I would only envision using the parallel feature for vacuumdb after a
pg_upgrade or some other major maintenance window (that is the only
time I ever envision using vacuumdb at all).  I don't think autovacuum
can be expected to handle such situations well, as it is designed to
be a smooth background process.

I guess the ideal solution would be for manual VACUUM to have a
PARALLEL option, then vacuumdb could just invoke that one table at a
time.  That way you would get within-table parallelism which would be
important if one table dominates the entire database cluster. But I
don't foresee that happening any time soon.

I don't know how to calibrate the number of lines that is worthwhile.
If you write in C and need to have cross-platform compatibility and
robust error handling, it seems to take hundreds of lines to do much
of anything.  The code duplication is a problem, but I don't think
just raw line count is, especially since it has already been written.

The trend in this project seems to be for shell scripts to eventually
get converted into C programs.  In fact, src/bin/scripts now has no
scripts at all.  Also it is important to vacuum/analyze tables in the
same database at the same time, otherwise you will not get much
speed-up in the ordinary case where there is only one meaningful
database.  Doing that in a shell script would be fairly hard.  It
should be pretty easy in Perl (at least for me--I'm sure others
disagree), but that also doesn't seem to be the way we do things for
programs intended for end users.

Cheers,

Jeff



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Alvaro Herrera
Date:
Jeff Janes wrote:

> I would only envision using the parallel feature for vacuumdb after a
> pg_upgrade or some other major maintenance window (that is the only
> time I ever envision using vacuumdb at all).  I don't think autovacuum
> can be expected to handle such situations well, as it is designed to
> be a smooth background process.

That's a fair point.  One thing that would be pretty neat but I don't
think I would get anyone to implement it, is having the user control the
autovacuum launcher in some way.  For instance "please vacuum this set
of tables as quickly as possible", and it would launch as many workers
are configured.  It would take months to get a UI settled for this,
however.

> I guess the ideal solution would be for manual VACUUM to have a
> PARALLEL option, then vacuumdb could just invoke that one table at a
> time.  That way you would get within-table parallelism which would be
> important if one table dominates the entire database cluster. But I
> don't foresee that happening any time soon.

I see this as a completely different feature, which might also be pretty
neat, at least if you're open to spending more I/O bandwidth processing
a single table: have several processes scanning the heap simultaneously.
Since I think vacuum is mostly I/O bound at the moment, I'm not sure
there is much point in this currently.

> I don't know how to calibrate the number of lines that is worthwhile.
> If you write in C and need to have cross-platform compatibility and
> robust error handling, it seems to take hundreds of lines to do much
> of anything.  The code duplication is a problem, but I don't think
> just raw line count is, especially since it has already been written.

Well, there are (at least) two types of duplicate code: first you have
these common routines such as pgpipe that are duplicates for no good
reason.  Just move them to src/port or something and it's all good.  But
the OP said there is code that cannot be shared even though it's very
similar in both incarnations.  That means we cannot (or it's difficult
to) just have one copy, which means as they fix bugs in one copy we need
to update the other.  This is bad -- witness the situation with ecpg's
copy of date/time code, where there are bugs fixed in the backend
version but the ecpg version does not have the fix.  It's difficult to
keep track of these things.

> The trend in this project seems to be for shell scripts to eventually
> get converted into C programs.  In fact, src/bin/scripts now has no
> scripts at all.  Also it is important to vacuum/analyze tables in the
> same database at the same time, otherwise you will not get much
> speed-up in the ordinary case where there is only one meaningful
> database.  Doing that in a shell script would be fairly hard.  It
> should be pretty easy in Perl (at least for me--I'm sure others
> disagree), but that also doesn't seem to be the way we do things for
> programs intended for end users.

Yeah, shipping shell scripts doesn't work very well for us.  I'm
thinking perhaps we can have sample scripts in which we show how to use
parallel(1) to run multiple vacuumdb's in parallel in Unix and some
similar mechanism in Windows, and that's it.  So we wouldn't provide the
complete toolset, but the platform surely has ways to make it happen.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Sawada Masahiko
Date:
On Wed, Jul 2, 2014 at 2:27 PM, Dilip kumar <dilip.kumar@huawei.com> wrote:
> On 01 July 2014 22:17, Sawada Masahiko Wrote,
>
>
>> I have executed latest patch.
>> One question is that how to use --jobs option is correct?
>> $ vacuumdb  -d postgres  --jobs=30
>>
>> I got following error.
>> vacuumdb: unrecognized option '--jobs=30'
>> Try "vacuumdb --help" for more information.
>>
>
> Thanks for comments, Your usage are correct, but there are some problem in code and I have fixed the same in attached
patch.
>

This patch allows to set 0 to -j option?
When I set 0 to -j option, I think that the behavior of this is same
as when I set to 1.

Regards,

-------
Sawada Masahiko



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
On 03 July 2014 00:01, Sawada Masahiko Wrote,
 >
> 
> This patch allows to set 0 to -j option?
> When I set 0 to -j option, I think that the behavior of this is same as
> when I set to 1.
>

I have changed the patch, now It will return error if -j set to 0 or less than 0.
"vacuumdb: Number of parallel "jobs" should be at least 1"

Thanks & Regards,
Dilip



Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Wed, Jul 2, 2014 at 11:45 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> Jeff Janes wrote:
>
> > I would only envision using the parallel feature for vacuumdb after a
> > pg_upgrade or some other major maintenance window (that is the only
> > time I ever envision using vacuumdb at all).  I don't think autovacuum
> > can be expected to handle such situations well, as it is designed to
> > be a smooth background process.
>
> That's a fair point.  One thing that would be pretty neat but I don't
> think I would get anyone to implement it, is having the user control the
> autovacuum launcher in some way.  For instance "please vacuum this set
> of tables as quickly as possible", and it would launch as many workers
> are configured.  It would take months to get a UI settled for this,
> however.

This sounds to be a better way to have multiple workers working
on vacuuming tables.  For vacuum as we already have some sort
of infrastructure (vacuum workers) to perform tasks in parallel, why
not to leverage that instead of inventing a new one even if we assume
that we can reduce the duplicate code.

> > I don't know how to calibrate the number of lines that is worthwhile.
> > If you write in C and need to have cross-platform compatibility and
> > robust error handling, it seems to take hundreds of lines to do much
> > of anything.  The code duplication is a problem, but I don't think
> > just raw line count is, especially since it has already been written.
>
> Well, there are (at least) two types of duplicate code: first you have
> these common routines such as pgpipe that are duplicates for no good
> reason.  Just move them to src/port or something and it's all good.  But
> the OP said there is code that cannot be shared even though it's very
> similar in both incarnations.  That means we cannot (or it's difficult
> to) just have one copy, which means as they fix bugs in one copy we need
> to update the other.

I checked briefly the duplicate code among both versions and I think,
we might be able to reduce it to a significant amount by making common
functions and use AH where passed (as an example, I have checked
function ParallelBackupStart() which is more than 100 lines).  If you see
code duplication as a major point for which you don't prefer this patch,
then I think that can be ameliorated or atleast it is worth a try to do so.
However I think it might be better to achieve in a way suggested by you
using autovacuum launcher.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
On 02 July 2014 23:45, Alvaro Herrera Wrote,
>
> Well, there are (at least) two types of duplicate code: first you have
> these common routines such as pgpipe that are duplicates for no good
> reason.  Just move them to src/port or something and it's all good.
> But the OP said there is code that cannot be shared even though it's
> very similar in both incarnations.  That means we cannot (or it's
> difficult
> to) just have one copy, which means as they fix bugs in one copy we
> need to update the other.  This is bad -- witness the situation with
> ecpg's copy of date/time code, where there are bugs fixed in the
> backend version but the ecpg version does not have the fix.  It's
> difficult to keep track of these things.

In attached patch, I have moved pgpipe, piperead functions to src/port/pipe.c

There are some more common function what Jeff and Amit also mentioned to move to common place,
Currently I am not sure where we can move other functions to.
Can we move other parallel functions to src/port, may be one new file parallel.c under src/port ?

Thanks & Regards,
Dilip Kumar

Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Robert Haas
Date:
On Fri, Jul 4, 2014 at 1:15 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:
> In attached patch, I have moved pgpipe, piperead functions to src/port/pipe.c

If we want to consider proceeding with this approach, you should
probably separate this into a refactoring patch that doesn't do
anything but move code around and a feature patch that applies on top
of it.

(As to whether this is the right approach, I'm not sure.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
On 07 July 2014 17:55 Rebert Hass Wrote,

> 
> On Fri, Jul 4, 2014 at 1:15 AM, Dilip kumar <dilip.kumar@huawei.com>
> wrote:
> > In attached patch, I have moved pgpipe, piperead functions to
> > src/port/pipe.c
> 
> If we want to consider proceeding with this approach, you should
> probably separate this into a refactoring patch that doesn't do
> anything but move code around and a feature patch that applies on top
> of it.
> 
> (As to whether this is the right approach, I'm not sure.)

I have done the refactoring of the code.

Two patches are attached 

1. vacuumdb_parallel_refactor.patch  --> Moved pg_dump, parallel code to port/parallel_utils.c  (almost 800 lines are
movedto the common code).
 
2. vacuumdb_parallel_v9 --> Feature changes for vaccumdb parallel (created on top of first patch).

I think by this changes, we are able to address all the concerns we were having related to duplicate code.


Thanks & Regards,
Dilip Kumar


Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Jeff Janes
Date:
On Fri, Jun 27, 2014 at 4:10 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:
> On 27 June 2014 02:57, Jeff Wrote,
>
>> Based on that, I find most importantly that it doesn't seem to
>> correctly vacuum tables which have upper case letters in the name,
>> because it does not quote the table names when they need quotes.
>
> Thanks for your comments....
>
> There are two problem
> First -> When doing the vacuum of complete database that time if any table with upper case letter, it was giving
error
> --FIXED by adding quotes for table name
>
> Second -> When user pass the table using -t option, and if it has uppercase letter
> --This is the existing problem (without parallel implementation),

Just for the record, I don't think the second one is actually a bug.
If someone uses -t option from the command line, they are required to
provide the quotes if quotes are needed, just like they would need to
in psql.  That can be annoying to do from a shell, as you then need to
protect the quotes themselves from the shell, but that is the way it
is.

vacuumdb -t '"CrAzY QuOtE"'
or
vacuumdb -t \"CrAzY\ QuOtE\"

Cheers,

Jeff



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Magnus Hagander
Date:
On Tue, Jul 1, 2014 at 6:25 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:
> On 01 July 2014 03:48, Alvaro Wrote,
>
>> > In particular, pgpipe is almost an exact duplicate between them,
>> > except the copy in vac_parallel.c has fallen behind changes made to
>> > parallel.c.  (Those changes would have fixed the Windows warnings).
>> I
>> > think that this function (and perhaps other parts as
>> > well--"exit_horribly" for example) need to refactored into a common
>> > file that both files can include.  I don't know where the best place
>> > for that would be, though.  (I haven't done this type of refactoring
>> > myself.)
>>
>> I think commit d2c1740dc275543a46721ed254ba3623f63d2204 is apropos.
>> Maybe we should move pgpipe back to src/port and have pg_dump and this
>> new thing use that.  I'm not sure about the rest of duplication in
>> vac_parallel.c; there might be a lot in common with what
>> pg_dump/parallel.c does too.  Having two copies of code is frowned upon
>> for good reasons.  This patch introduces 1200 lines of new code in
>> vac_parallel.c, ugh.
>
>>
>> If we really require 1200 lines to get parallel vacuum working for
>> vacuumdb, I would question the wisdom of this effort.  To me, it seems
>> better spent improving autovacuum to cover whatever it is that this
>> patch is supposed to be good for --- or maybe just enable having a
>> shell script that launches multiple vacuumdb instances in parallel ...
>
> Thanks for looking into the patch,
>
> I think if we use shell script for launching parallel vacuumdb, we cannot get complete control of dividing the task,
> If we directly divide table b/w multiple process, it may happen some process get very big tables then it will be as
goodas one process is doing operation.
 
>
> In this patch at a time we assign only one table to each process and whichever process finishes fast, we assign new
table,this way all process get equal sharing of the task.
 


I am late to this game, but the first thing to my mind was - do we
really need the whole forking/threading thing on the client at all? We
need it for things like pg_dump/pg_restore because they can themselvse
benefit from parallelism at the client level, but for something like
this, might the code become a lot simpler if we just use multiple
database connections and async queries? That would also bring the
benefit of less platform dependent code, less cleanup needs etc.

(Oh, and for some reason at my quick review i also noticed - you added
quoting of the table name, but forgot to do it for the schema name.
You should probably also look at using something like
quote_identifier(), that'll make things easier).

-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
On 15 July 2014 19:01, Magnus Hagander Wrote,
 
> I am late to this game, but the first thing to my mind was - do we
> really need the whole forking/threading thing on the client at all? We
> need it for things like pg_dump/pg_restore because they can themselvse
> benefit from parallelism at the client level, but for something like
> this, might the code become a lot simpler if we just use multiple
> database connections and async queries? That would also bring the
> benefit of less platform dependent code, less cleanup needs etc.

Thanks for the review, I understand you point, but I think if we have do this directly by independent connection, 
It's difficult to equally divide the jobs b/w multiple independent connections.

As per this implementation we are able to share the load b/w the processes quite well,
1. If one process finishes the work faster it can take the other load.
2. Specially while vacuuming whole database, it's very difficult to divide the load if they are not centralized
control.

By above points, I think that we can have this patch.. 

> 
> (Oh, and for some reason at my quick review i also noticed - you added
> quoting of the table name, but forgot to do it for the schema name.
> You should probably also look at using something like
> quote_identifier(), that'll make things easier).

Thanks for the comments, I have attached the updated patch.

vacuumdb_parallel_refactor --> No change
vacuumdb_parallel_v9  --> Quotes added for namespace


Thanks & Regards,
Dilip

Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Tom Lane
Date:
Dilip kumar <dilip.kumar@huawei.com> writes:
> On 15 July 2014 19:01, Magnus Hagander Wrote,
>> I am late to this game, but the first thing to my mind was - do we
>> really need the whole forking/threading thing on the client at all?

> Thanks for the review, I understand you point, but I think if we have do this directly by independent connection, 
> It's difficult to equally divide the jobs b/w multiple independent connections.

That argument seems like complete nonsense.  You're confusing work
allocation strategy with the implementation technology for the multiple
working threads.  I see no reason why a good allocation strategy couldn't
work with either approach; indeed, I think it would likely be easier to
do some things *without* client-side physical parallelism, because that
makes it much simpler to handle feedback between the results of different
operational threads.
        regards, tom lane



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Dilip kumar <dilip.kumar@huawei.com> writes:
> > On 15 July 2014 19:01, Magnus Hagander Wrote,
> >> I am late to this game, but the first thing to my mind was - do we
> >> really need the whole forking/threading thing on the client at all?
> 
> > Thanks for the review, I understand you point, but I think if we have do this directly by independent connection, 
> > It's difficult to equally divide the jobs b/w multiple independent connections.
> 
> That argument seems like complete nonsense.  You're confusing work
> allocation strategy with the implementation technology for the multiple
> working threads.  I see no reason why a good allocation strategy couldn't
> work with either approach; indeed, I think it would likely be easier to
> do some things *without* client-side physical parallelism, because that
> makes it much simpler to handle feedback between the results of different
> operational threads.

So you would have one initial connection, which generates a task list;
then open N libpq connections.  Launch one vacuum on each, and then
sleep on select() on the three sockets.  Whenever one returns
read-ready, the vacuuming is done and we send another item from the task
list.  Repeat until tasklist is empty.  No need to fork anything.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Magnus Hagander
Date:
<p dir="ltr"><br /> On Jul 16, 2014 7:05 AM, "Alvaro Herrera" <<a
href="mailto:alvherre@2ndquadrant.com">alvherre@2ndquadrant.com</a>>wrote:<br /> ><br /> > Tom Lane wrote:<br
/>> > Dilip kumar <<a href="mailto:dilip.kumar@huawei.com">dilip.kumar@huawei.com</a>> writes:<br /> >
>> On 15 July 2014 19:01, Magnus Hagander Wrote,<br /> > > >> I am late to this game, but the first
thingto my mind was - do we<br /> > > >> really need the whole forking/threading thing on the client at
all?<br/> > ><br /> > > > Thanks for the review, I understand you point, but I think if we have do this
directlyby independent connection,<br /> > > > It's difficult to equally divide the jobs b/w multiple
independentconnections.<br /> > ><br /> > > That argument seems like complete nonsense.  You're confusing
work<br/> > > allocation strategy with the implementation technology for the multiple<br /> > > working
threads. I see no reason why a good allocation strategy couldn't<br /> > > work with either approach; indeed, I
thinkit would likely be easier to<br /> > > do some things *without* client-side physical parallelism, because
that<br/> > > makes it much simpler to handle feedback between the results of different<br /> > >
operationalthreads.<br /> ><br /> > So you would have one initial connection, which generates a task list;<br />
>then open N libpq connections.  Launch one vacuum on each, and then<br /> > sleep on select() on the three
sockets. Whenever one returns<br /> > read-ready, the vacuuming is done and we send another item from the task<br />
>list.  Repeat until tasklist is empty.  No need to fork anything.<br /> ><p dir="ltr">Yeah, those are exactly my
points.I think it would be significantly simpler to do it that way, rather than forking and threading. And also easier
tomake portable... <p dir="ltr">(and as a  optimization on Alvaros suggestion, you can of course reuse the initial
connectionas one of the workers as long as you got the full list of tasks from it up front, which I think you  do
anywayin order to do sorting of tasks...) <p dir="ltr">/Magnus <br /> 

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:

On 16 July 2014 12:13 Magnus Hagander Wrote,

>>Yeah, those are exactly my points. I think it would be significantly simpler to do it that way, rather than forking and threading. And also easier to make portable...

>>(and as a  optimization on Alvaros suggestion, you can of course reuse the initial connection as one of the workers as long as you got the full list of tasks from it up front, which I think you  do anyway in order to do sorting of tasks...)

Oh, I got your point, I will update my patch and send,

Now we can completely remove vac_parallel.h file and no need of refactoring also:)

Thanks & Regards,

Dilip Kumar

 

 

From: Magnus Hagander [mailto:magnus@hagander.net]
Sent: 16 July 2014 12:13
To: Alvaro Herrera
Cc: Dilip kumar; Jan Lentfer; Tom Lane; PostgreSQL-development; Sawada Masahiko; Euler Taveira
Subject: Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

 


On Jul 16, 2014 7:05 AM, "Alvaro Herrera" <alvherre@2ndquadrant.com> wrote:
>
> Tom Lane wrote:
> > Dilip kumar <dilip.kumar@huawei.com> writes:
> > > On 15 July 2014 19:01, Magnus Hagander Wrote,
> > >> I am late to this game, but the first thing to my mind was - do we
> > >> really need the whole forking/threading thing on the client at all?
> >
> > > Thanks for the review, I understand you point, but I think if we have do this directly by independent connection,
> > > It's difficult to equally divide the jobs b/w multiple independent connections.
> >
> > That argument seems like complete nonsense.  You're confusing work
> > allocation strategy with the implementation technology for the multiple
> > working threads.  I see no reason why a good allocation strategy couldn't
> > work with either approach; indeed, I think it would likely be easier to
> > do some things *without* client-side physical parallelism, because that
> > makes it much simpler to handle feedback between the results of different
> > operational threads.
>
> So you would have one initial connection, which generates a task list;
> then open N libpq connections.  Launch one vacuum on each, and then
> sleep on select() on the three sockets.  Whenever one returns
> read-ready, the vacuuming is done and we send another item from the task
> list.  Repeat until tasklist is empty.  No need to fork anything.
>

Yeah, those are exactly my points. I think it would be significantly simpler to do it that way, rather than forking and threading. And also easier to make portable...

(and as a  optimization on Alvaros suggestion, you can of course reuse the initial connection as one of the workers as long as you got the full list of tasks from it up front, which I think you  do anyway in order to do sorting of tasks...)

/Magnus

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:

On 16 July 2014 12:13, Magnus Hagander Wrote,

 

>Yeah, those are exactly my points. I think it would be significantly simpler to do it that way, rather than forking and threading. And also easier to make portable...

>(and as a  optimization on Alvaros suggestion, you can of course reuse the initial connection as one of the workers as long as you got the full list of tasks from it up front, which I think you  do anyway in order to   sorting of tasks...)

I have modified the patch as per the suggestion,

Now in beginning we create all connections, and first connection we use for getting table list in beginning, After that all connections will be involved in vacuum task.

Please have a look and provide your opinion…

Thanks & Regards,

Dilip Kumar

 

 

 

Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Jeff Janes
Date:
On Wed, Jul 16, 2014 at 5:30 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:

On 16 July 2014 12:13 Magnus Hagander Wrote,

>>Yeah, those are exactly my points. I think it would be significantly simpler to do it that way, rather than forking and threading. And also easier to make portable...

>>(and as a  optimization on Alvaros suggestion, you can of course reuse the initial connection as one of the workers as long as you got the full list of tasks from it up front, which I think you  do anyway in order to do sorting of tasks...)

Oh, I got your point, I will update my patch and send,

Now we can completely remove vac_parallel.h file and no need of refactoring also:)

Thanks & Regards,

Dilip Kumar


Should we push the refactoring through anyway?  I have a hard time believing that pg_dump is going to be the only client program we ever have that will need process-level parallelism, even if this feature itself does not need it.  Why make the next person who comes along re-invent that re-factoring of this wheel?

Cheers,

Jeff


Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Alvaro Herrera
Date:
Jeff Janes wrote:

> Should we push the refactoring through anyway?  I have a hard time
> believing that pg_dump is going to be the only client program we ever have
> that will need process-level parallelism, even if this feature itself does
> not need it.  Why make the next person who comes along re-invent that
> re-factoring of this wheel?

I gave the refactoring patch a look some days ago, and my conclusion was
that it is reasonably sound but it needed quite some cleanup in order
for it to be committable.  Without any immediate use case, it's hard to
justify going through all that effort.  Maybe we can add a TODO item and
have it point to the posted patch, so that if in the future we see a
need for another parallel client program we can easily rebase the
current patch.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Fri, Jul 18, 2014 at 10:22 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:
> On 16 July 2014 12:13, Magnus Hagander Wrote,
> >Yeah, those are exactly my points. I think it would be significantly simpler to do it that way, rather than forking and threading. And also easier to make portable...
>
> >(and as a  optimization on Alvaros suggestion, you can of course reuse the initial connection as one of the workers as long as you got the full list of tasks from it up front, which I think you  do anyway in order to   sorting of tasks...)
>
> I have modified the patch as per the suggestion,
>
> Now in beginning we create all connections, and first connection we use for getting table list in beginning, After that all connections will be involved in vacuum task.
>
> Please have a look and provide your opinion…

1.
+ connSlot = (ParallelSlot*)pg_malloc(parallel * sizeof(ParallelSlot));
+ for (i = 0; i < parallel; i++)
+ {
+ connSlot[i].connection = connectDatabase(dbname, host, port, username,
+  prompt_password, progname, false);
+ PQsetnonblocking(connSlot[i].connection, 1);
+ connSlot[i].isFree = true;
+ connSlot[i].sock = PQsocket(connSlot[i].connection);
+ }
Here it seems to me that you are opening connections before
getting or checking tables list, so in case you have lesser
number of tables, won't the extra connections be always idle.
Simple case to erify the same is with below example

vacuumdb -t t1 -d postgres -j 4

2.
+         res = executeQuery(conn,
+             "select relname, nspname from pg_class c, pg_namespace ns"
+             " where relkind= \'r\' and c.relnamespace = ns.oid"
+             " order by relpages desc",
+             progname, echo);

Here it is just trying to get the list of relations, however
Vacuum  command processes materialized views as well, so
I think here the list should include materialized views as well
unless you have any specific reason for not including those.

3. In function vacuum_parallel(), if user has not provided list of tables,
then it is retrieving all the tables in database and then in run_parallel_vacuum(),
it tries to do Vacuum for each of table using Async mechanism, now
consider a case when after getting list if any table is dropped by user
from some other session, then patch will error out.  However without patch
or Vacuum command will internally ignore such a case and complete
the Vacuum for other tables.  Don't you think the patch should maintain
the existing behaviour?


4.
+        <term><option>-j <replaceable class="parameter">jobs</replaceable></></term>
+        Number of parallel process to perform the operation.

Change this description as per new implementation. Also I think
there is a need of some explanation for this new option.


5.
It seems there is no change in below function decalration:
static void vacuum_one_database(const char *dbname, bool full, bool verbose,
! bool and_analyze, bool analyze_only, bool analyze_in_stages,
! bool freeze, const char *table, const char *host,
! const char *port, const char *username,
! enum trivalue prompt_password,
  const char *progname, bool echo);

6.
+ printf(_("  -j, --jobs=NUM                  use this many parallel jobs to vacuum\n"));
Change the description as per new implementation.



7. 
/* This will give the free connection slot, if no slot is free it will
wait for atleast one slot to get free.*/

Multiline comments should be written like (refer other places)
/*
 * This will give the free connection slot, if no slot is free it will
 * wait for atleast one slot to get free.
 */
Kindly correct at other places if similar instance exist in patch.

8.
Isn't it a good idea to check performance of this new patch
especially for some worst cases like when there is not much
to vacuum in the tables inside a database.  The reason I wanted
to check is that because with new algorithm (for a vacuum of database,
now it will get the list of tables and perform vacuum on individual
tables) we have to repeat certain actions in server side like
allocation/deallocataion of context, sending stats which would have
been otherwise done once.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:

On 31 July 2014 10:59, Amit kapila Wrote,

 

Thanks for the review and valuable comments.

I have fixed all the comments and attached the revised patch.

 

As per your suggestion I have taken the performance report also…

 

Test1:

Machine Configuration:

                                Core : 8 (Intel(R) Xeon(R) CPU  E5520  @ 2.27GHz)

                                RAM: 48GB

Test Scenario:

                               8 tables all with 1M+ records. [many records are deleted and inserted using some pattern, (files is attached in the mail)]

 

Test Result

                Base Code:  43.126s

                Parallel Vacuum Code

                                    2 Threads :  29.687s

                                    8 Threads :  14.647s

 

Test2: (as per your scenario, where actual vacuum time is very less)

            Vacuum done for complete DB

            8 tables all with 10000 records and few dead tuples

 

Test Result

Base Code:  0.59s

Parallel Vacuum Code

                    2 Threads  :  0.50s

                    4 Threads  :  0.29s        

                    8 Threads  :  0.18s

 

Regards,

Dilip Kumar

 

 

 

 

 

From: Amit Kapila [mailto:amit.kapila16@gmail.com]
Sent: 31 July 2014 10:59
To: Dilip kumar
Cc: Magnus Hagander; Alvaro Herrera; Jan Lentfer; Tom Lane; PostgreSQL-development; Sawada Masahiko; Euler Taveira
Subject: Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

 

On Fri, Jul 18, 2014 at 10:22 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:
> On 16 July 2014 12:13, Magnus Hagander Wrote,
> >Yeah, those are exactly my points. I think it would be significantly simpler to do it that way, rather than forking and threading. And also easier to make portable...
>
> >(and as a  optimization on Alvaros suggestion, you can of course reuse the initial connection as one of the workers as long as you got the full list of tasks from it up front, which I think you  do anyway in order to   sorting of tasks...)
>
> I have modified the patch as per the suggestion,
>
> Now in beginning we create all connections, and first connection we use for getting table list in beginning, After that all connections will be involved in vacuum task.
>
> Please have a look and provide your opinion…

 

1.

+        connSlot = (ParallelSlot*)pg_malloc(parallel * sizeof(ParallelSlot));

+        for (i = 0; i < parallel; i++)

+        {

+                    connSlot[i].connection = connectDatabase(dbname, host, port, username,

+                                                                                             prompt_password, progname, false);

+                    PQsetnonblocking(connSlot[i].connection, 1);

+                    connSlot[i].isFree = true;

+                    connSlot[i].sock = PQsocket(connSlot[i].connection);

+        }

Here it seems to me that you are opening connections before

getting or checking tables list, so in case you have lesser

number of tables, won't the extra connections be always idle.

Simple case to erify the same is with below example

 

vacuumdb -t t1 -d postgres -j 4

 

2.

+         res = executeQuery(conn,

+             "select relname, nspname from pg_class c, pg_namespace ns"

+             " where relkind= \'r\' and c.relnamespace = ns.oid"

+             " order by relpages desc",

+             progname, echo);

 

Here it is just trying to get the list of relations, however

Vacuum  command processes materialized views as well, so

I think here the list should include materialized views as well

unless you have any specific reason for not including those.

 

3. In function vacuum_parallel(), if user has not provided list of tables,

then it is retrieving all the tables in database and then in run_parallel_vacuum(),

it tries to do Vacuum for each of table using Async mechanism, now

consider a case when after getting list if any table is dropped by user

from some other session, then patch will error out.  However without patch

or Vacuum command will internally ignore such a case and complete

the Vacuum for other tables.  Don't you think the patch should maintain

the existing behaviour?

 

 

4.

+        <term><option>-j <replaceable class="parameter">jobs</replaceable></></term>

+        Number of parallel process to perform the operation.

 

Change this description as per new implementation. Also I think

there is a need of some explanation for this new option.

 

 

5.

It seems there is no change in below function decalration:

static void vacuum_one_database(const char *dbname, bool full, bool verbose,

!                                                         bool and_analyze, bool analyze_only, bool analyze_in_stages,

!                                                         bool freeze, const char *table, const char *host,

!                                                         const char *port, const char *username,

!                                                         enum trivalue prompt_password,

                                                          const char *progname, bool echo);

 

6.

+        printf(_("  -j, --jobs=NUM                  use this many parallel jobs to vacuum\n"));

Change the description as per new implementation.

 

 

 

7. 

/* This will give the free connection slot, if no slot is free it will

                      wait for atleast one slot to get free.*/

 

Multiline comments should be written like (refer other places)

/*

 * This will give the free connection slot, if no slot is free it will

 * wait for atleast one slot to get free.

 */

Kindly correct at other places if similar instance exist in patch.

 

8.

Isn't it a good idea to check performance of this new patch

especially for some worst cases like when there is not much

to vacuum in the tables inside a database.  The reason I wanted

to check is that because with new algorithm (for a vacuum of database,

now it will get the list of tables and perform vacuum on individual

tables) we have to repeat certain actions in server side like

allocation/deallocataion of context, sending stats which would have

been otherwise done once.

 

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
<div dir="ltr">On Mon, Aug 4, 2014 at 11:41 AM, Dilip kumar <<a
href="mailto:dilip.kumar@huawei.com">dilip.kumar@huawei.com</a>>wrote:<br />><br />> On 31 July 2014 10:59,
Amitkapila Wrote,<br />><br />>  <br />><br /> > Thanks for the review and valuable comments.<br />> I
havefixed all the comments and attached the revised patch.<br /><br />I have again looked into your revised patch and
wouldlike<br />to share my findings with you.<br /><br />1. <br />+        Number of parallel connections to perform
theoperation. This option will enable the vacuum<br />+        operation to run on parallel connections, at a time one
tablewill be operated on one connection.<br /><br />a. How about describing w.r.t asynchronous connections<br />instead
ofparallel connections?<br />b. It is better to have line length as lesser than 80.<br />c. As you are using multiple
connectionsto achieve parallelism,<br /> I suggest you add a line in your description indicating user should<br
/>verifymax_connections parameters.  Something similar to pg_dump:<br /><br />"pg_dump will open njobs + 1 connections
tothe database, so make<br />  sure your max_connections setting is high enough to accommodate<br />   all
connections."<br/> <br /><br />2. <br />+        So at one time as many tables will be vacuumed parallely as number of
jobs.<br/><br />can you briefly mention about the case when number of jobs<br />is more than number of tables?<br /><br
/>3.<br/>+ /* When user is giving the table list, and list is smaller then<br />+ * number of tables<br />+ */<br />+
if(tbl_count && (parallel > tbl_count))<br />+ parallel = tbl_count;<br />+ <br /><br />Again here multiline
commentsare wrong.<br /><br />Some other instances are as below:<br />a.<br />/* This will give the free connection
slot,if no slot is free it will<br />* wait for atleast one slot to get free.<br />*/<br />b.<br />/* if table list is
notprovided then we need to do vaccum for whole DB<br /> * get the list of all tables and prpare the list<br />*/<br
/>c.<br/>/* Some of the slot are free, Process the results for slots whichever<br />*  are free<br />*/<br /><br /><br
/>4.<br/>src/bin/scripts/vacuumdb.c:51: indent with spaces.<br /> +                     bool analyze_only, bool freeze,
PQExpBuffersql);<br />src/bin/scripts/vacuumdb.c:116: indent with spaces.<br />+    int parallel = 0;<br
/>src/bin/scripts/vacuumdb.c:198:indent with spaces.<br />+    optind++;<br /> src/bin/scripts/vacuumdb.c:299: space
beforetab in indent.<br />+                       vacuum_one_database(dbname, full, verbose, and_analyze,<br /><br
/>Thereare lot of redundant whitespaces, check with<br />git diff --check and fix them.<br /><br /><br />5.<br />res =
executeQuery(conn,<br/>            "select relname, nspname from pg_class c, pg_namespace ns"<br />            " where
(relkind= \'r\' or relkind = \'m\')"<br />            " and c.relnamespace = ns.oid order by relpages desc",<br />    
       progname, echo);<br /><br />a. Here you need to use SQL keywords in capital letters, refer one<br />of the other
callerof executeQuery() in vacuumdb.c<br />b. Why do you need this condition c.relnamespace = ns.oid in above<br />
query?<br/>I think to get the list of required objects from pg_class, you don't<br />need to have a join with
pg_namespace.<br/><br />6.<br />vacuum_parallel()<br />{<br />..<br />if (!tables || !tables->head)<br />{<br
/>..<br/>tbl_count++;<br /> }<br />..<br />}<br /><br />a. Here why you need a separate variable (tbl_count) to verify
number<br/>asynchronous/parallel connections you want, why can't we use ntuple?<br />b. there is a warning in code (I
havecompiled it on windows) as well<br /> related to this variable.<br />vacuumdb.c(695): warning C4700: uninitialized
localvariable 'tbl_count' used<br /><br /><br />7.<br />Fix for one of my previous comment is as below:<br
/>GetQueryResult()<br/>{<br />..<br />if (!r && !completedb)<br /> ..<br />}<br /><br />Here I think some
genericerrors like connection broken or others<br />will also get ignored.  Is it possible that we can ignore
particular<br/>error which we want to ignore without complicating the code?<br /><br /> Also in anycase add comments to
explainwhy you are ignoring<br />error for particular case.<br /><br /><br />8.<br />+ fprintf(stderr, _("%s: Number of
parallel\"jobs\" should be at least 1\n"),<br />+ progname);<br /> formatting of 2nd line progname is not as per
standard(you can refer other fprintf in the same file).<br /><br />9. +     int parallel = 0;<br />I think it is better
toname it as numAsyncCons or something similar.<br /><br />10. It is better if you can add function header for newly
added<br/> functions. <br />  <br />><br />> Test2: (as per your scenario, where actual vacuum time is very
less)<br/>><br />>             Vacuum done for complete DB<br />><br />>             8 tables all with
10000records and few dead tuples<br /><br />I think this test is missing in attached file.  Few means?<br />Can you try
with0.1%, 1% of dead tuples in table and try to<br />take time in milliseconds if it is taking less time to complete<br
/>thetest.<br /><br /><br />PS - <br /> It is better if you mention against each review comment/suggestion<br />what
youhave done, because in some cases it will help reviewer to<br />understand your fix easily and as author you will
alsobe sure that<br />all got fixed.<br /><br />With Regards,<br />Amit Kapila.<br />EnterpriseDB: <a
href="http://www.enterprisedb.com">http://www.enterprisedb.com</a></div>

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:

On 11 August 2014 10:29, Amit kapila wrote,

 

1.     I have fixed all the review comments except few, and modified patch is attached.

2.     For not fixed comments, find inline reply in the mail..

 

>1.

>+        Number of parallel connections to perform the operation. This option will enable the vacuum

>+        operation to run on parallel connections, at a time one table will be operated on one connection.

>a. How about describing w.r.t asynchronous connections

>instead of parallel connections?

>b. It is better to have line length as lesser than 80.

>c. As you are using multiple connections to achieve parallelism,

>I suggest you add a line in your description indicating user should

>verify max_connections parameters.  Something similar to pg_dump:

>2.

>+        So at one time as many tables will be vacuumed parallely as number of jobs.

>can you briefly mention about the case when number of jobs

>is more than number of tables?

 

 

1 and 2 ARE FIXED in DOC.

 

 

>3.

>+ /* When user is giving the table list, and list is smaller then

>+ * number of tables

>+ */

>+ if (tbl_count && (parallel > tbl_count))

>+ parallel = tbl_count;

>+

>Again here multiline comments are wrong.

>Some other instances are as below:

>a.

>/* This will give the free connection slot, if no slot is free it will

>* wait for atleast one slot to get free.

>*/

>b.

>/* if table list is not provided then we need to do vaccum for whole DB

>* get the list of all tables and prpare the list

>*/

>c.

>/* Some of the slot are free, Process the results for slots whichever

>*  are free

>*/

 

COMMENTS are FIXED

 

 

>4.

>src/bin/scripts/vacuumdb.c:51: indent with spaces.

>+                     bool analyze_only, bool freeze, PQExpBuffer sql);

>src/bin/scripts/vacuumdb.c:116: indent with spaces.

>+    int parallel = 0;

>src/bin/scripts/vacuumdb.c:198: indent with spaces.

>+    optind++;

>src/bin/scripts/vacuumdb.c:299: space before tab in indent.

>+                       vacuum_one_database(dbname, full, verbose, and_analyze,

>There are lot of redundant whitespaces, check with

>git diff --check and fix them.

 

ALL are FIXED

 

>5.

>res = executeQuery(conn,

>            "select relname, nspname from pg_class c, pg_namespace ns"

>            " where (relkind = \'r\' or relkind = \'m\')"

>            " and c.relnamespace = ns.oid order by relpages desc",

>            progname, echo);

>a. Here you need to use SQL keywords in capital letters, refer one

 

>of the other caller of executeQuery() in vacuumdb.c

>b. Why do you need this condition c.relnamespace = ns.oid in above

>query?

 

IT IS POSSIBLE THAT, TWO NAMESPACE HAVE THE SAME TABLE NAME, SO WHEN WE ARE SENDING COMMAND FROM CLIENT WE NEED TO GIVE NAMESPACE WISE BECAUSE WE NEED TO VACUUM ALL THE

TABLES.. (OTHERWISE TWO TABLE WITH SAME NAME FROM DIFFERENT NAMESPACE WILL BE TREATED SAME.)

 

>I think to get the list of required objects from pg_class, you don't

>need to have a join with pg_namespace.

 

DONE

 

>6.

>vacuum_parallel()

>{

>..

>if (!tables || !tables->head)

>{

>..

>tbl_count++;

>}

>..

>}

>a. Here why you need a separate variable (tbl_count) to verify number

>asynchronous/parallel connections you want, why can't we use ntuple?

>b. there is a warning in code (I have compiled it on windows) as well

>related to this variable.

>vacuumdb.c(695): warning C4700: uninitialized local variable 'tbl_count' used

 

Variable REMOVED

 

>7.

>Fix for one of my previous comment is as below:

>GetQueryResult()

>{

>..

>if (!r && !completedb)

>..

>}

>Here I think some generic errors like connection broken or others

>will also get ignored.  Is it possible that we can ignore particular

>error which we want to ignore without complicating the code?

>Also in anycase add comments to explain why you are ignoring

>error for particular case.

 

Here we are getting message string, I think if we need to find the error code then we need to parse the string, and after that we need to compare with error codes.

Is there any other way to do this ?

Comments are added

 

 

>8.

>+ fprintf(stderr, _("%s: Number of parallel \"jobs\" should be at least 1\n"),

>+ progname);

>formatting of 2nd line progname is not as per standard (you can refer other fprintf in the same file).

 

DONE

 

>9. +     int parallel = 0;

>I think it is better to name it as numAsyncCons or something similar.

 

CHANGED as PER SUGGESTION

 

>10. It is better if you can add function header for newly added

>functions.

 

ADDED  

>> 

>> Test2: (as per your scenario, where actual vacuum time is very less)

>> 

>>             Vacuum done for complete DB

>> 

>>             8 tables all with 10000 records and few dead tuples

>I think this test is missing in attached file.  Few means?

>Can you try with 0.1%, 1% of dead tuples in table and try to

>take time in milliseconds if it is taking less time to complete

>the test.

 

TESTED with 1%, 0.1% and 0.01 % and results are as follows

1.       With 1% (file test1%.sql)

 

Base Code  :  22.26 s

2 Threads   : 12.82 s

4 Threads  : 9.19s

8 Threads :  8.89s

 

2.       With 0.1%

 

Base Code  :  3.83.26 s

2 Threads   : 2.01 s

4 Threads  : 2.02s

8 Threads :  2.25s

 

3.       With 0.01%

 

Base Code  :  0.60 s

2 Threads   :  0.32 s

4 Threads  :  0.26s

8 Threads :   0.31s

 

Thanks & Regards,

Dilip

Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Robert Haas
Date:
On Mon, Aug 11, 2014 at 12:59 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> 1.
> +        Number of parallel connections to perform the operation. This
> option will enable the vacuum
> +        operation to run on parallel connections, at a time one table will
> be operated on one connection.
>
> a. How about describing w.r.t asynchronous connections
> instead of parallel connections?

I don't think "asynchronous" is a good choice of word.  Maybe "simultaneous"?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Wed, Aug 13, 2014 at 4:01 PM, Dilip kumar <dilip.kumar@huawei.com> wrote:
> On 11 August 2014 10:29, Amit kapila wrote,
> >5.
>
> >res = executeQuery(conn,
>
> >            "select relname, nspname from pg_class c, pg_namespace ns"
>
> >            " where (relkind = \'r\' or relkind = \'m\')"
>
> >            " and c.relnamespace = ns.oid order by relpages desc",
>
> >            progname, echo);
>
> >
>
> >a. Here you need to use SQL keywords in capital letters, refer one
> >of the other caller of executeQuery() in vacuumdb.c
>
> >b. Why do you need this condition c.relnamespace = ns.oid in above
>
> >query?
>
>  
>
> IT IS POSSIBLE THAT, TWO NAMESPACE HAVE THE SAME TABLE NAME, SO WHEN WE ARE SENDING COMMAND FROM CLIENT WE NEED TO GIVE NAMESPACE WISE BECAUSE WE NEED TO VACUUM ALL THE
>
> TABLES.. (OTHERWISE TWO TABLE WITH SAME NAME FROM DIFFERENT NAMESPACE WILL BE TREATED SAME.)

Thats right, however writing query in below way might
make it more understandable
+ "SELECT relname, nspname FROM pg_class c, pg_namespace ns"

"SELECT c.relname, ns.nspname FROM pg_class c, pg_namespace ns"



> >7.
>
>
> Here we are getting message string, I think if we need to find the error code then we need to parse the string, and after that we need to compare with error codes.
>
> Is there any other way to do this ?

You can compare against SQLSTATE by using below API.
val = PQresultErrorField(res, PG_DIAG_SQLSTATE);

You need to handle *42P01* SQLSTATE, also please refer below
usage where we are checking SQLSTATE.

fe-connect.c
PQresultErrorField(conn->result, PG_DIAG_SQLSTATE),
  ERRCODE_INVALID_PASSWORD) == 0)

Few more comments:

1.
* If user has not given the vacuum of complete db, then if

I think here you have said reverse of what code is doing.
You don't need *not* in above sentence.

2.
+ appendPQExpBuffer(&sql, "\"%s\".\"%s\"", nspace, relName);
I think here you need to use function fmtQualifiedId() or fmtId()
or something similar to handle quotes appropriately.

3.

+ */
+ if (!r && !completedb)
Here the usage of completedb variable is reversed which means
that it goes into error path when actually whole database is
getting vacuumed and the reason is that you are setting it
to false in below code:
+ /* Vaccuming full database*/
+ vacuum_tables = false;

4.
Functions prepare_command() and vacuum_one_database() contain
duplicate code, is there any problem in using prepare_command()
function in vacuum_one_database(). Another point in this context
is that I think it is better to name function prepare_command() 
as append_vacuum_options() or something on that lines, also it will
be better if you can write function header for this function as well. 

5. 
+ if (error)
+ {
+ for (i = 0; i < max_slot; i++)
+ {
+ DisconnectDatabase(&connSlot[i]);
+ }

Here why do we need DisconnectDatabase() type of function?
Why can't we simply call PQfinish() as in base code?

6.
+ /*
+ * if table list is not provided then we need to do vaccum for whole DB
+ * get the list of all tables and prpare the list
+ */
spelling of prepare is wrong. I have noticed spell mistake
in comments at some other place as well, please check all
comments once

7. I think in new mechanism cancel handler will not work.
In single connection vacuum it was always set/reset
in function executeMaintenanceCommand(). You might need
to set/reset it in function run_parallel_vacuum().


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Fri, Aug 15, 2014 at 12:55 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Mon, Aug 11, 2014 at 12:59 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > 1.
> > +        Number of parallel connections to perform the operation. This
> > option will enable the vacuum
> > +        operation to run on parallel connections, at a time one table will
> > be operated on one connection.
> >
> > a. How about describing w.r.t asynchronous connections
> > instead of parallel connections?
>
> I don't think "asynchronous" is a good choice of word.

Agreed.

>Maybe "simultaneous"?

Not sure. How about *concurrent* or *multiple*?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Robert Haas
Date:
On Tue, Aug 19, 2014 at 7:08 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Aug 15, 2014 at 12:55 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>
>> On Mon, Aug 11, 2014 at 12:59 AM, Amit Kapila <amit.kapila16@gmail.com>
>> wrote:
>> > 1.
>> > +        Number of parallel connections to perform the operation. This
>> > option will enable the vacuum
>> > +        operation to run on parallel connections, at a time one table
>> > will
>> > be operated on one connection.
>> >
>> > a. How about describing w.r.t asynchronous connections
>> > instead of parallel connections?
>>
>> I don't think "asynchronous" is a good choice of word.
>
> Agreed.
>
>>Maybe "simultaneous"?
>
> Not sure. How about *concurrent* or *multiple*?

multiple isn't right, but we could say concurrent.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Thu, Aug 21, 2014 at 12:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Aug 19, 2014 at 7:08 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Fri, Aug 15, 2014 at 12:55 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> >>
> >> On Mon, Aug 11, 2014 at 12:59 AM, Amit Kapila <amit.kapila16@gmail.com>
> >> wrote:
> >> >
> >> > a. How about describing w.r.t asynchronous connections
> >> > instead of parallel connections?
> >>
> >> I don't think "asynchronous" is a good choice of word.
> >
> > Agreed.
> >
> >>Maybe "simultaneous"?
> >
> > Not sure. How about *concurrent* or *multiple*?
>
> multiple isn't right, but we could say concurrent.

I also find concurrent more appropriate.
Dilip, could you please change it to concurrent in doc updates,
variables, functions unless you see any objection for the same.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
<div class="WordSection1"><p class="MsoNormal">On<span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"></span><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">21August 2014 08:31, Amit Kapila Wrote, </span><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"></span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span><pclass="MsoNormal"
style="margin-bottom:12.0pt">>>><br /> > > >Not sure. How about *concurrent* or *multiple*?<br />
>><br /> > >multiple isn't right, but we could say concurrent.<p class="MsoNormal">>I also find
concurrentmore appropriate.<p class="MsoNormal">>Dilip, could you please change it to concurrent in doc updates,<p
class="MsoNormal">>variables,functions unless you see any objection for the same.<p class="MsoNormal"> <p
class="MsoNormal">Ok,I will take care along with other comments fix..<p class="MsoNormal"> <p
class="MsoNormal">Regards,<pclass="MsoNormal">Dilip Kumar</div> 

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Tue, Aug 19, 2014 at 4:27 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> Few more comments:
>
Some more comments:

1. I could see one shortcoming in the way the patch has currently parallelize the
   work for --analyze-in-stages. Basically patch is performing the work for each stage
   for multiple tables in concurrent connections that seems okay for the cases when
   number of parallel connections is less than equal to number of tables, but for
   the case when user has asked for more number of connections than number of tables,
   then I think this strategy will not be able to use the extra connections. 

2. Similarly for the case of multiple databases, currently it will not be able
   to use connections more than number of tables in each database because the
   parallelizing strategy is to just use the conncurrent connections for
   tables inside single database.

I am not completely sure whether current strategy is good enough or
we should try to address the above problems.  What do you think?

3.  
+ do
+ {
+ i = select_loop(maxFd, &slotset);
+ Assert(i != 0);

Could you explain the reason of using this loop, I think you
want to wait for data on socket descriptor, but why for maxFd?
Also it is better if you explain this logic in comments.

4.
+ for (i = 0; i < max_slot; i++)
+ {
+ if (!FD_ISSET(pSlot[i].sock, &slotset))
+ continue;
+ PQconsumeInput(pSlot[i].connection);
+ if (PQisBusy(pSlot[i].connection))
+ continue;

I think it is better to call PQconsumeInput() only if you find
connection is busy.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:

On 24 August 2014 11:33, Amit Kapila Wrote

 

Thanks for you comments, i have worked on both the review comment lists, sent on 19 August, and 24 August.

 

Latest patch is attached with the mail..

 

 

on 19 August:

------------

>You can compare against SQLSTATE by using below API.

>val = PQresultErrorField(res, PG_DIAG_SQLSTATE);

 

>You need to handle *42P01* SQLSTATE, also please refer below

>usage where we are checking SQLSTATE.

 

>fe-connect.c

>PQresultErrorField(conn->result, PG_DIAG_SQLSTATE),

>                                               ERRCODE_INVALID_PASSWORD) == 0)

 

DONE

 

>Few more comments:

>1.

>* If user has not given the vacuum of complete db, then if

>I think here you have said reverse of what code is doing.

>You don't need *not* in above sentence.

 

DONE

 

 

>2.

>+                                           appendPQExpBuffer(&sql, "\"%s\".\"%s\"", nspace, relName);

>I think here you need to use function fmtQualifiedId() or fmtId()

>or something similar to handle quotes appropriately.

 

DONE

 

>3.

>+           */

>+           if (!r && !completedb)

>Here the usage of completedb variable is reversed which means

>that it goes into error path when actually whole database is

>getting vacuumed and the reason is that you are setting it

>to false in below code:

>+                           /* Vaccuming full database*/

>+                           vacuum_tables = false;

 

FIXED

 

>4.

>Functions prepare_command() and vacuum_one_database() contain

>duplicate code, is there any problem in using prepare_command()

>function in vacuum_one_database(). Another point in this context

>is that I think it is better to name function prepare_command()

>as append_vacuum_options() or something on that lines, also it will

>be better if you can write function header for this function as well.

 

DONE

 

>5.

>+           if (error)

>+           {

>+                           for (i = 0; i < max_slot; i++)

>+                           {

>+                                           DisconnectDatabase(&connSlot[i]);

>+                           }

>Here why do we need DisconnectDatabase() type of function?

>Why can't we simply call PQfinish() as in base code?

 

Beacause base code jut need to handle main connection, and when sending the PQfinish, means either its completed or error,

In both the cases, control is with client, But in multi connection case, if one connection fails then we need to send

cancle to on other connection that wwhat is done DisconnectDatabase.

 

>6.

>+           /*

>+           * if table list is not provided then we need to do vaccum for whole DB

>+           * get the list of all tables and prpare the list

>+           */

>spelling of prepare is wrong. I have noticed spell mistake

>in comments at some other place as well, please check all

>comments once

 

FIXED

 

>7. I think in new mechanism cancel handler will not work.

>In single connection vacuum it was always set/reset

>in function executeMaintenanceCommand(). You might need

>to set/reset it in function run_parallel_vacuum().

 

Good catch, Now i have called SetCancelConn(pSlot[0].connection), on first connection. this will enable cancle

handler to cancle the query on first connection so that select loop will come out.

 

 

24 August

---------

>1. I could see one shortcoming in the way the patch has currently parallelize the

>   work for --analyze-in-stages. Basically patch is performing the work for each stage

>   for multiple tables in concurrent connections that seems okay for the cases when

>   number of parallel connections is less than equal to number of tables, but for

>   the case when user has asked for more number of connections than number of tables,

>   then I think this strategy will not be able to use the extra connections.

 

I think --analyze-in-stages should maintain the order.

 

>2. Similarly for the case of multiple databases, currently it will not be able

>   to use connections more than number of tables in each database because the

>   parallelizing strategy is to just use the conncurrent connections for

>   tables inside single database.

 

I think for multiple database doing the parallel execution we need to maintain the multiple connection with multiple databases.

And we need to maintain a table list for all the databases together to run them concurrently. I think this may impact the startup cost,

as we need to create a multiple connection and disconnect for preparing the list and i think it will increase the complexity also.

 

 

>I am not completely sure whether current strategy is good enough or

>we should try to address the above problems.  What do you think?

 

>3. 

>+           do

>+           {

>+                           i = select_loop(maxFd, &slotset);

>+                           Assert(i != 0);

>Could you explain the reason of using this loop, I think you

>want to wait for data on socket descriptor, but why for maxFd?

>Also it is better if you explain this logic in comments.

 

We are sending vacuum job to the connection and when non of the connection slot is free, we are waiting on all the socket, and

wait until one of them get freed.                             

 

>4.

>+                           for (i = 0; i < max_slot; i++)

>+                           {

>+                                           if (!FD_ISSET(pSlot[i].sock, &slotset))

>+                                                           continue;

>+

>+                                           PQconsumeInput(pSlot[i].connection);

>+                                           if (PQisBusy(pSlot[i].connection))

>+                                                           continue;

>I think it is better to call PQconsumeInput() only if you find

>connection is busy.

 

I think here logic is bit different, in other places of code they call PQconsumeInput untill it shows PQisBusy in a loop to consume all data,

but in over case, we have sent the query now we consume if there is something on n/w and then check PQisBusy,

if not busy means this connection is freed.

And PQconsumeInput functionality is if input available then only consume it so i think we need not to checkk for PQisBusy externally,

 

However the case where user have to fetch complete data, that time they need to call PQisBusy and then PQconsumeInput in loop so PQconsumeInput

will not get called in tight loop, and will be called only when data connection is busy.

 

 

Regards,

Dilip

                              

 

 

 

Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Jeff Janes
Date:
On Wed, Sep 24, 2014 at 2:48 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:

On 24 August 2014 11:33, Amit Kapila Wrote

 

Thanks for you comments, i have worked on both the review comment lists, sent on 19 August, and 24 August.

 

Latest patch is attached with the mail..


Hi Dilip,

I think you have an off-by-one error in the index into the array of file handles.

vacuumdb runs at full CPU, and if I run:

strace -ttt -T -f ../parallel_vac/bin/vacuumdb -z -a -j 8

I get the select returning immediately with a bad file descriptor error:

1411663937.641177 select(55, [4 5 6 7 8 9 10 54], NULL, NULL, NULL) = -1 EBADF (Bad file descriptor) <0.000012>
1411663937.641232 recvfrom(3, 0x104e3f0, 16384, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.000012>
1411663937.641279 recvfrom(4, 0x1034bc0, 16384, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.000011>
1411663937.641326 recvfrom(5, 0x10017c0, 16384, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.000014>
1411663937.641415 recvfrom(6, 0x10097e0, 16384, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.000020>
1411663937.641487 recvfrom(7, 0x1012330, 16384, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.000013>
1411663937.641538 recvfrom(8, 0x101af00, 16384, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.000011>
1411663937.641584 recvfrom(9, 0x1023af0, 16384, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.000012>
1411663937.641631 recvfrom(10, 0x1054600, 16384, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable) <0.000012>

Cheers,

Jeff

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Jeff Janes
Date:
On Thu, Sep 25, 2014 at 10:00 AM, Jeff Janes <jeff.janes@gmail.com> wrote:
On Wed, Sep 24, 2014 at 2:48 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:

On 24 August 2014 11:33, Amit Kapila Wrote

 

Thanks for you comments, i have worked on both the review comment lists, sent on 19 August, and 24 August.

 

Latest patch is attached with the mail..


Hi Dilip,

I think you have an off-by-one error in the index into the array of file handles.

Actually the problem is that the socket for the master connection was not getting initialized, see my one line addition here.

   connSlot = (ParallelSlot*)pg_malloc(concurrentCons * sizeof(ParallelSlot));
   connSlot[0].connection = conn;
+   connSlot[0].sock = PQsocket(conn);


However, I don't think it is good to just ignore errors from the select call (like the EBADF) and go into a busy loop instead, so there are more changes needed than this.

Also, cancelling the run (by hitting ctrl-C in the shell that invoked it) does not seem to work on linux.  I get a message that says "Cancel request sent", but then it continues to finish the job anyway.

Cheers,

Jeff

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Wed, Sep 24, 2014 at 3:18 PM, Dilip kumar <dilip.kumar@huawei.com> wrote:
> On 24 August 2014 11:33, Amit Kapila Wrote
>
> >7. I think in new mechanism cancel handler will not work.
>
> >In single connection vacuum it was always set/reset
>
> >in function executeMaintenanceCommand(). You might need
>
> >to set/reset it in function run_parallel_vacuum().
>
>  
>
> Good catch, Now i have called SetCancelConn(pSlot[0].connection), on first connection. this will enable cancle
>
> handler to cancle the query on first connection so that select loop will come out.
>

I don't think this can handle cancel requests properly because
you are just setting it in GetIdleSlot() what if the cancel
request came during GetQueryResult() after sending sql for
all connections (probably thats the reason why Jeff is not able
to cancel the vacuumdb when using parallel option).

>
>  
>
> >1. I could see one shortcoming in the way the patch has currently parallelize the
>
> >   work for --analyze-in-stages. Basically patch is performing the work for each stage
>
> >   for multiple tables in concurrent connections that seems okay for the cases when
>
> >   number of parallel connections is less than equal to number of tables, but for
>
> >   the case when user has asked for more number of connections than number of tables,
>
> >   then I think this strategy will not be able to use the extra connections.
>
>  
>
> I think --analyze-in-stages should maintain the order.

Yes, you are right.  So lets keep the code as it is for this case.

>  
>
> >2. Similarly for the case of multiple databases, currently it will not be able
>
> >   to use connections more than number of tables in each database because the
>
> >   parallelizing strategy is to just use the conncurrent connections for
>
> >   tables inside single database.
>
>  
>
> I think for multiple database doing the parallel execution we need to maintain the multiple connection with multiple databases.
> And we need to maintain a table list for all the databases together to run them concurrently. I think this may impact the startup cost,
> as we need to create a multiple connection and disconnect for preparing the list

Yeah probably startup cost will be bit higher, but that cost we
will anyway incur during overall operation.

> and i think it will increase the complexity also.

I understand that complexity of code might increase and the strategy
to parallelize can also be different incase we want to parallelize
for all databases case, so lets leave as it is unless someone else
raises voice for the same.

Today while again thinking about the startegy used in patch to
parallelize the operation (vacuum database), I think we can
improve the same for cases when number of connections are
lesser than number of tables in database (which I presume
will normally be the case).  Currently we are sending command
to vacuum one table per connection, how about sending multiple
commands (example Vacuum t1; Vacuum t2) on one connection.
It seems to me there is extra roundtrip for cases when there
are many small tables in database and few large tables.  Do
you think we should optimize for any such cases?

Few other points
1.
+ vacuum_parallel(const char *dbname, bool full, bool verbose,
{
..
+ connSlot = (ParallelSlot*)pg_malloc(concurrentCons * sizeof(ParallelSlot));
+ connSlot[0].connection = conn;

a.
Does above memory gets freed anywhere, if not isn't it
good idea to do the same
b. For slot 0, you are not seeting it as PQsetnonblocking,
where as I think it can be used to run commands like any other
connection.

2.
+ /*
+ * If user has given the vacuum of complete db, then if
+ * any of the object vacuum 
failed it can be ignored and vacuuming
+ * of other object can be continued, this is the same 
behavior as
+ * vacuuming of complete db is handled without --jobs option
+ */

s/object/object's

3.
+ if(!completedb ||
+ (sqlState && strcmp(sqlState, 
ERRCODE_UNDEFINED_TABLE) != 0))
+ {
+ fprintf(stderr, _("%s: vacuuming of 
database \"%s\" failed: %s"),
+ progname, dbname, PQerrorMessage
(conn));

Indentation on both places is wrong.  Check other palces for
similar issues.

4.
+ bool analyze_only, bool freeze, int numAsyncCons,

In code still there is reference to AsyncCons, as decided lets
change it to concurrent_connections | conc_cons




With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Alvaro Herrera
Date:
Amit Kapila wrote:

> Today while again thinking about the startegy used in patch to
> parallelize the operation (vacuum database), I think we can
> improve the same for cases when number of connections are
> lesser than number of tables in database (which I presume
> will normally be the case).  Currently we are sending command
> to vacuum one table per connection, how about sending multiple
> commands (example Vacuum t1; Vacuum t2) on one connection.
> It seems to me there is extra roundtrip for cases when there
> are many small tables in database and few large tables.  Do
> you think we should optimize for any such cases?

I don't think this is a good idea; at least not in a first cut of this
patch.  It's easy to imagine that a table you initially think is small
enough turns out to have grown much larger since last analyze.  In that
case, putting one worker to process that one together with some other
table could end up being bad for parallelism, if later it turns out that
some other worker has no table to process.  (Table t2 in your example
could grown between the time the command is sent and t1 is vacuumed.)

It's simpler to have workers do one thing at a time only.

I don't think it's a very good idea to call pg_relation_size() on every
table in the database from vacuumdb.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Fri, Sep 26, 2014 at 7:06 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> Amit Kapila wrote:
>
> > Today while again thinking about the startegy used in patch to
> > parallelize the operation (vacuum database), I think we can
> > improve the same for cases when number of connections are
> > lesser than number of tables in database (which I presume
> > will normally be the case).  Currently we are sending command
> > to vacuum one table per connection, how about sending multiple
> > commands (example Vacuum t1; Vacuum t2) on one connection.
> > It seems to me there is extra roundtrip for cases when there
> > are many small tables in database and few large tables.  Do
> > you think we should optimize for any such cases?
>
> I don't think this is a good idea; at least not in a first cut of this
> patch.  It's easy to imagine that a table you initially think is small
> enough turns out to have grown much larger since last analyze.

That could be possible, but currently it vacuum's even system tables
one by one (where I think chances of growing up would be comparatively
less) which was the main reason I thought it might be worth
to consider if the current work distribution strategy is good enough.

> In that
> case, putting one worker to process that one together with some other
> table could end up being bad for parallelism, if later it turns out that
> some other worker has no table to process.  (Table t2 in your example
> could grown between the time the command is sent and t1 is vacuumed.)
>
> It's simpler to have workers do one thing at a time only.

Yeah probably that is best at least for initial patch.

> I don't think it's a very good idea to call pg_relation_size() on every
> table in the database from vacuumdb.

You might be right, however I was bit skeptical about the current
strategy where the work unit is one object and each object is considered
same irrespective of it's size/bloat.  OTOH I agree with you that it is
good to keep the first version simpler.



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Gavin Flower
Date:
On 27/09/14 01:36, Alvaro Herrera wrote:
> Amit Kapila wrote:
>
>> Today while again thinking about the startegy used in patch to
>> parallelize the operation (vacuum database), I think we can
>> improve the same for cases when number of connections are
>> lesser than number of tables in database (which I presume
>> will normally be the case).  Currently we are sending command
>> to vacuum one table per connection, how about sending multiple
>> commands (example Vacuum t1; Vacuum t2) on one connection.
>> It seems to me there is extra roundtrip for cases when there
>> are many small tables in database and few large tables.  Do
>> you think we should optimize for any such cases?
> I don't think this is a good idea; at least not in a first cut of this
> patch.  It's easy to imagine that a table you initially think is small
> enough turns out to have grown much larger since last analyze.  In that
> case, putting one worker to process that one together with some other
> table could end up being bad for parallelism, if later it turns out that
> some other worker has no table to process.  (Table t2 in your example
> could grown between the time the command is sent and t1 is vacuumed.)
>
> It's simpler to have workers do one thing at a time only.
>
> I don't think it's a very good idea to call pg_relation_size() on every
> table in the database from vacuumdb.
>
Curious: would it be both feasible and useful to have multiple workers 
process a 'large' table, without complicating things too much?  The 
could each start at a different position in the file.


Cheers,
Gavin



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Alvaro Herrera
Date:
Gavin Flower wrote:

> Curious: would it be both feasible and useful to have multiple
> workers process a 'large' table, without complicating things too
> much?  The could each start at a different position in the file.

Feasible: no.  Useful: maybe, we don't really know.  (You could just as
well have a worker at double the speed, i.e. double vacuum_cost_limit).

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Gregory Smith
Date:
On 9/26/14, 2:38 PM, Gavin Flower wrote:
> Curious: would it be both feasible and useful to have multiple workers
> process a 'large' table, without complicating things too much?  The
> could each start at a different position in the file.

Not really feasible without a major overhaul.  It might be mildly useful
in one rare case.  Occasionally I'll find very hot single tables that
vacuum is constantly processing, despite mostly living in RAM because
the server has a lot of memory.  You can set vacuum_cost_page_hit=0 in
order to get vacuum to chug through such a table as fast as possible.

However, the speed at which that happens will often then be limited by
how fast a single core can read from memory, for things in
shared_buffers.  That is limited by the speed of memory transfers from a
single NUMA memory bank.  Which bank you get will vary depending on the
core that owns that part of shared_buffers' memory, but it's only one at
a time.

On large servers, that can be only a small fraction of the total memory
bandwidth the server is able to reach.  I've attached a graph showing
how this works on a system with many NUMA banks of RAM, and this is only
a medium sized system.  This server can hit 40GB/s of memory transfers
in total; no one process will ever see more than 8GB/s.

If we had more vacuum processes running against the same table, there
would then be more situations where they were doing work against
different NUMA memory banks at the same time, therefore making faster
progress through the hits in shared_buffers possible. In the real world,
this situation is rare enough compared to disk-bound vacuum work that I
doubt it's worth getting excited over.  Systems with lots of RAM where
performance is regularly dominated by one big ugly table are common
though, so I wouldn't just rule the idea out as not useful either.

--
Greg Smith greg.smith@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/

Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Jeff Janes
Date:
On Fri, Sep 26, 2014 at 11:47 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
Gavin Flower wrote:

> Curious: would it be both feasible and useful to have multiple
> workers process a 'large' table, without complicating things too
> much?  The could each start at a different position in the file.

Feasible: no.  Useful: maybe, we don't really know.  (You could just as
well have a worker at double the speed, i.e. double vacuum_cost_limit).

Vacuum_cost_delay is already 0 by default.  So unless you changed that, vacuum_cost_limit does not take effect under vacuumdb.

It is pretty easy for vacuum to be CPU limited, and even easier for analyze to be CPU limited (It does a lot of sorting).  I think analyzing is the main use case for this patch, to shorten the pg_upgrade window.  At least, that is how I anticipate using it.

Cheers,

Jeff

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Gavin Flower
Date:
On 27/09/14 11:36, Gregory Smith wrote:
> On 9/26/14, 2:38 PM, Gavin Flower wrote:
>> Curious: would it be both feasible and useful to have multiple 
>> workers process a 'large' table, without complicating things too 
>> much?  The could each start at a different position in the file.
>
> Not really feasible without a major overhaul.  It might be mildly 
> useful in one rare case.  Occasionally I'll find very hot single 
> tables that vacuum is constantly processing, despite mostly living in 
> RAM because the server has a lot of memory.  You can set 
> vacuum_cost_page_hit=0 in order to get vacuum to chug through such a 
> table as fast as possible.
>
> However, the speed at which that happens will often then be limited by 
> how fast a single core can read from memory, for things in 
> shared_buffers.  That is limited by the speed of memory transfers from 
> a single NUMA memory bank.  Which bank you get will vary depending on 
> the core that owns that part of shared_buffers' memory, but it's only 
> one at a time.
>
> On large servers, that can be only a small fraction of the total 
> memory bandwidth the server is able to reach.  I've attached a graph 
> showing how this works on a system with many NUMA banks of RAM, and 
> this is only a medium sized system.  This server can hit 40GB/s of 
> memory transfers in total; no one process will ever see more than 8GB/s.
>
> If we had more vacuum processes running against the same table, there 
> would then be more situations where they were doing work against 
> different NUMA memory banks at the same time, therefore making faster 
> progress through the hits in shared_buffers possible. In the real 
> world, this situation is rare enough compared to disk-bound vacuum 
> work that I doubt it's worth getting excited over.  Systems with lots 
> of RAM where performance is regularly dominated by one big ugly table 
> are common though, so I wouldn't just rule the idea out as not useful 
> either.
>
Thanks for the very detailed reply of yours, and the comments from others.

Cheers,
Gavin



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:

On 26 September 2014 01:24, Jeff Janes Wrote,

 

 

>I think you have an off-by-one error in the index into the array of file handles.

 

>Actually the problem is that the socket for the master connection was not getting initialized, see my one line addition here.

 

>   connSlot = (ParallelSlot*)pg_malloc(concurrentCons * sizeof(ParallelSlot));

>   connSlot[0].connection = conn;

>+   connSlot[0].sock = PQsocket(conn);

 

Thanks for the review, I have fixed this.

 

>However, I don't think it is good to just ignore errors from the select call (like the EBADF) and go into a busy loop instead, so there are more changes needed than this.

 

Actually this select_loop function I have implemented same as other client application are handling, i.e pg_dum in parallel.c, however parallel.c is handling the case if process is in abort (if Ctrl+c is recieved),

And we need to handle the same, so I have fixed this in attached patch.

 

>Also, cancelling the run (by hitting ctrl-C in the shell that invoked it) does not seem to work on linux.  I get a message that says "Cancel request sent", but then it continues to finish the job anyway.

Apart from above mentioned reason, GetQueryResult was also not setting “SetCancelConn” as Amit has pointed, now this is also fixed.

 

 

Regards,

Dilip Kumar

 

 

Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
<div class="WordSection1"><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">On</span><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">26September 2014 12:24, Amit Kapila Wrote,</span><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"></span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>Idon't think this can handle cancel requests properly
because</span><pclass="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>you are just
settingit in GetIdleSlot() what if the cancel</span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>requestcame during GetQueryResult() after sending sql
for</span><pclass="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>all connections
(probablythats the reason why Jeff is not able</span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>tocancel the vacuumdb when using parallel
option).</span><pclass="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif"">You are right, I have fixed, it in
latestpatch, please check latest patch @ (</span><span
style="font-size:8.5pt;font-family:"Verdana","sans-serif";color:black"><a
href="http://www.postgresql.org/message-id/4205E661176A124FAF891E0A6BA9135266363710@szxeml509-mbs.china.huawei.com">4205E661176A124FAF891E0A6BA9135266363710@szxeml509-mbs.china.huawei.com</a></span><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">)</span><pclass="MsoNormal"><i><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span></i><pclass="MsoNormal"><i><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">dilip@linux-ltr9:/home/dilip/9.4/install/bin>./vacuumdb
-z-a -j 8 -p 9005</span></i><p class="MsoNormal"><i><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">vacuumdb:vacuuming database "db1"</span></i><p
class="MsoNormal"><i><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif"">vacuumdb: vacuuming database
"postgres"</span></i><pclass="MsoNormal"><i><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">Cancel
requestsent</span></i><p class="MsoNormal"><i><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif";color:red">vacuumdb:vacuuming of database "postgres" failed:
ERROR: canceling statement due to user request</span></i><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif";color:red"></span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>Fewother points</span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>1.</span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>+vacuum_parallel(const char *dbname, bool full, bool
verbose,</span><pclass="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>{</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>..</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>+        connSlot =
(ParallelSlot*)pg_malloc(concurrentCons* sizeof(ParallelSlot));</span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>+       connSlot[0].connection = conn;</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">Fixed</span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>a.</span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>Doesabove memory gets freed anywhere, if not isn't
it</span><pclass="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>good idea to do the
same</span><pclass="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>b. For slot 0, you
arenot seeting it as PQsetnonblocking,</span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>whereas I think it can be used to run commands like any
other</span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>connection.</span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">Yes,this was missing in the code, I have fixed it..</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>2.</span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>+       /*</span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>+       * If user has given the vacuum of complete db,
thenif</span><p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>+        * any
ofthe object vacuum </span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>failedit can be ignored and vacuuming</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>+        * of other object can be
continued,this is the same </span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>behavioras</span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>+       * vacuuming of complete db is handled without
--jobsoption</span><p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>+       
*/</span><pclass="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">> </span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>s/object/object's</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">FIXED</span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>3.</span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>+                   if(!completedb ||</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>+                               
(sqlState&& strcmp(sqlState, </span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>ERRCODE_UNDEFINED_TABLE)!= 0))</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>+                    {</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>+ </span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>+                               
fprintf(stderr,_("%s: vacuuming of </span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>database\"%s\" failed: %s"),</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>+
                                                                  progname, dbname, PQerrorMessage</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>(conn));</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif"">> </span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>Indentation on both places is
wrong. Check other palces for</span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>similarissues.</span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">FIXED</span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>4.</span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>+                                                      
boolanalyze_only, bool freeze, int numAsyncCons,</span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>Incode still there is reference to AsyncCons, as decided
lets</span><pclass="MsoNormal"><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">>change it to
concurrent_connections| conc_cons</span><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">FIXED</span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">Regards,</span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">Dilip</span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span></div>

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Simon Riggs
Date:
On 27 September 2014 03:55, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Fri, Sep 26, 2014 at 11:47 AM, Alvaro Herrera <alvherre@2ndquadrant.com>
> wrote:
>>
>> Gavin Flower wrote:
>>
>> > Curious: would it be both feasible and useful to have multiple
>> > workers process a 'large' table, without complicating things too
>> > much?  The could each start at a different position in the file.
>>
>> Feasible: no.  Useful: maybe, we don't really know.  (You could just as
>> well have a worker at double the speed, i.e. double vacuum_cost_limit).
>
>
> Vacuum_cost_delay is already 0 by default.  So unless you changed that,
> vacuum_cost_limit does not take effect under vacuumdb.
>
> It is pretty easy for vacuum to be CPU limited, and even easier for analyze
> to be CPU limited (It does a lot of sorting).  I think analyzing is the main
> use case for this patch, to shorten the pg_upgrade window.  At least, that
> is how I anticipate using it.

I've been trying to review this thread with the thought "what does
this give me?". I am keen to encourage contributions and also keen to
extend our feature set, but I do not wish to complicate our code base.
Dilip's developments do seem to be good quality; what I question is
whether we want this feature.

This patch seems to allow me to run multiple VACUUMs at once. But I
can already do this, with autovacuum.

Is there anything this patch can do that cannot be already done with autovacuum?

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Thu, Oct 16, 2014 at 8:08 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
>
> I've been trying to review this thread with the thought "what does
> this give me?". I am keen to encourage contributions and also keen to
> extend our feature set, but I do not wish to complicate our code base.
> Dilip's developments do seem to be good quality; what I question is
> whether we want this feature.
>
> This patch seems to allow me to run multiple VACUUMs at once. But I
> can already do this, with autovacuum.
>
> Is there anything this patch can do that cannot be already done with autovacuum?

The difference lies in the fact that vacuumdb (or VACUUM) gives
the option to user to control the vacuum activity for cases when
autovacuum doesn't suffice the need, one of the example is to perform
vacuum via vacuumdb after pg_upgrade or some other maintenance
activity (as mentioned by Jeff upthread).  So I think in all such cases
having parallel option can give benefit in terms of performance which
is already shown by Dilip upthread by running some tests (with and
without patch). 

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Simon Riggs
Date:
On 16 October 2014 06:05, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Thu, Oct 16, 2014 at 8:08 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>
>>
>> I've been trying to review this thread with the thought "what does
>> this give me?". I am keen to encourage contributions and also keen to
>> extend our feature set, but I do not wish to complicate our code base.
>> Dilip's developments do seem to be good quality; what I question is
>> whether we want this feature.
>>
>> This patch seems to allow me to run multiple VACUUMs at once. But I
>> can already do this, with autovacuum.
>>
>> Is there anything this patch can do that cannot be already done with
>> autovacuum?
>
> The difference lies in the fact that vacuumdb (or VACUUM) gives
> the option to user to control the vacuum activity for cases when
> autovacuum doesn't suffice the need, one of the example is to perform
> vacuum via vacuumdb after pg_upgrade or some other maintenance
> activity (as mentioned by Jeff upthread).  So I think in all such cases
> having parallel option can give benefit in terms of performance which
> is already shown by Dilip upthread by running some tests (with and
> without patch).

Why do we need 2 ways to do the same thing?

Why not ask autovacuum to do this for you?

Just send a message to autovacuum to request an immediate action. Let
it manage the children and the tasks.
  SELECT pg_autovacuum_immediate(nworkers = N, list_of_tables);

Request would allocate an additional N workers and immediately begin
vacuuming the stated tables.

vacuumdb can still issue the request, but the guts of this are done by
the server, not a heavily modified client.

If we are going to heavily modify a client then it needs to be able to
run more than just one thing. Parallel psql would be nice. pg_batch?

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Thu, Oct 16, 2014 at 1:56 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 16 October 2014 06:05, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > On Thu, Oct 16, 2014 at 8:08 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> >>
> >> This patch seems to allow me to run multiple VACUUMs at once. But I
> >> can already do this, with autovacuum.
> >>
> >> Is there anything this patch can do that cannot be already done with
> >> autovacuum?
> >
> > The difference lies in the fact that vacuumdb (or VACUUM) gives
> > the option to user to control the vacuum activity for cases when
> > autovacuum doesn't suffice the need, one of the example is to perform
> > vacuum via vacuumdb after pg_upgrade or some other maintenance
> > activity (as mentioned by Jeff upthread).  So I think in all such cases
> > having parallel option can give benefit in terms of performance which
> > is already shown by Dilip upthread by running some tests (with and
> > without patch).
>
> Why do we need 2 ways to do the same thing?

Isn't the multiple ways to do the same already exists like via
vacuumdb | Vaccum and autovaccum?

> Why not ask autovacuum to do this for you?
>
> Just send a message to autovacuum to request an immediate action. Let
> it manage the children and the tasks.
>
>    SELECT pg_autovacuum_immediate(nworkers = N, list_of_tables);
>
> Request would allocate an additional N workers and immediately begin
> vacuuming the stated tables.

I think doing anything on the server side can have higher complexity like:
a. Does this function return immediately after sending request to
autovacuum, if yes then the behaviour of this new functionality
will be different as compare to vacuumdb which user might not
expect?
b. We need to have some way to handle errors that can occur in
autovacuum (may be need to find a way to pass back to user),
vacuumdb or Vacuum can report error to user.
c. How does nworkers input relates to autovacuum_max_workers
which is needed at start for shared memory initialization and in calc
of maxbackends. 
d. How to handle database wide vacuum which is possible via vacuumdb
e. What is the best UI (a command like above or via config parameters)?

I think we can find a way for above and may be if any other similar things
needs to be taken care, but still I think it is better that we have this feature
via vacuumdb rather than adding complexity in server code.  Also the
current algorithm used in patch is discussed and agreed upon in this
thread and if now we want to go via some other method (auto vacuum),
it might take much more time to build consensus on all the points.


> vacuumdb can still issue the request, but the guts of this are done by
> the server, not a heavily modified client.
>
> If we are going to heavily modify a client then it needs to be able to
> run more than just one thing. Parallel psql would be nice. pg_batch?

Could you be more specific in this point, I am not able to see how
vacuumdb utility has anything to do with parallel psql?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Simon Riggs
Date:
On 16 October 2014 15:09, Amit Kapila <amit.kapila16@gmail.com> wrote:

>> Just send a message to autovacuum to request an immediate action. Let
>> it manage the children and the tasks.
>>
>>    SELECT pg_autovacuum_immediate(nworkers = N, list_of_tables);
>>
>> Request would allocate an additional N workers and immediately begin
>> vacuuming the stated tables.
>
> I think doing anything on the server side can have higher complexity like:
> a. Does this function return immediately after sending request to
> autovacuum, if yes then the behaviour of this new functionality
> will be different as compare to vacuumdb which user might not
> expect?
> b. We need to have some way to handle errors that can occur in
> autovacuum (may be need to find a way to pass back to user),
> vacuumdb or Vacuum can report error to user.
> c. How does nworkers input relates to autovacuum_max_workers
> which is needed at start for shared memory initialization and in calc
> of maxbackends.
> d. How to handle database wide vacuum which is possible via vacuumdb
> e. What is the best UI (a command like above or via config parameters)?


c) seems like the only issue that needs any thought. I don't think its
going to be that hard.

I don't see any problems with the other points. You can make a
function wait, if you wish.

> I think we can find a way for above and may be if any other similar things
> needs to be taken care, but still I think it is better that we have this
> feature
> via vacuumdb rather than adding complexity in server code.  Also the
> current algorithm used in patch is discussed and agreed upon in this
> thread and if now we want to go via some other method (auto vacuum),
> it might take much more time to build consensus on all the points.

Well, I read Alvaro's point from earlier in the thread and agreed with
it. All we really need is an instruction to autovacuum to say "be
aggressive".

Just because somebody added something to the TODO list doesn't make it
a good idea. I apologise to Dilip for saying this, it is not anything
against him, just the idea.

Perhaps we just accept the patch and change AV in the future.


>> vacuumdb can still issue the request, but the guts of this are done by
>> the server, not a heavily modified client.
>>
>> If we are going to heavily modify a client then it needs to be able to
>> run more than just one thing. Parallel psql would be nice. pg_batch?
>
> Could you be more specific in this point, I am not able to see how
> vacuumdb utility has anything to do with parallel psql?

That's my point. All this code in vacuumdb just for this one isolated
use case? Twice the maintenance burden.

A more generic utility to run commands in parallel would be useful.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Fri, Oct 17, 2014 at 1:31 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On 16 October 2014 15:09, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > I think doing anything on the server side can have higher complexity like:
> > a. Does this function return immediately after sending request to
> > autovacuum, if yes then the behaviour of this new functionality
> > will be different as compare to vacuumdb which user might not
> > expect?
> > b. We need to have some way to handle errors that can occur in
> > autovacuum (may be need to find a way to pass back to user),
> > vacuumdb or Vacuum can report error to user.
> > c. How does nworkers input relates to autovacuum_max_workers
> > which is needed at start for shared memory initialization and in calc
> > of maxbackends.
> > d. How to handle database wide vacuum which is possible via vacuumdb
> > e. What is the best UI (a command like above or via config parameters)?
>
>
> c) seems like the only issue that needs any thought. I don't think its
> going to be that hard.
>
> I don't see any problems with the other points. You can make a
> function wait, if you wish.

It is quite possible, but still I think to accomplish such a function,
we need to have some mechanism where it can inform auto vacuum
and then some changes in auto vacuum to receive/read that information
and reply back.  I don't think any such mechanism exists.

> > I think we can find a way for above and may be if any other similar things
> > needs to be taken care, but still I think it is better that we have this
> > feature
> > via vacuumdb rather than adding complexity in server code.  Also the
> > current algorithm used in patch is discussed and agreed upon in this
> > thread and if now we want to go via some other method (auto vacuum),
> > it might take much more time to build consensus on all the points.
>
> Well, I read Alvaro's point from earlier in the thread and agreed with
> it. All we really need is an instruction to autovacuum to say "be
> aggressive".
>
> Just because somebody added something to the TODO list doesn't make it
> a good idea. I apologise to Dilip for saying this, it is not anything
> against him, just the idea.
>
> Perhaps we just accept the patch and change AV in the future.

So shall we move ahead with review of this patch and make a note
of changing AV in future (may be TODO)?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Simon Riggs
Date:
On 17 October 2014 12:52, Amit Kapila <amit.kapila16@gmail.com> wrote:

> It is quite possible, but still I think to accomplish such a function,
> we need to have some mechanism where it can inform auto vacuum
> and then some changes in auto vacuum to receive/read that information
> and reply back.  I don't think any such mechanism exists.

That's exactly how the CHECKPOINT command works, in conjunction with
the checkpointer process.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Alvaro Herrera
Date:
Amit Kapila wrote:
> On Fri, Oct 17, 2014 at 1:31 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> >
> > On 16 October 2014 15:09, Amit Kapila <amit.kapila16@gmail.com> wrote:

> > c) seems like the only issue that needs any thought. I don't think its
> > going to be that hard.
> >
> > I don't see any problems with the other points. You can make a
> > function wait, if you wish.
> 
> It is quite possible, but still I think to accomplish such a function,
> we need to have some mechanism where it can inform auto vacuum
> and then some changes in auto vacuum to receive/read that information
> and reply back.  I don't think any such mechanism exists.

You're right, it doesn't.  I think we have plenty more infrastructure
for that than we had when autovacuum was initially developed.  It
shouldn't be that hard.

Of course, this is a task that requires much more thinking, design, and
discussion than just adding multi-process capability to vacuumdb ...

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Simon Riggs
Date:
On 17 October 2014 14:05, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

> Of course, this is a task that requires much more thinking, design, and
> discussion than just adding multi-process capability to vacuumdb ...

Yes, please proceed with this patch as originally envisaged. No more
comments from me.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Tue, Oct 7, 2014 at 11:10 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:
>
> On 26 September 2014 12:24, Amit Kapila Wrote,
>
> >I don't think this can handle cancel requests properly because
>
> >you are just setting it in GetIdleSlot() what if the cancel
>
> >request came during GetQueryResult() after sending sql for
>
> >all connections (probably thats the reason why Jeff is not able
>
> >to cancel the vacuumdb when using parallel option).
>
>  
>
> You are right, I have fixed, it in latest patch, please check latest patch @ (4205E661176A124FAF891E0A6BA9135266363710@szxeml509-mbs.china.huawei.com)
>


***************
*** 358,363 **** handle_sigint(SIGNAL_ARGS)
--- 358,364 ----

  /* Send QueryCancel if we are processing a database query */
  if (cancelConn != NULL)
  {
+ inAbort = true;
  if (PQcancel(cancelConn, errbuf, sizeof(errbuf)))
  fprintf(stderr, _("Cancel request sent\n"));
  else

Do we need to set inAbort flag incase PQcancel is successful?
Basically if PQCancel fails due to any reason, I think behaviour
can be undefined as the executing thread can assume that cancel is
done.

*** 391,396 **** consoleHandler(DWORD dwCtrlType)
--- 392,399 ----
  EnterCriticalSection
(&cancelConnLock);
  if (cancelConn != NULL)
  {
+ inAbort = 
true;

You have set this flag in case of windows handler, however the same
is never used incase of windows, are you expecting any use of this
flag for windows?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Sat, Oct 25, 2014 at 5:52 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>
> ***************
> *** 358,363 **** handle_sigint(SIGNAL_ARGS)
> --- 358,364 ----
>
>   /* Send QueryCancel if we are processing a database query */
>   if (cancelConn != NULL)
>   {
> + inAbort = true;
>   if (PQcancel(cancelConn, errbuf, sizeof(errbuf)))
>   fprintf(stderr, _("Cancel request sent\n"));
>   else
>
> Do we need to set inAbort flag incase PQcancel is successful?
> Basically if PQCancel fails due to any reason, I think behaviour
> can be undefined as the executing thread can assume that cancel is
> done.
>
> *** 391,396 **** consoleHandler(DWORD dwCtrlType)
> --- 392,399 ----
>   EnterCriticalSection
> (&cancelConnLock);
>   if (cancelConn != NULL)
>   {
> + inAbort =
> true;
> +
>
> You have set this flag in case of windows handler, however the same
> is never used incase of windows, are you expecting any use of this
> flag for windows?

Going further with verification of this patch, I found below issue:
Run the testcase.sql file at below link:
./vacuumdb --analyze-in-stages -j 8 -d postgres
Generating minimal optimizer statistics (1 target)
Segmentation fault

Server Log:
ERROR:  syntax error at or near "minimal" at character 12
STATEMENT:  ANALYZE ng minimal optimizer statistics (1 target)
LOG:  could not receive data from client: Connection reset by peer


Fixed below issues and attached an updated patch with mail:

1.
make check for docs gives below errors:
{ \
  echo "<!ENTITY version \"9.5devel\">"; \
  echo "<!ENTITY majorversion \"9.5\">"; \
} > version.sgml
'/usr/bin/perl' ./mk_feature_tables.pl YES ../../../src/backend/catalog/sql_feature_packages.txt 
../../../src/backend/catalog/sql_features.txt > features-supported.sgml
'/usr/bin/perl' ./mk_feature_tables.pl NO ../../../src/backend/catalog/sql_feature_packages.txt 
../../../src/backend/catalog/sql_features.txt > features-unsupported.sgml
'/usr/bin/perl' ./generate-errcodes-table.pl ../../../src/backend/utils/errcodes.txt > errcodes-table.sgml
onsgmls -wall -wno-unused-param -wno-empty -wfully-tagged -D . -D . -s postgres.sgml
onsgmls:ref/vacuumdb.sgml:224:15:E: end tag for "LISTITEM" omitted, but OMITTAG NO was specified
onsgmls:ref/vacuumdb.sgml:209:8: start tag was here
onsgmls:ref/vacuumdb.sgml:224:15:E: end tag for "VARLISTENTRY" omitted, but OMITTAG NO was specified
onsgmls:ref/vacuumdb.sgml:206:5: start tag was here
onsgmls:ref/vacuumdb.sgml:224:15:E: end tag for "VARIABLELIST" omitted, but OMITTAG NO was specified
onsgmls:ref/vacuumdb.sgml:79:4: start tag was here
onsgmls:ref/vacuumdb.sgml:225:18:E: end tag for element "LISTITEM" which is not open
onsgmls:ref/vacuumdb.sgml:226:21:E: end tag for element "VARLISTENTRY" which is not open
onsgmls:ref/vacuumdb.sgml:228:18:E: document type does not allow element "VARLISTENTRY" here; assuming 
missing "VARIABLELIST" start-tag
onsgmls:ref/vacuumdb.sgml:260:9:E: end tag for element "PARA" which is not open
make: *** [check] Error 1

Fixed.

2. 
Below multi-line comment is wrong:
/* Otherwise, we got a stage from vacuum_all_databases(), so run
* only that one. */

Fixed.

3.
fprintf(stderr, _("%s: vacuuming of database \"%s\" failed: %s"),
progname, dbname, PQerrorMessage(conn));
indentation of fprintf is not proper.

Fixed.

4.
/* This can only happen if user has sent the cacle request using
* Ctrl+C, Cancle is 
handled by 0th slot, so fetch the error result
*/
spelling of cancel is wrong and multi-line comment is not proper.

Fixed

5.
/* This can only happen if user has sent the cacle request using
 * Ctrl+C, Cancle is handled by 0th slot, so fetch the error result
*/

GetQueryResult(pSlot[0].connection, dbname, progname,   
completedb);

indentation of completedb parameter is wrong.

Fixed.

6.
/*
 * vacuum_parallel
 * This function will open the multiple concurrent connections as
 * suggested by used, it will derive the table list using server call
 * if table list is not given by user and perform the vacuum call
 */

s/used/user

Fixed.

In general, I think you can once go through all the comments
and code to see if similar issues exist at other places as well.

I have done some performance test with the patch, data for which is
as below:
Performance Data
------------------------------
IBM POWER-7 16 cores, 64 hardware threads
RAM = 64GB
max_connections = 128
checkpoint_segments=256
checkpoint_timeout    =15min
shared_buffers = 1GB

Before each test, run the testcase.sql file at below link:

Un-patched -
time ./vacuumdb -t t1 -t t2 -t t3 -t t4 -t t5 -t t6 -t t7 -t t8 -d postgres
real 0m2.454s
user 0m0.002s
sys 0m0.006s

Patched -
time ./vacuumdb -t t1 -t t2 -t t3 -t t4 -t t5 -t t6 -t t7 -t t8 -j 4 -d postgres

real 0m1.691s
user 0m0.001s
sys 0m0.004s

time ./vacuumdb -t t1 -t t2 -t t3 -t t4 -t t5 -t t6 -t t7 -t t8 -j 8 -d postgres

real 0m1.496s
user 0m0.002s
sys 0m0.004s


Above data indicates that the patch improves performance when used
with more number of concurrent connections.  I have done similar tests
for whole database as well and the results are quite similar to above.

I think you can once run the performance test for --analyze-in-stages
option as well after fixing the issue reported above in this mail.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
<div class="WordSection1"><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">On</span><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"></span><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">25October 2014 17:52, Amit Kapila Wrote,</span><p
class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">>***************</span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">>*** 358,363 ****
handle_sigint(SIGNAL_ARGS)</span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">>---358,364 ----</span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">> </span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">>  /* Send
QueryCancelif we are processing a database query */</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">> if (cancelConn != NULL)</span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">>  {</span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">>+ inAbort =
true;</span><pclass="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">> 
if(PQcancel(cancelConn, errbuf, sizeof(errbuf)))</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">> fprintf(stderr, _("Cancel request
sent\n"));</span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">> else</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">> </span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">>Dowe need to set inAbort flag incase
PQcancelis successful?</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">>Basicallyif PQCancel fails due to any
reason,I think behaviour</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">>canbe undefined as the executing thread
canassume that cancel is</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">>done.</span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">> </span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">>***391,396 **** consoleHandler(DWORD
dwCtrlType)</span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">>---392,399 ----</span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">> 
                          EnterCriticalSection</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">>(&cancelConnLock);</span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">> 
                          if (cancelConn != NULL)</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">>                            {</span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">>+
                                         inAbort = </span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">>true;</span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">>+</span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black"> </span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">In“<b>handle_sigint</b>” function if we are
goingto cancel the query that time I am setting the flag <b>inAbort </b>(even when it is success), so that in
“<b>select_loop</b>”function </span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">If<b>select(maxFd + 1, workerset, NULL, NULL,
&tv);</b>come out, we can know whether it came out because of cancel query and handle it accordingly. </span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">                i =
select(maxFd+ 1, workerset, NULL, NULL, NULL);</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black"> </span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">                               if (in_abort()) 
<b>//loopbreak because of cancel query, so return fail…</b></span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">                               {</span><p
class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">                                               
return-1;</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">                               }</span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black"> </span><p
class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">                               if (i < 0
&&errno == EINTR)</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">                                               
continue;</span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Regards,</span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">DilipKumar</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span></div>

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Tue, Oct 28, 2014 at 9:03 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:
>
> On 25 October 2014 17:52, Amit Kapila Wrote,
>
> >***************
>
> >*** 358,363 **** handle_sigint(SIGNAL_ARGS)
>
> >--- 358,364 ----
>
> >
>
> >  /* Send QueryCancel if we are processing a database query */
>
> >  if (cancelConn != NULL)
>
> >  {
>
> >+ inAbort = true;
>
> >  if (PQcancel(cancelConn, errbuf, sizeof(errbuf)))
>
> >  fprintf(stderr, _("Cancel request sent\n"));
>
> >  else
>
> >
>
> >Do we need to set inAbort flag incase PQcancel is successful?
>
> >Basically if PQCancel fails due to any reason, I think behaviour
>
> >can be undefined as the executing thread can assume that cancel is
>
> >done.
>
> >
>
> >*** 391,396 **** consoleHandler(DWORD dwCtrlType)
>
> >--- 392,399 ----
>
> >                             EnterCriticalSection
>
> >(&cancelConnLock);
>
> >                             if (cancelConn != NULL)
>
> >                             {
>
> >+                                           inAbort =
>
> >true;
>
> >+
>
>  
>
> In “handle_sigint” function if we are going to cancel the query that time I am setting the flag inAbort (even when it is success), so that in “select_loop” function
>

I am worried about the case if after setting the inAbort flag,
PQCancel() fails (returns error).

> If select(maxFd + 1, workerset, NULL, NULL, &tv); come out, we can know whether it came out because of cancel query and handle it accordingly.
>

Yeah, it is fine for the case when PQCancel() is successful, what
if it fails?
I think even if select comes out due to any other reason, it will behave
as if it came out due to Cancel, even though actually Cancel is failed,
how are planning to handle that case? 


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
<div class="WordSection1"><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">On</span><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"></span><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">28October 2014 09:18, Amit Kapila Wrote,</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">>Iam worried about the case if after setting
theinAbort flag,</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">>PQCancel()fails (returns error).</span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">> </span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">>> If select(maxFd
+1, workerset, NULL, NULL, &tv); come out, we can know whether it came out because of cancel query and handle it
accordingly.</span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">>> </span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">> </span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">>Yeah,it is fine for the case when
PQCancel()is successful, what</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">>ifit fails?</span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">>I think even if
selectcomes out due to any other reason, it will behave</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">>asif it came out due to Cancel, even though
actuallyCancel is failed,</span><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">>howare planning to handle that
case?</span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black"> </span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">Ithink If PQcancel fails then also there is no
problem,because we are setting inAbort flag in handle_sigint handler, it means user have tried to terminate.</span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black"> </span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">So in this case as well
wewill find that <b>inAbort </b>is set so we return error, and in error case we finally call <b>DisconnectDatabase</b>,
andin this function we will send the <b>PQcancel</b> for all active connection and then only we disconnect.</span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black"> </span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">Regards,</span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:black">DIlip</span></div> 

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Tue, Oct 28, 2014 at 9:27 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:
> On 28 October 2014 09:18, Amit Kapila Wrote,
>
> >I am worried about the case if after setting the inAbort flag,
>
> >PQCancel() fails (returns error).
>
> >
>
> >> If select(maxFd + 1, workerset, NULL, NULL, &tv); come out, we can know whether it came out because of cancel query and handle it accordingly.
>
> >>
>
> >
>
> >Yeah, it is fine for the case when PQCancel() is successful, what
>
> >if it fails?
>
> >I think even if select comes out due to any other reason, it will behave
>
> >as if it came out due to Cancel, even though actually Cancel is failed,
>
> >how are planning to handle that case?
>
>  
>
> I think If PQcancel fails then also there is no problem, because we are setting inAbort flag in handle_sigint handler, it means user have tried to terminate.
>

Yeah, user has tried to terminate, however utility will emit the
message: "Could not send cancel request" in such a case and still
silently tries to cancel and disconnect all connections.

One other related point is that I think still cancel handling mechanism
is not completely right, code is doing that when there are not enough
number of freeslots, but otherwise it won't handle the cancel request,
basically I am referring to below part of code:

run_parallel_vacuum()
{
..
for (cell = tables->head; cell; cell = cell->next)
{
/*
* This will give the free connection slot, if no slot is free it will
* wait for atleast one slot to get free.
*/

free_slot = GetIdleSlot(connSlot, max_slot, dbname, progname,
completedb);

if (free_slot == NO_SLOT)
{
error = true;
goto fail;
}

prepare_command(connSlot[free_slot].connection, full, verbose,
and_analyze, analyze_only, freeze, &sql);

appendPQExpBuffer(&sql, " %s", cell->val);

connSlot[free_slot].isFree = false;

slotconn = connSlot[free_slot].connection;

PQsendQuery(slotconn, sql.data);

resetPQExpBuffer(&sql);
}
..
}

I am wondering if it would be better to setcancelconn in above loop.   

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Mon, Oct 27, 2014 at 5:26 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>
> Going further with verification of this patch, I found below issue:
> Run the testcase.sql file at below link:
> http://www.postgresql.org/message-id/4205E661176A124FAF891E0A6BA9135266347F25@szxeml509-mbs.china.huawei.com
> ./vacuumdb --analyze-in-stages -j 8 -d postgres
> Generating minimal optimizer statistics (1 target)
> Segmentation fault
>
> Server Log:
> ERROR:  syntax error at or near "minimal" at character 12
> STATEMENT:  ANALYZE ng minimal optimizer statistics (1 target)
> LOG:  could not receive data from client: Connection reset by peer
>

As mentioned by you offlist that you are not able reproduce this
issue, I have tried again and what I observe is that I am able to
reproduce it only on *release* build and some cases work without
this issue as well,
example:
./vacuumdb --analyze-in-stages -t t1 -t t2 -t t3 -t t4 -t t5 -t t6 -t t7 -t t8 -j 8 -d postgres
Generating minimal optimizer statistics (1 target)
Generating medium optimizer statistics (10 targets)
Generating default (full) optimizer statistics

So to me, it looks like this is a timing issue and please notice
why in error the statement looks like
"ANALYZE ng minimal optimizer statistics (1 target)".  I think this
is not a valid statement.

Let me know if you still could not reproduce it.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
<div class="WordSection1"><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">On</span><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"></span><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">13November 2014 15:35 Amit Kapila Wrote, </span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><p class="MsoNormal">>As
mentionedby you offlist that you are not able reproduce this<p class="MsoNormal">>issue, I have tried again and what
Iobserve is that I am able to<p class="MsoNormal">>reproduce it only on *release* build and some cases work
without<pclass="MsoNormal">>this issue as well,<p class="MsoNormal">>example:<p class="MsoNormal">>./vacuumdb
--analyze-in-stages-t t1 -t t2 -t t3 -t t4 -t t5 -t t6 -t t7 -t t8 -j 8 -d postgres<p class="MsoNormal">>Generating
minimaloptimizer statistics (1 target)<p class="MsoNormal">>Generating medium optimizer statistics (10 targets)<p
class="MsoNormal">>Generatingdefault (full) optimizer statistics<p class="MsoNormal"><br clear="all" /><p
class="MsoNormal">>Soto me, it looks like this is a timing issue and please notice<p class="MsoNormal">>why in
errorthe statement looks like<p class="MsoNormal">>"ANALYZE ng minimal optimizer statistics (1 target)".  I think
this<pclass="MsoNormal" style="margin-bottom:12.0pt">>is not a valid statement.<p class="MsoNormal">>Let me know
ifyou still could not reproduce it.<p class="MsoNormal"> <p class="MsoNormal">Thank you for looking into it once
again..<pclass="MsoNormal"> <p class="MsoNormal">I have tried with the release mode, but could not reproduce the same.
Bylooking at server LOG sent by you “"ANALYZE ng minimal optimizer statistics (1 target)". ”, seems like some
corruption.<pclass="MsoNormal"> <p class="MsoNormal">So actually looks like two issues here.<p class="MsoListParagraph"
style="text-indent:-18.0pt;mso-list:l0level1 lfo1"><span style="mso-list:Ignore">1.<span style="font:7.0pt "Times New
Roman"">     </span></span>Query string sent to server is getting corrupted.<p class="MsoListParagraph"
style="text-indent:-18.0pt;mso-list:l0level1 lfo1"><span style="mso-list:Ignore">2.<span style="font:7.0pt "Times New
Roman"">     </span></span>Client is getting crashed.<p class="MsoNormal"> <p class="MsoNormal">I will review the code
andtry to find the same, meanwhile if you can find some time to debug this, it will be really helpful.<p
class="MsoNormal"> <pclass="MsoNormal"> <p class="MsoNormal">Regards,<p class="MsoNormal">Dilip</div> 

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Mon, Nov 17, 2014 at 8:55 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:
>
> On 13 November 2014 15:35 Amit Kapila Wrote,

> >As mentioned by you offlist that you are not able reproduce this
>
> >issue, I have tried again and what I observe is that I am able to
>
> >reproduce it only on *release* build and some cases work without
>
> >this issue as well,
>
> >example:
>
> >./vacuumdb --analyze-in-stages -t t1 -t t2 -t t3 -t t4 -t t5 -t t6 -t t7 -t t8 -j 8 -d postgres
>
> >Generating minimal optimizer statistics (1 target)
>
> >Generating medium optimizer statistics (10 targets)
>
> >Generating default (full) optimizer statistics
>
>
> >So to me, it looks like this is a timing issue and please notice
>
> >why in error the statement looks like
>
> >"ANALYZE ng minimal optimizer statistics (1 target)".  I think this
>
> >is not a valid statement.
>
> >Let me know if you still could not reproduce it.
>  
>
> I will review the code and try to find the same, meanwhile if you can find some time to debug this, it will be really helpful.
>

I think I have found the problem and fix for the same.

The stacktrace of crash is as below:

#0  0x00000080108cf3a4 in .strlen () from /lib64/libc.so.6
#1  0x00000080108925bc in ._IO_vfprintf () from /lib64/libc.so.6
#2  0x00000080108bc1e0 in .__GL__IO_vsnprintf_vsnprintf () from /lib64/libc.so.6
#3  0x00000fff7e581590 in .appendPQExpBufferVA () from 
/data/akapila/workspace/master/installation/lib/libpq.so.5
#4  0x00000fff7e581774 in .appendPQExpBuffer () from 
/data/akapila/workspace/master/installation/lib/libpq.so.5
#5  0x0000000010003748 in .run_parallel_vacuum ()
#6  0x0000000010003f60 in .vacuum_parallel ()
#7  0x0000000010002ae4 in .main ()
(gdb) f 5
#5  0x0000000010003748 in .run_parallel_vacuum ()

So now the real reason here is that the list of tables passed to
function is corrupted.  The below code seems to be the real
culprit:

vacuum_parallel()
{
..
if (!tables || !tables->head)
{
SimpleStringList dbtables = {NULL, NULL};
...
..
           tables = &dbtables;
}
..
}

In above code dbtables is local to if loop and code
is using the address of same to assign it to tables which
is used out of if block scope, moving declaration to the
outer scope fixes the problem in my environment.  Find the
updated patch that fixes this problem attached with this
mail.  Let me know your opinion about the same.

While looking at this problem, I have noticed couple of other
improvements:

a. In prepare_command() function, patch is doing init of sql
buffer (initPQExpBuffer(sql);) which I think is not required
as both places from where this function is called, it is done by
caller.  I think this will lead to memory leak.

b. In prepare_command() function, for fixed strings you can
use appendPQExpBufferStr() which is what used in original code
as well.

c. 
run_parallel_vacuum()
{
..
prepare_command(connSlot[free_slot].connection, full, verbose,
and_analyze, analyze_only, freeze, &sql);

appendPQExpBuffer(&sql, " %s", cell->val);
..
}

I think it is better to end command with ';' by using
appendPQExpBufferStr(&sql, ";"); in above code.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:

On 23 November 2014 14:45, Amit Kapila Wrote

 

Thanks a lot for debugging and fixing the issue..

 

>The stacktrace of crash is as below:

 

>#0  0x00000080108cf3a4 in .strlen () from /lib64/libc.so.6

>#1  0x00000080108925bc in ._IO_vfprintf () from /lib64/libc.so.6

>#2  0x00000080108bc1e0 in .__GL__IO_vsnprintf_vsnprintf () from /lib64/libc.so.6

>#3  0x00000fff7e581590 in .appendPQExpBufferVA () from 

>/data/akapila/workspace/master/installation/lib/libpq.so.5

>#4  0x00000fff7e581774 in .appendPQExpBuffer () from 

>/data/akapila/workspace/master/installation/lib/libpq.so.5

>#5  0x0000000010003748 in .run_parallel_vacuum ()

>#6  0x0000000010003f60 in .vacuum_parallel ()

>#7  0x0000000010002ae4 in .main ()

>(gdb) f 5

>#5  0x0000000010003748 in .run_parallel_vacuum ()

 

>So now the real reason here is that the list of tables passed to

>function is corrupted.  The below code seems to be the real

>culprit:

>vacuum_parallel()
>{
>..
>if (!tables || !tables->head)
>{
>SimpleStringList dbtables = {NULL, NULL};
>...
>..
>           tables = &dbtables;
>}
>..
>}

 

>In above code dbtables is local to if loop and code

>is using the address of same to assign it to tables which

>is used out of if block scope, moving declaration to the

>outer scope fixes the problem in my environment.  Find the

>updated patch that fixes this problem attached with this

>mail.  Let me know your opinion about the same.

 

Yes, that’s the reason of corruption, this must be causing both the issues, sending corrupted query to server as well as crash at client side.

 

 

>While looking at this problem, I have noticed couple of other

>improvements:

 

>a. In prepare_command() function, patch is doing init of sql

>buffer (initPQExpBuffer(sql);) which I think is not required

>as both places from where this function is called, it is done by

>caller.  I think this will lead to memory leak.

 

Fixed..

 

>b. In prepare_command() function, for fixed strings you can

>use appendPQExpBufferStr() which is what used in original code

>as well.

 

Changed as per comment..

 

>c. 

>run_parallel_vacuum()

>{

>..

>prepare_command(connSlot[free_slot].connection, full, verbose,
>and_analyze, analyze_only, freeze, &sql);
>
>appendPQExpBuffer(&sql, " %s", cell->val);
>..

>}

 

>I think it is better to end command with ';' by using

>appendPQExpBufferStr(&sql, ";"); in above code.

 

Done

 

Latest patch is attached, please have a look.

 

Regards,

Dilip Kumar

 

 

 

 

Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Mon, Nov 24, 2014 at 7:34 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:
>
> On 23 November 2014 14:45, Amit Kapila Wrote
>
>  
>
> Thanks a lot for debugging and fixing the issue..
>
>  
>
> Latest patch is attached, please have a look.
>

I think still some of the comments given upthread are not handled:

a.  About cancel handling
b.  Setting of inAbort flag for case where PQCancel is successful
c.  Performance data of --analyze-in-stages switch
d.  Have one pass over the comments in patch.  I could still some
wrong multiline comments.  Refer below:
+  /* Otherwise, we got a stage from vacuum_all_databases(), so run
+   * only that one. */
 

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:

On 24 November 2014 11:29, Amit Kapila Wrote,

 

>I think still some of the comments given upthread are not handled:

>a.  About cancel handling

 

Your Actual comment was à

 

>One other related point is that I think still cancel handling mechanism

>is not completely right, code is doing that when there are not enough

>number of freeslots, but otherwise it won't handle the cancel request,

>basically I am referring to below part of code:

 

run_parallel_vacuum()
{
..
for (cell = tables->head; cell; cell = cell->next)
{
..

free_slot = GetIdleSlot(connSlot, max_slot, dbname, progname,
completedb);

PQsendQuery(slotconn, sql.data);

resetPQExpBuffer(&sql);
}

 

 

1.      I think only if connection is going for Slot wait, it will be in blocking, or while GetQueryResult, so we have Handle SetCancleRequest both places.

2.      Now a case (as you mentioned), when there are enough slots, and and above for loop is running if user do Ctrl+C then this will not break, This I have handled by checking inAbort

Mode inside the for loop before sending the new command, I think this we cannot do by setting the SetCancel because only when query receive some data it will realize that it canceled and it will fail, but until connection is not going to receive data it will not see the failure. So I have handled inAbort directly.



 

>b.  Setting of inAbort flag for case where PQCancel is successful

 

Your Actual comment was à

 

>Yeah, user has tried to terminate, however utility will emit the
>message: "Could not send cancel request" in such a case and still

>silently tries to cancel and disconnect all connections.

 

You are right, I have fixed the code, now in case of failure we need not to set inAbort Flag..

 

>c.  Performance data of --analyze-in-stages switch

 

 

 

Performance Data

------------------------------

CPU 8 cores
RAM = 16GB
checkpoint_segments=256

Before each test, run the test.sql (attached)

 

Un-patched -

dilip@linux-ltr9:/home/dilip/9.4/install/bin> time ./vacuumdb  -p 9005  --analyze-in-stages  -d postgres

Generating minimal optimizer statistics (1 target)

Generating medium optimizer statistics (10 targets)

Generating default (full) optimizer statistics

 

real    0m0.843s

user    0m0.000s

sys     0m0.000s

 

Patched

dilip@linux-ltr9:/home/dilip/9.4/install/bin> time ./vacuumdb  -p 9005  --analyze-in-stages  -j 2 -d postgres

Generating minimal optimizer statistics (1 target)

Generating medium optimizer statistics (10 targets)

Generating default (full) optimizer statistics

 

real    0m0.593s

user    0m0.004s

sys     0m0.004s

 

dilip@linux-ltr9:/home/dilip/9.4/install/bin> time ./vacuumdb  -p 9005  --analyze-in-stages  -j 4 -d postgres

Generating minimal optimizer statistics (1 target)

Generating medium optimizer statistics (10 targets)

Generating default (full) optimizer statistics

 

real    0m0.421s

user    0m0.004s

sys     0m0.004s

 

I think in 2 connections we can get 30% improvement.

 

>d.  Have one pass over the comments in patch.  I could still some

>wrong multiline comments.  Refer below:

>+  /* Otherwise, we got a stage from vacuum_all_databases(), so run

>+   * only that one. */

 

Checked all, and fixed..

 

While testing, I found one more different behavior compare to base code,

 

Base Code:

dilip@linux-ltr9:/home/dilip/9.4/install/bin> time ./vacuumdb  -p 9005 -t "t1" -t "t2" -t "t3" -t "t4" --analyze-in-stages  -d Postgres

 

Generating minimal optimizer statistics (1 target)

Generating medium optimizer statistics (10 targets)

Generating default (full) optimizer statistics

Generating minimal optimizer statistics (1 target)

Generating medium optimizer statistics (10 targets)

Generating default (full) optimizer statistics

Generating minimal optimizer statistics (1 target)

Generating medium optimizer statistics (10 targets)

Generating default (full) optimizer statistics

Generating minimal optimizer statistics (1 target)

Generating medium optimizer statistics (10 targets)

Generating default (full) optimizer statistics

 

real    0m0.605s

user    0m0.004s

sys     0m0.000s

 

I think it should be like,

SET default_statistics_target=1; do for all three tables

SET default_statistics_target=10; do for all three tables so on..

 

With Patched

dilip@linux-ltr9:/home/dilip/9.4/install/bin> time ./vacuumdb  -p 9005 -t "t1" -t "t2" -t "t3" -t "t4" --analyze-in-stages -j 2 -d postgres

Generating minimal optimizer statistics (1 target)

Generating medium optimizer statistics (10 targets)

Generating default (full) optimizer statistics

 

real    0m0.395s

user    0m0.000s

sys     0m0.004s

 

here we are setting each target once and doing for all the tables..

 

Please provide you opinion.

 

 

Regards,

Dilip Kumar

 

 

 

Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:

On Mon, Dec 1, 2014 at 12:18 PM, Dilip kumar <dilip.kumar@huawei.com> wrote:
>
> On 24 November 2014 11:29, Amit Kapila Wrote,
>

I have verified that all previous comments are addressed and
the new version is much better than previous version.

>
> here we are setting each target once and doing for all the tables..
>

Hmm, theoretically I think new behaviour could lead to more I/O in
certain cases as compare to existing behaviour.  The reason for more I/O
is that in the new behaviour, while doing Analyze for a particular table at
different targets, in-between it has Analyze of different table as well,
so the pages in shared buffers or OS cache for a particular table needs to
be reloded again for a new target whereas currently it will do all stages
of Analyze for a particular table in one-go which means that each stage
of Analyze could get benefit from the pages of a table loaded by previous
stage.  If you agree, then we should try to avoid this change in new
behaviour.
  
>
> Please provide you opinion.

I have few questions regarding function GetIdleSlot()

+ static int
+ GetIdleSlot(ParallelSlot *pSlot, int max_slot, const char *dbname,
+ const 
char *progname, bool completedb)
{
..
+ /*
+ * Some of the slot are free, Process the results for slots whichever
+ * are free
+ */
+ do
+ {
+ SetCancelConn(pSlot[0].connection);
+ i = select_loop(maxFd, 
&slotset);
+ ResetCancelConn();
+ if (i < 0)
+ {
+ /*
+
* This can only happen if user has sent the cancel request using
+
Ctrl+C, Cancel is handled by 0th slot, so fetch the error result.
+ */
+
GetQueryResult(pSlot[0].connection, dbname, progname,
+   
completedb);
+ return NO_SLOT;
+ }
+ Assert(i != 0);
+
for (i = 0; i < max_slot; i++)
+ {
+ if (!FD_ISSET(pSlot[i].sock, 
&slotset))
+ continue;
+ PQconsumeInput(pSlot[i].connection);
if (PQisBusy(pSlot[i].connection))
+ continue;
+
pSlot[i].isFree = true;
+ if (!GetQueryResult(pSlot[i].connection, dbname, 
progname,
+ completedb))
+
return NO_SLOT;
+ if (firstFree < 0)
+ firstFree = i;
+
}
+ }while(firstFree < 0);
}

I wanted to understand what exactly the above loop is doing.

a.
first of all the comment on top of it says "Some of the slot
are free, ...", if some slot is free, then why do you want
to process the results? (Do you mean to say that *None* of
the slot is free....?)

b.
IIUC, you have called function select_loop(maxFd, &slotset)
to check if socket descriptor is readable, if yes then why
in do..while loop the same maxFd is checked always, don't
you want to check different socket descriptors?  I am not sure
if I am missing something here

c.
After checking the socket descriptor for maxFd why you want
to run run the below for loop for all slots?
for (i = 0; i < max_slot; i++)


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Michael Paquier
Date:
On Sat, Dec 6, 2014 at 9:01 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> If you agree, then we should try to avoid this change in new behaviour.
Still seeing many concerns about this patch, so marking it as returned
with feedback. If possible, switching it to the next CF would be fine
I guess as this work is still being continued.
-- 
Michael



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:

On 06 December 2014 20:01 Amit Kapila Wrote

 

>I wanted to understand what exactly the above loop is doing.

 

>a.

>first of all the comment on top of it says "Some of the slot

>are free, ...", if some slot is free, then why do you want

>to process the results? (Do you mean to say that *None* of

>the slot is free....?)

 

This comment is wrong, I will remove this.

 

>b.

>IIUC, you have called function select_loop(maxFd, &slotset)

>to check if socket descriptor is readable, if yes then why

>in do..while loop the same maxFd is checked always, don't

>you want to check different socket descriptors?  I am not sure

>if I am missing something here

 

select_loop(maxFd, &slotset)

 

maxFd is the max descriptor among all SETS, and slotset contains all the descriptor, so if any of the descriptor get some message select_loop will come out, and once select loop come out,

we need to check how many descriptor have got the message from server so we loop and process the results.

 

So it’s not only for a maxFd, it’s for all the descriptors. And it’s in do..while loop, because it possible that select_loop come out because of some intermediate message on any of the socket but still query is not complete,

and if none of the socket is still free (that we check in below for loop), then go to select_loop again.

 

 

>c.

>After checking the socket descriptor for maxFd why you want

>to run run the below for loop for all slots?

>for (i = 0; i < max_slot; i++)

After Select loop is out, it’s possible that we might have got result on multiple connections, so consume input and check if still busy, then nothing to do, but if finished process the result and mark the connection free.

And if any of the connection is free, then we will break the do..while loop.

 

 

 

 

 

 

 

From: Amit Kapila [mailto:amit.kapila16@gmail.com]
Sent: 06 December 2014 20:01
To: Dilip kumar
Cc: Magnus Hagander; Alvaro Herrera; Jan Lentfer; Tom Lane; PostgreSQL-development; Sawada Masahiko; Euler Taveira
Subject: Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

 


On Mon, Dec 1, 2014 at 12:18 PM, Dilip kumar <dilip.kumar@huawei.com> wrote:
>
> On 24 November 2014 11:29, Amit Kapila Wrote,
>

 

I have verified that all previous comments are addressed and

the new version is much better than previous version.

 

>
> here we are setting each target once and doing for all the tables..
>

Hmm, theoretically I think new behaviour could lead to more I/O in

certain cases as compare to existing behaviour.  The reason for more I/O

is that in the new behaviour, while doing Analyze for a particular table at

different targets, in-between it has Analyze of different table as well,

so the pages in shared buffers or OS cache for a particular table needs to

be reloded again for a new target whereas currently it will do all stages

of Analyze for a particular table in one-go which means that each stage

of Analyze could get benefit from the pages of a table loaded by previous

stage.  If you agree, then we should try to avoid this change in new

behaviour.

  
>
> Please provide you opinion.

 

I have few questions regarding function GetIdleSlot()

 

+ static int

+ GetIdleSlot(ParallelSlot *pSlot, int max_slot, const char *dbname,

+                                const 

char *progname, bool completedb)

{

..

+        /*

+        * Some of the slot are free, Process the results for slots whichever

+        * are free

+        */

+        do

+        {

+                    SetCancelConn(pSlot[0].connection);

+                    i = select_loop(maxFd, 

&slotset);

+                    ResetCancelConn();

+                    if (i < 0)

+                    {

+                                /*

+       

                      * This can only happen if user has sent the cancel request using

+                                * 

Ctrl+C, Cancel is handled by 0th slot, so fetch the error result.

+                                */

+                               

GetQueryResult(pSlot[0].connection, dbname, progname,

+                                                                      

completedb);

+                                return NO_SLOT;

+                    }

+                    Assert(i != 0);

+                   

for (i = 0; i < max_slot; i++)

+                    {

+                                if (!FD_ISSET(pSlot[i].sock, 

&slotset))

+                                            continue;

+                                PQconsumeInput(pSlot[i].connection);

                                  if (PQisBusy(pSlot[i].connection))

+                                            continue;

+                   

          pSlot[i].isFree = true;

+                                if (!GetQueryResult(pSlot[i].connection, dbname, 

progname,

+                                                                                            completedb))

+                                           

return NO_SLOT;

+                                if (firstFree < 0)

+                                            firstFree = i;

+       

          }

+        }while(firstFree < 0);

}

 

I wanted to understand what exactly the above loop is doing.

 

a.

first of all the comment on top of it says "Some of the slot

are free, ...", if some slot is free, then why do you want

to process the results? (Do you mean to say that *None* of

the slot is free....?)

 

b.

IIUC, you have called function select_loop(maxFd, &slotset)

to check if socket descriptor is readable, if yes then why

in do..while loop the same maxFd is checked always, don't

you want to check different socket descriptors?  I am not sure

if I am missing something here

 

c.

After checking the socket descriptor for maxFd why you want

to run run the below for loop for all slots?

for (i = 0; i < max_slot; i++)



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Mon, Dec 8, 2014 at 7:33 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:
> On 06 December 2014 20:01 Amit Kapila Wrote
>
> >I wanted to understand what exactly the above loop is doing.
>
>  
>
> >a.
>
> >first of all the comment on top of it says "Some of the slot
>
> >are free, ...", if some slot is free, then why do you want
>
> >to process the results? (Do you mean to say that *None* of
>
> >the slot is free....?)
>
>  
>
> This comment is wrong, I will remove this.
>

I suggest rather than removing, edit the comment to indicate
the idea behind code at that place.

> >b.
>
> >IIUC, you have called function select_loop(maxFd, &slotset)
>
> >to check if socket descriptor is readable, if yes then why
>
> >in do..while loop the same maxFd is checked always, don't
>
> >you want to check different socket descriptors?  I am not sure
>
> >if I am missing something here
>
>  
>
> select_loop(maxFd, &slotset)
>
>
> So it’s not only for a maxFd, it’s for all the descriptors. And it’s in do..while loop, because it possible that select_loop come out because of some intermediate message on any of the socket but still query is not complete,
>

Okay, I think this part of code is somewhat similar to what
is done in pg_dump/parallel.c with some differences related
to handling of inAbort.  One thing I have noticed here which
could lead to a problem is that caller of select_loop() function
assumes that return value is less than zero only if there is a
cancel request which I think is wrong, because select system
call could also return -1 in case of error.  I am referring below
code in above context:

+ if (i < 0)
+ {
+ /*
+  * This can only happen if user has sent the cancel request using
+  * Ctrl+C, Cancel is handled by 0th slot, so fetch the error result.
+  */
+
+ GetQueryResult(pSlot[0].connection, dbname, progname,
+    completedb);



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:

On December 2014 17:31 Amit Kapila Wrote,

 

 

>I suggest rather than removing, edit the comment to indicate

>the idea behind code at that place.

Done


>Okay, I think this part of code is somewhat similar to what

>is done in pg_dump/parallel.c with some differences related

>to handling of inAbort.  One thing I have noticed here which

>could lead to a problem is that caller of select_loop() function

>assumes that return value is less than zero only if there is a

>cancel request which I think is wrong, because select system

>call could also return -1 in case of error.  I am referring below

>code in above context:

+ if (i < 0)
+ {
+ /*
+  * This can only happen if user has sent the cancel request using
+  * Ctrl+C, Cancel is handled by 0th slot, so fetch the error result.
+  */
+
+ GetQueryResult(pSlot[0].connection, dbname, progname,
+    completedb);

 

Now for abort case I am using special error code, and other than that case we will assert, this behavior is same as in pg_dump.

 

 

>Hmm, theoretically I think new behaviour could lead to more I/O in

>certain cases as compare to existing behaviour.  The reason for more I/O

>is that in the new behaviour, while doing Analyze for a particular table at

>different targets, in-between it has Analyze of different table as well,

>so the pages in shared buffers or OS cache for a particular table needs to

>be reloded again for a new target whereas currently it will do all stages

>of Analyze for a particular table in one-go which means that each stage

>of Analyze could get benefit from the pages of a table loaded by previous

>stage.  If you agree, then we should try to avoid this change in new

>behaviour.

 

I will work on this comment and post the updated patch..

 

I will move this patch to the latest commitfest.

 

Regards,

Dilip

 

 

Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Mon, Dec 15, 2014 at 4:18 PM, Dilip kumar <dilip.kumar@huawei.com> wrote:
>
> On December 2014 17:31 Amit Kapila Wrote,
>  
>
> >Hmm, theoretically I think new behaviour could lead to more I/O in
>
> >certain cases as compare to existing behaviour.  The reason for more I/O
>
> >is that in the new behaviour, while doing Analyze for a particular table at
>
> >different targets, in-between it has Analyze of different table as well,
>
> >so the pages in shared buffers or OS cache for a particular table needs to
>
> >be reloded again for a new target whereas currently it will do all stages
>
> >of Analyze for a particular table in one-go which means that each stage
>
> >of Analyze could get benefit from the pages of a table loaded by previous
>
> >stage.  If you agree, then we should try to avoid this change in new
>
> >behaviour.
>
>  
>
> I will work on this comment and post the updated patch..
>

One idea is to send all the stages and corresponding Analyze commands
to server in one go which means something like
"BEGIN; SET default_statistics_target=1; SET vacuum_cost_delay=0;
  Analyze t1; COMMIT;" 
"BEGIN; SET default_statistics_target=10; RESET vacuum_cost_delay;
  Analyze t1; COMMIT;"
"BEGIN; RESET default_statistics_target;
 Analyze t1; COMMIT;"

Now, still parallel operations in other backends could lead to
page misses, but I think the impact will be minimized.

  
>
> I will move this patch to the latest commitfest.
>

By the way, I think this patch should be in Waiting On Author stage.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
<div class="WordSection1"><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">On</span><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"></span><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">19December 2014 16:41, Amit Kapila Wrote,</span><p
class="MsoNormal"><spanstyle="font-size:10.0pt;font-family:"Tahoma","sans-serif""> </span><p class="MsoNormal">>One
ideais to send all the stages and corresponding Analyze commands<p class="MsoNormal">>to server in one go which
meanssomething like<p class="MsoNormal">>"BEGIN; SET default_statistics_target=1; SET vacuum_cost_delay=0;<p
class="MsoNormal">> Analyze t1; COMMIT;" <p class="MsoNormal">>"BEGIN; SET default_statistics_target=10; RESET
vacuum_cost_delay;<pclass="MsoNormal">>  Analyze t1; COMMIT;"<p class="MsoNormal">>"BEGIN; RESET
default_statistics_target;<pclass="MsoNormal">> Analyze t1; COMMIT;"<p class="MsoNormal"> <p
class="MsoNormal">Case1:<u>InCase for CompleteDB:</u><p class="MsoNormal"> <p class="MsoNormal">In base code first it
willprocess all the tables in stage 1 then in stage2 and so on, so that at some time all the tables are analyzed at
leastup to certain stage.<p class="MsoNormal"> <p class="MsoNormal">But If we process all the stages for one table
first,and then take the other table for processing the stage 1, then it may happen that for some table all the stages
areprocessed, <p class="MsoNormal">but others are waiting for even first stage to be processed, this will affect the
functionalityfor analyze-in-stages.<p class="MsoNormal"> <p class="MsoNormal"><u>Case2: In case for independent tables
like–t “t1” –t “t2”</u><p class="MsoNormal"> <p class="MsoNormal">In base code also currently we are processing all the
stagesfor first table and processing same for next table and so on.<p class="MsoNormal"> <p class="MsoNormal">I think,
ifuser is giving multiple tables together then his purpose might be to analyze those tables together stage by stage, <p
class="MsoNormal">butin our code we analyze table1 in all stages and then only considering the next table.<p
class="MsoNormal"> <pclass="MsoNormal">So for tables also it should be like<p class="MsoNormal">Stage1:<p
class="MsoNormal">           T1<p class="MsoNormal">            T2<p class="MsoNormal">            ..<p
class="MsoNormal">Stage2:<pclass="MsoNormal">            T1<p class="MsoNormal">            T2<p class="MsoNormal">…<p
class="MsoNormal"> <pclass="MsoNormal">Thoughts?<p class="MsoNormal"> <p class="MsoNormal">>Now, still parallel
operationsin other backends could lead to<p class="MsoNormal">>page misses, but I think the impact will be
minimized.<pclass="MsoNormal"> <p class="MsoNormal">Regards,<p class="MsoNormal">Dilip<p class="MsoNormal"> <p
class="MsoNormal"> <pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span></div>

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Wed, Dec 24, 2014 at 4:00 PM, Dilip kumar <dilip.kumar@huawei.com> wrote:
>
> Case1:In Case for CompleteDB:
>
> In base code first it will process all the tables in stage 1 then in stage2 and so on, so that at some time all the tables are analyzed at least up to certain stage.
>
> But If we process all the stages for one table first, and then take the other table for processing the stage 1, then it may happen that for some table all the stages are processed,
>
> but others are waiting for even first stage to be processed, this will affect the functionality for analyze-in-stages.
>
> Case2: In case for independent tables like –t “t1” –t “t2”
>
> In base code also currently we are processing all the stages for first table and processing same for next table and so on.
>
> I think, if user is giving multiple tables together then his purpose might be to analyze those tables together stage by stage,
> but in our code we analyze table1 in all stages and then only considering the next table.
>

So basically you want to say that currently the processing for
tables with --analyze-in-stages switch is different when the user
executes vacuumdb for whole database versus when it does for
individual tables (multiple tables together).  In the proposed patch
the processing for tables will be same for either cases (whole
database or independent tables).  I think your point has merit, so
lets proceed with this as it is in your patch.

Do you have anything more to handle in patch or shall I take one
another look and pass it to committer if it is ready for the same.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
<div class="WordSection1"><p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">On</span><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">29December 2014 10:22 Amit Kapila Wrote,</span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span><p
class="MsoNormal">>>Case1:In Case for CompleteDB:<br /> >><br /> >> In base code first it will
processall the tables in stage 1 then in stage2 and so on, so that at some time all the tables are analyzed at least up
tocertain stage.<br /> >><br /> >> But If we process all the stages for one table first, and then take the
othertable for processing the stage 1, then it may happen that for some table all the stages are processed,<br />
>><br/> >> but others are waiting for even first stage to be processed, this will affect the functionality
foranalyze-in-stages.<br /> >><br /> >> Case2: In case for independent tables like –t “t1” –t “t2”<br />
>><br/> > In base code also currently we are processing all the stages for first table and processing same for
nexttable and so on.<br /> >><br /> >> I think, if user is giving multiple tables together then his purpose
mightbe to analyze those tables together stage by stage,<br /> >> but in our code we analyze table1 in all stages
andthen only considering the next table.<br /> >><br /> >So basically you want to say that currently the
processingfor<p class="MsoNormal">>tables with --analyze-in-stages switch is different when the user<p
class="MsoNormal">>executesvacuumdb for whole database versus when it does for<p class="MsoNormal">>individual
tables(multiple tables together).  In the proposed patch<p class="MsoNormal">>the processing for tables will be same
foreither cases (whole<p class="MsoNormal">>database or independent tables).  I think your point has merit, so<p
class="MsoNormal">>letsproceed with this as it is in your patch.<p class="MsoNormal"> <p class="MsoNormal">>Do
youhave anything more to handle in patch or shall I take one<p class="MsoNormal">>another look and pass it to
committerif it is ready for the same.<p class="MsoNormal"> <p class="MsoNormal">I think nothing more to be handled from
myside, you can go ahead with review..<p class="MsoNormal"> <p class="MsoNormal">Regards,<p class="MsoNormal">Dilip<p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span></div> 

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Mon, Dec 29, 2014 at 11:10 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:
>
> On 29 December 2014 10:22 Amit Kapila Wrote,
>
>
> I think nothing more to be handled from my side, you can go ahead with review..
>

The patch looks good to me.  I have done couple of
cosmetic changes (spelling mistakes, improve some comments,
etc.), check the same once and if you are okay, we can move
ahead. 



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
<div class="WordSection1"><p class="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">On</span><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"></span><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">31December 2014 18:36, Amit Kapila Wrote,</span><p
class="MsoNormal"><spanstyle="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span><p
class="MsoNormal">>Thepatch looks good to me.  I have done couple of<p class="MsoNormal">>cosmetic changes
(spellingmistakes, improve some comments,<p class="MsoNormal">>etc.), check the same once and if you are okay, we
canmove<p class="MsoNormal">>ahead. <p class="MsoNormal"> <p class="MsoNormal">Thanks for review and changes,
changeslooks fine to me..<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">Regards,</span><pclass="MsoNormal"><span
style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">Dilip</span><pclass="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span></div>

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Fri, Jan 2, 2015 at 11:47 AM, Dilip kumar <dilip.kumar@huawei.com> wrote:
>
> On 31 December 2014 18:36, Amit Kapila Wrote,
>
> >The patch looks good to me.  I have done couple of
>
> >cosmetic changes (spelling mistakes, improve some comments,
>
> >etc.), check the same once and if you are okay, we can move
>
> >ahead.
>
>  
>
> Thanks for review and changes, changes looks fine to me..
>

Okay, I have marked this patch as "Ready For Committer"

Notes for Committer -
There is one behavioural difference in the handling of --analyze-in-stages
switch, when individual tables (by using -t option) are analyzed by
using this switch, patch will process (in case of concurrent jobs) all the
given tables for stage-1 and then for stage-2 and so on whereas in the
unpatched code it will process all the three stages table by table
(table-1 all three stages, table-2 all three stages and so on).  I think
the new behaviour is okay as the same is done when this utility does
vacuum for whole database.  As there was no input from any committer
on this point, I thought it is better to get the same rather than waiting
more just for one point.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Kevin Grittner
Date:
Amit Kapila <amit.kapila16@gmail.com> wrote:

> Notes for Committer -
> There is one behavioural difference in the handling of --analyze-in-stages
> switch, when individual tables (by using -t option) are analyzed by
> using this switch, patch will process (in case of concurrent jobs) all the
> given tables for stage-1 and then for stage-2 and so on whereas in the
> unpatched code it will process all the three stages table by table
> (table-1 all three stages, table-2 all three stages and so on).  I think
> the new behaviour is okay as the same is done when this utility does
> vacuum for whole database.

IMV, the change is for the better.  The whole point of
--analyze-in-stages is to get minimal statistics so that "good
enough" plans will be built for most queries to allow a production
database to be brought back on-line quickly, followed by generating
increasing granularity (which takes longer but should help ensure
"best plan") while the database is in use with the initial
statistics.  Doing all three levels for one table before generating
the rough statistics for the others doesn't help with that, so I
see this change as fixing a bug.  Is it feasible to break that part 
out as a separate patch?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Fri, Jan 2, 2015 at 8:34 PM, Kevin Grittner <kgrittn@ymail.com> wrote:
>
> Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > Notes for Committer -
> > There is one behavioural difference in the handling of --analyze-in-stages
> > switch, when individual tables (by using -t option) are analyzed by
> > using this switch, patch will process (in case of concurrent jobs) all the
> > given tables for stage-1 and then for stage-2 and so on whereas in the
> > unpatched code it will process all the three stages table by table
> > (table-1 all three stages, table-2 all three stages and so on).  I think
> > the new behaviour is okay as the same is done when this utility does
> > vacuum for whole database.
>
> IMV, the change is for the better.  The whole point of
> --analyze-in-stages is to get minimal statistics so that "good
> enough" plans will be built for most queries to allow a production
> database to be brought back on-line quickly, followed by generating
> increasing granularity (which takes longer but should help ensure
> "best plan") while the database is in use with the initial
> statistics.  Doing all three levels for one table before generating
> the rough statistics for the others doesn't help with that, so I
> see this change as fixing a bug.  Is it feasible to break that part
> out as a separate patch?
>

Currently as the patch stands the fix (or new behaviour) is only
implemented for the multiple jobs option, however to fix this in base
code a separate patch is required.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Andres Freund
Date:
On 2014-12-31 18:35:38 +0530, Amit Kapila wrote:
> +      <term><option>-j <replaceable class="parameter">jobs</replaceable></option></term>
> +      <term><option>--jobs=<replaceable class="parameter">njobs</replaceable></option></term>
> +      <listitem>
> +       <para>
> +        Number of concurrent connections to perform the operation.
> +        This option will enable the vacuum operation to run on asynchronous
> +        connections, at a time one table will be operated on one connection.
> +        So at one time as many tables will be vacuumed parallely as number of
> +        jobs.  If number of jobs given are more than number of tables then
> +        number of jobs will be set to number of tables.

"asynchronous connections" isn't a very well defined term. Also, the
second part of that sentence doesn't seem to be gramattically correct.

> +       </para>
> +       <para>
> +        <application>vacuumdb</application> will open
> +        <replaceable class="parameter"> njobs</replaceable> connections to the
> +        database, so make sure your <xref linkend="guc-max-connections">
> +        setting is high enough to accommodate all connections.
> +       </para>

Isn't it njobs+1?

> @@ -141,6 +199,7 @@ main(int argc, char *argv[])
>          }
>      }
>  
> +    optind++;

Hm, where's that coming from?

> +    PQsetnonblocking(connSlot[0].connection, 1);
> +
> +    for (i = 1; i < concurrentCons; i++)
> +    {
> +        connSlot[i].connection = connectDatabase(dbname, host, port, username,
> +                                  prompt_password, progname, false);
> +
> +        PQsetnonblocking(connSlot[i].connection, 1);
> +        connSlot[i].isFree = true;
> +        connSlot[i].sock = PQsocket(connSlot[i].connection);
> +    }

Are you sure about this global PQsetnonblocking()? This means that you
might not be able to send queries... And you don't seem to be waiting
for sockets waiting for writes in the select loop - which means you
might end up being stuck waiting for reads when you haven't submitted
the query.

I think you might need a more complex select() loop. On nonfree
connections also wait for writes if PQflush() returns != 0.


> +/*
> + * GetIdleSlot
> + * Process the slot list, if any free slot is available then return
> + * the slotid else perform the select on all the socket's and wait
> + * until atleast one slot becomes available.
> + */
> +static int
> +GetIdleSlot(ParallelSlot *pSlot, int max_slot, const char *dbname,
> +            const char *progname, bool completedb)
> +{
> +    int        i;
> +    fd_set    slotset;


Hm, you probably need to limit -j to FD_SETSIZE - 1 or so.

> +    int     firstFree = -1;
> +    pgsocket maxFd;
> +
> +    for (i = 0; i < max_slot; i++)
> +        if (pSlot[i].isFree)
> +            return i;

> +    FD_ZERO(&slotset);
> +
> +    maxFd = pSlot[0].sock;
> +
> +    for (i = 0; i < max_slot; i++)
> +    {
> +        FD_SET(pSlot[i].sock, &slotset);
> +        if (pSlot[i].sock > maxFd)
> +            maxFd = pSlot[i].sock;
> +    }

So we're waiting for idle connections?

I think you'll have to have to use two fdsets here, and set the write
set based on PQflush() != 0.

> +/*
> + * A select loop that repeats calling select until a descriptor in the read
> + * set becomes readable. On Windows we have to check for the termination event
> + * from time to time, on Unix we can just block forever.
> + */

Should a) mention why we have to check regularly on windows b) that on
linux we don't have to because we send a cancel event from the signal
handler.

> +static int
> +select_loop(int maxFd, fd_set *workerset)
> +{
> +    int            i;
> +    fd_set        saveSet = *workerset;
>
> +#ifdef WIN32
> +    /* should always be the master */

Hm?


I have to say, this is a fairly large patch for such a minor feature...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Michael Paquier
Date:
On Sun, Jan 4, 2015 at 10:57 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-12-31 18:35:38 +0530, Amit Kapila wrote:
>> +      <term><option>-j <replaceable class="parameter">jobs</replaceable></option></term>
>> +      <term><option>--jobs=<replaceable class="parameter">njobs</replaceable></option></term>
>> +      <listitem>
>> +       <para>
>> +        Number of concurrent connections to perform the operation.
>> +        This option will enable the vacuum operation to run on asynchronous
>> +        connections, at a time one table will be operated on one connection.
>> +        So at one time as many tables will be vacuumed parallely as number of
>> +        jobs.  If number of jobs given are more than number of tables then
>> +        number of jobs will be set to number of tables.
>
> "asynchronous connections" isn't a very well defined term. Also, the
> second part of that sentence doesn't seem to be gramattically correct.
>
>> +       </para>
>> +       <para>
>> +        <application>vacuumdb</application> will open
>> +        <replaceable class="parameter"> njobs</replaceable> connections to the
>> +        database, so make sure your <xref linkend="guc-max-connections">
>> +        setting is high enough to accommodate all connections.
>> +       </para>
>
> Isn't it njobs+1?
>
>> @@ -141,6 +199,7 @@ main(int argc, char *argv[])
>>               }
>>       }
>>
>> +     optind++;
>
> Hm, where's that coming from?
>
>> +     PQsetnonblocking(connSlot[0].connection, 1);
>> +
>> +     for (i = 1; i < concurrentCons; i++)
>> +     {
>> +             connSlot[i].connection = connectDatabase(dbname, host, port, username,
>> +                                                               prompt_password, progname, false);
>> +
>> +             PQsetnonblocking(connSlot[i].connection, 1);
>> +             connSlot[i].isFree = true;
>> +             connSlot[i].sock = PQsocket(connSlot[i].connection);
>> +     }
>
> Are you sure about this global PQsetnonblocking()? This means that you
> might not be able to send queries... And you don't seem to be waiting
> for sockets waiting for writes in the select loop - which means you
> might end up being stuck waiting for reads when you haven't submitted
> the query.
>
> I think you might need a more complex select() loop. On nonfree
> connections also wait for writes if PQflush() returns != 0.
>
>
>> +/*
>> + * GetIdleSlot
>> + * Process the slot list, if any free slot is available then return
>> + * the slotid else perform the select on all the socket's and wait
>> + * until atleast one slot becomes available.
>> + */
>> +static int
>> +GetIdleSlot(ParallelSlot *pSlot, int max_slot, const char *dbname,
>> +                     const char *progname, bool completedb)
>> +{
>> +     int             i;
>> +     fd_set  slotset;
>
>
> Hm, you probably need to limit -j to FD_SETSIZE - 1 or so.
>
>> +     int     firstFree = -1;
>> +     pgsocket maxFd;
>> +
>> +     for (i = 0; i < max_slot; i++)
>> +             if (pSlot[i].isFree)
>> +                     return i;
>
>> +     FD_ZERO(&slotset);
>> +
>> +     maxFd = pSlot[0].sock;
>> +
>> +     for (i = 0; i < max_slot; i++)
>> +     {
>> +             FD_SET(pSlot[i].sock, &slotset);
>> +             if (pSlot[i].sock > maxFd)
>> +                     maxFd = pSlot[i].sock;
>> +     }
>
> So we're waiting for idle connections?
>
> I think you'll have to have to use two fdsets here, and set the write
> set based on PQflush() != 0.
>
>> +/*
>> + * A select loop that repeats calling select until a descriptor in the read
>> + * set becomes readable. On Windows we have to check for the termination event
>> + * from time to time, on Unix we can just block forever.
>> + */
>
> Should a) mention why we have to check regularly on windows b) that on
> linux we don't have to because we send a cancel event from the signal
> handler.
>
>> +static int
>> +select_loop(int maxFd, fd_set *workerset)
>> +{
>> +     int                     i;
>> +     fd_set          saveSet = *workerset;
>>
>> +#ifdef WIN32
>> +     /* should always be the master */
>
> Hm?
>
>
> I have to say, this is a fairly large patch for such a minor feature...
Andres, this patch needs more effort from the author, right? So
marking it as returned with feedback.
-- 
Michael



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Alvaro Herrera
Date:
Michael Paquier wrote:

> Andres, this patch needs more effort from the author, right? So
> marking it as returned with feedback.

I will give this patch a look in the current commitfest, if you can
please set as 'needs review' instead with me as reviewer, so that I
don't forget, I would appreciate it.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Michael Paquier
Date:
On Fri, Jan 16, 2015 at 12:53 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Michael Paquier wrote:
>
>> Andres, this patch needs more effort from the author, right? So
>> marking it as returned with feedback.
>
> I will give this patch a look in the current commitfest, if you can
> please set as 'needs review' instead with me as reviewer, so that I
> don't forget, I would appreciate it.
Fine for me, done this way.
-- 
Michael



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
On 04 January 2015 07:27, Andres Freund Wrote,
> On 2014-12-31 18:35:38 +0530, Amit Kapila wrote:
> > +      <term><option>-j <replaceable
> class="parameter">jobs</replaceable></option></term>
> > +      <term><option>--jobs=<replaceable
> class="parameter">njobs</replaceable></option></term>
> > +      <listitem>
> > +       <para>
> > +        Number of concurrent connections to perform the operation.
> > +        This option will enable the vacuum operation to run on
> asynchronous
> > +        connections, at a time one table will be operated on one
> connection.
> > +        So at one time as many tables will be vacuumed parallely as
> number of
> > +        jobs.  If number of jobs given are more than number of
> tables then
> > +        number of jobs will be set to number of tables.
>
> "asynchronous connections" isn't a very well defined term. Also, the
> second part of that sentence doesn't seem to be gramattically correct.

I have changed this to concurrent connections, is this ok?


> > +       </para>
> > +       <para>
> > +        <application>vacuumdb</application> will open
> > +        <replaceable class="parameter"> njobs</replaceable>
> connections to the
> > +        database, so make sure your <xref linkend="guc-max-
> connections">
> > +        setting is high enough to accommodate all connections.
> > +       </para>
>
> Isn't it njobs+1?

The main connections what we are using for getting table information, same is use as first slot connections, so total
numberof connections are still njobs. 


> > @@ -141,6 +199,7 @@ main(int argc, char *argv[])
> >          }
> >      }
> >
> > +    optind++;
>
> Hm, where's that coming from?

This is wrong, I have removed it.

>
> > +    PQsetnonblocking(connSlot[0].connection, 1);
> > +
> > +    for (i = 1; i < concurrentCons; i++)
> > +    {
> > +        connSlot[i].connection = connectDatabase(dbname, host, port,
> username,
> > +                                  prompt_password,
> progname, false);
> > +
> > +        PQsetnonblocking(connSlot[i].connection, 1);
> > +        connSlot[i].isFree = true;
> > +        connSlot[i].sock = PQsocket(connSlot[i].connection);
> > +    }
>
> Are you sure about this global PQsetnonblocking()? This means that you
> might not be able to send queries... And you don't seem to be waiting
> for sockets waiting for writes in the select loop - which means you
> might end up being stuck waiting for reads when you haven't submitted
> the query.
>
> I think you might need a more complex select() loop. On nonfree
> connections also wait for writes if PQflush() returns != 0.

1. In GetIdleSlot we are making sure that, only if connection is busy, means if we have sent query on that connections,
onlyin that case we will wait. 
2. When all the connections are busy in that case we are doing select on all FD to make sure some response on
connections,and if there is any response on connections 
   Select will come out, then we consume the input and check whether connection is idle, or it's just a intermediate
response,if it not busy then we process all the result    and set it as free. 

>
> > +/*
> > + * GetIdleSlot
> > + * Process the slot list, if any free slot is available then return
> > + * the slotid else perform the select on all the socket's and wait
> > + * until atleast one slot becomes available.
> > + */
> > +static int
> > +GetIdleSlot(ParallelSlot *pSlot, int max_slot, const char *dbname,
> > +            const char *progname, bool completedb) {
> > +    int        i;
> > +    fd_set    slotset;
>
>
> Hm, you probably need to limit -j to FD_SETSIZE - 1 or so.

I will change this in next patch..

> > +    int     firstFree = -1;
> > +    pgsocket maxFd;
> > +
> > +    for (i = 0; i < max_slot; i++)
> > +        if (pSlot[i].isFree)
> > +            return i;
>
> > +    FD_ZERO(&slotset);
> > +
> > +    maxFd = pSlot[0].sock;
> > +
> > +    for (i = 0; i < max_slot; i++)
> > +    {
> > +        FD_SET(pSlot[i].sock, &slotset);
> > +        if (pSlot[i].sock > maxFd)
> > +            maxFd = pSlot[i].sock;
> > +    }
>
> So we're waiting for idle connections?
>
> I think you'll have to have to use two fdsets here, and set the write
> set based on PQflush() != 0.

I did not get this ?

The logic here is, we are waiting for any connections to respond, and wait using select on all fds.
When select come out, we check all the socket that which all are not busy, mark all the finished connection as idle at
once,
If none of the connection free, we go to select again, otherwise will return first idle connection.


> > +/*
> > + * A select loop that repeats calling select until a descriptor in
> > +the read
> > + * set becomes readable. On Windows we have to check for the
> > +termination event
> > + * from time to time, on Unix we can just block forever.
> > + */
>
> Should a) mention why we have to check regularly on windows b) that on
> linux we don't have to because we send a cancel event from the signal
> handler.

I have added the comments..

> > +static int
> > +select_loop(int maxFd, fd_set *workerset) {
> > +    int            i;
> > +    fd_set        saveSet = *workerset;
> >
> > +#ifdef WIN32
> > +    /* should always be the master */
>
> Hm?
>
>
> I have to say, this is a fairly large patch for such a minor feature...
>
> Greetings,
>
> Andres Freund
>
> --
>  Andres Freund                       http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Alvaro Herrera
Date:
I didn't understand the coding in GetQueryResult(); why do we check the
result status of the last returned result only?  It seems simpler to me
to check it inside the loop, but maybe there's a reason you didn't do it
like that?

Also, what is the reason we were ignoring those errors only in
"completedb" mode?  It doesn't seem like it would cause any harm if we
did it in all cases.  That means we can just not have the "completeDb"
flag at all.

Finally, I think it's better to report the "missing relation" error,
even if we're going to return true to continue processing other tables.
That makes the situation clearer to the user.

So the function would end up looking like this:

/** GetQueryResult** Process the query result.  Returns true if there's no error, false* otherwise -- but errors about
tryingto vacuum a missing relation are* ignored.*/
 
static bool
GetQueryResult(PGconn *conn, errorOptions *erropts)
{PGresult    *result = NULL;
SetCancelConn(conn);while ((result = PQgetResult(conn)) != NULL){    /*     * If errors are found, report them.  Errors
abouta missing table are     * harmless so we continue processing, but die for other errors.     */    if
(PQresultStatus(result)!= PGRES_COMMAND_OK)    {        char *sqlState = PQresultErrorField(result, PG_DIAG_SQLSTATE);
 
        fprintf(stderr, _("%s: vacuuming of database \"%s\" failed: %s"),                erropts->progname,
erropts->dbname,PQerrorMessage(conn));
 
        if (sqlState && strcmp(sqlState, ERRCODE_UNDEFINED_TABLE) != 0)        {            PQclear(result);
returnfalse;        }    }
 
    PQclear(result);}ResetCancelConn();
return true;
}



-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Wed, Jan 21, 2015 at 8:51 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> I didn't understand the coding in GetQueryResult(); why do we check the
> result status of the last returned result only?  It seems simpler to me
> to check it inside the loop, but maybe there's a reason you didn't do it
> like that?
>
> Also, what is the reason we were ignoring those errors only in
> "completedb" mode?  It doesn't seem like it would cause any harm if we
> did it in all cases.  That means we can just not have the "completeDb"
> flag at all.
>

IIRC it is done to match the existing behaviour where such errors are
ignored we use this utility to vacuum database.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Alvaro Herrera
Date:
Amit Kapila wrote:
> On Wed, Jan 21, 2015 at 8:51 PM, Alvaro Herrera <alvherre@2ndquadrant.com>
> wrote:
> >
> > I didn't understand the coding in GetQueryResult(); why do we check the
> > result status of the last returned result only?  It seems simpler to me
> > to check it inside the loop, but maybe there's a reason you didn't do it
> > like that?
> >
> > Also, what is the reason we were ignoring those errors only in
> > "completedb" mode?  It doesn't seem like it would cause any harm if we
> > did it in all cases.  That means we can just not have the "completeDb"
> > flag at all.
> 
> IIRC it is done to match the existing behaviour where such errors are
> ignored we use this utility to vacuum database.

I think that's fine, but we should do it always, not just in
whole-database mode.

I've been hacking this patch today BTW; hope to have something to show
tomorrow.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Amit Kapila
Date:
On Thu, Jan 22, 2015 at 8:22 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> Amit Kapila wrote:
> > On Wed, Jan 21, 2015 at 8:51 PM, Alvaro Herrera <alvherre@2ndquadrant.com>
> > wrote:
> > >
> > > I didn't understand the coding in GetQueryResult(); why do we check the
> > > result status of the last returned result only?  It seems simpler to me
> > > to check it inside the loop, but maybe there's a reason you didn't do it
> > > like that?
> > >
> > > Also, what is the reason we were ignoring those errors only in
> > > "completedb" mode?  It doesn't seem like it would cause any harm if we
> > > did it in all cases.  That means we can just not have the "completeDb"
> > > flag at all.
> >
> > IIRC it is done to match the existing behaviour where such errors are
> > ignored we use this utility to vacuum database.
>
> I think that's fine, but we should do it always, not just in
> whole-database mode.
>
> I've been hacking this patch today BTW; hope to have something to show
> tomorrow.
>

Great!


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Alvaro Herrera
Date:
Here's v23.

I reworked a number of things.  First, I changed trivial stuff like
grouping all the vacuuming options in a struct, to avoid passing an
excessive number of arguments to functions.  full, freeze, analyze_only,
and_analyze and verbose are all in a single struct now.  Also, the
stage_commands and stage_messages was duplicated by your patch; I moved
them to a file-level static struct.

I made prepare_command reset the string buffer and receive an optional
table name, so that it can append it to the generated command, and
append the semicolon as well.  Forcing the callers to reset the string
before calling, and having them add the table name and semicolon
afterwards was awkward and unnecessarily verbose.

You had a new in_abort() function in common.c which seems an unnecessary
layer; in its place I just exported the inAbort boolean flag it was
returning, and renamed to CancelRequested.

I was then troubled by the fact that vacuum_one_database() was being
called in a loop by main() when multiple tables are vacuumed, but
vacuum_parallel() was doing the loop internally.  I found this
discrepancy confusing, so I renamed that new function to
vacuum_one_database_parallel and modified the original
vacuum_one_database to do the loop internally as well.  Now they are, in
essence, a mirror of each other, one doing the parallel stuff and one
doing it serially.  This seems to make more sense to me -- but see
below.

I also modified some underlying stuff like GetIdleSlot returning a
ParallelSlot pointer instead of an array index.  Since its caller always
has to dereference the array with the given index, it makes more sense
to return the right element pointer instead, so I made it do that.
Also, that way, instead of returning NO_SLOT in case of error it can
just return NULL; no need for extra cognitive burden.

I also changed select_loop.  In your patch it had two implementations,
one WIN32 and another one for the rest.  It looks nicer to me to have
only one with small exceptions in the places that need it.  (I haven't
tested the WIN32 path.)  Also, instead of returning ERROR_IN_ABORT I
made it set a boolean flag in case of error, which seems cleaner.

I changed GetQueryResult as I described in a previous message.


There are two things that continue to bother me and I would like you,
dear patch author, to change them before committing this patch:

1. I don't like having vacuum_one_database() and a separate
vacuum_one_database_parallel().  I think we should merge them into one
function, which does either thing according to parameters.  There's
plenty in there that's duplicated.

2. in particular, the above means that run_parallel_vacuum can no longer
exist as it is.  Right now vacuum_one_database_parallel relies on
run_parallel_vacuum to do the actual job parallellization.  I would like
to have that looping in the improved vacuum_one_database() function
instead.


Looking forward to v24,

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
On 22 January 2015 23:16, Alvaro Herrera Wrote,

> Here's v23.
>
> There are two things that continue to bother me and I would like you,
> dear patch author, to change them before committing this patch:
>
> 1. I don't like having vacuum_one_database() and a separate
> vacuum_one_database_parallel().  I think we should merge them into one
> function, which does either thing according to parameters.  There's
> plenty in there that's duplicated.
>
> 2. in particular, the above means that run_parallel_vacuum can no
> longer exist as it is.  Right now vacuum_one_database_parallel relies
> on run_parallel_vacuum to do the actual job parallellization.  I would
> like to have that looping in the improved vacuum_one_database()
> function instead.

> Looking forward to v24,

Thanks you for your effort, I have tried to change the patch as per your instructions and come up with v24,

Changes:
1. In current patch vacuum_one_database (for table list), have the table loop outside and analyze_stage loop inside, so
itwill finish 
All three stage for one table first and then pick the next table. But vacuum_one_database_parallel will do the stage
loopoutside and will call run_parallel_vacuum, 
Which will have table loop, so for one stage all the tables will be vacuumed first, then go to next stage.
So for merging two function both functions behaviors should be identical, I think if user have given a list of tables
inanalyze-in-stages, than doing all the table 
Atleast for one stage and then picking next stage will be better solution so I have done it that way.

2. in select_loop
For WIN32 TranslateSocketError function I replaced with
                        if (WSAGetLastError() == WSAEINTR)
                                errno == EINTR;

otherwise I have to expose TranslateSocketError function from socket and include extra header.

I have tested in windows also its working fine.

Regards,
Dilip




Attachment

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Alvaro Herrera
Date:
Dilip kumar wrote:

> Changes:
> 1. In current patch vacuum_one_database (for table list), have the table loop outside and analyze_stage loop inside,
soit will finish
 
> All three stage for one table first and then pick the next table. But vacuum_one_database_parallel will do the stage
loopoutside and will call run_parallel_vacuum,
 
> Which will have table loop, so for one stage all the tables will be vacuumed first, then go to next stage.
> So for merging two function both functions behaviors should be identical, I think if user have given a list of tables
inanalyze-in-stages, than doing all the table
 
> Atleast for one stage and then picking next stage will be better solution so I have done it that way.

Yeah, I think the stages loop should be outermost, as discussed upthread
somewhere -- it's better to have initial stats for all tables as soon as
possible, and improve them later, than have some tables/dbs with no
stats for a longer period while full stats are computed for some
specific tables/database.

I'm tweaking your v24 a bit more now, thanks -- main change is to make
vacuum_one_database be called only to run one analyze stage, so it never
iterates for each stage; callers must iterate calling it multiple times
in those cases.  (There's only one callsite that needs changing anyway.)

> 2. in select_loop
> For WIN32 TranslateSocketError function I replaced with
>                         if (WSAGetLastError() == WSAEINTR)
>                                 errno == EINTR;
> 
> otherwise I have to expose TranslateSocketError function from socket and include extra header.

Grumble.  Don't like this bit, but moving TranslateSocketError to
src/common is outside the scope of this patch, so okay.  (pg_dump's
parallel stuff has the same issue anyway.)

In case you're up for doing some more work later on, there are two ideas
here: move the backend's TranslateSocketError to src/common, and try to
merge pg_dump's select_loop function with the one in this new code.  But
that's for another patch anyway (and this new function needs a little
battle-testing, of course.)

> I have tested in windows also its working fine.

Great, thanks.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Alvaro Herrera
Date:
Andres Freund wrote:
> On 2014-12-31 18:35:38 +0530, Amit Kapila wrote:

> > +    PQsetnonblocking(connSlot[0].connection, 1);
> > +
> > +    for (i = 1; i < concurrentCons; i++)
> > +    {
> > +        connSlot[i].connection = connectDatabase(dbname, host, port, username,
> > +                                  prompt_password, progname, false);
> > +
> > +        PQsetnonblocking(connSlot[i].connection, 1);
> > +        connSlot[i].isFree = true;
> > +        connSlot[i].sock = PQsocket(connSlot[i].connection);
> > +    }
> 
> Are you sure about this global PQsetnonblocking()? This means that you
> might not be able to send queries... And you don't seem to be waiting
> for sockets waiting for writes in the select loop - which means you
> might end up being stuck waiting for reads when you haven't submitted
> the query.
> 
> I think you might need a more complex select() loop. On nonfree
> connections also wait for writes if PQflush() returns != 0.

I removed the PQsetnonblocking() calls.  They were a misunderstanding, I
think.

> > +/*
> > + * GetIdleSlot
> > + * Process the slot list, if any free slot is available then return
> > + * the slotid else perform the select on all the socket's and wait
> > + * until atleast one slot becomes available.
> > + */
> > +static int
> > +GetIdleSlot(ParallelSlot *pSlot, int max_slot, const char *dbname,
> > +            const char *progname, bool completedb)
> > +{
> > +    int        i;
> > +    fd_set    slotset;
> 
> 
> Hm, you probably need to limit -j to FD_SETSIZE - 1 or so.

I tried without the check to use 1500 connections, and select() didn't
even blink -- everything worked fine vacuuming 1500 tables in parallel
on a set of 2000 tables.  Not sure what's the actual limit but my
FD_SETSIZE says 1024, so I'm clearly over the limit.  (I tried to run it
with -j2000 but the server didn't start with that many connections.  I
didn't try any intermediate numbers.)  Anyway I added the check.

I fixed some more minor issues and pushed.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:

> I'm tweaking your v24 a bit more now, thanks -- main change is to make
> vacuum_one_database be called only to run one analyze stage, so it never
> iterates for each stage; callers must iterate calling it multiple times
> in those cases.  (There's only one callsite that needs changing anyway.)

I made some more changes, particularly so that the TAP test pass (we
were missing the semicolon when a table name was not specified to
prepare_vacuum_command).  I reordered the code in a more sensible
manner, remove the vacuum_database_stage layer (which was pretty
useless), and changed the analyze-in-stages mode: if we pass a valid
stage number, run that stage, if not, then we're not in analyze-in-stage
mode.  So I got rid of the boolean flag altogether.  I also moved the
per-stage commands and messages back into a struct inside a function,
since there's no need to have them be file-level variables anymore.

-j1 is now the same as not specifying anything, and vacuum_one_database
uses more common code in the parallel and not-parallel cases: the
not-parallel case is just the parallel case with a single connection, so
the setup and shutdown is mostly the same in both cases.

I pushed the result.  Please test, particularly on Windows.  If you can
use configure --enable-tap-tests and run them ("make check" in the
src/bin/scripts subdir) that would be good too .. not sure whether
that's expected to work on Windows.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
On 23 January 2015 21:10, Alvaro Herrera Wrote,


> In case you're up for doing some more work later on, there are two
> ideas
> here: move the backend's TranslateSocketError to src/common, and try to
> merge pg_dump's select_loop function with the one in this new code.
> But that's for another patch anyway (and this new function needs a
> little battle-testing, of course.)
>
Thanks for your effort, I will take it up for next commitfest..



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Dilip kumar
Date:
On 23 January 2015 23:55, Alvaro Herrera,

> -j1 is now the same as not specifying anything, and vacuum_one_database
> uses more common code in the parallel and not-parallel cases: the not-
> parallel case is just the parallel case with a single connection, so
> the setup and shutdown is mostly the same in both cases.
>
> I pushed the result.  Please test, particularly on Windows.  If you can
> use configure --enable-tap-tests and run them ("make check" in the
> src/bin/scripts subdir) that would be good too .. not sure whether
> that's expected to work on Windows.
>

I have tested in windows, its working fine,

Not sure how to enable tap test in windows, I will check it and run if possible.


Thanks,
Dilip



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Pavel Stehule
Date:
Hi

I am testing this feature on relative complex schema (38619 tables in db) and I got deadlock

[pavel@localhost bin]$ /usr/local/pgsql/bin/vacuumdb test2 -fz -j 4
vacuumdb: vacuuming database "test2"
vacuumdb: vacuuming of database "test2" failed: ERROR:  deadlock detected
DETAIL:  Process 24689 waits for RowExclusiveLock on relation 2608 of database 194769; blocked by process 24690.
Process 24690 waits for AccessShareLock on relation 1249 of database 194769; blocked by process 24689.
HINT:  See server log for query details.

ERROR:  deadlock detected
DETAIL:  Process 24689 waits for RowExclusiveLock on relation 2608 of database 194769; blocked by process 24690.
        Process 24690 waits for AccessShareLock on relation 1249 of database 194769; blocked by process 24689.
        Process 24689: VACUUM (FULL, ANALYZE) pg_catalog.pg_attribute;
        Process 24690: VACUUM (FULL, ANALYZE) pg_catalog.pg_depend;
HINT:  See server log for query details.
STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_attribute;
ERROR:  canceling statement due to user request
STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_depend;
ERROR:  canceling statement due to user request
STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_class;
ERROR:  canceling statement due to user request
STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_proc;
LOG:  could not send data to client: Broken pipe
STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_proc;
FATAL:  connection to client lost
LOG:  could not send data to client: Broken pipe
ERROR:  canceling statement due to user request
FATAL:  connection to client lost

   Schema   |          Name           | Type  |  Owner   |    Size    | Description
------------+-------------------------+-------+----------+------------+-------------
 pg_catalog | pg_attribute            | table | postgres | 439 MB     |
 pg_catalog | pg_rewrite              | table | postgres | 314 MB     |
 pg_catalog | pg_proc                 | table | postgres | 136 MB     |
 pg_catalog | pg_depend               | table | postgres | 133 MB     |
 pg_catalog | pg_class                | table | postgres | 69 MB      |
 pg_catalog | pg_attrdef              | table | postgres | 55 MB      |
 pg_catalog | pg_trigger              | table | postgres | 47 MB      |
 pg_catalog | pg_type                 | table | postgres | 31 MB      |
 pg_catalog | pg_description          | table | postgres | 23 MB      |
 pg_catalog | pg_index                | table | postgres | 20 MB      |
 pg_catalog | pg_constraint           | table | postgres | 17 MB      |
 pg_catalog | pg_shdepend             | table | postgres | 17 MB      |
 pg_catalog | pg_statistic            | table | postgres | 928 kB     |
 pg_catalog | pg_operator             | table | postgres | 552 kB     |
 pg_catalog | pg_collation            | table | postgres | 232 kB     |

Regards

Pavel Stehule


2015-01-27 3:26 GMT+01:00 Dilip kumar <dilip.kumar@huawei.com>:
On 23 January 2015 21:10, Alvaro Herrera Wrote,


> In case you're up for doing some more work later on, there are two
> ideas
> here: move the backend's TranslateSocketError to src/common, and try to
> merge pg_dump's select_loop function with the one in this new code.
> But that's for another patch anyway (and this new function needs a
> little battle-testing, of course.)
>
Thanks for your effort, I will take it up for next commitfest..


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Fabrízio de Royes Mello
Date:


Em quinta-feira, 29 de janeiro de 2015, Pavel Stehule <pavel.stehule@gmail.com> escreveu:
Hi

I am testing this feature on relative complex schema (38619 tables in db) and I got deadlock

[pavel@localhost bin]$ /usr/local/pgsql/bin/vacuumdb test2 -fz -j 4
vacuumdb: vacuuming database "test2"
vacuumdb: vacuuming of database "test2" failed: ERROR:  deadlock detected
DETAIL:  Process 24689 waits for RowExclusiveLock on relation 2608 of database 194769; blocked by process 24690.
Process 24690 waits for AccessShareLock on relation 1249 of database 194769; blocked by process 24689.
HINT:  See server log for query details.

ERROR:  deadlock detected
DETAIL:  Process 24689 waits for RowExclusiveLock on relation 2608 of database 194769; blocked by process 24690.
        Process 24690 waits for AccessShareLock on relation 1249 of database 194769; blocked by process 24689.
        Process 24689: VACUUM (FULL, ANALYZE) pg_catalog.pg_attribute;
        Process 24690: VACUUM (FULL, ANALYZE) pg_catalog.pg_depend;
HINT:  See server log for query details.
STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_attribute;
ERROR:  canceling statement due to user request
STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_depend;
ERROR:  canceling statement due to user request
STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_class;
ERROR:  canceling statement due to user request
STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_proc;
LOG:  could not send data to client: Broken pipe
STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_proc;
FATAL:  connection to client lost
LOG:  could not send data to client: Broken pipe
ERROR:  canceling statement due to user request
FATAL:  connection to client lost

   Schema   |          Name           | Type  |  Owner   |    Size    | Description
------------+-------------------------+-------+----------+------------+-------------
 pg_catalog | pg_attribute            | table | postgres | 439 MB     |
 pg_catalog | pg_rewrite              | table | postgres | 314 MB     |
 pg_catalog | pg_proc                 | table | postgres | 136 MB     |
 pg_catalog | pg_depend               | table | postgres | 133 MB     |
 pg_catalog | pg_class                | table | postgres | 69 MB      |
 pg_catalog | pg_attrdef              | table | postgres | 55 MB      |
 pg_catalog | pg_trigger              | table | postgres | 47 MB      |
 pg_catalog | pg_type                 | table | postgres | 31 MB      |
 pg_catalog | pg_description          | table | postgres | 23 MB      |
 pg_catalog | pg_index                | table | postgres | 20 MB      |
 pg_catalog | pg_constraint           | table | postgres | 17 MB      |
 pg_catalog | pg_shdepend             | table | postgres | 17 MB      |
 pg_catalog | pg_statistic            | table | postgres | 928 kB     |
 pg_catalog | pg_operator             | table | postgres | 552 kB     |
 pg_catalog | pg_collation            | table | postgres | 232 kB     |


 
There are a warning in the docs to be careful to use the -f (full) option and -j.

Regards,

Fabrízio 
 


--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Pavel Stehule
Date:


2015-01-29 10:28 GMT+01:00 Fabrízio de Royes Mello <fabriziomello@gmail.com>:


Em quinta-feira, 29 de janeiro de 2015, Pavel Stehule <pavel.stehule@gmail.com> escreveu:

Hi

I am testing this feature on relative complex schema (38619 tables in db) and I got deadlock

[pavel@localhost bin]$ /usr/local/pgsql/bin/vacuumdb test2 -fz -j 4
vacuumdb: vacuuming database "test2"
vacuumdb: vacuuming of database "test2" failed: ERROR:  deadlock detected
DETAIL:  Process 24689 waits for RowExclusiveLock on relation 2608 of database 194769; blocked by process 24690.
Process 24690 waits for AccessShareLock on relation 1249 of database 194769; blocked by process 24689.
HINT:  See server log for query details.

ERROR:  deadlock detected
DETAIL:  Process 24689 waits for RowExclusiveLock on relation 2608 of database 194769; blocked by process 24690.
        Process 24690 waits for AccessShareLock on relation 1249 of database 194769; blocked by process 24689.
        Process 24689: VACUUM (FULL, ANALYZE) pg_catalog.pg_attribute;
        Process 24690: VACUUM (FULL, ANALYZE) pg_catalog.pg_depend;
HINT:  See server log for query details.
STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_attribute;
ERROR:  canceling statement due to user request
STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_depend;
ERROR:  canceling statement due to user request
STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_class;
ERROR:  canceling statement due to user request
STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_proc;
LOG:  could not send data to client: Broken pipe
STATEMENT:  VACUUM (FULL, ANALYZE) pg_catalog.pg_proc;
FATAL:  connection to client lost
LOG:  could not send data to client: Broken pipe
ERROR:  canceling statement due to user request
FATAL:  connection to client lost

   Schema   |          Name           | Type  |  Owner   |    Size    | Description
------------+-------------------------+-------+----------+------------+-------------
 pg_catalog | pg_attribute            | table | postgres | 439 MB     |
 pg_catalog | pg_rewrite              | table | postgres | 314 MB     |
 pg_catalog | pg_proc                 | table | postgres | 136 MB     |
 pg_catalog | pg_depend               | table | postgres | 133 MB     |
 pg_catalog | pg_class                | table | postgres | 69 MB      |
 pg_catalog | pg_attrdef              | table | postgres | 55 MB      |
 pg_catalog | pg_trigger              | table | postgres | 47 MB      |
 pg_catalog | pg_type                 | table | postgres | 31 MB      |
 pg_catalog | pg_description          | table | postgres | 23 MB      |
 pg_catalog | pg_index                | table | postgres | 20 MB      |
 pg_catalog | pg_constraint           | table | postgres | 17 MB      |
 pg_catalog | pg_shdepend             | table | postgres | 17 MB      |
 pg_catalog | pg_statistic            | table | postgres | 928 kB     |
 pg_catalog | pg_operator             | table | postgres | 552 kB     |
 pg_catalog | pg_collation            | table | postgres | 232 kB     |


 

should not be used a pessimist controlled locking instead?

Regards

Pavel
 

Regards,

Fabrízio 
 


--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL


Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Alvaro Herrera
Date:
Pavel Stehule wrote:

> should not be used a pessimist controlled locking instead?
> 

Patches welcome.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Laurent Laborde
Date:
On Fri, Jan 2, 2015 at 3:18 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:


Okay, I have marked this patch as "Ready For Committer"

Notes for Committer -
There is one behavioural difference in the handling of --analyze-in-stages
switch, when individual tables (by using -t option) are analyzed by
using this switch, patch will process (in case of concurrent jobs) all the
given tables for stage-1 and then for stage-2 and so on whereas in the
unpatched code it will process all the three stages table by table
(table-1 all three stages, table-2 all three stages and so on).  I think
the new behaviour is okay as the same is done when this utility does
vacuum for whole database.  As there was no input from any committer
on this point, I thought it is better to get the same rather than waiting
more just for one point.

Friendly greetings !

What's the status of parallel clusterdb please ?
I'm having fun (and troubles) applying the vacuumdb patch to clusterdb.

This thread also talk about unifying code for parallelizing clusterdb and reindex.
Was anything done about it ? Because i can't see it and my work currently involve a lot of copy/pasting from vacuumdb to clusterdb :)

And no, (i'm pretty sure) i don't have the required postgresql knowledge to do this unification if it isn't done already.

Thank you :)

(And sorry about the thread-necromancy)
 

--
Laurent "ker2x" Laborde
DBA \o/ Gandi.net

Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Alvaro Herrera
Date:
Laurent Laborde wrote:

> Friendly greetings !
> 
> What's the status of parallel clusterdb please ?
> I'm having fun (and troubles) applying the vacuumdb patch to clusterdb.
> 
> This thread also talk about unifying code for parallelizing clusterdb and
> reindex.
> Was anything done about it ? Because i can't see it and my work currently
> involve a lot of copy/pasting from vacuumdb to clusterdb :)

Honestly, I have to wonder whether there are really valid use cases for
clusterdb.  Are you actually using it and want to see it improved, or is
this just an academical exercise?

> And no, (i'm pretty sure) i don't have the required postgresql knowledge to
> do this unification if it isn't done already.

You may or may not lack it *now*, but that doesn't mean you will
continue to lack it forever.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

From
Laurent Laborde
Date:
<div dir="ltr"><p dir="ltr">Le 23 juil. 2015 19:27, "Alvaro Herrera" <<a href="mailto:alvherre@2ndquadrant.com"
target="_blank">alvherre@2ndquadrant.com</a>>a écrit :<br /> ><br /> > Laurent Laborde wrote:<br /> ><br />
>> Friendly greetings !<br /> > ><br /> > > What's the status of parallel clusterdb please ?<br />
>> I'm having fun (and troubles) applying the vacuumdb patch to clusterdb.<br /> > ><br /> > > This
threadalso talk about unifying code for parallelizing clusterdb and<br /> > > reindex.<br /> > > Was
anythingdone about it ? Because i can't see it and my work currently<br /> > > involve a lot of copy/pasting from
vacuumdbto clusterdb :)<br /> ><br /> > Honestly, I have to wonder whether there are really valid use cases
for<br/> > clusterdb.  Are you actually using it and want to see it improved, or is<br /> > this just an
academicalexercise?<p dir="ltr">Purely academical. I don't use it.<br /><p dir="ltr">> > And no, (i'm pretty
sure)i don't have the required postgresql knowledge to<br /> > > do this unification if it isn't done already.<br
/>><br /> > You may or may not lack it *now*, but that doesn't mean you will<br /> > continue to lack it
forever.<pdir="ltr">That's why i'm working on it :)<br /><br />--<br />Laurent "ker2x" Laborde<br />DBA \o/ <a
href="http://gandi.net">gandi.net</a></div>