Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ] - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Date
Msg-id CAB7nPqQH4XAksne0X47Fbd0YG0QApHFHLZ09tQp7WzkF+gCPpg@mail.gmail.com
Whole thread Raw
In response to TODO : Allow parallel cores to be used by vacuumdb [ WIP ]  (Dilip kumar <dilip.kumar@huawei.com>)
List pgsql-hackers
On Thu, Nov 7, 2013 at 8:42 PM, Dilip kumar <dilip.kumar@huawei.com> wrote:
> This patch implementing the following TODO item
>
> Allow parallel cores to be used by vacuumdb
> http://www.postgresql.org/message-id/4F10A728.7090403@agliodbs.com
>
>
>
> Like Parallel pg_dump, vacuumdb is provided with the option to run the
> vacuum of multiple tables in parallel. [ vacuumdb –j ]
>
>
>
> 1.       One new option is provided with vacuumdb to give the number of
> workers.
>
> 2.       All worker will be started in beginning and all will be waiting for
> the vacuum instruction from the master.
>
> 3.       Now, if table list is provided in vacuumdb command using –t then,
> it will send the vacuum of one table to one of the IDLE worker, next table
> to next IDLE worker and so on.
>
> 4.       If vacuum is given for one DB then, it will execute select on
> pg_class to get the table list and fetch the table name one by one and also
> assign the vacuum responsibility to IDLE workers.
>
>
>
> Performance Data by parallel vacuumdb:
>
> Machine Configuration:
>
>                                 Core : 8
>
>                                 RAM: 24GB
>
> Test Scenario:
>
>                                 16 tables all with 4M records. [many records
> are deleted and inserted using some pattern, (files is attached in the
> mail)]
>
>
>
> Test Result
>
>
>
> {Base Code}    Time(s)    %CPU Usage      Avg Read(kB/s)    Avg Write(kB/s)
>
>                                 521       3%                         12000
> 20000
>
>
>
>
>
> {With Parallel Vacuum Patch}
>
>    worker          Time(s)    %CPU Usage    Avg Read(kB/s)          Avg
> Write(kB/s)
>
>       1                     518                     3%                 12000
> 20000   --> this will take the same path as base code
>
>       2                     390                     5%                 14000
> 30000
>
>       8                     235                     7%                 18000
> 40000
>
>       16                   197                     8%                 20000
> 50000
>
>
>
> Conclusion:
>
>                 By running the vacuumdb in parallel, CPU and I/O throughput
> is increasing and it can give >50% performance improvement.
>
>
>
> Work to be Done:
>
> 1.       Documentations of the new command.
>
> 2.       Parallel support for vacuum all db.
>
>
>
> Is it required to move the common code for parallel operation of pg_dump and
> vacuumdb to one place and reuse it ?
>
>
>
> Prototype patch is attached in the mail, please provide your
> feedback/Suggestions…
>
>
>
>                 Thanks & Regards,
>
>                 Dilip Kumar
>
>
>
>
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



--
Michael



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Clang 3.3 Analyzer Results
Next
From: Michael Paquier
Date:
Subject: Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ]