Re: [HACKERS] Block level parallel vacuum - Mailing list pgsql-hackers

From Haribabu Kommi
Subject Re: [HACKERS] Block level parallel vacuum
Date
Msg-id CAJrrPGe_11PeypnMpWrsZh4W_+yMdwWWJKstBBW0jL=sDm2KfA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Block level parallel vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: [HACKERS] Block level parallel vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers

On Fri, Jan 18, 2019 at 11:42 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Fri, Jan 18, 2019 at 10:38 AM Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
>
>
> On Tue, Jan 15, 2019 at 6:00 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>
>>
>> Rebased.
>
>
> I started reviewing the patch, I didn't finish my review yet.
> Following are some of the comments.

Thank you for reviewing the patch.

>
> +    <term><literal>PARALLEL <replaceable class="parameter">N</replaceable></literal></term>
> +    <listitem>
> +     <para>
> +      Execute index vacuum and cleanup index in parallel with
>
> I doubt that user can understand the terms index vacuum and cleanup index.
> May be it needs some more detailed information.
>

Agreed. Table 27.22 "Vacuum phases" has a good description of vacuum
phases. So maybe adding the referencint to it would work.

OK.
 
>
> -typedef enum VacuumOption
> +typedef enum VacuumOptionFlag
>  {
>
> I don't find the new name quite good, how about VacuumFlags?
>

Agreed with removing "Option" from the name but I think VacuumFlag
would be better because this enum represents only one flag. Thoughts?

OK.
 

> postgres=# vacuum (parallel 2, verbose) tbl;
>
> With verbose, no parallel workers related information is available.
> I feel giving that information is required even when it is not parallel
> vacuum also.
>

Agreed. How about the folloiwng verbose output? I've added the number
of launched, planned and requested vacuum workers and purpose (vacuum
or cleanup).

postgres(1:91536)=# vacuum (verbose, parallel 30) test; -- table
'test' has 3 indexes
INFO:  vacuuming "public.test"
INFO:  launched 2 parallel vacuum workers for index vacuum (planned:
2, requested: 30)
INFO:  scanned index "test_idx1" to remove 2000 row versions
DETAIL:  CPU: user: 0.12 s, system: 0.00 s, elapsed: 0.12 s
INFO:  scanned index "test_idx2" to remove 2000 row versions by
parallel vacuum worker
DETAIL:  CPU: user: 0.07 s, system: 0.05 s, elapsed: 0.12 s
INFO:  scanned index "test_idx3" to remove 2000 row versions by
parallel vacuum worker
DETAIL:  CPU: user: 0.09 s, system: 0.05 s, elapsed: 0.14 s
INFO:  "test": removed 2000 row versions in 10 pages
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO:  launched 2 parallel vacuum workers for index cleanup (planned:
2, requested: 30)
INFO:  index "test_idx1" now contains 991151 row versions in 2745 pages
DETAIL:  2000 index row versions were removed.
24 index pages have been deleted, 18 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "test_idx2" now contains 991151 row versions in 2745 pages
DETAIL:  2000 index row versions were removed.
24 index pages have been deleted, 18 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "test_idx3" now contains 991151 row versions in 2745 pages
DETAIL:  2000 index row versions were removed.
24 index pages have been deleted, 18 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  "test": found 2000 removable, 367 nonremovable row versions in
41 out of 4425 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 500
There were 6849 unused item pointers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.12 s, system: 0.01 s, elapsed: 0.17 s.
VACUUM
 
The verbose output is good.

Since the previous patch conflicts with 285d8e12 I've attached the
latest version patch that incorporated the review comment I got.

Thanks for the latest patch. I have some more minor comments.

+      Execute index vacuum and cleanup index in parallel with

Better to use vacuum index and cleanup index? This is in same with
the description of vacuum phases. It is better to follow same notation
in the patch.


+ dead_tuples = lazy_space_alloc(lvstate, nblocks, parallel_workers);

With the change, the lazy_space_alloc takes care of initializing the
parallel vacuum, can we write something related to that in the comments.


+ initprog_val[2] = dead_tuples->max_dead_tuples;

dead_tuples variable may need rename for better reading?



+ if (lvshared->indstats[idx].updated)
+ result = &(lvshared->indstats[idx].stats);
+ else
+ copy_result = true;


I don't see a need for copy_result variable, how about directly using
the updated flag to decide whether to copy or not? Once the result is
copied update the flag.


+use Test::More tests => 34;

I don't find any new tetst are added in this patch.

I am thinking of performance penalty if we use the parallel option of
vacuum on a small sized table? 

Regards,
Haribabu Kommi
Fujitsu Australia

pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: Alternative to \copy in psql modelled after \g
Next
From: Surafel Temesgen
Date:
Subject: Re: pg_dump multi VALUES INSERT