Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem - Mailing list pgsql-hackers

From Claudio Freire
Subject Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
Date
Msg-id CAGTBQpbS2jDH1X8UYdqNpLUveyTsH8rioGhiGx1DhpiJh6jy8A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
Sorry for the delay, I had extended vacations that kept me away from
my test rigs, and afterward testing too, liteally, a few weeks.

I built a more thoroguh test script that produced some interesting
results. Will attach the results.

For now, to the review comments:

On Thu, Apr 27, 2017 at 4:25 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> I've read this patch again and here are some review comments.
>
> + * Lookup in that structure proceeds sequentially in the list of segments,
> + * and with a binary search within each segment. Since segment's size grows
> + * exponentially, this retains O(log N) lookup complexity (2 log N to be
> + * precise).
>
> IIUC we now do binary search even over the list of segments.

Right

>
> -----
>
> We often fetch a particular dead tuple segment. How about providing a
> macro for easier understanding?
> For example,
>
>  #define GetDeadTuplsSegment(lvrelstats, seg) \
>   (&(lvrelstats)->dead_tuples.dt_segments[(seg)])
>
> -----
>
> +       if (vacrelstats->dead_tuples.num_segs == 0)
> +               return;
> +
>
> +       /* If uninitialized, we have no tuples to delete from the indexes */
> +       if (vacrelstats->dead_tuples.num_segs == 0)
> +       {
> +               return;
> +       }
>
> +       if (vacrelstats->dead_tuples.num_segs == 0)
> +               return false;
> +

Ok

> As I listed, there is code to check if dead tuple is initialized
> already in some places where doing actual vacuum.
> I guess that it should not happen that we attempt to vacuum a
> table/index page while not having any dead tuple. Is it better to have
> Assert or ereport instead?

I'm not sure. Having a non-empty dead tuples array is not necessary to
be able to honor the contract in the docstring. Most of those functions
clean up the heap/index of dead tuples given the array of dead tuples,
which is a no-op for an empty array.

The code that calls those functions doesn't bother calling if the array
is known empty, true, but there's no compelling reason to enforce that at the
interface. Doing so could cause subtle bugs rather than catch them
(in the form of unexpected assertion failures, if some caller forgot to
check the dead tuples array for emptiness).

If you're worried about the possibility that some bugs fails to record
dead tuples in the array, and thus makes VACUUM silently ineffective,
I instead added a test for that case. This should be a better approach,
since it's more likely to catch unexpected failure modes than an assert.

> @@ -1915,2 +2002,2 @@ count_nondeletable_pages(Relation onerel,
> LVRelStats *vacrelstats)
> -                       BlockNumber     prefetchStart;
> -                       BlockNumber     pblkno;
> +                       BlockNumber prefetchStart;
> +                       BlockNumber pblkno;
>
> I think that it's a unnecessary change.

Yep. But funnily that's how it's now in master.

>
> -----
>
> +       /* Search for the segment likely to contain the item pointer */
> +       iseg = vac_itemptr_binsrch(
> +               (void *) itemptr,
> +               (void *)
> &(vacrelstats->dead_tuples.dt_segments->last_dead_tuple),
> +               vacrelstats->dead_tuples.last_seg + 1,
> +               sizeof(DeadTuplesSegment));
> +
>
> I think that we can change the above to;
>
> +       /* Search for the segment likely to contain the item pointer */
> +       iseg = vac_itemptr_binsrch(
> +               (void *) itemptr,
> +               (void *) &(seg->last_dead_tuple),
> +               vacrelstats->dead_tuples.last_seg + 1,
> +               sizeof(DeadTuplesSegment));
>
> We set "seg = vacrelstats->dead_tuples.dt_segments" just before this.

Right

Attached is a current version of both patches, rebased since we're at it.

I'm also attaching the output from the latest benchmark runs, in raw
(tar.bz2) and digested (bench_report) forms, the script used to run
them (vacuumbench.sh) and to produce the reports
(vacuum_bench_report.sh).

Those are before the changes in the review. While I don't expect any
change, I'll re-run some of them just in case, and try to investigate
the slowdown. But that will take forever. Each run takes about a week
on my test rig, and I don't have enough hardware to parallelize the
tests. I will run a test on a snapshot of a particularly troublesome
production database we have, that should be interesting.

The benchmarks show a consistent improvement at scale 400, which may
be related to the search implementation being better somehow, and a
slowdown at scale 4000 in some variants. I believe this is due to
those variants having highly clustered indexes. While the "shuf"
(shuffled) variantes were intended to be the opposite of that, I
suspect I somehow failed to get the desired outcome, so I'll be
double-checking that.

In any case the slowdown is only materialized when vacuuming with a
large mwm setting, which is something that shouldn't happen
unintentionally.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: Re: [HACKERS] New partitioning - some feedback
Next
From: Greg Stark
Date:
Subject: Re: [HACKERS] RFC: Key normalization for nbtree