Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem - Mailing list pgsql-hackers
From | Claudio Freire |
---|---|
Subject | Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem |
Date | |
Msg-id | CAGTBQpbS2jDH1X8UYdqNpLUveyTsH8rioGhiGx1DhpiJh6jy8A@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem (Masahiko Sawada <sawada.mshk@gmail.com>) |
List | pgsql-hackers |
Sorry for the delay, I had extended vacations that kept me away from my test rigs, and afterward testing too, liteally, a few weeks. I built a more thoroguh test script that produced some interesting results. Will attach the results. For now, to the review comments: On Thu, Apr 27, 2017 at 4:25 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > I've read this patch again and here are some review comments. > > + * Lookup in that structure proceeds sequentially in the list of segments, > + * and with a binary search within each segment. Since segment's size grows > + * exponentially, this retains O(log N) lookup complexity (2 log N to be > + * precise). > > IIUC we now do binary search even over the list of segments. Right > > ----- > > We often fetch a particular dead tuple segment. How about providing a > macro for easier understanding? > For example, > > #define GetDeadTuplsSegment(lvrelstats, seg) \ > (&(lvrelstats)->dead_tuples.dt_segments[(seg)]) > > ----- > > + if (vacrelstats->dead_tuples.num_segs == 0) > + return; > + > > + /* If uninitialized, we have no tuples to delete from the indexes */ > + if (vacrelstats->dead_tuples.num_segs == 0) > + { > + return; > + } > > + if (vacrelstats->dead_tuples.num_segs == 0) > + return false; > + Ok > As I listed, there is code to check if dead tuple is initialized > already in some places where doing actual vacuum. > I guess that it should not happen that we attempt to vacuum a > table/index page while not having any dead tuple. Is it better to have > Assert or ereport instead? I'm not sure. Having a non-empty dead tuples array is not necessary to be able to honor the contract in the docstring. Most of those functions clean up the heap/index of dead tuples given the array of dead tuples, which is a no-op for an empty array. The code that calls those functions doesn't bother calling if the array is known empty, true, but there's no compelling reason to enforce that at the interface. Doing so could cause subtle bugs rather than catch them (in the form of unexpected assertion failures, if some caller forgot to check the dead tuples array for emptiness). If you're worried about the possibility that some bugs fails to record dead tuples in the array, and thus makes VACUUM silently ineffective, I instead added a test for that case. This should be a better approach, since it's more likely to catch unexpected failure modes than an assert. > @@ -1915,2 +2002,2 @@ count_nondeletable_pages(Relation onerel, > LVRelStats *vacrelstats) > - BlockNumber prefetchStart; > - BlockNumber pblkno; > + BlockNumber prefetchStart; > + BlockNumber pblkno; > > I think that it's a unnecessary change. Yep. But funnily that's how it's now in master. > > ----- > > + /* Search for the segment likely to contain the item pointer */ > + iseg = vac_itemptr_binsrch( > + (void *) itemptr, > + (void *) > &(vacrelstats->dead_tuples.dt_segments->last_dead_tuple), > + vacrelstats->dead_tuples.last_seg + 1, > + sizeof(DeadTuplesSegment)); > + > > I think that we can change the above to; > > + /* Search for the segment likely to contain the item pointer */ > + iseg = vac_itemptr_binsrch( > + (void *) itemptr, > + (void *) &(seg->last_dead_tuple), > + vacrelstats->dead_tuples.last_seg + 1, > + sizeof(DeadTuplesSegment)); > > We set "seg = vacrelstats->dead_tuples.dt_segments" just before this. Right Attached is a current version of both patches, rebased since we're at it. I'm also attaching the output from the latest benchmark runs, in raw (tar.bz2) and digested (bench_report) forms, the script used to run them (vacuumbench.sh) and to produce the reports (vacuum_bench_report.sh). Those are before the changes in the review. While I don't expect any change, I'll re-run some of them just in case, and try to investigate the slowdown. But that will take forever. Each run takes about a week on my test rig, and I don't have enough hardware to parallelize the tests. I will run a test on a snapshot of a particularly troublesome production database we have, that should be interesting. The benchmarks show a consistent improvement at scale 400, which may be related to the search implementation being better somehow, and a slowdown at scale 4000 in some variants. I believe this is due to those variants having highly clustered indexes. While the "shuf" (shuffled) variantes were intended to be the opposite of that, I suspect I somehow failed to get the desired outcome, so I'll be double-checking that. In any case the slowdown is only materialized when vacuuming with a large mwm setting, which is something that shouldn't happen unintentionally. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: