Re: automatic analyze: readahead - add "IO read time" log message - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: automatic analyze: readahead - add "IO read time" log message
Date
Msg-id 20210205212234.GT27507@tamriel.snowman.net
Whole thread Raw
In response to Re: automatic analyze: readahead - add "IO read time" log message  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: automatic analyze: readahead - add "IO read time" log message
List pgsql-hackers
Greetings,

* Heikki Linnakangas (hlinnaka@iki.fi) wrote:
> On 13/01/2021 23:17, Stephen Frost wrote:
> >Would be great to get a review / comments from others as to if there's
> >any concerns.  I'll admit that it seems reasonably straight-forward to
> >me, but hey, I wrote most of it, so that's not really a fair
> >assessment... ;)
>
> Look good overall. A few minor comments:

Thanks a lot for the review!

> The patch consists of two part: add stats to the log for auto-analyze, and
> implement prefetching. They seem like independent features, consider
> splitting into two patches.

Yeah, that's a good point.  I had anticipated that there would be
overlap but in the end there really wasn't.  Done in the attached.

> It's a bit weird that you get more stats in the log for
> autovacuum/autoanalyze than you get with VACUUM/ANALYZE VERBOSE. Not really
> this patch's fault though.

Agreed.

> This conflicts with the patch at https://commitfest.postgresql.org/31/2907/,
> to refactor the table AM analyze API. That's OK, it's straightforward to
> resolve regardless of which patch is committed first.

Agreed.

> >    /* Outer loop over blocks to sample */
> >    while (BlockSampler_HasMore(&bs))
> >    {
> >#ifdef USE_PREFETCH
> >        BlockNumber prefetch_targblock = InvalidBlockNumber;
> >#endif
> >        BlockNumber targblock = BlockSampler_Next(&bs);
> >
> >#ifdef USE_PREFETCH
> >
> >        /*
> >         * Make sure that every time the main BlockSampler is moved forward
> >         * that our prefetch BlockSampler also gets moved forward, so that we
> >         * always stay out ahead.
> >         */
> >        if (BlockSampler_HasMore(&prefetch_bs))
> >            prefetch_targblock = BlockSampler_Next(&prefetch_bs);
> >#endif
> >
> >        vacuum_delay_point();
> >
> >        if (!table_scan_analyze_next_block(scan, targblock, vac_strategy))
> >            continue;
> >
> >#ifdef USE_PREFETCH
> >
> >        /*
> >         * When pre-fetching, after we get a block, tell the kernel about the
> >         * next one we will want, if there's any left.
> >         */
> >        if (effective_io_concurrency && prefetch_targblock != InvalidBlockNumber)
> >            PrefetchBuffer(scan->rs_rd, MAIN_FORKNUM, prefetch_targblock);
> >#endif
> >
> >        while (table_scan_analyze_next_tuple(scan, OldestXmin, &liverows, &deadrows, slot))
> >        {
> >            ...
> >        }
> >
> >        pgstat_progress_update_param(PROGRESS_ANALYZE_BLOCKS_DONE,
> >                                     ++blksdone);
> >    }
>
> If effective_io_concurrency == 0, this calls BlockSampler_Next(&prefetch_bs)
> anyway, which is a waste of cycles.

Good point, fixed.

> If table_scan_analyze_next_block() returns false, we skip the
> PrefetchBuffer() call. That seem wrong.

Agreed, fixed.

> Is there any potential harm from calling PrefetchBuffer() on a page that
> table_scan_analyze_next_block() later deems as unsuitable for smapling and
> returns false? That's theoretical at the moment, because
> heapam_scan_analyze_next_block() always returns true. (The tableam ANALYZE
> API refactor patch will make this moot, as it moves this logic into the
> tableam's implementation, so the implementation can do whatever make sense
> for the particular AM.)

I can't see any potential harm and it seems pretty likely that if an
heapam_scan_analyze_next_block()-equivilant were to decide that a block
isn't appropriate to analyze it'd have to do so after reading that block
anyway, making the prefetch still useful.

Perhaps there'll be a case in the future where a given AM would know
based on just the block number that it isn't useful to analyze, in which
case it'd make sense to adjust the code to skip that block for both
Prefetching and actually reading, but I don't think that would be too
hard to do.  Doesn't seem sensible to invent that in advance of actually
having that case though- it's certainly not the case for heap AM today,
at least, as you say.

Unless there's anything else on this, I'll commit these sometime next
week.

Thanks again for the review!

Stephen

Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: new heapcheck contrib module
Next
From: Tom Lane
Date:
Subject: Re: making update/delete of inheritance trees scale better