Re: Streaming read-ready sequential scan code - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: Streaming read-ready sequential scan code |
Date | |
Msg-id | CAAKRu_arLQjAknvDOa+DQkOPcZRtS257dETFLgrhEwAVdv+WBQ@mail.gmail.com Whole thread Raw |
In response to | Re: Streaming read-ready sequential scan code (Thomas Munro <thomas.munro@gmail.com>) |
Responses |
Re: Streaming read-ready sequential scan code
|
List | pgsql-hackers |
On Thu, Apr 4, 2024 at 7:45 AM Thomas Munro <thomas.munro@gmail.com> wrote: > > On Thu, Apr 4, 2024 at 8:02 PM David Rowley <dgrowleyml@gmail.com> wrote: > > 3a4a3537a > > latency average = 34.497 ms > > latency average = 34.538 ms > > > > 3a4a3537a + read_stream_for_seqscans.patch > > latency average = 40.923 ms > > latency average = 41.415 ms > > > > i.e. no meaningful change from the refactor, but a regression from a > > cached workload that changes the page often without doing much work in > > between with the read stread patch. > > I ran Heikki's test except I ran the "insert" 4 times to get a table > of 4376MB according to \d+. On my random cloud ARM server (SB=8GB, > huge pages, parallelism disabled), I see a speedup 1290ms -> 1046ms > when the data is in LInux cache and PG is not prewarmed, roughly as he > reported. Good. > > If I pg_prewarm first, I see that slowdown 260ms -> 294ms. Trying > things out to see what works, I got that down to 243ms (ie beat > master) by inserting a memory prefetch: > > --- a/src/backend/storage/aio/read_stream.c > +++ b/src/backend/storage/aio/read_stream.c > @@ -757,6 +757,8 @@ read_stream_next_buffer(ReadStream *stream, void > **per_buffer_data) > /* Prepare for the next call. */ > read_stream_look_ahead(stream, false); > > + __builtin_prefetch(BufferGetPage(stream->buffers[stream->oldest_buffer_index])); > > Maybe that's a solution to a different problem that just happens to > more than make up the difference in this case, and it may be > questionable whether that cache line will survive long enough to help > you, but this one-tuple-per-page test likes it... Hmm, to get a more > realistic table than the one-tuple-per-page on, I tried doubling a > tenk1 table until it reached 2759MB and then I got a post-prewarm > regression 702ms -> 721ms, and again I can beat master by memory > prefetching: 689ms. > > Annoyingly, with the same table I see no difference between the actual > pg_prewarm('giga') time: around 155ms for both. pg_prewarm is able to > use the 'fast path' I made pretty much just to be able to minimise > regression in that (probably stupid) all-cached tes that doesn't even > look at the page contents. Unfortunately seq scan can't use it > because it has per-buffer data, which is one of the things it can't do > (because of a space management issue). Maybe I should try to find a > way to fix that. So, sequential scan does not have per-buffer data. I did some logging and the reason most fully-in-SB sequential scans don't use the fast path is because read_stream->pending_read_nblocks is always 0. When when read_stream->distance stays 1 (expected for all-in-SB as it is initialized to 1 and we don't want distance to ramp up), read_stream_look_ahead() never increments read_stream->pending_read_nblocks because it sets it to 1 the first time it is called and then the conditions of the while loop are not met again while (stream->ios_in_progress < stream->max_ios && stream->pinned_buffers + stream->pending_read_nblocks < stream->distance) distance is 1, pending_read_nblocks is 1, thus we only loop once and don't increment pending_read_nblocks. prewarm is only able to use the fast path because it passes READ_STREAM_FULL and thus initializes read_stream->distance to a higher initial value. I added some logging to see if any of the sequential scans in the regression suite used the fast path. The one example I see of the fast path being used is a temp table IO stats test in src/test/regress/sql/stats.sql. I didn't check exactly what conditions led it to do this. But we probably want seq scans which are all in SB to use the fast path. - Melanie
pgsql-hackers by date: