Re: Streaming read-ready sequential scan code - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: Streaming read-ready sequential scan code |
Date | |
Msg-id | 20240401195848.qdlgt6u4dlyikm5k@liskov Whole thread Raw |
In response to | Re: Streaming read-ready sequential scan code (Melanie Plageman <melanieplageman@gmail.com>) |
Responses |
Re: Streaming read-ready sequential scan code
|
List | pgsql-hackers |
On Wed, Mar 27, 2024 at 08:47:03PM -0400, Melanie Plageman wrote: > On Fri, Mar 8, 2024 at 4:56 PM Melanie Plageman > <melanieplageman@gmail.com> wrote: > > > > On Sat, Mar 02, 2024 at 06:07:48PM -0500, Melanie Plageman wrote: > > > On Wed, Feb 28, 2024 at 12:30 PM Melanie Plageman > > > <melanieplageman@gmail.com> wrote: > > > > > > > > On Mon, Feb 26, 2024 at 03:56:57PM -0500, Melanie Plageman wrote: > > > > > On Mon, Feb 19, 2024 at 6:05 PM Melanie Plageman > > > > > <melanieplageman@gmail.com> wrote: > > > > > > > > > > > > On Mon, Jan 29, 2024 at 4:17 PM Melanie Plageman > > > > > > <melanieplageman@gmail.com> wrote: > > > > > > > > > > > > > > There is an outstanding question about where to allocate the > > > > > > > PgStreamingRead object for sequential scans > > > > > > > > > > > > I've written three alternative implementations of the actual streaming > > > > > > read user for sequential scan which handle the question of where to > > > > > > allocate the streaming read object and how to handle changing scan > > > > > > direction in different ways. > > > > > > > > > > > > Option A) https://github.com/melanieplageman/postgres/tree/seqscan_pgsr_initscan_allocation > > > > > > - Allocates the streaming read object in initscan(). Since we do not > > > > > > know the scan direction at this time, if the scan ends up not being a > > > > > > forwards scan, the streaming read object must later be freed -- so > > > > > > this will sometimes allocate a streaming read object it never uses. > > > > > > - Only supports ForwardScanDirection and once the scan direction > > > > > > changes, streaming is never supported again -- even if we return to > > > > > > ForwardScanDirection > > > > > > - Must maintain a "fallback" codepath that does not use the streaming read API > > > > > > > > > > Attached is a version of this patch which implements a "reset" > > > > > function for the streaming read API which should be cheaper than the > > > > > full pg_streaming_read_free() on rescan. This can easily be ported to > > > > > work on any of my proposed implementations (A/B/C). I implemented it > > > > > on A as an example. > > > > > > > > Attached is the latest version of this patchset -- rebased in light of > > > > Thomas' updatees to the streaming read API [1]. I have chosen the > > > > approach I think we should go with. It is a hybrid of my previously > > > > proposed approaches. > > > > > > While investigating some performance concerns, Andres pointed out that > > > the members I add to HeapScanDescData in this patch push rs_cindex and > > > rs_ntuples to another cacheline and introduce a 4-byte hole. Attached > > > v4's HeapScanDescData is as well-packed as on master and its members > > > are reordered so that rs_cindex and rs_ntuples are back on the second > > > cacheline of the struct's data. > > > > I did some additional profiling and realized that dropping the > > unlikely() from the places we check rs_inited frequently was negatively > > impacting performance. v5 adds those back and also makes a few other > > very minor cleanups. > > > > Note that this patch set has a not yet released version of Thomas > > Munro's Streaming Read API with a new ramp-up logic which seems to fix a > > performance issue I saw with my test case when all of the sequential > > scan's blocks are in shared buffers. Once he sends the official new > > version, I will rebase this and point to his explanation in that thread. > > Attached v6 has the version of the streaming read API mentioned here > [1]. This resolved the fully-in-shared-buffers regressions > investigated in that thread by Andres, Bilal, and Thomas. Attached v7 has version 14 of the streaming read API as well as a few small tweaks to comments and code. I noticed that 0001 in the set posed a small regression from master for a sequential scan of a relation already in shared buffers. While investigating this, I saw that heapfetchbuf() was still not being inlined (compiled at -O2) and when I promoted heapfetchbuf() from static inline to static pg_attribute_always_inline, most of the very small regression I saw went away. I don't know if I squashed the issue entirely, though. - Melanie
Attachment
pgsql-hackers by date: