Re: Use streaming read API in ANALYZE - Mailing list pgsql-hackers

From Mats Kindahl
Subject Re: Use streaming read API in ANALYZE
Date
Msg-id CA+14426ro2tZ-3mjROxWMMwnb8o=psTfoQSuBWtGz8FEtSouHA@mail.gmail.com
Whole thread Raw
In response to Re: Use streaming read API in ANALYZE  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
On Tue, Sep 10, 2024 at 6:04 AM Thomas Munro <thomas.munro@gmail.com> wrote:
On Tue, Sep 10, 2024 at 10:27 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> Mats, what do you think about
> this?  (I haven't tried to preserve the prefetching behaviour, which
> probably didn't actually too work for you in v16 anyway at a guess,
> I'm just looking for the absolute simplest thing we can do to resolve
> this API mismatch.)  TimeScale could then continue to use its v16
> coding to handle the two-relations-in-a-trenchcoat problem, and we
> could continue discussing how to make v18 better.

. o O { Spitballing here: if we add that tiny function I showed to get
you unstuck for v17, then later in v18, if we add a multi-relation
ReadStream constructor/callback (I have a patch somewhere, I want to
propose that as it is needed for streaming recovery), you could
construct a new ReadSteam of your own that is daisy-chained from that
one.  You could keep using your N + M block numbering scheme if you
want to, and the callback of the new stream could decode the block
numbers and redirect to the appropriate relation + real block number.

I think it is good to make as small changes as possible to the RC, so agree with this approach. Looking at the patch. I think it will work, but I'll do some experimentation with the patch.

Just asking, is there any particular reason why you do not want to *add* new functions for opaque objects inside a major release? After all, that was the reason they were opaque from the beginning and extending with new functions would not break any existing code, not even from the ABI perspective.
 
That way you'd get I/O concurrency for both relations (for now just
read-ahead advice, but see Andres's AIO v2 thread).  That'd
essentially be a more supported version of the 'access the struct
internals' idea (or at least my understanding of what you had in
mind), through daisy-chained streams.  A little weird maybe, and maybe
the redesign work will result in something completely
different/better... just a thought... }

I'll take a look at the thread. I really think the ReadStream abstraction is a good step in the right direction.
--
Best wishes,
Mats Kindahl, Timescale

pgsql-hackers by date:

Previous
From: shveta malik
Date:
Subject: Re: Conflict detection for update_deleted in logical replication
Next
From: Maxim Orlov
Date:
Subject: Re: Add memory/disk usage for WindowAgg nodes in EXPLAIN