Re: Table AM Interface Enhancements - Mailing list pgsql-hackers

From Pavel Borisov
Subject Re: Table AM Interface Enhancements
Date
Msg-id CALT9ZEEYUKgkkrb3034iJxCk8nu_xxnrm+154upKkzutWXSaBQ@mail.gmail.com
Whole thread Raw
In response to Re: Table AM Interface Enhancements  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Table AM Interface Enhancements
List pgsql-hackers


On Mon, 15 Apr 2024 at 19:36, Robert Haas <robertmhaas@gmail.com> wrote:
On Sat, Apr 13, 2024 at 5:28 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> Yes, I think so.  Table AM API deals with TIDs and block numbers, but
> doesn't force on what they actually mean.  For example, in ZedStore
> [1], data is stored on per-column B-trees, where TID used in table AM
> is just a logical key of that B-trees.  Similarly, blockNumber is a
> range for B-trees.
>
> c6fc50cb4028 and 041b96802ef are putting to acquire_sample_rows() an
> assumption that we are sampling physical blocks as they are stored in
> data files.  That couldn't anymore be some "logical" block numbers
> with meaning only table AM implementation knows.  That was pointed out
> by Andres [2].  I'm not sure if ZedStore is alive, but there could be
> other table AM implementations like this, or other implementations in
> development, etc.  Anyway, I don't feel good about narrowing the API,
> which is there from pg12.

I spent some time looking at this. I think it's valid to complain
about the tighter coupling, but c6fc50cb4028 is there starting in v14,
so I don't think I understand why the situation after 041b96802ef is
materially worse than what we've had for the last few releases. I
think it is worse in the sense that, before, you could dodge the
problem without defining USE_PREFETCH, and now you can't, but I don't
think we can regard nonphysical block numbers as a supported scenario
on that basis.

But maybe I'm not correctly understanding the situation?
Hi, Robert!

In my understanding, the downside of 041b96802ef is bringing read_stream* things from being heap-only-related up to the level of acquire_sample_rows() that is not supposed to be tied to heap. And changing *_analyze_next_block() function signature to use ReadStream explicitly in the signature. 

Regards,
Pavel.

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: pg17 issues with not-null contraints
Next
From: Nathan Bossart
Date:
Subject: Re: allow changing autovacuum_max_workers without restarting