On Mon, Apr 15, 2024 at 11:17 PM Andres Freund <andres@anarazel.de> wrote: > On 2024-04-15 16:02:00 -0400, Robert Haas wrote: > > Do you want that patch applied, not applied, or applied with some set of > > modifications? > > I think we should apply Alexander's proposed revert and then separately > discuss what we should do about 041b96802ef.
Taking a closer look at acquire_sample_rows(), I think it would be good if table AM implementation would care about block-level (or whatever-level) sampling. So that acquire_sample_rows() just fetches tuples one-by-one from table AM implementation without any care about blocks. Possible table_beginscan_analyze() could take an argument of target number of tuples, then those tuples are just fetches with table_scan_analyze_next_tuple(). What do you think?
Hi, Alexander!
I like the idea of splitting abstraction levels for:
1. acquirefuncs (FDW or physical table)
2. new specific row fetch functions (alike to existing _scan_analyze_next_tuple()), that could be AM-specific.
Then scan_analyze_next_block() or another iteration algorithm would be contained inside table AM implementation of _scan_analyze_next_tuple().
So, init of scan state would be inside table AM implementation of _beginscan_analyze(). Scan state (like BlockSamplerData or other state that could be custom for AM) could be transferred from _beginscan_analyze() to _scan_analyze_next_tuple() by some opaque AM-specific data structure. If so we'll also may need AM-specific table_endscan_analyze to clean it.