Home > mailing lists

RFC: Table access methods and scans - Mailing list pgsql-hackers

From	Mats Kindahl
Subject	RFC: Table access methods and scans
Date	March 31, 2021 23:10:22
Msg-id	CA+144262LYLjcLJHvGMKJNaUsPU83Z76F-O4WHS03muWZ9nFwg@mail.gmail.com Whole thread Raw
Responses	Re: RFC: Table access methods and scans
List	pgsql-hackers

Tree view

Hi all,

I started looking into how table scans are handled for table access
methods and have discovered a few things that I find odd. I cannot
find any material regarding why this particular choice was made (if
anybody has pointers, I would be very grateful).

I am quite new to PostgreSQL so forgive me if my understanding of the
code below is wrong and please clarify what I have misunderstood.

I noted that `scan_begin` accepts a `ScanKey` and my *guess* was that
the intention for adding this to the interface was to support primary
indexes for table access methods (the comment is a little vague, but
it seems to point to that). However, looking at where `scan_begin` is
called from, I see that it is called from the following methods in
`tableam.h`:

- `table_beginscan` is always called using zero scan keys and NULL.
- `table_beginscan_strat` is mostly called with zero keys and NULL,
with the exception of `systable_beginscan`, which is only for system
tables. It does use this feature.
- `table_beginscan_bm` is only called with zero keys and NULL.
- `table_beginscan_sampling` is only called with zero keys and NULL.
- `table_beginscan_tid` calls `scan_begin` with zero keys and NULL.
- `table_beginscan_analyze` calls `scan_begin` with zero keys and NULL.
- `table_beginscan_catalog` is called with more than one key, but
AFACT this is only for catalog tables.
- `table_beginscan_parallel` calls `scan_begin` with zero keys and NULL.

I draw the conclusion that the scan keys only make sense for a table
access method for the odd case where it is used for a system tables or
catalog tables, so for all practical purposes the scan key cannot be
used to implement a primary index for general tables.

As an example of how this is useful, I noticed the work by Heikki and
Ashwin [1], where they return a `TableScanDesc` that contains
information about what columns to scan, which looks very useful. Since
the function `table_beginscan` in `src/include/access/tableam.h`
accept a `ScanKey` as input, this is (AFAICT) what Heikki and Ashwin
was exploiting to create a specialized scan for a columnar store.

Another example of where this can be useful is to optimize access
during a sequential scan when you can handle some specific scans very
efficiently and can "skip ahead" many tuples if you know what is being
looked for instead of filtering "late". Two examples of where this
could be useful are:

- An access method that reads data from a remote system and doesn't want
to transfer all tuples unless necessary.
- Some sort of log-structured storage with Bloom filters that allows
you to quickly skip suites that do not have a key.

Interestingly enough, `ScanKey` is generated for `IndexScan` and I
think that the same approach could be used for sequential scans: pick
out the quals that can be used for filtering and offer them to the
table access method through the `scan_begin` callback.

Thoughts around this?

Best wishes,
Mats Kindahl

[1] https://www.postgresql-archive.org/Zedstore-compressed-in-core-columnar-storage-tp6081536.html

pgsql-hackers by date:

From: Mark Dilger
Date: 31 March 2021, 23:07:48
Subject: Re: multi-install PostgresNode fails with older postgres versions

From: Tom Lane
Date: 31 March 2021, 23:15:24
Subject: Re: libpq debug log

RFC: Table access methods and scans - Mailing list pgsql-hackers

Previous

Next