Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers
From | Melanie Plageman |
---|---|
Subject | Re: BitmapHeapScan streaming read user and prelim refactoring |
Date | |
Msg-id | CAAKRu_Y8rtavnZCSTk2DrYjKMygoSYuVKdPzbV5WviLzzeUKUA@mail.gmail.com Whole thread Raw |
In response to | Re: BitmapHeapScan streaming read user and prelim refactoring (Melanie Plageman <melanieplageman@gmail.com>) |
List | pgsql-hackers |
On Mon, Feb 10, 2025 at 1:02 PM Melanie Plageman <melanieplageman@gmail.com> wrote: > > It'll be hard to look into all of these, so I think I'll focus on > trying to reproduce something with eic=1 that I can reproduce on my > machine. So far, I can reproduce a regression with the following and > the data file attached. > > # initdb and get set up with shared_buffers 1GB > psql -c "create table bitmap_scan_test (a bigint, b bigint, c text) > with (fillfactor = 25)" > psql -c "copy bitmap_scan_test from '/tmp/bitmap_scan_test.data'" > psql -c "create index on bitmap_scan_test (a)" > psql -c "vacuum analyze" > psql -c "checkpoint" > > pg_ctl stop > echo 3 | sudo tee /proc/sys/vm/drop_caches > pg_ctl start > psql -c "SET max_parallel_workers_per_gather = 4;" \ > -c "SET effective_io_concurrency = 1;" \ > -c "SET parallel_setup_cost = 0;" \ > -c "SET parallel_tuple_cost = 0;" \ > -c "SET enable_seqscan = off;" \ > -c "SET enable_indexscan = off;" \ > -c "SET work_mem = 65536;" > > psql -c "EXPLAIN SELECT * FROM bitmap_scan_test WHERE (a BETWEEN -33 > AND 10015) OFFSET 1000000;" > psql -c "SELECT * FROM bitmap_scan_test WHERE (a BETWEEN -33 AND > 10015) OFFSET 1000000;" I think I figured out why there is a regression. On master, parallel bitmap heap scans seem to end up cheating effective_io_concurrency. What you expect to see with effective_io_concurrency == 1 is a single pread followed by a single fadvise. We can prefetch up to one block before reading the next block. This is what you see on both the patch and master with a serial bitmap heap scan. This is also what you see with the patch if you strace a participating parallel bitmap heap scan process. On master, however, you do not see this 1-1 interleaving for parallel bitmap heap scan. On master we typically issue many fadvises in a row followed by a few preads in a row. For example: fadvise64 fadvise64 fadvise64 fadvise64 pread64 fadvise64 fadvise64 pread64 pread64 fadvise64 On master, while executing this query, the leader did more than 2000 runs of > 1 fadvise or pread in a row. With the patch, there are essentially none. On master, parallel bitmap heap scans' prefetching behavior is controlled by some shared pstate members, prefetch_target and prefetch_pages. Prefetching is supposed to be allowed only up to prefetch_target -- which is capped at effective_io_concurrency. Incrementing and decrementing these variables is not based on whether or not the process actually did a read or a prefetch -- only on the values of those shared memory variables. I think what is happening due to quirks of CPU scheduling is that some of the processes are actually issuing more consecutive reads and prefetches and another process is incrementing and decrementing those values in a way that makes this possible. This effectively increases effective_io_concurrency for parallel bitmap heap scans on master. The patch can't really compete because it is interleaving every read with an fadvise -- preventing readahead. I don't really know what to do about this. The behavior of master parallel bitmap heap scan can be emulated with the patch by increasing effective_io_concurrency. But, IIRC we didn't want to do that for some reason? Not only does effective_io_concurrency == 1 negatively affect read ahead, but it also prevents read combining regardless of the io_combine_limit. - Melanie
pgsql-hackers by date: