Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans) - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans) |
Date | |
Msg-id | CAH2-WzmebSkeKPGw7TEaNw9=Qx-X8fAnFw916Fd2V8VVqYqqaQ@mail.gmail.com Whole thread Raw |
In response to | Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans) (Matthias van de Meent <boekewurm+postgres@gmail.com>) |
Responses |
Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)
|
List | pgsql-hackers |
On Wed, Nov 27, 2024 at 8:36 AM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > > I think we should divide them because by dividing the total buffer usage by the number of loops, user finds the averagebuffer consumption per loop. This gives them a clearer picture of the resource intensity per basic unit of work. > > I disagree; I think the whole "dividing by number of loops and > rounding up to integer" was the wrong choice for tuple count, as that > makes it difficult if not impossible to determine the actual produced > count when it's less than the number of loops. Data is lost in the > rounding/processing, and I don't want to have lost that data. I think that you're definitely right about this. I changed my mind (or changed it back to my original position) recently, when I noticed how bad the problem was with parallel index scans: nloops generally comes from the number of workers (including the leader) for parallel scans, and so it wasn't that hard to see "Index Searches: 0" with the latest version (the version that started to divide by nloops). Obviously, that behavior is completely ridiculous. Let's not do that. The precedent to follow here is "Heap Fetches: N" (in the context of index-only scans), which also doesn't divide by nloops. Likely because the same sorts of issues arise with heap fetches. > Same applies for ~scans~ searches: If we do an index search, we should > show it in the count as total sum, not partial processed value. If a > user is interested in per-loopcount values, then they can derive that > value from the data they're presented with; but that isn't true when > we present only the divided-and-rounded value. I recently came across a good example of how showing "Index Searches: N" is likely to be useful in the context of nested loop joins. The example comes from the recently committed ORs-to-SAOP join transformation work (commit 627d6341). If I run the test case (taken from src/test/regress/sql/join.sql) with EXPLAIN ANALYZE, the output confirms that the optimization added by that commit works particularly well: pg@regression:5432 [205457]=# explain (analyze, costs off, SUMMARY off) select count(*) from tenk1 t1, tenk1 t2 where t2.thousand = t1.tenthous or t2.thousand = t1.unique1 or t2.thousand = t1.unique2; ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Aggregate (actual time=13.761..13.762 rows=1 loops=1) │ │ Buffers: shared hit=24201 │ │ -> Nested Loop (actual time=0.011..12.928 rows=20000 loops=1) │ │ Buffers: shared hit=24201 │ │ -> Seq Scan on tenk1 t1 (actual time=0.004..0.507 rows=10000 loops=1) │ │ Buffers: shared hit=345 │ │ -> Index Only Scan using tenk1_thous_tenthous on tenk1 t2 (actual time=0.001..0.001 rows=2 loops=10000) │ │ Index Cond: (thousand = ANY (ARRAY[t1.tenthous, t1.unique1, t1.unique2])) │ │ Index Searches: 11885 │ │ Heap Fetches: 0 │ │ Buffers: shared hit=23856 │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ (11 rows) As you can see, there are 10,000 executions of the inner index-only scan here, which has a SAOP qual whose array will always have 3 array elements. That means that the best possible case is 10,000 index searches, and the worst possible case is 30,000 index searches. We actually see "Index Searches: 11885" -- not bad! The main factor that gets us pretty close to that best possible case is a certain kind of redundancy: many individual inner index scans have duplicate array elements, allowing nbtree preprocessing to shrink the array when as it is sorted and deduplicated -- the array used during many individual inner scan executions has as few as one or two array elements. Another contributing factor is the prevalence of "out of bounds" array elements: many individual SAOP arrays/inner scans have 2 array elements that are both greater than 1,000. That'll allow nbtree to get away with needing only one index search for all out-of-bounds array elements. That is, it allows nbtree to determine that all out-of-bounds elements can't possibly have any matches using only one index search (a search that lands on the rightmost leaf page, where no matches for any out-of-bounds element will be found). Of course, this can only be surmised from the EXPLAIN ANALYZE output shown because I went back to not dividing by nloops within explain.c. A huge amount of useful information would be lost in cases like this if we divide by nloops. So, again, let's not do it that way. It'd be just as easy to surmise what's going on here if the inner index scan happened to be a plain index scan. That would make the "Buffers" output include heap buffer hits, which would usually make it impossible to infer how many of the "Buffers hit" came from the index structure. My analysis didn't rely on "Buffers" at all, though (only on "Index Searches: 11885" + "loops=10000"), so everything I pointed out would be just as readily apparent. -- Peter Geoghegan
pgsql-hackers by date: