Thread: Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)
Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)
From
Robert Haas
Date:
On Tue, Aug 27, 2024 at 11:16 AM Peter Geoghegan <pg@bowt.ie> wrote: > On Thu, Aug 15, 2024 at 3:22 PM Peter Geoghegan <pg@bowt.ie> wrote: > > Attached patch has EXPLAIN ANALYZE display the total number of > > primitive index scans for all 3 kinds of index scan node. > > Attached is v2, which fixes bitrot. > > v2 also uses new terminology. EXPLAIN ANALYZE will now show "Index > Searches: N", not "Primitive Index Scans: N". Although there is > limited precedent for using the primitive scan terminology, I think > that it's a bit unwieldy. I do like "Index Searches" better than "Primitive Index Scans." But I think Matthias had some good points about this being btree-specific. I'm not sure whether he was completely correct, but you seemed to just dismiss his argument and say "well, that can't be done," which doesn't seem convincing to me at all. If, for non-btree indexes, the number of index searches will always be the same as the loop count, then surely there is some way to avoid cluttering the output for non-btree indexes with output that can never be of any use. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)
From
Peter Geoghegan
Date:
On Tue, Aug 27, 2024 at 1:45 PM Robert Haas <robertmhaas@gmail.com> wrote: > I do like "Index Searches" better than "Primitive Index Scans." > > But I think Matthias had some good points about this being > btree-specific. It's not B-Tree specific -- not really. Any index scan that can at least non-natively support ScalarArrayOps (i.e. SAOP scans that the executor manages using ExecIndexEvalArrayKeys() + bitmap scans) will show information that is exactly equivalent to what B-Tree will show, given a similar ScalarArrayOps query. There is at best one limited sense in which the information shown is B-Tree specific: it tends to be more interesting in the case of B-Tree index scans. You cannot trivially derive the number based on the number of array keys for B-Tree scans, since nbtree is now clever about not needlessly searching the index anew. It's quite possible that other index AMs will in the future be enhanced in about the same way as nbtree was in commit 5bf748b86b, at which point even this will no longer apply. (Tom speculated about adding something like that to GiST recently). > I'm not sure whether he was completely correct, but > you seemed to just dismiss his argument and say "well, that can't be > done," which doesn't seem convincing to me at all. To be clear, any variation that you can think of *can* be done without much difficulty. I thought that Matthias was unclear about what he even wanted, is all. The problem isn't that there aren't any alternatives. The problem, if any, is that there are a huge number of slightly different alternatives. There are hopelessly subjective questions about what the best trade-off between redundancy and consistency is. I'm absolutely not set on doing things in exactly the way I've laid out. What do you think should be done? Note that the number of loops matters here, in addition to the number of SAOP primitive scans/searches. If you want to suppress the information shown in the typical "nsearches == 1" case, what does that mean for the less common "nsearches == 0" case? > If, for non-btree > indexes, the number of index searches will always be the same as the > loop count, then surely there is some way to avoid cluttering the > output for non-btree indexes with output that can never be of any use. Even if we assume that a given index/index AM will never use SAOPs, it's still possible to show more than one "Index Search" per executor node execution. For example, when an index scan node is the inner side of a nestloop join. I see value in making it obvious to users when and how pg_stat_all_indexes.idx_scan advances. Being able to easily relate it to EXPLAIN ANALYZE output is useful, independent of whether or not SAOPs happen to be used. That's probably the single best argument in favor of showing "Index Searches: N" unconditionally. But I'm certainly not going to refuse to budge over that. -- Peter Geoghegan
Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)
From
Tom Lane
Date:
Peter Geoghegan <pg@bowt.ie> writes: > I see value in making it obvious to users when and how > pg_stat_all_indexes.idx_scan advances. Being able to easily relate it > to EXPLAIN ANALYZE output is useful, independent of whether or not > SAOPs happen to be used. That's probably the single best argument in > favor of showing "Index Searches: N" unconditionally. But I'm > certainly not going to refuse to budge over that. TBH, I'm afraid that this patch basically is exposing numbers that nobody but Peter Geoghegan and maybe two or three other hackers will understand, and even fewer people will find useful (since the how-many-primitive-scans behavior is not something users have any control over, IIUC). I doubt that "it lines up with pg_stat_all_indexes.idx_scan" is enough to justify the additional clutter in EXPLAIN. Maybe we should be going the other direction and trying to make pg_stat_all_indexes count in a less detailed but less surprising way, ie once per indexscan plan node invocation. regards, tom lane
Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)
From
Peter Geoghegan
Date:
On Tue, Aug 27, 2024 at 3:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > TBH, I'm afraid that this patch basically is exposing numbers that > nobody but Peter Geoghegan and maybe two or three other hackers > will understand, and even fewer people will find useful (since the > how-many-primitive-scans behavior is not something users have any > control over, IIUC). You can make about the same argument against showing "Buffers". It's not really something that you can address directly, either. It's helpful only in the context of a specific problem. > I doubt that "it lines up with > pg_stat_all_indexes.idx_scan" is enough to justify the additional > clutter in EXPLAIN. The scheme laid out in the patch is just a starting point for discussion. I just think that it's particularly important that we have this for skip scan -- that's the part that I feel strongly about. With skip scan in place, every scan of the kind we'd currently call a "full index scan" will be eligible to skip. Whether and to what extent we actually skip is determined at runtime. We really need some way of determining how much skipping has taken place. (There are many disadvantages to having a dedicated skip scan index path, which I can go into if you want.) > Maybe we should be going the other direction > and trying to make pg_stat_all_indexes count in a less detailed but > less surprising way, ie once per indexscan plan node invocation. Is that less surprising, though? I think that it's more surprising. -- Peter Geoghegan
Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)
From
Robert Haas
Date:
On Tue, Aug 27, 2024 at 3:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Geoghegan <pg@bowt.ie> writes: > > I see value in making it obvious to users when and how > > pg_stat_all_indexes.idx_scan advances. Being able to easily relate it > > to EXPLAIN ANALYZE output is useful, independent of whether or not > > SAOPs happen to be used. That's probably the single best argument in > > favor of showing "Index Searches: N" unconditionally. But I'm > > certainly not going to refuse to budge over that. > > TBH, I'm afraid that this patch basically is exposing numbers that > nobody but Peter Geoghegan and maybe two or three other hackers > will understand, and even fewer people will find useful (since the > how-many-primitive-scans behavior is not something users have any > control over, IIUC). I doubt that "it lines up with > pg_stat_all_indexes.idx_scan" is enough to justify the additional > clutter in EXPLAIN. Maybe we should be going the other direction > and trying to make pg_stat_all_indexes count in a less detailed but > less surprising way, ie once per indexscan plan node invocation. I kind of had that reaction too initially, but I think that was mostly because "Primitive Index Scans" seemed extremely unclear. I think "Index Searches" is pretty comprehensible, honestly. Why shouldn't someone be able to figure out what that means? Might make sense to restrict this to VERBOSE mode, too. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)
From
Peter Geoghegan
Date:
On Wed, Aug 28, 2024 at 9:25 AM Robert Haas <robertmhaas@gmail.com> wrote: > Might make sense to restrict this to VERBOSE mode, too. If we have to make the new output appear selectively, I'd prefer to do it this way. There are lots of small problems with selectively displaying less/no information based on rules applied against the number of index searches/loops/whatever. While that general approach works quite well in the case of the "Buffers" instrumentation, it won't really work here. After all, the base case is that there is one index search per index scan node -- not zero searches. -- Peter Geoghegan
Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)
From
Peter Geoghegan
Date:
On Wed, Aug 28, 2024 at 9:25 AM Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Aug 27, 2024 at 3:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > I kind of had that reaction too initially, but I think that was mostly > because "Primitive Index Scans" seemed extremely unclear. I think > "Index Searches" is pretty comprehensible, honestly. Why shouldn't > someone be able to figure out what that means? Worth noting that Lukas Fittl made a point of prominently highlighting the issue with how this works when he explained the Postgres 17 nbtree work: https://pganalyze.com/blog/5mins-postgres-17-faster-btree-index-scans And no, I wasn't asked to give any input to the blog post. Lukas has a general interest in making the system easier to understand for ordinary users. Presumably that's why he zeroed in on this one aspect of the work. It's far from an esoteric implementation detail. -- Peter Geoghegan