Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan |
Date | |
Msg-id | CAH2-WzngJFKNQ2kN2Ae-Mu_DkKdxebLJhmAPcqMO_Mj5zxU_rA@mail.gmail.com Whole thread Raw |
In response to | Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan (Matthias van de Meent <boekewurm+postgres@gmail.com>) |
List | pgsql-hackers |
On Wed, Mar 6, 2024 at 4:51 PM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > To clarify, what I mean here is that merging the changes of both the > SAOPs changes and the removal of arrayKeyData seems to increase the > patch size and increases the maximum complexity of each component > patch's review. Removing arrayKeyData probably makes the patch very slightly smaller, actually. But even if it's really the other way around, I'd still like to get rid of it as part of the same commit as everything else. It just makes sense that way. > Separate patches may make this more reviewable, or not, but no comment > was given on why it is better to merge the changes into a single > patch. Fair enough. Here's why: The original SAOP design (commit 9e8da0f7, "Teach btree to handle ScalarArrayOpExpr quals natively") added a layer of indirection between scan->keyData (input scan keys) and so->keyData (output scan keys): it added another scan key array, so->arrayKeyData. There was array-specific preprocessing in _bt_preprocess_array_keys, that happened before the first primitive index scan even began -- that transformed our true input scan keys (scan->keyData) into a copy of the array with limited amounts of array-specific preprocessing already performed (so->arrayKeyData). This made a certain amount of sense at the time, because _bt_preprocess_keys was intended to be called once per primitive index scan. Kind of like the inner side of a nested loop join's inner index scan, where we call _bt_preprocess_keys once per inner-side scan/btrescan call. (Actually, Tom's design has us call _bt_preprocess once per primitive index scan per btrescan call, which might matter in those rare cases where the inner side of a nestloop join had SAOP quals.) What I now propose to do is to just call _bt_preprocess_keys once per btrescan (actually, it's still called once per primitive index scan, but all calls after the first are just no-ops after v12 of the patch). This makes sense because SAOP array constants aren't like nestloop joins with an inner index scans, in one important way: we really can see everything up-front. We can see all of the array elements, and operate on whole arrays as necessary during preprocessing (e.g., performing the array merging thing I added to _bt_preprocess_array_keys). It's not like the next array element is only visible to prepocessing only after the outer side of a nestloop join runs, and next calls btrescan -- so why treat it like that? Conceptually, "WHERE a = 1" is almost the same thing as "WHERE a = any('{1,2}')", so why not use essentially the same approach to preprocessing in both cases? We don't need to copy the input keys into so->arrayKeyData, because the indirection (which is a bit like a fake nested loop join) doesn't buy us anything. v13 of the patch doesn't quite 100% eliminate so->arrayKeyData. While it removes the arrayKeyData field from the BTScanOpaqueData struct, we still use a temporary array (accessed via a pointer that's just a local variable) that's also called arrayKeyData. And that also stores array-preprocessing-only copies of the input scan keys. That happens within _bt_preprocess_keys. So we do "still need arrayKeyData", but we only need it for a brief period at the start of the index scan. It just doesn't make any sense to keep it around for longer than that, in a world where _bt_preprocess_keys "operates directly on arrays". That only made sense (a bit of sense) back when _bt_preprocess_keys was subordinate to _bt_preprocess_array_keys, but it's kinda the other way around now. We could probably even get rid of this remaining limited form of arrayKeyData, but that doesn't seem like it would add much. -- Peter Geoghegan
pgsql-hackers by date: