Re: Adding skip scan (including MDAM style range skip scan) to nbtree - Mailing list pgsql-hackers
From | Aleksander Alekseev |
---|---|
Subject | Re: Adding skip scan (including MDAM style range skip scan) to nbtree |
Date | |
Msg-id | CAJ7c6TP2N9Dv06NkbFXYKOTaZuQcdiSf6PSKjbAPizzV3p1pfA@mail.gmail.com Whole thread Raw |
In response to | Re: Adding skip scan (including MDAM style range skip scan) to nbtree (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: Adding skip scan (including MDAM style range skip scan) to nbtree
|
List | pgsql-hackers |
Hi Peter, > It looks like the queries you posted have a kind of adversarial > quality to them, as if they were designed to confuse the > implementation. Was it intentional? To some extent. I merely wrote several queries that I would expect should benefit from skip scans. Since I didn't look at the queries you used there was a chance that I will hit something interesting. > Attached v2 fixes this bug. The problem was that the skip support > function used by the "char" opclass assumed signed char comparisons, > even though the authoritative B-Tree comparator (support function 1) > uses signed comparisons (via uint8 casting). A simple oversight. Your > test cases will work with this v2, provided you use "char" (instead of > unadorned char) in the create table statements. Thanks for v2. > If you change your table definition to CREATE TABLE test1(c "char", n > bigint), then your example queries can use the optimization. This > makes a huge difference. You are right, it does. Test1 takes 33.7 ms now (53 ms before the path, x1.57) Test3 I showed before contained an error in the table definition (Postgres can't do `n bigint, s text DEFAULT 'text_value' || n`). Here is the corrected test: ``` CREATE TABLE test3(c "char", n bigint, s text); CREATE INDEX test3_idx ON test3 USING btree(c,n) INCLUDE(s); INSERT INTO test3 SELECT chr(ascii('a') + random(0,2)) AS c, random(0, 1_000_000_000) AS n, 'text_value_' || random(0, 1_000_000_000) AS s FROM generate_series(0, 1_000_000); EXPLAIN ANALYZE SELECT s FROM test3 WHERE n < 10_000; ``` It runs fast (< 1 ms) and uses the index, as expected. Test2 with "char" doesn't seem to benefit from the patch anymore (pretty sure it did in v1). It always chooses Parallel Seq Scans even if I change the condition to `WHERE n > 999_995_000` or `WHERE n = 999_997_362`. Is it an expected behavior? I also tried Test4 and Test5. In Test4 I was curious if scip scans work properly with functional indexes: ``` CREATE TABLE test4(d date, n bigint); CREATE INDEX test4_idx ON test4 USING btree(extract(year from d),n); INSERT INTO test4 SELECT ('2024-' || random(1,12) || '-' || random(1,28)) :: date AS d, random(0, 1_000_000_000) AS n FROM generate_series(0, 1_000_000); EXPLAIN ANALYZE SELECT COUNT(*) FROM test4 WHERE n > 900_000_000; ``` The query uses Index Scan, however the performance is worse than with Seq Scan chosen before the patch. It doesn't matter if I choose '>' or '=' condition. Test5 checks how skip scans work with partial indexes: ``` CREATE TABLE test5(c "char", n bigint); CREATE INDEX test5_idx ON test5 USING btree(c, n) WHERE n > 900_000_000; INSERT INTO test5 SELECT chr(ascii('a') + random(0,2)) AS c, random(0, 1_000_000_000) AS n FROM generate_series(0, 1_000_000); EXPLAIN ANALYZE SELECT COUNT(*) FROM test5 WHERE n > 950_000_000; ``` It runs fast and choses Index Only Scan. But then I discovered that without the patch Postgres also uses Index Only Scan for this query. I didn't know it could do this - what is the name of this technique? The query takes 17.6 ms with the patch, 21 ms without the patch. Not a huge win but still. That's all I have for now. -- Best regards, Aleksander Alekseev
pgsql-hackers by date: