Thread: Index creation

Index creation

From
Дмитрий Иванов
Date:
Good afternoon.
I have a query parser question. If there are two kinds of queries using an indexed field. In this case, one view is limited to this field, the second one uses a number of fields included in the index by the include directive. It makes sense to have two indexes, lightweight and containing include. Or will the plan rely on the nearest suitable index without considering its weight?
--
Regards, Dmitry!

Re: Index creation

From
"David G. Johnston"
Date:
On Sun, Jun 19, 2022 at 12:06 PM Дмитрий Иванов <firstdismay@gmail.com> wrote:
Good afternoon.
I have a query parser question. If there are two kinds of queries using an indexed field. In this case, one view is limited to this field, the second one uses a number of fields included in the index by the include directive. It makes sense to have two indexes, lightweight and containing include. Or will the plan rely on the nearest suitable index without considering its weight?


The system should avoid the larger sized index unless it will sufficiently benefit from the Index Only Scan that such a larger covering index is supposed to facilitate.

David J.

Re: Index creation

From
Дмитрий Иванов
Date:
Your statement seems obvious to me. But what I see doesn't seem like a conscious choice. It turns out that it is better to have a lighter general-purpose index than to strive to create a target covering index for a certain kind of operation.

DROP INDEX IF EXISTS bpd.index_class_prop_id_prop_inherit;
CREATE INDEX IF NOT EXISTS index_class_prop_id_prop_inherit
    ON bpd.class_prop USING btree
    (id_prop_inherit ASC NULLS LAST, timestamp_class_inherit ASC NULLS LAST)
    TABLESPACE pg_default;



DROP INDEX IF EXISTS bpd.index_class_prop_id_prop_inherit_covering;
CREATE INDEX IF NOT EXISTS index_class_prop_id_prop_inherit_covering
    ON bpd.class_prop USING btree
    (id_prop_inherit ASC NULLS LAST, timestamp_class_inherit ASC NULLS LAST)
    INCLUDE(id, id_class, inheritance)
    TABLESPACE pg_default;

Uchet=# SELECT relid, indexrelid, schemaname, relname, indexrelname, idx_scan, idx_tup_read, idx_tup_fetch
Uchet-#
Uchet-#     FROM bpd.cfg_v_stat_all_indexes WHERE indexrelname LIKE 'index_class_prop_id_prop_inherit%';
 relid | indexrelid | schemaname |  relname   |               indexrelname                | idx_scan | idx_tup_read | idx_tup_fetch
-------+------------+------------+------------+-------------------------------------------+----------+--------------+---------------
 17572 |      40036 | bpd        | class_prop | index_class_prop_id_prop_inherit          |        0 |            0 |             0
 17572 |      40037 | bpd        | class_prop | index_class_prop_id_prop_inherit_covering |     7026 |         7026 |             0
(2 rows)


DROP INDEX IF EXISTS bpd.index_class_prop_id_prop_inherit_covering;

CREATE INDEX IF NOT EXISTS index_class_prop_id_prop_inherit_covering
    ON bpd.class_prop USING btree
    (id_prop_inherit ASC NULLS LAST, timestamp_class_inherit ASC NULLS LAST)
    INCLUDE(id, id_class, inheritance)
    TABLESPACE pg_default;

DROP INDEX IF EXISTS bpd.index_class_prop_id_prop_inherit;

CREATE INDEX IF NOT EXISTS index_class_prop_id_prop_inherit
    ON bpd.class_prop USING btree
    (id_prop_inherit ASC NULLS LAST, timestamp_class_inherit ASC NULLS LAST)
    TABLESPACE pg_default;

Uchet=# SELECT relid, indexrelid, schemaname, relname, indexrelname, idx_scan, idx_tup_read, idx_tup_fetch FROM bpd.cfg_v_stat_all_indexes WHERE indexrelname LIK
E 'index_class_prop_id_prop_inherit%';
 relid | indexrelid | schemaname |  relname   |               indexrelname                | idx_scan | idx_tup_read | idx_tup_fetch
-------+------------+------------+------------+-------------------------------------------+----------+--------------+---------------
 17572 |      40049 | bpd        | class_prop | index_class_prop_id_prop_inherit          |     6356 |         6356 |             0
 17572 |      40048 | bpd        | class_prop | index_class_prop_id_prop_inherit_covering |        0 |            0 |             0
(2 rows)
--
Regards, Dmitry!


пн, 20 июн. 2022 г. в 00:08, David G. Johnston <david.g.johnston@gmail.com>:
On Sun, Jun 19, 2022 at 12:06 PM Дмитрий Иванов <firstdismay@gmail.com> wrote:
Good afternoon.
I have a query parser question. If there are two kinds of queries using an indexed field. In this case, one view is limited to this field, the second one uses a number of fields included in the index by the include directive. It makes sense to have two indexes, lightweight and containing include. Or will the plan rely on the nearest suitable index without considering its weight?


The system should avoid the larger sized index unless it will sufficiently benefit from the Index Only Scan that such a larger covering index is supposed to facilitate.

David J.

Re: Index creation

From
Jeff Janes
Date:
On Mon, Jun 20, 2022 at 1:17 AM Дмитрий Иванов <firstdismay@gmail.com> wrote:
Your statement seems obvious to me. But what I see doesn't seem like a conscious choice. It turns out that it is better to have a lighter general-purpose index than to strive to create a target covering index for a certain kind of operation.

If both indexes are expected to be hit only once in the query and return only one row, their expected costs will be the same.  In this case, the tie is broken arbitrarily, and that often means the most-recently created index will get chosen.

As the expected number of leaf page accesses in a given query goes up, the smaller index will start to look less expensive.

Cheers,

Jeff 

Re: Index creation

From
Дмитрий Иванов
Date:
Yes, you are right. The presented index usage data is caused by recursive queries, which check the integrity of hierarchical structures from the bottom up. Your explanation has clarified what is going on. Thank you.
My experiments with indexes are caused by the appearance of significant variance (1-180ms) in these operations, which appeared when I switched from version 12 to 14, which increased the checking time by ~250% with the existing implementation and ~40% after I rewrote the functions to run as dynamic SQL. This decision was due to the obvious correlation between the level of variance and the primary dataset obtained when the non-dynamic function was first called. I don't think my communication experience will allow me to properly describe the problem, but the information I received was useful. Thank you.
--
Regards, Dmitry!


пн, 20 июн. 2022 г. в 23:23, Jeff Janes <jeff.janes@gmail.com>:
On Mon, Jun 20, 2022 at 1:17 AM Дмитрий Иванов <firstdismay@gmail.com> wrote:
Your statement seems obvious to me. But what I see doesn't seem like a conscious choice. It turns out that it is better to have a lighter general-purpose index than to strive to create a target covering index for a certain kind of operation.

If both indexes are expected to be hit only once in the query and return only one row, their expected costs will be the same.  In this case, the tie is broken arbitrarily, and that often means the most-recently created index will get chosen.

As the expected number of leaf page accesses in a given query goes up, the smaller index will start to look less expensive.

Cheers,

Jeff