Re: Feature request for adoptive indexes - Mailing list pgsql-hackers

From Hayk Manukyan
Subject Re: Feature request for adoptive indexes
Date
Msg-id CAF+kZOFhNJ5NwvOQksGrCKEeMuWTxg+LEF4mYwSDtYB3J-UkBA@mail.gmail.com
Whole thread Raw
In response to Re: Feature request for adoptive indexes  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Feature request for adoptive indexes  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
ok. here is the deal if I have the following index with 6 column

CREATE INDEX ON job_nlp_year_scan (job, nlp, year, scan_id, issue_flag, sequence);

I need to specify all 6 columns in where clause in order to fully use this index.
It will not be efficient in cases when I have 4 condition in where clause also I should follow the order of columns.
In case of INCLUDE the 3 columns just will be in index but will not be structured as index so it will have affect only if In select I will have that 6 columns nothing more.

In my case I have table with ~15 columns
In my application  I have to do a lot of queries with following where clauses 

1. where  job = <something> and nlp = <something> and year = <something> and SCAN_ID = <something>
2. where  job = <something> and nlp = <something> and year = <something> and ISSUE_FLAG = <something>
3. where  job = <something> and nlp = <something> and year = <something> and SEQUENCE = <something>

I don't want to index just on  job, nlp, year because for each  job, nlp, year I have approximately 5000-7000 rows ,
overall table have ~50m rows so it is partitioned by job as well.  So if I build 3 separate indexes it will be huge resource.
So I am thinking of having one index which will be job, nlp, year and the 4-th layer will be other columns not just included but also in B-tree structure. 
To visualize it will be something like this:
image.png
The red part is ordinary index with nested b-trees ant the yellow part is adaptive part so depends on
where clause optimizer can decide which direction (leaf, b-tree whatever) to chose.
In this case I will have one index and will manage red part only once for all three cases.
Those it make sense ? 
If you need more discussion we can have short call I will try to explain you in more detailed way. 

best regards 

пн, 25 окт. 2021 г. в 19:33, Tomas Vondra <tomas.vondra@enterprisedb.com>:
Hi,

On 10/25/21 16:07, Hayk Manukyan wrote:
> Hi everyone. I want to do some feature request regarding indexes, as far as
> I know this kind of functionality doesn't exists in Postgres. Here is my
> problem :
> I need to create following indexes:
>          Create index job_nlp_year_scan on ingest_scans_stageing
> (`job`,`nlp`,`year`,`scan_id`);
>          Create index job_nlp_year_issue_flag on ingest_scans_stageing
> (`job`,`nlp`,`year`,`issue_flag`);
>          Create index job_nlp_year_sequence on ingest_scans_stageing
> (`job`,`nlp`,`year`,`sequence`);
> As you can see the first 3 columns are the same (job, nlp, year). so if I
> create 3 different indexes db should manage same job_nlp_year structure 3
> times.
> The Data Structure that I think which can be efficient in this kind of
> scenarios is to have 'Adaptive Index'  which will be something like
> Create index job_nlp_year on ingest_scans_stageing
> (`job`,`nlp`,`year`,(`issue_flag`,`scan_id`, `sequence`));
> And depend on query it will use or job_nlp_year_scan  or
> job_nlp_year_issue_flag , or job_nlp_year_sequence ( job, nlp, year and one
> of ( `issue_flag` , `scan_id` ,  `sequence` )
> For more description please feel free to refer me

It's not very clear what exactly would the "adaptive index" do, except
that it'd have all three columns. Clearly, the three columns can't be
considered for ordering etc. but need to be in the index somehow. So why
wouldn't it be enough to either to create an index with all six columns?

CREATE INDEX ON job_nlp_year_scan (job, nlp, year, scan_id, issue_flag,
sequence);

or possibly with the columns just "included" in the index:

CREATE INDEX ON job_nlp_year_scan (job, nlp, year) INCLUDE (scan_id,
issue_flag, sequence);

If this does not work, you either need to explain more clearly what
exactly the adaptive indexes does, or show queries that can't benefit
from these existing features.


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

pgsql-hackers by date:

Previous
From: Ronan Dunklau
Date:
Subject: Re: pg_receivewal starting position
Next
From: Greg Nancarrow
Date:
Subject: Re: row filtering for logical replication