Re: Feature request for adoptive indexes - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Feature request for adoptive indexes
Date
Msg-id 39a8239e-b25d-d9ce-4935-4fb7fe10be7e@enterprisedb.com
Whole thread Raw
In response to Re: Feature request for adoptive indexes  (Hayk Manukyan <manukyantt@gmail.com>)
Responses Re: Feature request for adoptive indexes  (Pavel Borisov <pashkin.elfe@gmail.com>)
List pgsql-hackers

On 10/29/21 15:32, Hayk Manukyan wrote:
> Hi all
> First of all thank you all for fast and rich responses, that is really nice.
> I don't have that deep knowledge of how postgres  works under the hood 
> so I will try to explain more user side.
> I want to refer for some points mentioned above.
>   - First INCLUDE statement mostly eliminates the necessity to refer to 
> a clustered index or table to get columns that do not exist in the 
> index. So filtering upon columns in INCLUDE statement will not be 
> performant. It can give some very little performance if we include 
> additional columns but it is not in level to compare with indexed one. I 
> believe this not for this case
> - Tomas Vondra's Assumption that adaptive should be something between 
> this two
> 1) (job, nlp, year, sequence)
> 2) (job, nlp, year, scan_id, issue_flag, sequence)
> is completely valid. I have made fairly small demo with this index 
> comparison and as I can see the difference is noticeable. Here is git 
> repo and results 
> <https://github.com/HaykManukyanAvetiky/index_comparition/blob/main/results.md> , 
> I had no much time to do significant one sorry for that ))

I find those results entirely unconvincing, or maybe even suspicious.

I used the script to create the objects, and the index sizes are:

                        Name                   |  Size
     ------------------------------------------+---------
      job_nlp_year_scan_id_issue_flag_sequence | 1985 MB
      job_nlp_year_sequence                    | 1985 MB

So there's no actual difference, most likely due to alignment making up 
for the two smalling columns.

And if I randomize the queries instead of running them with the same 
parameters over and over (see the attached scripts), then an average of 
10 runs, each 60s long, the results are (after a proper warmup)

  pgbench -n -f q4.sql -T 60

  4 columns: 106 ms
  6 columns: 109 ms

So there's like 3% difference between the two cases, and even that might 
be just noise. This is consistent with the two indexes being about the 
same size.

This is on machine with i5-2500k CPU and 8GB of RAM, which is just 
enough to keep everything in RAM. It seems somewhat strange that your 
machine does this in 10ms, i.e. 10x faster. Seems strange.


I'm not sure what is the point of the second query, considering it's not 
even using an index but parallel seqscan.


Anyway, this still fails to demonstrate any material difference between 
the two indexes, and consequently any potential benefit of the proposed 
new index type.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Delegating superuser tasks to new security roles (Was: Granting control of SUSET gucs to non-superusers)
Next
From: Jeff Davis
Date:
Subject: Re: Delegating superuser tasks to new security roles (Was: Granting control of SUSET gucs to non-superusers)