Re: WIP: Covering + unique indexes. - Mailing list pgsql-hackers

From Anastasia Lubennikova
Subject Re: WIP: Covering + unique indexes.
Date
Msg-id f90aa60a-b67f-95b5-d9f5-f5d8ced178c6@postgrespro.ru
Whole thread Raw
In response to Re: WIP: Covering + unique indexes.  (Andrey Borodin <amborodin@acm.org>)
Responses Re: WIP: Covering + unique indexes.
Re: WIP: Covering + unique indexes.
List pgsql-hackers
14.08.2016 20:11, Andrey Borodin:
> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:       tested, failed
> Spec compliant:           tested, passed
> Documentation:            tested, passed
>
> Hi hackers!
>
> I've read the patch and here is my code review.
>
> ==========PURPOSE============
> I've used this feature from time to time with MS SQL. From my experience INCLUDE is a 'sugar on top' feature.
> Some MS SQL classes do not even mention INCLUDE despite it's there from 2005 (though classes do not mention lots of
importantthings, so it's not kind of valuable indicator). 
> But those who use it, use it whenever possible. For example, system view with recommended indices rarely list one
withoutINCLUDE columns. 
> So, this feature is very important from perspective of converting MS SQL DBAs to PostgreSQL. This is how I see it.

Thank you for the review, I hope this feature will be useful for many
people.

> ========SUGGESTIONS==========
> 0. Index build is broken. This script
https://github.com/x4m/pggistopt/blob/8ad65d2e305e98c836388a07909af5983dba9c73/test.sqlSEGFAULTs and may cause
situationwhen you cannot insert anything into table (I think drop of index would help, but didn't tested this) 

Thank you for reporting. That was a bug caused by high key truncation,
that occurs
when index has more than 3 levels.
Fixed. See attached file.

> 1. I think MS SQL syntax INCLUDE instead of INCLUDING would be better (for a purpose listed above)

I've chosen this particular name to avoid using of new keyword. We
already have INCLUDING
in postgres in a context of inheritance that will never intersect with
covering indexes.
I'm sure it won't be a big problem of migration from MsSQL.

> 2. Empty line added in ruleutils.c. Is it for a reason?

No, just a missed line.
Fixed.

> 3. Now we have indnatts and indnkeyatts instead of indnatts. I think it is worth considering renaming indnatts to
somethingdifferent from old name. Someone somewhere could still suppose it's a number of keys. 

I agree that naming became vague after this patch.
I've already suggested to replace "indkeys[]" with more specific name, and
AFAIR there was no reaction. So I didn't do that.
But I don't sure about your suggestion regarding indnatts. Old queries
(and old indexes)
can still use it correctly. I don't see a reason to break compatibility
for all users.
Those who will use this new feature, should ensure that their queries to
pg_index
behave as expected.

> ========PERFORMANCE==========
> Due to suggestion number 0 I could not measure performance of index build. Index crashes when there's more than 1.1
millionof rows in a table. 
> Performance test script is here
https://github.com/x4m/pggistopt/blob/f206b4395baa15a2fa42897eeb27bd555619119a/test.sql
> Test scenario is following:
> 1. Create table, then create index, then add data.
> 2. Make a query touching data in INCLUDING columns.
> This scenario is tested against table with:
> A. Table with index, that do not contain touched columns, just PK.
> B. Index with all columns in index.
> C. Index with PK in keys and INCLUDING all other columns.
>
> Tests were executed 5 times on Ubuntu VM under Hyper-V i5 2500 CPU, 16 Gb of RAM, SSD disk.
> Time to insert 10M rows:
> A. AVG 110 seconds STD 4.8
> B. AVG 121 seconds STD 2.0
> C. AVG 111 seconds STD 5.7
> Inserts to INCLUDING index is almost as fast as inserts to index without extra columns.
>
> Time to run SELECT query:
> A. AVG 2864 ms STD 794
> B. AVG 2329 ms STD 84
> C. AVG 2293 ms STD 58
> Selects with INCLUDING columns is almost as fast as with full index.
>
> Index size (deterministic measure, STD = 0)
> A. 317 MB
> B. 509 MB
> C. 399 MB
> Index size is in the middle between full index and minimal index.
>
> I think this numbers agree with expectation from the feature.
>
> ========CONCLUSION==========
> This patch brings useful and important feature. Build shall be repaired; other my suggestions are only suggestions.
>
>
>
> Best regards, Andrey Borodin, Octonica & Ural Federal University.
>
> The new status of this patch is: Waiting on Author
>

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Why --backup-and-modify-in-place in perltidy config?
Next
From: amul sul
Date:
Subject: Re: Bug in to_timestamp().