Re: PATCH: index-only scans with partial indexes - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: PATCH: index-only scans with partial indexes
Date
Msg-id 20160309.172949.84135555.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: PATCH: index-only scans with partial indexes  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: PATCH: index-only scans with partial indexes
Re: PATCH: index-only scans with partial indexes
Re: PATCH: index-only scans with partial indexes
List pgsql-hackers
Hello, thank you for the comments. The new v8 patch is attched.

At Tue, 08 Mar 2016 18:08:55 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in <21567.1457478535@sss.pgh.pa.us>
> Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes:
> > Hello, This is a (maybe) committer-ready patch of a Tomas
> > Vondra's project.
> 
> I think this needs quite a bit of work yet.  A few comments:

Not a few at all.

> * If we're going to pay the price of identifying implied restriction
> conditions in check_partial_indexes(), we should at least recoup some
> of that investment by not doing it again in create_indexscan_plan().

Moved a part of the check from create_indexscan_plan into
check_partial_indexes. 
I noticed that we should avoid to exlude
clauses that contain mutable functions so I added that. But I
don't understand the reason for the following condition to refuse
clause pruning.

| rel->relid == root->parse->resultRelation


> * create_indexscan_plan() has this comment about what it's doing:
>      * We can also discard quals that are implied by a partial index's
>      * predicate, but only in a plain SELECT; when scanning a target relation
>      * of UPDATE/DELETE/SELECT FOR UPDATE, we must leave such quals in the
>      * plan so that they'll be properly rechecked by EvalPlanQual testing.
> I believe that that problem applies for this optimization as well,
> and thus that you can only remove implied quals in plain SELECT.
> At least, if there's some reason why that problem does *not* apply,
> there had darn well better be a comment explaining why it's safe.

This is done in check_partial_indexes using parse_rowmark.

The problem haven't realized with the previous patch because it
(accidentially) used rel->baserestirictinfo, not
index->indrinfos for scan_clauses in create_scan_plan.

But the way how create_scan_plan gets scan_clauses seems
bad. I haven't have any clean idea to deal with it.

> * Adding indexrinfos to IndexPath seems unnecessary, since that struct
> already has the "index" pointer --- you can just get the field out of the
> IndexOptInfo when you need it.  If you insist on having the extra field,
> this patch is short of the threshold for correctness on adding fields to
> paths.  It missed _outIndexPath for instance.

Sorry for the stupid code. I don't insist to keep it. Removed.

> * The additional #include in costsize.c has no apparent reason.

Thank you for pointing out. Removed.

> * The changes in cost_index() really ought to come with some change
> in the associated comments.

I tried to add a comment but it doesn't seem clear.

> * Personally I'd not change the signature of
> match_restriction_clauses_to_index; that's just code churn that
> may have to get undone someday.

No problem, reverted.

> * The block comment in check_index_only needs more thought than this,
> as the phrase "The same is true" is now invalid; the "same" it refers
> to *isn't* the same anymore.

Maybe I took this "the same" wrongly. Tried to fix it but I'm not
confident on the result.

> * I'm not too thrilled with injecting the initialization of
> index->indrinfos into the initial loop in check_partial_indexes().
> If it stays there, I'd certainly expect the comment ahead of the
> loop to be changed to have something to do with reality.  But can't
> we find some more-appropriate place to initialize it?  Like maybe
> where the IndexOptInfo is first created?  I would not really expect
> check_partial_indexes() to have side-effects on non-partial indexes.

Mmm. That is quote right in general. IndexOptInfo is created in
get_relation_info() but baserestrictinfo has not been fixed at
the point. It is fixed as late as set_append_rel_size, almost
just before set_rel_size, and just before the
check_partial_indexes. But initializing indrinfos as a
side-effect of check_partial_indexes is not good as you pointed.
But it is called in two ways, set_tablesample_rel_size and
set_plain_rel_size. So the only possible position of that other
than check_partial_indexes is set_rel_size.


> * I think the double loop in check_partial_indexes() is too cute by half.
> I'd be inclined to just build the replacement list unconditionally while
> we do the predicate_implied_by() tests.  Those are expensive enough that
> saving one lappend per implication-test is a useless optimization,
> especially if it requires code as contorted and bug-prone as this.

Ok, I removed the too cute part and added comment mentioning the
reason for the unconditional replacement.

> * The comment added to IndexOptInfo is not very adequate, and not spelled
> correctly either.  There's a block comment you should be adding a para to
> (probably take the text you added for struct IndexPath).  

I understand that you are mentioning here.

+  List     *indrinfos;    /* baseristrict info which are not implied by
+                           * indpred */

I rewritten to make sense, maybe.

> And again,
> there is more work to do to add a field to such a struct, eg outfuncs.c.
> Usually a good way to find all the places to touch is to grep for some of
> the existing field names in the struct.

Sorry, I just forgot of that. (In spite that I myself give such
kind of comments..) Yeah, I love find-grep on emacs.

By the way, I found this comment in copyfuncs.c but I couldn't
find the "subsidiary structs".

|  * We don't support copying RelOptInfo, IndexOptInfo, or Path nodes.
|  * There are some subsidiary structs that are useful to copy, though.

Finally, all I added for this was one line in _outIndexOptInfo.

> * I don't much care for the field name "indrinfos"; it's neither very
> readable nor descriptive.  Don't have a better suggestion right now
> though.

I agree with you. I didn't like the name so I rethought that. I
followed the seeming rule that prefixing with 'ind' to the field
name, but it is not for index, but for the parent relation. So I
renamed it as "baserestrictinfo" in this version.

> * Not sure if new regression test cases would be appropriate.  The changes
> in the existing cases seem a bit unfortunate actually; I'm afraid that
> this may be defeating the original intent of those tests.

Only aggregates.out is modifed in this patch. The comment for the
test says that,

> --
> -- Test cases that should be optimized into indexscans instead of
> -- the generic aggregate implementation.
...
> -- try it on an inheritance tree
...
> explain (costs off)
>   select min(f1), max(f1) from minmaxtest;

and 

> -- DISTINCT doesn't do anything useful here, but it shouldn't fail
> explain (costs off)
>   select distinct min(f1), max(f1) from minmaxtest;

Utterly no problem from the point of the comment. Although this
patch removes "Index Cond"s for the index minmaxtest3i, it is
simplly caused by a index predicate on the index, which is the
very result of this patch.

> I'm setting this back to Waiting on Author.

Attached the new version v8.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: Minor documentation tweak to GetForeignPlan documentation
Next
From: Etsuro Fujita
Date:
Subject: Re: Optimization for updating foreign tables in Postgres FDW