Re: Next Steps with Hash Indexes - Mailing list pgsql-hackers

From Dilip Kumar
Subject Re: Next Steps with Hash Indexes
Date
Msg-id CAFiTN-vYxT1-F-cPqvWL=u0sTa1H8jnoZ8hkVAcPGmxBqQFG3w@mail.gmail.com
Whole thread Raw
In response to Re: Next Steps with Hash Indexes  (Sadhuprasad Patro <b.sadhu@gmail.com>)
Responses Re: Next Steps with Hash Indexes
List pgsql-hackers
On Thu, Sep 23, 2021 at 10:04 AM Sadhuprasad Patro <b.sadhu@gmail.com> wrote:
>
> > > One more thing to consider is that it seems that the planner requires
> > > a condition for the first column of an index before considering an
> > > indexscan plan. See Tom's email [1] in this regard. I think it would
> > > be better to see what kind of work is involved there if you want to
> > > explore a single hash value for all columns idea.
> > >
> > > [1] - https://www.postgresql.org/message-id/29263.1506483172%40sss.pgh.pa.us
> >
> > About this point, I will analyze further and update.
> >
>
> I have checked the planner code, there does not seem to be any
> complicated changes needed to cover if we take up a single hash value
> for all columns... Below are the major part of changes needed:
>
> In build_index_paths(), there is a check like, "if (index_clauses ==
> NIL && !index->amoptionalkey)", which helps to figure out if the
> leading column has any clause or not. This needs to be moved out of
> the loop and check for clauses on all key columns.
> With this we need to add a "amallcolumncluse" field to Index
> structure, which will be set to TRUE for HASH index and FALSE in other
> cases.

Right we can add an AM level option and based on that we can decide
whether to select the index scan if conditions are not given for all
the key columns.  And changes don't look that complicated.

>
> And to get the multi-column hash index selected, we may set
> enable_hashjoin =off, to avoid any condition become join condition,
> saw similar behaviors in other DBs as well...

This may be related to Tom's point that, if some of the quals are
removed due to optimization or converted to join quals, then now, even
if the user has given qual on all the key columns the index scan will
not be selected because we will be forcing that the hash index can
only be selected if it has quals on all the key attributes?

I don't think suggesting enable_hashjoin =off is a solution, this can
happen with merge join or the nested loop join with materialized node,
in all such cases join filter can not be pushed down to the inner node
because the outer node will not start to scan until we
materialize/sort/hash the inner node.  But yeah if we test this
behavior in other databases also and if it appeared that this is how
the hash index is being used then maybe this behavior can be
documented.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Amul Sul
Date:
Subject: Re: Deduplicate code updating ControleFile's DBState.
Next
From: Amit Kapila
Date:
Subject: Re: logical replication restrictions