Re: Get more from indices. - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Get more from indices.
Date
Msg-id 5212.1397599817@sss.pgh.pa.us
Whole thread Raw
In response to Re: Get more from indices.  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: Get more from indices.
List pgsql-hackers
Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes:
> [ pathkey_and_uniqueindx_v10_20130411.patch ]

I thought some more about this patch, and realized that it's more or less
morally equivalent to allowing references to ungrouped variables when the
query has a GROUP BY clause listing all the columns of the primary key.
In that case the parser is effectively pretending that the GROUP BY list
contains additional implicit entries that are functionally dependent on
the entries that are actually there.  In this patch, what we want to do
is recognize that trailing entries in an ORDER BY list are semantically
no-ops and can be ignored because they are functionally dependent on
earlier entries.

Now, the reason that the parser restricts the functional dependency
deduction to a primary key is that it wants to be able to identify a
constraint OID that the query is dependent on to be semantically valid.
In this case, we don't need such an OID, so just finding any old unique
index on not-null columns is good enough.  (If someone drops the index,
the optimization might become incorrect, but that would force replanning
anyway.)

However, this way of thinking about it shows that the patch is missing
possible optimizations.  If we have "ORDER BY a, b, c" and (a,b) is the
primary key, then including c in the ORDER BY list is semantically
redundant, *whether or not we use an indexscan on the pkey index at all*.
More: if we have "ORDER BY a, b, c" and the primary key is (b,a), we
can still discard c from the sort requirement, even though the pkey
index as such isn't helpful for producing the required order.

So hacking up the pathkeys attributed to the indexscan is the wrong thing.
Rather, what we should be looking to do is decide that c is a useless
pathkey and remove it from the query_pathkeys, much as we'd do if we found
"c = constant" in WHERE.  That would allow optimization of other query
plans besides scan-the-pkey-index plans.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Hannu Krosing
Date:
Subject: Question about optimising (Postgres_)FDW
Next
From: Andreas 'ads' Scherbaum
Date:
Subject: Patch: iff -> if