[HACKERS] Improving DISTINCT with LooseScan node - Mailing list pgsql-hackers

From Dmitriy Sarafannikov
Subject [HACKERS] Improving DISTINCT with LooseScan node
Date
Msg-id 110251505670189@web3o.yandex.ru
Whole thread Raw
Responses Re: [HACKERS] Improving DISTINCT with LooseScan node
Re: [HACKERS] Improving DISTINCT with LooseScan node
List pgsql-hackers
Hi hackers,
 
Everybody knows, that we have unefficient execution of query like "SELECT DISTINCT id from mytable"
if table has many-many rows and only several unique id values. Query plan looks like Unique + IndexScan.
 
I have tried to implement this feature in new type of node called Loose Scan.
This node must appears in plan together with IndexScan or IndexOnlyScan just like Unique node in this case.
But instead of filtering rows with equal values, LooseScan must retreive first row from IndexScan,
then remember and return this. With all subsequent calls LooseScan must initiate calling index_rescan via ExecReScan
with search value that > or < (depending on scan direction) of previous value.
Total cost of this path must be something like total_cost = n_distinct_values * subpath->startup_cost
What do you think about this idea?
 
I was able to create new LooseScan node, but i ran into difficulties with changing scan keys.
I looked (for example) on the ExecReScanIndexOnlyScan function and I find it difficult to understand
how i can reach the ioss_ScanKeys through the state of executor. Can you help me with this?
 
I also looked on the Nested Loop node, which as i think must change scan keys, but i didn't become clear for me.
The only thought that came to my head, that maybe i incorrectly create this subplan.
I create it just like create_upper_unique_plan, and create subplan for IndexScan via create_plan_recurse.
Can you tell me where to look or maybe somewhere there are examples?
 
Thanks
 
Regards,
Dmitriy Sarafannikov

pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: [HACKERS] Range Merge Join v1
Next
From: David Fetter
Date:
Subject: Re: [HACKERS] [PATCH] Overestimated filter cost and its mitigation