Re: [WIP PATCH] Index scan offset optimisation using visibility map - Mailing list pgsql-hackers

From Michail Nikolaev
Subject Re: [WIP PATCH] Index scan offset optimisation using visibility map
Date
Msg-id CANtu0oiNi5haJyktWQ8bn80pJT3uAo_R9FvRODJNo491-iNU+Q@mail.gmail.com
Whole thread Raw
In response to Re: [WIP PATCH] Index scan offset optimisation using visibility map  (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses Re: [WIP PATCH] Index scan offset optimisation using visibility map  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
Hello.

Thanks a lot for your feedback. I'll try to update patch in few days (currently stuck at small performance regression in unknown place).

Regarding issue with delete: yes, it is valid point, but record removing should clear visibility buffer - and tuple will be fetched from heap to test its existance. In such case expression are not evaluated at all. Not sure for delete and query in same transaction - I'll check. 
Also, need to recheck possible issues with EvalPlanQual.

PS.



сб, 14 июл. 2018 г. в 0:17, Heikki Linnakangas <hlinnaka@iki.fi>:
On 21/05/18 18:43, Michail Nikolaev wrote:
> Hello everyone.
> This letter related to “Extended support for index-only-scan” from my
> previous message in the thread.
>
> WIP version of the patch is ready for a while now and I think it is time to
> resume the work on the feature. BTW, I found a small article about Oracle
> vs Postgres focusing this issue -
> https://blog.dbi-services.com/postgres-vs-oracle-access-paths-viii/
>
> Current WIP version of the patch is located here -
> https://github.com/postgres/postgres/compare/88ba0ae2aa4aaba8ea0d85c0ff81cc46912d9308...michail-nikolaev:index_only_fetch,
> passing all checks. In addition, patch includes small optimization for
> caching of amcostestimate results.

Please submit an actual path, extracted e.g. with "git format-patch
-n1", rather than a link to an external site. That is a requirement for
archival purposes, so that people reading the email archives later on
can see what was being discussed. (And that link doesn't return a proper
diff, anyway.)

> For now, I decide to name the plan as “Index Only Fetch Scan”. Therefore:
> * In case of “Index Scan” – we touch the index and heap for EVERY tuple we
> need to test
> * For “Index Only Scan” – we touch the index for every tuple and NEVER
> touch the heap
> * For “Index Only Fetch Scan” – we touch the index for every tuple and
> touch the heap for those tuples we need to RETURN ONLY.

Hmm. IIRC there was some discussion on doing that, when index-only scans
were implemented. It's not generally OK to evaluate expressions based on
data that has already been deleted from the table. As an example of the
kind of problems you might get:

Imagine that a user does "DELETE * FROM table WHERE div = 0; SELECT *
FROM table WHERE 100 / div < 10". Would you expect the query to throw a
"division by zero error"? If there was an index on 'div', you might
evaluate the "100 / div" expression based on values from the index,
which still includes entries for the just-deleted tuples with div = 0.
They would be filtered out later, after performing the visibility
checks, but it's too late if you already threw an error.

Now, maybe there's some way around that, but I don't know what. Without
some kind of a solution, this won't work.

- Heikki

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Fix some error handling for read() and errno
Next
From: Ashutosh Bapat
Date:
Subject: Re: partition pruning doesn't work with IS NULL clause in multikeyrange partition case