Re: [WIP PATCH] Index scan offset optimisation using visibility map - Mailing list pgsql-hackers

From Michail Nikolaev
Subject Re: [WIP PATCH] Index scan offset optimisation using visibility map
Date
Msg-id CANtu0ohuBMXimT-qriqZMcrch+4qdJ1HxnW8BKBqcdER39M3Qw@mail.gmail.com
Whole thread Raw
In response to Re: [WIP PATCH] Index scan offset optimisation using visibility map  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [WIP PATCH] Index scan offset optimisation using visibility map  (Michail Nikolaev <michail.nikolaev@gmail.com>)
List pgsql-hackers
Hello.

Tom, thanks a lot for your thorough review.

> What you've done to
> IndexNext() is a complete disaster from a modularity standpoint: it now
> knows all about the interactions between index_getnext, index_getnext_tid,
> and index_fetch_heap.

I was looking into the current IndexOnlyNext implementation as a starting point - and it knows about index_getnext_tid and index_fetch_heap already.
At the same time I was trying to keep patch non-invasive.
Patched IndexNext now only knowns about index_getnext_tid and index_fetch_heap - the same as IndexOnlyNext.
But yes, it probably could be done better.

> I'm not sure about a nicer way to refactor that, but I'd suggest that
> maybe you need an additional function in indexam.c that hides all this
> knowledge about the internal behavior of an IndexScanDesc.

I'll try to split index_getnext into two functions. A new one will do everything index_getnext does except index_fetch_heap.

> Or that is, it knows too much and still not enough,
> because it's flat out wrong for the case that xs_continue_hot is set.
> You can't call index_getnext_tid when that's still true from last time.

Oh.. Yes, clear error here.

< The PredicateLockPage call also troubles me quite a bit, not only from
< a modularity standpoint but because that implies a somewhat-user-visible
< behavioral change when this optimization activates. People who are using
< serializable mode are not going to find it to be an improvement if their
< queries fail and need retries a lot more often than they did before.
< I don't know if that problem is bad enough that we should disable skipping
< when serializable mode is active, but it's something to think about.

Current IndexOnlyScan already does that. And I think a user should expect such a change in serializable mode.

> You haven't documented the behavior required for tuple-skipping in any
> meaningful fashion, particularly not the expectation that the child plan
> node will still return tuples that just need not contain any valid
> content.

Only nodeLimit could receive such tuples and they are immediately discarded. I'll add some comment to it.

> is a gross hack and probably wrong. You could use ExecStoreAllNullTuple,
> perhaps.

Oh, nice, missed it.

> I'm disturbed by the fact that you've not taught the planner about the
> potential cost saving from this, so that it won't have any particular
> reason to pick a regular indexscan over some other plan type when this
> optimization applies. Maybe there's no practical way to do that, or maybe
> it wouldn't really matter in practice; I've not looked into it. But not
> doing anything feels like a hack.

I was trying to do it. But current planner architecture does not provide a nice way to achive it due to the way it handles limit and offset.
So, I think it is better to to be avoided for now.

> Setting this back to Waiting on Author.

I'll try to make the required changes in a few days.

Thanks.

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: PATCH: Unlogged tables re-initialization tests
Next
From: Tom Lane
Date:
Subject: Re: PATCH: Configurable file mode mask