Re: Index Skip Scan - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: Index Skip Scan
Date
Msg-id CAMkU=1z0TzA8sBEoNvV9YB_W-u4Mf75b7zUiZvuAmoQqVTx1gg@mail.gmail.com
Whole thread Raw
In response to Re: Index Skip Scan  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: Index Skip Scan
List pgsql-hackers
On Thu, Jan 31, 2019 at 1:32 AM Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
Hello.

At Wed, 30 Jan 2019 18:19:05 +0100, Dmitry Dolgov <9erthalion6@gmail.com> wrote in <CA+q6zcVP18wYiO=aa+fz3GuncuTF52q1sufB7ise37TJPSDK1w@mail.gmail.com>
> A bit of adjustment after nodes/relation -> nodes/pathnodes.

I had a look on this.

The name "index skip scan" is a different feature from the
feature with the name on other prodcuts, which means "index scan
with postfix key (of mainly of multi column key) that scans
ignoring the prefixing part" As Thomas suggested I'd suggest that
we call it "index hop scan". (I can accept Hopscotch, either:p)

I think that what we have proposed here is just an incomplete implementation of what other products call a skip scan, not a fundamentally different thing.  They don't ignore the prefix part, they use that part in a way to cancel itself out to give the same answer, but faster.  I think they would also use this skip method to get distinct values if that is what is requested.  But they would go beyond that to also use it to do something similar to the plan we get with this:

Set up:
pgbench -i -s50
create index on pgbench_accounts (bid, aid);
alter table pgbench_accounts drop constraint pgbench_accounts_pkey ;

Query: 
explain analyze with t as (select distinct bid from pgbench_accounts ) 
  select pgbench_accounts.* from pgbench_accounts join t using (bid) where aid=5;

If we accept this patch, I hope it would be expanded in the future to give similar performance as the above query does even when the query is written in its more natural way of:

explain analyze select * from pgbench_accounts where aid=5;

(which currently takes 200ms, rather than the 0.9ms taken for the one benefiting from skip scan)

I don't think we should give it a different name, just because our initial implementation is incomplete.

Or do you think our implementation of one feature does not really get us closer to implementing the other?

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Drop type "smgr"?
Next
From: Alvaro Herrera
Date:
Subject: Re: propagating replica identity to partitions