Home > mailing lists

Re: ML-based indexing ("The Case for Learned Index Structures", a paper from Google) - Mailing list pgsql-hackers

From	Stefan Keller
Subject	Re: ML-based indexing ("The Case for Learned Index Structures", a paper from Google)
Date	April 21, 2021 08:52:19
Msg-id	CAFcOn29OBRezBGVHTZJkhpAGQJ9ZADyR_NhBYBtsgrTTW8rF=g@mail.gmail.com Whole thread
In response to	Re: ML-based indexing ("The Case for Learned Index Structures", a paper from Google) (Peter Geoghegan <pg@bowt.ie>)
Responses	Re: ML-based indexing ("The Case for Learned Index Structures", a paper from Google)
List	pgsql-hackers

Tree view

Di., 20. Apr. 2021 23:50 Tom Lane <tgl@sss.pgh.pa.us> wrote:
> There's enough support these days that you can build a new index
> type as an extension, without touching the core code at all.

Thanks. I'm ramping up knowledge about extending PG with C++.

I'm still interested to understand in principle what an index has to
do with concurrency control, in order to divide
concerns/reponsibilities of code.

Di., 20. Apr. 2021 23:51 Uhr Peter Geoghegan <pg@bowt.ie> wrote:
> It's easy for me to be a skeptic

Isn't being skeptic a requirement for all of us to be a db engineer :-)

> but much less uncertainty about the outcome of alternative projects of comparable difficulty

Oh. As mentioned above I'm trying to get an overview of indices. So,
if you have hints about other new indexes (like PGM, VODKA for
text/ts, or Hippo), I'm interested.

 ~Stefan

Am Di., 20. Apr. 2021 um 23:51 Uhr schrieb Peter Geoghegan <pg@bowt.ie>:
>
> On Tue, Apr 20, 2021 at 2:29 PM Stefan Keller <sfkeller@gmail.com> wrote:
> > Just for the records: A learned index as no more foreknowledge about
> > the dataset as other indices.
>
> Maybe. ML models are famously prone to over-interpreting training
> data. In any case I am simply not competent to assess how true this
> is.
>
> > I'd give learned indexes at least a change to provide a
> > proof-of-concept. And I want to learn more about the requirements to
> > be accepted as a new index (before undergoing month's of code
> > sprints).
>
> I have everything to gain and nothing to lose by giving them a chance
> -- I'm not required to do anything to give them a chance, after all. I
> just want to be clear that I'm a skeptic now rather than later. I'm
> not the one making a big investment of my time here.
>
> > As you may have seen, the "Stonebraker paper" I cited [1] is also
> > sceptic requiring full parity on features (like "concurrency control,
> > recovery, non main memory,and multi-user settings")! Non main memory
> > code I understand.
> > => But index read/write operations and multi-user settings are part of
> > a separate software (transaction manager), aren't they?
>
> It's easy for me to be a skeptic -- again, what do I have to lose by
> freely expressing my opinion? Mostly I'm just saying that I wouldn't
> work on this because ISTM that there is significant uncertainty about
> the outcome, but much less uncertainty about the outcome of
> alternative projects of comparable difficulty. That's fundamentally
> how I assess what to work on. There is plenty of uncertainty on my end
> -- but that's beside the point.
>
> --
> Peter Geoghegan

pgsql-hackers by date:

From: David Rowley
Date: 21 April 2021, 08:37:14
Subject: Re: prerequisites of pull_up_sublinks

From: Masahiko Sawada
Date: 21 April 2021, 09:06:57
Subject: Re: Replication slot stats misgivings

Re: ML-based indexing ("The Case for Learned Index Structures", a paper from Google) - Mailing list pgsql-hackers

Previous

Next