Re: Google Summer of Code: question about GiST API advancementproject - Mailing list pgsql-hackers

From Andrey Borodin
Subject Re: Google Summer of Code: question about GiST API advancementproject
Date
Msg-id 07514C22-17F0-4E6F-AA06-6F3B92C8B6F7@yandex-team.ru
Whole thread Raw
In response to Google Summer of Code: question about GiST API advancement project  (GUO Rui <ruig2@uci.edu>)
Responses Re: Google Summer of Code: question about GiST API advancement project
List pgsql-hackers
Hi!

> 31 марта 2019 г., в 14:58, GUO Rui <ruig2@uci.edu> написал(а):
>
> I'm Rui Guo, a PhD student focusing on database at the University of California, Irvine. I'm interested in the "GiST
APIadvancement" project for the Google Summer of Code 2019 which is listed at
https://wiki.postgresql.org/wiki/GSoC_2019#GiST_API_advancement_.282019.29. 
>
> I'm still reading about RR*-tree, GiST and the PostgreSQL source code to have a better idea on my proposal.
Meanwhile,I have a very basic and simple question: 
>
> Since the chooseSubtree() algorithm in both R*-tree and RR*-tree are heuristic and somehow greedy (e.g. pick the MBB
thatneeds to enlarge the least), is it possible to apply machine learning algorithm to improve it? The only related
referenceI got is to use deep learning in database join operation (https://arxiv.org/abs/1808.03196). Is it not
suitableto use machine learning here or someone already did? 

If you are interested in ML and DBs you should definitely look into [0]. You do not have to base your proposal on
mentorideas, you can use your own. Implementing learned indexes - seems reasonable. 

RR*-tree algorithms are heuristic in some specific parts, but in general they are designed to optimize very clear
metrics.Generally, ML algorithms tend to compose much bigger pile of heuristics and solve less mathematically clear
tasksthan splitting subtrees or choosing subtree for insertion. 
R*-tree algorithms are heuristic only to be faster.

Best regards, Andrey Borodin.

[0] https://arxiv.org/pdf/1712.01208.pdf


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: speeding up planning with partitions
Next
From: Andrew Dunstan
Date:
Subject: Re: jsonpath