Home > mailing lists

Re: LSM tree for Postgres - Mailing list pgsql-hackers

From	Dmitry Dolgov
Subject	Re: LSM tree for Postgres
Date	August 5, 2020 11:09:38
Msg-id	20200805110938.hwgxtwyzo5wvy6sc@localhost Whole thread
In response to	LSM tree for Postgres (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Responses	Re: LSM tree for Postgres
List	pgsql-hackers

Tree view

> On Tue, Aug 04, 2020 at 11:22:13AM +0300, Konstantin Knizhnik wrote:
>
> Then I think about implementing ideas of LSM using standard Postgres
> nbtree.
>
> We need two indexes: one small for fast inserts and another - big
> (main) index. This top index is small enough to fit in memory so
> inserts in this index are very fast.  Periodically we will merge data
> from top index to base index and truncate the top index. To prevent
> blocking of inserts in the table while we are merging indexes we can
> add ... on more index, which will be used during merge.
>
> So final architecture of Lsm3 is the following: two top indexes used
> in cyclic way and one main index. When top index reaches some
> threshold value we initiate merge with main index, done by bgworker
> and switch to another top index.  As far as merging indexes is done in
> background, it doesn't  affect insert speed.  Unfortunately Postgres
> Index AM has not bulk insert operation, so we have to perform normal
> inserts.  But inserted data is already sorted by key which should
> improve access locality and partly solve random reads problem for base
> index.
>
> Certainly to perform search in Lsm3 we have to make lookups in all
> three indexes and merge search results.

Thanks for sharing this! In fact this reminds me more of partitioned
b-trees [1] (and more older [2]) rather than LSM as it is (although
could be that the former was influenced by the latter). What could be
interesting is that quite often in these and many other whitepapers
(e.g. [3]) to address the lookup overhead the design includes bloom
filters in one or another way to avoid searching not relevant part of an
index. Tomas mentioned them in this thread as well (in the different
context), probably the design suggested here could also benefit from it?

[1]: Riegger Christian, Vincon Tobias, Petrov Ilia. Write-optimized
indexing with partitioned b-trees. (2017). 296-300. 10.1145/3151759.3151814.
[2]: Graefe Goetz. Write-Optimized B-Trees. (2004). 672-683.
10.1016/B978-012088469-8/50060-7.
[3]: Huanchen Zhang, David G. Andersen, Andrew Pavlo, Michael Kaminsky,
Lin Ma, and Rui Shen. Reducing the Storage Overhead of Main-Memory OLTP
Databases with Hybrid Indexes. (2016). 1567–1581. 10.1145/2882903.2915222.

pgsql-hackers by date:

From: Masahiko Sawada
Date: 05 August 2020, 09:56:35
Subject: Re: Add MAIN_RELATION_CLEANUP and SECONDARY_RELATION_CLEANUP options to VACUUM

From: Robert Haas
Date: 05 August 2020, 11:19:15
Subject: Re: display offset along with block number in vacuum errors

Re: LSM tree for Postgres - Mailing list pgsql-hackers

Previous

Next