Re: LSM tree for Postgres - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: LSM tree for Postgres
Date
Msg-id CAPpHfdvaY3jCADz6nX_sODw2=CC5Y99k3COsdH0=NkCVGKisUA@mail.gmail.com
Whole thread Raw
In response to Re: LSM tree for Postgres  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Responses Re: LSM tree for Postgres
List pgsql-hackers
ср, 5 авг. 2020 г., 09:13 Konstantin Knizhnik <k.knizhnik@postgrespro.ru>:
> Concerning degrade of basic index - B-Tree itself is balanced tree. Yes,
> insertion of random keys can cause split of B-Tree page.
> In the worst case half of B-Tree page will be empty. So B-Tree size will
> be two times larger than ideal tree.
> It may cause degrade up to two times. But that is all. There should not
> be infinite degrade of speed tending to zero.

My concerns are not just about space utilization.  My main concern is
about the order of the pages.  After the first merge the base index
will be filled in key order.  So physical page ordering perfectly
matches their logical ordering.  After the second merge some pages of
base index splits, and new pages are added to the end of the index.
Splits also happen in key order.  So, now physical and logical
orderings match within two extents corresponding to first and second
merges, but not within the whole tree.  While there are only few such
extents, disk page reads may in fact be mostly sequential, thanks to
OS cache and readahead.  But finally, after many merges, we can end up
with mostly random page reads.  For instance, leveldb doesn't have a
problem of ordering degradation, because it stores levels in sorted
files.

------
Regards,
Alexander Korotkov



pgsql-hackers by date:

Previous
From: Pierre Giraud
Date:
Subject: [PG13] Planning (time + buffers) data structure in explain plan (format text)
Next
From: Julien Rouhaud
Date:
Subject: Re: [PG13] Planning (time + buffers) data structure in explain plan (format text)