Re: Fast insertion indexes: why no developments - Mailing list pgsql-hackers

From Merlin Moncure
Subject Re: Fast insertion indexes: why no developments
Date
Msg-id CAHyXU0yQPgCHF6siBhnmyLGUDx=k_k66Y6okEVaEw1PF29Y+LQ@mail.gmail.com
Whole thread Raw
In response to Fast insertion indexes: why no developments  (Leonardo Francalanci <m_lists@yahoo.it>)
Responses Re: Fast insertion indexes: why no developments  (Leonardo Francalanci <m_lists@yahoo.it>)
List pgsql-hackers
On Tue, Oct 29, 2013 at 2:53 AM, Leonardo Francalanci <m_lists@yahoo.it> wrote:
> Hi,
>
>
> I don't see much interest in insert-efficient indexes. These are the ones I've found:
>
> - LSM-tree (used by Cassandra and SQLite4?)
> - Y-Tree
(http://www.bossconsulting.com/oracle_dba/white_papers/DW%20in%20oracle/P23%20(ytree%20index%20structure%20for%20DWs).pdf
)
> - Fractal indexes (TokuDB, patented)
>
> While I understand that b*trees are still the best compromise in insertion/search speed, disk size, concurrency, and
morein general in OLTP workloads, they are useless when it comes to insertion in big data tables (>50M rows) of random
values(not ordered values). 
>
> I would like to know if the lack of development in this area (not only in Postgresql, but in databases in general) is
dueto: 
>
> 1) complex implementation
> 2) poor search performance
> 3) poor concurrency performance
> 4) not interesting for most users
> 5) something else???
>
> I thought this was going to change due to the fast-insertion speeds needs of "Social Applications", but only TokuDB
seemsto be the only "successful" player in the area (I don't know how much of it is due to good marketing). Most other
DBtechnology claims faster insertion speed (MongoDB and the like...) but in the end they rely on the old b*tree +
shardinginstead of using different indexing mechanisms (with the exception of Cassandra). 

Another point to add: I don't really see btree as a barrier to
performance for most of the problems I face.  The real barriers to
database performance are storage, contention, and query planning.
Postgres btreee indexes are pretty fast and for stuff like bulk
insertions there are some optimization techniques available (such as
sharding or create index concurrently).

Stuff I'd like to see in terms of postgres indexing:
*) faster wal logged hash index
*) composite gist/gin
*) faster gist/gin (to the extent that it's possible).

merlin



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Fast insertion indexes: why no developments
Next
From: Andres Freund
Date:
Subject: Re: CLUSTER FREEZE