Re: Fast insertion indexes: why no developments - Mailing list pgsql-hackers

From Gavin Flower
Subject Re: Fast insertion indexes: why no developments
Date
Msg-id 5277FB99.6040704@archidevsys.co.nz
Whole thread Raw
In response to Re: Fast insertion indexes: why no developments  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Fast insertion indexes: why no developments  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On 05/11/13 05:35, Robert Haas wrote:
> On Mon, Nov 4, 2013 at 11:32 AM, Andres Freund <andres@2ndquadrant.com> wrote:
>> I think doing this outside of s_b will make stuff rather hard for
>> physical replication and crash recovery since we either will need to
>> flush the whole buffer at checkpoints - which is hard since the
>> checkpointer doesn't work inside individual databases - or we need to
>> persist the in-memory buffer across restart which also sucks.
> You might be right, but I think part of the value of LSM-trees is that
> the in-memory portion of the data structure is supposed to be able to
> be optimized for in-memory storage rather than on disk storage.  It
> may be that block-structuring that data bleeds away much of the
> performance benefit.  Of course, I'm talking out of my rear end here:
> I don't really have a clue how these algorithms are supposed to work.
>
How about having a 'TRANSIENT INDEX' that only exists in memory, so 
there is no requirement to write it to disk or to replicate directly? 
This type of index would be very fast and easier to implement.  Recovery 
would involve rebuilding the index, and sharing would involve recreating 
on a slave.  Probably not appropriate for a primary index, but may be 
okay for secondary indexes used to speed specific queries.

This could be useful in some situations now, and allow time to get 
experience in how best to implement the basic concept.  Then a more 
robust solution using WAL etc can be developed later.

I suspect that such a TRANSIENT INDEX would still be useful even when a 
more robust in memory index method was available.  As I expect it would 
be faster to set up than a robust memory index - which might be good 
when you need to have one or more indexes for a short period of time, or 
the size of the index is so small that recreating it requires very 
little time (total elapsed time might even be less than a disk backed one?).


Cheers,
Gavin



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pgsql: Remove internal uses of CTimeZone/HasCTZSet.
Next
From: Simon Riggs
Date:
Subject: Re: Fast insertion indexes: why no developments