Home > mailing lists

Re: Fast insertion indexes: why no developments - Mailing list pgsql-hackers

From	Leonardo Francalanci
Subject	Re: Fast insertion indexes: why no developments
Date	November 5, 2013 14:18:48
Msg-id	1383650322574-5776982.post@n5.nabble.com Whole thread Raw
In response to	Re: Fast insertion indexes: why no developments (Simon Riggs <simon@2ndQuadrant.com>)
List	pgsql-hackers

Tree view

Simon Riggs wrote
> On 5 November 2013 09:57, Leonardo Francalanci <

> m_lists@

> > wrote:
>> While I do believe in testing (since "In theory there is no difference
>> between theory and practice. In practice there is"), I would like to know
>> the "properties" of the minmax index before trying it.
>> What is it supposed to be good at? What are the pros/cons? We can't ask
>> all
>> the users to just "try" the index and see if it works for them.
> 
> No, but then all users aren't suggesting we need a new index type are
> they?
> 
> I think its reasonable for you to spend time checking whether what you
> require will be met, and if not, what precise query doesn't it help,
> so we can better design any future new-index.

I don't understand the parallel "We can't ask all the users to just try the
index and see if it works for them" with "all users aren't suggesting we
need a new index type". Anyway:

I'm not suggesting we need a new index type. Please read my first post: I'm
asking info, fearing that there's just a lot of marketing/hype in those
indexes.

What do you mean by "spend time checking whether what you require will be
met"? Met by what, minmax indexes?

Simon Riggs wrote
>> As I said, my understanding is that is very efficient (both in insertion
>> and
>> in searching) when data is somehow ordered in the table. But maybe I got
>> it
>> wrong...
> 
>> Anyway, the sql scenario is simple: a table with 4 bigint indexes; data
>> in
>> the fields is a random bigint in the range 1-10000000. Insertion is 5-10K
>> rows/second. One search every 1-5 seconds, by one of the indexed fields
>> (only equality, no range). There's also an index on a timestamp field,
>> but
>> that's not random so it doesn't give that many problems (it's actually
>> the
>> one where I wanted to try the minmax).
> 
> Without meaning to pick on you, imprecise analysis yields unhelpful
> new features. The clearer we are about what we are trying to solve the
> more likely we are to solve it well. 30 seconds analysis on what is
> needed is not sufficient to justify an additional man year of
> development, especially if a man year of work has already been done
> and the testing of the latest feature is now at hand.

I've never said that my analysis justifies a man year of work. As I said,
I'm actually not confident at all that even if we had those "cool" indexes
they would work on my scenario (marketing aside, there's not that much
data/tests out there on those indexes). I just wanted to know if the matter
was discussed in the past / getting more info.

At the same time, I'm reluctant to try a new index hoping it will work in my
case just because it's new and a man year of work has already been done.
Again: what's this minmax index supposed to be good at?
If it's indexing data in mostly-sequential order, it won't help me. From
what I got (maybe I got it wrong?) the index stores min/max values of
sequence of pages. In my case I guess that those min/max values would be
close to 1 (min) /10000000 (max), because I insert data in random order. So
any query will scan the entire table anyway. Am I wrong?

Simon Riggs wrote
> The requests from the indexes field, are they ordered? 

Mmmh, I don't think I understand the question... an operator searches for
calls made by a user, so he searches in random order... 

Simon Riggs wrote
> are they
> limited? Or you really want ALL calls? What is the tolerance on those?

I want all the calls made by the user. But there won't be many calls for 1
user.
And most users will never be queried (as I said, one "calls by person"
search every 1-5 seconds, so a very small percentage of calls will ever be
queried/retrieved)

--
View this message in context:
http://postgresql.1045698.n5.nabble.com/Fast-insertion-indexes-why-no-developments-tp5776227p5776982.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

pgsql-hackers by date:

From: Simon Riggs
Date: 05 November 2013, 13:42:31
Subject: Re: Fast insertion indexes: why no developments

From: Soroosh Sardari
Date: 05 November 2013, 14:21:52
Subject: Crash recovery

Re: Fast insertion indexes: why no developments - Mailing list pgsql-hackers

Previous

Next