Re: Fast insertion indexes: why no developments - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: Fast insertion indexes: why no developments
Date
Msg-id CAMkU=1zE3T+kswVEaS6=UqOTzD7rm7cBpSdFnFHP82AhOq__AA@mail.gmail.com
Whole thread Raw
In response to Re: Fast insertion indexes: why no developments  (Leonardo Francalanci <m_lists@yahoo.it>)
Responses Re: Fast insertion indexes: why no developments  (Leonardo Francalanci <m_lists@yahoo.it>)
List pgsql-hackers
On Wed, Oct 30, 2013 at 3:35 AM, Leonardo Francalanci <m_lists@yahoo.it> wrote:
> Presumably the data you are inserting isn't actually random. Please
> describe the use case you are considering in more detail and some view
> on how frequent that is, with some examples. Once we understand the
> use case and agree it is important, we might solve problems.


Collecting calls data for mobile network operators (and no, I don't work for the NSA...)
Easily 5000-10000 inserts per second. Indexes in timestamp and ID (not a problem, always increasing so no btree issues) and in called #, calling #, imsi, imei. The last four obviously are random, out of millions of possible values.
After the few first millions of records, the disks can't keep up with the amount of random writing in the indexes.

So, like, 3 minutes worth?  How much RAM and shared_buffers do you have?  The index insertions should be fast until the size of the active part of the indexes being inserted into exceeds shared_buffers by some amount (what that amount is would depend on how much dirty data the kernel is willing to allow in the page cache before it starts suffering anxiety about it).  If you have enough shared_buffers to make that last for 15 minutes, then you shouldn't have a problem inserting with live indexes. 
 
Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Merlin Moncure
Date:
Subject: Re: Fast insertion indexes: why no developments
Next
From: Alvaro Herrera
Date:
Subject: Re: appendStringInfo vs appendStringInfoString