Alvaro Herrera <alvherre@commandprompt.com> writes:
> The problem is, most likely, on updating the indexes. Heap inserts
> should always take more or less the same time, but index insertion
> requires walking down the index struct for each insert, and the path to
> walk gets larger the more data you have.
It's worse than that: his test case inserts randomly ordered keys, which
means that there's no locality of access during the index updates. Once
the indexes get bigger than RAM, update speed goes into the toilet,
because the working set of index pages that need to be touched also
is bigger than RAM. That effect is going to be present in *any*
standard-design database, not just Postgres.
It's possible that performance in a real-world situation would be
better, if the incoming data stream isn't so random; but it's
hard to tell about that with the given facts.
One possibly useful trick is to partition the data by timestamp with
partition sizes chosen so that the indexes don't get out of hand.
But the partition management might be enough of a PITA to negate
any win.
regards, tom lane