Re: very very slow inserts into very large table - Mailing list pgsql-performance

From Craig Ringer
Subject Re: very very slow inserts into very large table
Date
Msg-id 5004DC70.7000808@ringerc.id.au
Whole thread Raw
In response to Re: very very slow inserts into very large table  (Jon Nelson <jnelson+pgsql@jamponi.net>)
Responses Re: very very slow inserts into very large table  (Ants Aasma <ants@cybertec.at>)
List pgsql-performance
On 07/17/2012 01:56 AM, Jon Nelson wrote:
> What is the greater lesson to take away, here? If you are working with
> data that is larger (substantially larger) than available memory, is
> the architecture and design of postgresql such that the only real
> approach is some type of data partitioning? It is not my intent to
> insult or even disparage my favorite software, but it took less time
> to *build* the indices for 550GB of data than it would have to insert
> 1/20th as much. That doesn't seem right.

To perform reasonably well, Pg would need to be able to defer index
updates when bulk-loading data in a single statement (or even
transaction), then apply them when the statement finished or transaction
committed. Doing this at a transaction level would mean you'd need a way
to mark indexes as 'lazily updated' and have Pg avoid using them once
they'd been dirtied within a transaction. No such support currently
exists, and it'd be non-trivial to implement, especially since people
loading huge amounts of data often want to do it with multiple
concurrent sessions. You'd need some kind of 'DISABLE INDEX' and 'ENABLE
INDEX' commands plus a transactional backing table of pending index updates.

Not simple.


Right now, Pg is trying to keep the index consistent the whole time.
That involves moving a heck of a lot of data around - repeatedly.

Setting a lower FILLFACTOR on your indexes can give Pg some breathing
room here, but only a limited amount, and at the cost of reduced scan
efficiency.

--
Craig Ringer

pgsql-performance by date:

Previous
From: Claudio Freire
Date:
Subject: Re: very very slow inserts into very large table
Next
From: Satoshi Nagayasu
Date:
Subject: Re: very very slow inserts into very large table