Thread: ambulkinsert
Do we need a new API (ambulkinsert) to support optimized bulk insertion into indexes? Browsing the mailing lists I see people trying to improve bulk loading into indexes. One approach is to side-step WAL, but others have looked at alternative indexing methods. In my application, insertion speed is more important than query speed. There are known index methods that optimize insertion speed, but none are available for PostgreSQL, AFAIK. Worse, there are no APIs to support bulk insertion into indexes with COPY. In the mailing lists:* A request for non-WAL bulk loading of an index http://archives.postgresql.org/pgsql-general/2008-08/msg00035.php*A proposal to integrate an existing non-WAL non-indexingloading tool http://archives.postgresql.org/pgsql-hackers/2008-02/msg00811.php http://pgbulkload.projects.postgresql.org/*Special-purpose indexes with bulk loading features http://archives.postgresql.org/pgsql-hackers/2007-07/msg00918.php B-tree alternatives for bulk insertion:* A benchmark paper (compares six methods) http://www.springerlink.com/content/e0495h4744462rk7/*A survey paper http://citeseer.ist.psu.edu/vitter00external.html*A randomized alternative http://arxiv.org/abs/cs?papernum=0404028 PostgreSQL APIs include ambulkdelete but not ambulkinsert: http://www.postgresql.org/docs/8.3/static/index-functions.html --Steve
"Steve Mitchell" <mitchell@intertrust.com> wrote: > Do we need a new API (ambulkinsert) to support optimized bulk insertion > into indexes? Do you have a concrete image of ambulkinsert? I think we need to bring out the bottleneck of current insertion method first, before writing codes. Finding keys or splitting leaf pages? Do you have any idea here? If we will come to a decision, I'd like to cooperate with you; I'll port useful parts of pgbulkload into core. Regards, --- ITAGAKI Takahiro NTT Open Source Software Center