Re: build multiple indexes in single table pass? - Mailing list pgsql-hackers

From ITAGAKI Takahiro
Subject Re: build multiple indexes in single table pass?
Date
Msg-id 20080402102030.9504.52131E4D@oss.ntt.co.jp
Whole thread Raw
In response to Re: build multiple indexes in single table pass?  (Toru SHIMOGAKI <shimogaki.toru@oss.ntt.co.jp>)
List pgsql-hackers
Toru SHIMOGAKI <shimogaki.toru@oss.ntt.co.jp> wrote:

> Andrew Dunstan wrote:
> > we could get a performance gain from building multiple indexes from a 
> > single sequential pass over the base table?
> 
> It is already implemented in pg_bulkload 
> (http://pgbulkload.projects.postgresql.org/).

I think there are two ways to implement multiple index creation. 1. Add multiple indexes AFTER data loading. 2. Define
multipleindexes BEFORE data loading.
 

pg_bulkload uses the 2nd way, but the TODO item seems to target
the 1st, right? -- Both are useful, though.

| Allow multiple indexes to be created concurrently, ideally via a
| single heap scan, and have pg_restore use it

In either case, we probably need to renovate ambuild interface.
I'm thinking to reverse the control of heap sequential scans;
Seq scan is done in ambuild for now, but it will be controlled in
an external loop in the new method.

Define a new IndexBulder interface, something like:   interface IndexBuilder   {      addTuple(IndexTuple tuple);
finishBuild();  }
 
and make ambuild() to return an IndexBuilder instance implemented in each AM.

However, it cannot use multiple CPUs if indexes are built in one process.
A less granular method might be better for Postgres, like synchronized scans,
as already pointed out.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center




pgsql-hackers by date:

Previous
From: "Guillaume Smet"
Date:
Subject: Re: [JDBC] Re: How embarrassing: optimization of a one-shot query doesn't work
Next
From: sanjay sharma
Date:
Subject: Re: column level privileges