Thread: CREATE INDEX as bottleneck
Hello, in the last years, we have successfully manage to cope with our data growth using partitioning and splitting large aggregation tasks on multiple threads. The partitioning is done logically by our applicationn server, thus avoiding trigger overhead. There are a few places in our data flow where we have to wait for index creation before being able to distribute the process on multiple threads again. With the expected growth, create index will probably become a severe bottleneck for us. Is there any chance to see major improvement on it in a middle future ? I guess the question is naive, but why can't posgres use multiple threads for large sort operation ? best regards, Marc Mamin
On Thu, Nov 11, 2010 at 02:41:12PM +0100, Marc Mamin wrote: > Hello, > > in the last years, we have successfully manage to cope with our data > growth > using partitioning and splitting large aggregation tasks on multiple > threads. > The partitioning is done logically by our applicationn server, thus > avoiding trigger overhead. > > There are a few places in our data flow where we have to wait for index > creation before being able to distribute the process on multiple threads > again. > > With the expected growth, create index will probably become a severe > bottleneck for us. > > Is there any chance to see major improvement on it in a middle future ? > I guess the question is naive, but why can't posgres use multiple > threads for large sort operation ? > > > best regards, > > Marc Mamin > There has been a recent discussion on the hackers mailing list on using the infrastructure that is already in place to lauch autovacuum processes to launch other helper processes. Something like this could be used to offload the sort process to a much more parallelize version that could take advantage of multiple I/O streams and CPU cores. Many things are possible given the appropriate resources: funding, coding and development cycles... Regards, Ken
On Thu, Nov 11, 2010 at 06:41, Marc Mamin <M.Mamin@intershop.de> wrote: > There are a few places in our data flow where we have to wait for index > creation before being able to distribute the process on multiple threads > again. Would CREATE INDEX CONCURRENTLY help here?
No, CONCURRENTLY is to improve table availability during index creation, but it degrades the performances. best regards, Marc Mamin -----Original Message----- From: Alex Hunsaker [mailto:badalex@gmail.com] Sent: Donnerstag, 11. November 2010 19:55 To: Marc Mamin Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] CREATE INDEX as bottleneck On Thu, Nov 11, 2010 at 06:41, Marc Mamin <M.Mamin@intershop.de> wrote: > There are a few places in our data flow where we have to wait for index > creation before being able to distribute the process on multiple threads > again. Would CREATE INDEX CONCURRENTLY help here?