Thread: CREATE INDEX as bottleneck

CREATE INDEX as bottleneck

From
"Marc Mamin"
Date:
Hello,

in the last years, we have successfully manage to cope with our data
growth
using partitioning and splitting large aggregation tasks on multiple
threads.
The partitioning is done logically by our applicationn server, thus
avoiding trigger overhead.

There are a few places in our data flow where we have to wait for index
creation before being able to distribute the process on multiple threads
again.

With the expected growth, create index will probably become a severe
bottleneck for us.

Is there any chance to see major improvement on it in a middle future ?
I guess the question is naive, but why can't posgres use multiple
threads for large sort operation ?


best regards,

Marc Mamin

Re: CREATE INDEX as bottleneck

From
Kenneth Marshall
Date:
On Thu, Nov 11, 2010 at 02:41:12PM +0100, Marc Mamin wrote:
> Hello,
>
> in the last years, we have successfully manage to cope with our data
> growth
> using partitioning and splitting large aggregation tasks on multiple
> threads.
> The partitioning is done logically by our applicationn server, thus
> avoiding trigger overhead.
>
> There are a few places in our data flow where we have to wait for index
> creation before being able to distribute the process on multiple threads
> again.
>
> With the expected growth, create index will probably become a severe
> bottleneck for us.
>
> Is there any chance to see major improvement on it in a middle future ?
> I guess the question is naive, but why can't posgres use multiple
> threads for large sort operation ?
>
>
> best regards,
>
> Marc Mamin
>

There has been a recent discussion on the hackers mailing list on
using the infrastructure that is already in place to lauch autovacuum
processes to launch other helper processes. Something like this could
be used to offload the sort process to a much more parallelize version
that could take advantage of multiple I/O streams and CPU cores. Many
things are possible given the appropriate resources: funding, coding
and development cycles...

Regards,
Ken

Re: CREATE INDEX as bottleneck

From
Alex Hunsaker
Date:
On Thu, Nov 11, 2010 at 06:41, Marc Mamin <M.Mamin@intershop.de> wrote:
> There are a few places in our data flow where we have to wait for index
> creation before being able to distribute the process on multiple threads
> again.

Would CREATE INDEX CONCURRENTLY help here?

Re: CREATE INDEX as bottleneck

From
"Marc Mamin"
Date:
No, CONCURRENTLY is to improve table availability during index creation, but it degrades the performances.

best regards,

Marc Mamin


-----Original Message-----
From: Alex Hunsaker [mailto:badalex@gmail.com] 
Sent: Donnerstag, 11. November 2010 19:55
To: Marc Mamin
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] CREATE INDEX as bottleneck

On Thu, Nov 11, 2010 at 06:41, Marc Mamin <M.Mamin@intershop.de> wrote:
> There are a few places in our data flow where we have to wait for index
> creation before being able to distribute the process on multiple threads
> again.

Would CREATE INDEX CONCURRENTLY help here?