Re: Index creation running now for 14 hours - Mailing list pgsql-performance

From Tomas Vondra
Subject Re: Index creation running now for 14 hours
Date
Msg-id 55DE3F53.3060606@2ndquadrant.com
Whole thread Raw
In response to Re: Index creation running now for 14 hours  (Tory M Blue <tmblue@gmail.com>)
Responses Re: Index creation running now for 14 hours
List pgsql-performance
Hi,

On 08/26/2015 11:53 PM, Tory M Blue wrote:
>
>
> On Wed, Aug 26, 2015 at 2:45 PM, Qingqing Zhou
> <zhouqq.postgres@gmail.com <mailto:zhouqq.postgres@gmail.com>> wrote:
>
>     On Wed, Aug 26, 2015 at 1:26 PM, Tory M Blue <tmblue@gmail.com
>     <mailto:tmblue@gmail.com>> wrote:
>      >
>      > Right now the 100% cpu process which is this index is only using
>     3.5GB
>      > and has been for the last 15 hours
>      >
>
>     If 100% cpu, you can do 'sudo perf top' to see what the CPU is busy
>     about.
>
>     Regards,
>     Qingqing
>
>
>
> I appreciate the attempted help, but I know what it's doing, it's
> creating indexes for the last 14+ hours.

Sure, but what exactly was it doing? 'perf top' might give us a hint
which function is consuming most of the time, for example.

 > I've killed it now,  as it was
> about to run my machine out of disk space, stopped it at 97% full, could
> not go any longer.

Which suggests it's using a lot of temp files.

Indexes are built by reading all the necessary data from the table (just
the columns), sorted and then an index is built using the sorted data
(because it can be done very efficiently - much faster than when simply
inserting the tuples into the btree index).

The fact that you ran out of disk space probably means that you don't
have enough space for the sort (it clearly does not fit into
maintenance_work_mem), and there's no way around that - you need enough
disk space.

> I will now clean up the table a bit but will still have 500million rows
> with 6 indexes on it. I will create the indexes after the data is laid
> down vs during,  so it doesn't block my other table replications. I will
> then fire off my index creations in parallel  for my other tables so I
> can actually use the hardware the DB is sitting on.

That's a very bad idea, because each of the index builds will require
disk space for the sort, and you're even more likely to run out of disk
space.

>
> But I guess the answer is, no real way to tell what the box is doing
> when it's creating an index. Yes there was a lock, no I could not find a
> way to see how it's progressing so there was no way for me to gauge when
> it would be done.

Had it been waiting on a lock, it wouldn't consume 100% of CPU.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-performance by date:

Previous
From: Tory M Blue
Date:
Subject: Re: Index creation running now for 14 hours
Next
From: Tomas Vondra
Date:
Subject: Re: Index creation running now for 14 hours