Re: Index creation running now for 14 hours - Mailing list pgsql-performance

From Peter Geoghegan
Subject Re: Index creation running now for 14 hours
Date
Msg-id CAEYLb_U7FUWV49L-oj6HzhxqpsO8CsFgb-+UY64qt9a3OrDVTw@mail.gmail.com
Whole thread Raw
In response to Re: Index creation running now for 14 hours  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-performance
On Wed, Aug 26, 2015 at 3:36 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:
>> But I guess the answer is, no real way to tell what the box is doing
>> when it's creating an index. Yes there was a lock, no I could not find a
>> way to see how it's progressing so there was no way for me to gauge when
>> it would be done.
>
>
> Had it been waiting on a lock, it wouldn't consume 100% of CPU.

When things are going out to disk anyway, you're often better off with
a lower maintenance_work_mem (or work_mem). It's actually kind of
bogus than run size is dictated by these settings. Reducing it will
tend to make tuplesort's maintenance of the heap invariant
inexpensive, while not really making the merge phase more painful. I
would try 128MB of maintenance_work_mem. That could be significantly
faster. Check how the I/O load on the system compares with a higher
maintenance_work_mem setting. Often, this will make the sort less CPU
bound, which is good here.

I am currently working on making this a lot better in Postgres 9.6.
Also, note that text and numeric sorts will be much faster in 9.5.

Of course, as Tomas says, if you don't have the disk space to do the
sort, you're not going to be able to complete it. That much is very
clear.

If you're really worried about these costs, I suggest enabling
trace_sort locally, and monitoring the progress of this sort in the
logs.

--
Regards,
Peter Geoghegan


pgsql-performance by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Index creation running now for 14 hours
Next
From: Henrik Thostrup Jensen
Date:
Subject: Re: Gist indexing performance with cidr types