Re: Online index builds - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Online index builds
Date
Msg-id 87ejwmxskp.fsf@stark.xeocode.com
Whole thread Raw
In response to Re: Online index builds  (Hannu Krosing <hannu@skype.net>)
Responses Re: Online index builds  ("Joshua D. Drake" <jd@commandprompt.com>)
Re: Online index builds  (Agent M <agentm@themactionfaction.com>)
Re: Online index builds  (Simon Riggs <simon@2ndquadrant.com>)
Re: Online index builds  (Hannu Krosing <hannu@skype.net>)
Re: Online index builds  ("Jim C. Nasby" <jnasby@pervasive.com>)
List pgsql-hackers
Hannu Krosing <hannu@skype.net> writes:

> Maybe we can show progress indicators in status line (either
> pg_stat_activity.current_query or commandline shown in ps), like 
> 
> WAITING TO START PHASE 1 - WAITING FOR TRANSACTION XXX TO COMPLETE 
> 
> or 
> 
> INSERTING INDEX ENTRY N OF M
> 
> changing every few seconds.

Hm. That would be very interesting. I'll say that one of the things that
impressed me very much with Postgres moving from Oracle was the focus on
usability. Progress indicators would be excellent for a lot of operations.

That said I'm not sure how much I can do here. For a substantial index we
should expect most of the time will be spent in the tuplesort. It's hard to
see how to get any sort of progress indicator out of there and as long as we
can't it's hard to see the point of getting one during the heap scan or any of
the other i/o operations.

I think it does make sense to put something in current_query indicating when
it's waiting for transactions to end and when it's past that point. That's
something the DBA should be aware of.

> And why not make t possible to add a verbosity level there as well:
> 'CREATE INDEX ... VEBOSE 3'. And to VACUUM VERBOSE as well 
> 
> At level 3 all status changes could also be sent to client as well.

Wouldn't you just control this with log_min_messages? It seems unnecessary to
clutter the grammar for every command with "verbose" options.

> Another related thing - throttling
> ----------------------------------
> 
> Did you do any work on using vacuum_cost_* GUC vars to throttle the
> build process if desired ? 

Actually no. While there is consensus that will be necessary I'm not sure I
can do it with this patch. The problem is that most of the real heavy lifting
here is done inside tuplesort. Even aside from that most of what's left is
inside bulkdelete(*) and the code that handles regular index builds.

So I think we'll need some global thinking about what options Postgres needs
to control throttling in general. And probably someone needs to write a
separate patch that adds all the hooks to the various places in a single go.
Trying to throttle just one operation at a time when a lot of the code that
implements these operations is shared will have us running in circles.

(*) Hm. Come to think of it I wonder if the vacuum_cost parameters are already
kicking in for this phase. That would be a bit strange since it's the fastest
of the three scans.

-- 
greg



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [PATCHES] PostmasterHandl_patch of win32
Next
From: "Hiroshi Saito"
Date:
Subject: Re: [PATCHES] PostmasterHandl_patch of win32