Re: Online index builds - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: Online index builds
Date
Msg-id 1153003941.3031.42.camel@localhost.localdomain
Whole thread Raw
In response to Re: Online index builds  (Greg Stark <gsstark@mit.edu>)
Responses Re: Online index builds  (Greg Stark <gsstark@mit.edu>)
List pgsql-hackers
Ühel kenal päeval, N, 2006-07-13 kell 01:07, kirjutas Greg Stark:
> Simon Riggs <simon@2ndquadrant.com> writes:
> 
> > On Wed, 2006-07-12 at 12:09 -0400, Greg Stark wrote:
> > > no regression tests yet.
> > 
> > We'll need some performance tests that show that lock-hold time is
> > *actually* reduced, given the shenanigans needed to get there.
> 
> I'm not sure what you mean by "lock-hold time". Online index builds
> effectively take *no* locks in the user-visible sense that regular index
> builds do. Other transactions can insert, update, delete continuously
> throughout the entire process.
> 
> The only locks that are taken are
> 
> 1) a ShareUpdateExclusiveLock which blocks vacuum from running on the table
>    being indexed. This is taken by both phase 1 and phase 2. (Actually I had
>    the wrong lock in the patch I emailed in one place. Fixed in my source tree
>    here)
> 
> 2) An ExclusiveLock that is taken momentarily and immediately released. Even
>    if that can never be acquired due to a busy system it can eventually
>    proceed anyways as long as there are no long-running transactions that are
>    refusing to commit.
> 
> That said we do need some performance tests to get an idea how long phase 2
> takes for large tables. The additional index and heap scan and tid sort could
> take a substantial amount of time though never as long as the original index
> build done in phase 1.

Maybe we can show progress indicators in status line (either
pg_stat_activity.current_query or commandline shown in ps), like 

WAITING TO START PHASE 1 - WAITING FOR TRANSACTION XXX TO COMPLETE 

or 

INSERTING INDEX ENTRY N OF M

changing every few seconds.

that could give the sysadmin some idea of what is going on without too
much verbosity on console.

there could of course be a VERBOSE mode, which acts similar to VACUUM
VERBOSE.

And why not make t possible to add a verbosity level there as well:
'CREATE INDEX ... VEBOSE 3'. And to VACUUM VERBOSE as well 

At level 3 all status changes could also be sent to client as well.

> What's worse is that in some cases the merge could potentially be doing a lot
> of retail index inserts. I have no good intuition for how long those will take
> relative to the wholesale index build method, especially since for some index
> methods like GIN retail inserts are extremely expensive.
> 
> So for indexes that don't have a lot of records that need to be inserted
> individually what I expect -- and what I put in the docs -- is something under
> 100% time penalty for an online index build. In fact I expect it to be more
> like 50% though it depends on how wide the original index. For ones that do
> have lots of records mutated for phase 2 all bets are off.

the only quarantee seems to be, that if there are still some resources
left, the index build will eventualy complete.

but showing progress will let the DBA to make the decision to abort the
build if he sees that it takes "too long".


Another related thing - throttling
----------------------------------

Did you do any work on using vacuum_cost_* GUC vars to throttle the
build process if desired ? 

I guess that the initial seqscans are probably cheap enough ( or at
least they are no worse than if someone just did "select * ..."
concurrently), but the index merge can probably still not be as light on
OLTP queries as desirable.

In reality on OLTP dbs even SELECT COUNT(*) can sometimes be a
"mainenance operation" and thus the goal may not be to complete as fast
as possible, but to be as light as possible on concurrent OLTP queries.

Eventually it would nice to have special optimiser rules for any
"maintenance" queries and DDL ops as defined by DBA, but making use of
vacuum_cost_* vars for generic throttling would be a good first cut ;


> 
-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com




pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: automatic system info tool?
Next
From: Josh Berkus
Date:
Subject: Re: automatic system info tool?