Home > mailing lists

monitoring CREATE INDEX [CONCURRENTLY] - Mailing list pgsql-hackers

From	Alvaro Herrera
Subject	monitoring CREATE INDEX [CONCURRENTLY]
Date	December 20, 2018 22:00:22
Msg-id	20181220220022.mg63bhk26zdpvmcj@alvherre.pgsql Whole thread Raw
Responses	Re: monitoring CREATE INDEX [CONCURRENTLY] Re: monitoring CREATE INDEX [CONCURRENTLY] Re: monitoring CREATE INDEX [CONCURRENTLY]
List	pgsql-hackers

Tree view

Monitoring progress of CREATE INDEX [CONCURRENTLY] is sure to be welcome,
so here's a proposal.

There are three distinct interesting cases.  One is straight CREATE
INDEX of a standalone table; then we have CREATE INDEX CONCURRENTLY;
finally, CREATE INDEX on a partitioned table.  Note that there's no
CONCURRENTLY for partitioned tables.

A non-concurrent build is a very straightforward: we call create_index,
which does index_build, done.  See below for how to report for
index_build, which is the interesting part.  I propose not to report
anything else than that for non-concurrent build.  There's some
preparatory work that's identical than for CIC (see below).  Like
VACUUM, it seems a bit pointless to report an initial phase that's
almost immediate, so I propose we just don't report anything until the
actual index building starts.

CREATE INDEX CONCURRENTLY does these things first, which we would not
report (this is just like VACUUM, which only starts reporting once it
starts scanning blocks):

a. lock rel.  No metrics to report.
b. other prep; includes lots of catalog access.  Unlikely to lock, but
   not impossible.  No metrics to report.
c. create_index.  CIC skips index_build here, so there's no reason to
   report it either.

We would start reporting at this point, with these phases:

1. WaitForLockers 1.  Report how many xacts do we need to wait for, how
   many are done.
2. index_build.  See below.
3. WaitForLockers 2.  Report how many xacts do we need to wait for, how
   many are done.
4. validate_index.  Scans the whole rel again.  Report number of blocks
   scanned.
5. wait for virtual XIDs.  Like WaitForLockers: report how many xacts we
   need to wait for, how many are done.

We're done.

(Alternatively, we could have an initial "prep" phase for a/b/c for the
concurrent case and a/b for non-concurrent.  I'm just not sure it's
useful.)

index_build
-----------

The actual index building is an AM-specific undertaking, and we report
its progress separately from the AM-agnostic code.  That is, each AM has
freedom to define its own list of phases and counters, separate from the
generic code.  This avoids the need to provide a new AM method or invoke
callbacks.  So when you see that CREATE_INDEX_PHASE is either "index
build" you'll have a separate BTREE_CREATE_PHASE value set to either
"scanning heap" or "sorting" or "building upper layers"; equivalently
for other AMs.

Partitioned indexes
-------------------

For partitioned indexes, we only have the index build phase, but we
repeat it for each partition.  In addition to the index_build metrics
described above, we should report how many partitions we need to handle
in total and how many partitions are already done.  (I'm avoiding
getting in the trouble of reporting *which* partition we're currently
handling and have already handled.)

Thoughts?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

From: Robert Haas
Date: 20 December 2018, 21:38:51
Subject: Re: ATTACH/DETACH PARTITION CONCURRENTLY

From: Tom Lane
Date: 20 December 2018, 22:21:07
Subject: Re: Tid scan improvements

monitoring CREATE INDEX [CONCURRENTLY] - Mailing list pgsql-hackers

Previous

Next