monitoring CREATE INDEX [CONCURRENTLY] - Mailing list pgsql-hackers
From | Alvaro Herrera |
---|---|
Subject | monitoring CREATE INDEX [CONCURRENTLY] |
Date | |
Msg-id | 20181220220022.mg63bhk26zdpvmcj@alvherre.pgsql Whole thread Raw |
Responses |
Re: monitoring CREATE INDEX [CONCURRENTLY]
Re: monitoring CREATE INDEX [CONCURRENTLY] Re: monitoring CREATE INDEX [CONCURRENTLY] |
List | pgsql-hackers |
Monitoring progress of CREATE INDEX [CONCURRENTLY] is sure to be welcome, so here's a proposal. There are three distinct interesting cases. One is straight CREATE INDEX of a standalone table; then we have CREATE INDEX CONCURRENTLY; finally, CREATE INDEX on a partitioned table. Note that there's no CONCURRENTLY for partitioned tables. A non-concurrent build is a very straightforward: we call create_index, which does index_build, done. See below for how to report for index_build, which is the interesting part. I propose not to report anything else than that for non-concurrent build. There's some preparatory work that's identical than for CIC (see below). Like VACUUM, it seems a bit pointless to report an initial phase that's almost immediate, so I propose we just don't report anything until the actual index building starts. CREATE INDEX CONCURRENTLY does these things first, which we would not report (this is just like VACUUM, which only starts reporting once it starts scanning blocks): a. lock rel. No metrics to report. b. other prep; includes lots of catalog access. Unlikely to lock, but not impossible. No metrics to report. c. create_index. CIC skips index_build here, so there's no reason to report it either. We would start reporting at this point, with these phases: 1. WaitForLockers 1. Report how many xacts do we need to wait for, how many are done. 2. index_build. See below. 3. WaitForLockers 2. Report how many xacts do we need to wait for, how many are done. 4. validate_index. Scans the whole rel again. Report number of blocks scanned. 5. wait for virtual XIDs. Like WaitForLockers: report how many xacts we need to wait for, how many are done. We're done. (Alternatively, we could have an initial "prep" phase for a/b/c for the concurrent case and a/b for non-concurrent. I'm just not sure it's useful.) index_build ----------- The actual index building is an AM-specific undertaking, and we report its progress separately from the AM-agnostic code. That is, each AM has freedom to define its own list of phases and counters, separate from the generic code. This avoids the need to provide a new AM method or invoke callbacks. So when you see that CREATE_INDEX_PHASE is either "index build" you'll have a separate BTREE_CREATE_PHASE value set to either "scanning heap" or "sorting" or "building upper layers"; equivalently for other AMs. Partitioned indexes ------------------- For partitioned indexes, we only have the index build phase, but we repeat it for each partition. In addition to the index_build metrics described above, we should report how many partitions we need to handle in total and how many partitions are already done. (I'm avoiding getting in the trouble of reporting *which* partition we're currently handling and have already handled.) Thoughts? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: