Re: pg_stat_progress_create_index vs. parallel index builds - Mailing list pgsql-hackers
From | Matthias van de Meent |
---|---|
Subject | Re: pg_stat_progress_create_index vs. parallel index builds |
Date | |
Msg-id | CAEze2Wgm-NnZe3rOnwjYTVriS8xsVhzzVBCGj34h06cDNuaTig@mail.gmail.com Whole thread Raw |
In response to | Re: pg_stat_progress_create_index vs. parallel index builds (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: pg_stat_progress_create_index vs. parallel index builds
|
List | pgsql-hackers |
On Wed, 2 Jun 2021 at 15:23, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > > > On 6/2/21 3:03 PM, Matthias van de Meent wrote: > > On Wed, 2 Jun 2021 at 13:57, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > >> > >> Hi, > >> > >> While experimenting with parallel index builds, I've noticed a somewhat > >> strange behavior of pg_stat_progress_create_index when a btree index is > >> built with parallel workers - some of the phases seem to be missing. > >> > >> In serial (no parallelism) mode, the progress is roughly this (it's > >> always the first/last timestamp of each phase): > >> > >> | command | phase > >> -------------+--------------+---------------------------------------- > >> 12:56:01 AM | CREATE INDEX | building index: scanning table > >> ... > >> 01:06:22 AM | CREATE INDEX | building index: scanning table > >> 01:06:23 AM | CREATE INDEX | building index: sorting live tuples > >> ... > >> 01:13:10 AM | CREATE INDEX | building index: sorting live tuples > >> 01:13:11 AM | CREATE INDEX | building index: loading tuples in tree > >> ... > >> 01:24:02 AM | CREATE INDEX | building index: loading tuples in tree > >> > >> So it goes through three phases: > >> > >> 1) scanning tuples > >> 2) sorting live tuples > >> 3) loading tuples in tree > >> > >> But with parallel build index build, it changes to: > >> > >> | command | phase > >> -------------+--------------+---------------------------------------- > >> 11:40:48 AM | CREATE INDEX | building index: scanning table > >> ... > >> 11:47:24 AM | CREATE INDEX | building index: scanning table (scan > >> complete) > >> 11:56:22 AM | CREATE INDEX | building index: scanning table > >> 11:56:23 AM | CREATE INDEX | building index: loading tuples in tree > >> ... > >> 12:05:33 PM | CREATE INDEX | building index: loading tuples in tree > >> > >> That is, the "sorting live tuples" phase disappeared, and instead it > >> seems to be counted in the "scanning table" one, as if there was an > >> update of the phase missing. > > > >> I've only tried this on master, but I assume it behaves like this in the > >> older releases too. I wonder if this is intentional - it sure is a bit > >> misleading. > > > > This was a suprise to me as well. According to documentation in > > sortsupport.h (line 125-129) the parallel workers produce pre-sorted > > segments during the scanning phase, which are subsequently merged by > > the leader. This might mean that the 'sorting' phase is already > > finished during the 'scanning' phase by waiting for the parallel > > workers; I haven't looked further if this is the case and whether it > > could be changed to also produce the sorting metrics, but seeing as it > > is part of the parallel workers API of tuplesort, I think fixing it in > > current releases is going to be difficult. > > > > Maybe. Perhaps it's more complicated to decide when to switch between > phases with parallel workers. Still, the table scan is done after ~8 > minutes (based on blocks_total vs. blocks_done), yet we keep that phase > for another ~9 minutes. It seems this is where the workers do the sort, > so "sorting live tuples" seems like a more natural phase for this. After looking at it a bit more, it seems like a solution was actually easier than I'd expected. PFA a prototype (unvalidated, but check-world -ed) patch that would add these subphases of progress reporting, which can be backpatched down to 12. Do note that this is a partial fix, as it only fixes it when the leader participates; but I don't think that limitation is too much of a problem because only on builds which explicitly define the non-standard DISABLE_LEADER_PARTICIPATION this will happen, and in such cases the progress reporting for the loading phase will fail as well. With regards, Matthias van de Meent
Attachment
pgsql-hackers by date: