Home > mailing lists

[HACKERS] CLUSTER command progress monitor - Mailing list pgsql-hackers

From	Tatsuro Yamada
Subject	[HACKERS] CLUSTER command progress monitor
Date	August 31, 2017 08:12:02
Msg-id	59A77072.3090401@lab.ntt.co.jp Whole thread Raw
Responses	Re: [HACKERS] CLUSTER command progress monitor (Thomas Munro <thomas.munro@enterprisedb.com>) Re: [HACKERS] CLUSTER command progress monitor (Michael Paquier <michael.paquier@gmail.com>) Re: [HACKERS] CLUSTER command progress monitor (Robert Haas <robertmhaas@gmail.com>) Re: [HACKERS] CLUSTER command progress monitor (Jeff Janes <jeff.janes@gmail.com>)
List	pgsql-hackers

Tree view

Hi,

Following is a proposal for reporting the progress of CLUSTER command:

It seems that the following could be the phases of CLUSTER processing:

  1. scanning heap
  2. sort tuples
  3. writing new heap
  4. scan heap and write new heap
  5. swapping relation files
  6. rebuild index
  7. performing final cleanup

These phases are based on Rahila's presentation at PGCon 2017
 (https://www.pgcon.org/2017/schedule/attachments/472_Progress%20Measurement%20PostgreSQL.pdf)
and I added some phases on it.

CLUSTER command may use Index Scan or Seq Scan when scanning the heap.
Depending on which one is chosen, the command will proceed in the
following sequence of phases:

  * Seq Scan
    1. scanning heap
    2. sort tuples
    3. writing new heap
    5. swapping relation files
    6. rebuild index
    7. performing final cleanup

  * Index Scan
    4. scan heap and write new heap
    5. swapping relation files
    6. rebuild index
    7. performing final cleanup

The view provides the information of CLUSTER command progress details as follows
postgres=# \d pg_stat_progress_cluster
           View "pg_catalog.pg_stat_progress_cluster"
       Column        |  Type   | Collation | Nullable | Default
---------------------+---------+-----------+----------+---------
 pid                 | integer |           |          |
 datid               | oid     |           |          |
 datname             | name    |           |          |
 relid               | oid     |           |          |
 phase               | text    |           |          |
 scan_method         | text    |           |          |
 scan_index_relid    | bigint  |           |          |
 heap_tuples_total   | bigint  |           |          |
 heap_tuples_scanned | bigint  |           |          |


Then I have questions.

  * Should we have separate views for them?  Or should both be covered by the
    same view with some indication of which command (CLUSTER or VACUUM FULL)
    is actually running?
    I mean this progress monitor could be covering not only CLUSTER command but also
    VACUUM FULL command.

  * I chose tuples as scan heap's counter (heap_tuples_scanned) since it's not
    easy to get current blocks from Index Scan. Is it Ok?


I'll add this patch to CF2017-09.
Any comments or suggestion are welcome.

Regards,
Tatsuro Yamada
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

progress_monitor_cluster_v1.patch

pgsql-hackers by date:

From: "Bossart, Nathan"
Date: 31 August 2017, 07:52:13
Subject: Re: [HACKERS] [Proposal] Allow users to specify multiple tables inVACUUM commands

From: Peter Eisentraut
Date: 31 August 2017, 08:23:01
Subject: [HACKERS] document and use SPI_result_code_string()

[HACKERS] CLUSTER command progress monitor - Mailing list pgsql-hackers

Attachment

Previous

Next