2.4. Analyzing and Vacuuming #
Shardman databases require periodic maintenance, known as vacuuming.
2.4.1. Autovacuum Daemon #
For many installations, it is sufficient to let vacuuming be performed by the autovacuum daemon. As in PostgreSQL installations, the autovacuum daemon will issue ANALYZE
commands automatically whenever the content of a table has changed sufficiently. The ANALYZE
command transfers statistics for foreign partitions from remote nodes.
Rebalance process can move or copy data between cluster nodes. After this operation, all transferred objects are analyzed automatically. As usual, local statistics are gathered, and remote statistics are fetched from foreign servers.
2.4.2. Analyzing the Whole Database #
When running for the entire database, the ANALYZE
command calculates statistics for local partitions of sharded tables on the current node and for global tables. It also transfers statistics for foreign partitions available on remote nodes. However, note that remote nodes may lack the statistics.
Still, running ANALYZE
on each node may not give you full cluster-wide statistics, as they may have not been calculated on some nodes. Also, global tables are analyzed multiple times (for each node).
Instead, use the shardman.global_analyze() function to perform cluster-wide analysis. First, this function executes ANALYZE
on each node for all local partitions of sharded tables, then sends these statistics to other nodes. Next, it selects one node per a global table and runs ANALYZE
of this table on the selected node. Gathered statistics are broadcast to all other nodes in the cluster.
2.4.3. Analyzing Global Tables #
When ANALYZE
is run on a global table, only statistics for the corresponding local table on the current node are updated.
2.4.4. Analyzing Sharded Tables #
When ANALYZE
is run on a sharded table, statistics on its local partitions are updated, statistics for foreign partitions are transferred from remote nodes, if remote nodes have it. If statistics are missing on remote nodes, they are not calculated. If you run ANALYZE
for the local table, only its statistics are updated.
When ANALYZE
is run on a foreign table directly and a remote node does not have any statistics for the corresponding local table, this local table is analyzed remotely. Then statistics are transferred from the remote node.
2.4.5. Vacuuming #
The VACUUM
command running for the whole database performs vacuuming on the current node.
You can broadcast this command to perform cluster-wide vacuuming on all nodes. Set the shardman.broadcast_ddl configuration parameter to on
, run VACUUM
, and set shardman.broadcast_ddl
to off
.
When VACUUM
is run on a sharded or global table, the statement is broadcast automatically. On all nodes, the command vacuums all table partitions of the sharded table and all local tables of the global table.