33.3. Using Compression

Compression can only be enabled for separate tablespaces. To compress a tablespace, you should enable the compression option when creating this tablespace. For example:

    postgres=# CREATE TABLESPACE zfs LOCATION '/var/data/cfs' WITH (compression=true);
  

All tables created in this tablespace will be compressed using zstd, which is the default compression library.

Aside from the boolean value of the option, you can explicitly specify the library to use for compression. Possible values are zstd, default (the same as zstd), pglz, zlib, and lz4. For example, to use zlib, create the tablespace as follows:

    postgres=# CREATE TABLESPACE zfs1 LOCATION '/var/data/cfs1' WITH (compression='zlib');
  

Once set, the tablespace compression option cannot be altered, so you cannot compress or decompress an already existing tablespace. While user relations are always compressed in CFS, system relations are not compressed if the numeric part of the filename, such as, for example, pg_relation_filenode(), is less than 16384 (see Table 9.93 for more details).

If you would like to compress all tables created within the current session, you can make the compressed tablespace your default tablespace, as explained in Section 22.6.

Note

For compressed tablespaces, pg_checksums and pg_basebackup will not verify checksums even if they are enabled.

To configure CFS, use configuration parameters listed in Section 19.15. By default, CFS launches one background worker performing garbage collection. Garbage collector traverses tablespace directory, locating map files in it and checking percent of garbage in this file. When ratio of used and allocated spaces exceeds cfs_gc_threshold threshold, this file is defragmented. The file is locked at the period of defragmentation, preventing any access to this part of relation. To avoid getting stuck during any file defragmentation failure, CFS waits cfs_gc_respond_time seconds before the file is released from the lock. If it is not, a warning is written to the log. When defragmentation is completed, garbage collection waits cfs_gc_delay milliseconds and continues directory traversal. After the end of traversal, GC waits cfs_gc_period milliseconds and starts new GC iteration. If there are more than one GC workers, then they split work based on hash of file inode.

Several functions allow you to calculate the disk space used by different objects in CFS. For example:

  • pg_relation_size() can compute the disk space used by the relation in CFS.

  • pg_total_relation_size() computes the total disk space used by the specified table in CFS, including all indexes and TOAST data.

  • pg_indexes_size() computes the total disk space used by indexes attached to the specified table in CFS.

  • pg_database_size() computes the disk space used by the specified database in CFS.

See Section 9.27.7 for more details.

CFS provides several functions to manually control CFS garbage collection and get information on CFS state and activity. For the full list of functions, see Section 9.27.11.

To initiate garbage collection manually, use the cfs_start_gc(n_workers) function. This function returns number of workers which are actually started. Please notice that if cfs_gc_workers parameter is non zero, then GC is performed in background and cfs_start_gc function does nothing and returns 0.

Like the automatic garbage collection, the cfs_start_gc(n_workers) function only processes relations if the percent of garbage blocks in this relation exceeds the cfs_gc_threshold value. To defragment a relation with a smaller percent of garbage, you can temporarily set this parameter to a smaller value for your current session before calling this function.

It is possible to estimate effect of table compression using cfs_estimate(relation) function. This function takes first ten blocks of relation, tries to compress them, and returns average compress ratio. So if returned value is 7.8 then compressed table occupies about eight time less space than original table.

Function cfs_compression_ratio(relation) allows you to check how precise was the estimation of cfs_estimate(relation) function. It returns real compression ration for all segments of the compressed relation. Compression ration is total sum of virtual size of all relation segments (number of blocks multiplied by 8kb) divided by sum of physical size of the segment files.

As it was mentioned before, CFS always appends updated blocks to the end of the compressed file. So physical size of the file can be greater than used size in this file. I.e. CFS file is fragmented and defragmentation is periodically performed by CFS garbage collector. cfs_fragmentation(relation) functions returns average fragmentation of relation files. It is calculated as sum of physical sizes of the files minus sum of used size of the files divided by sum of physical sizes of the files.

To perform defragmentation for a particular compressed relation, use the cfs_gc_relation(relation) function. It returns the number of processed segments of the relation. Just like garbage collection performed in the background, this function only processes segments in which the percent of garbage blocks exceeds the cfs_gc_threshold value.

There are several functions allowing to monitors garbage collection activity: cfs_gc_activity_scanned_files returns number of files scanned by GC, cfs_gc_activity_processed_files returns number of file compacted by GC, cfs_gc_activity_processed_pages returns number of pages transferred by GC during files defragmentation, cfs_gc_activity_processed_bytes returns total size of transferred pages. All these functions calculate their values since system start.