19.4. Resource Consumption
19.4.1. Memory
shared_buffers
(integer
)Sets the amount of memory the database server uses for shared memory buffers. The default is typically 128 megabytes (
128MB
), but might be less if your kernel settings will not support it (as determined during initdb). This setting must be at least 128 kilobytes. (Non-default values ofBLCKSZ
change the minimum.) However, settings significantly higher than the minimum are usually needed for good performance. This parameter can only be set at server start.If you have a dedicated database server with 1GB or more of RAM, a reasonable starting value for
shared_buffers
is 25% of the memory in your system. There are some workloads where even larger settings forshared_buffers
are effective, but because Postgres Pro also relies on the operating system cache, it is unlikely that an allocation of more than 40% of RAM toshared_buffers
will work better than a smaller amount. Larger settings forshared_buffers
usually require a corresponding increase inmax_wal_size
, in order to spread out the process of writing large quantities of new or changed data over a longer period of time.On systems with less than 1GB of RAM, a smaller percentage of RAM is appropriate, so as to leave adequate space for the operating system.
huge_pages
(enum
)Enables/disables the use of huge memory pages. Valid values are
try
(the default),on
, andoff
.At present, this feature is supported only on Linux. The setting is ignored on other systems when set to
try
.The use of huge pages results in smaller page tables and less CPU time spent on memory management, increasing performance. For more details, see Section 18.4.5.
With
huge_pages
set totry
, the server will try to use huge pages, but fall back to using normal allocation if that fails. Withon
, failure to use huge pages will prevent the server from starting up. Withoff
, huge pages will not be used.temp_buffers
(integer
)Sets the maximum number of temporary buffers used by each database session. These are session-local buffers used only for access to temporary tables. The default is eight megabytes (
8MB
). The setting can be changed within individual sessions, but only before the first use of temporary tables within the session; subsequent attempts to change the value will have no effect on that session.A session will allocate temporary buffers as needed up to the limit given by
temp_buffers
. The cost of setting a large value in sessions that do not actually need many temporary buffers is only a buffer descriptor, or about 64 bytes, per increment intemp_buffers
. However if a buffer is actually used an additional 8192 bytes will be consumed for it (or in general,BLCKSZ
bytes).max_prepared_transactions
(integer
)Sets the maximum number of transactions that can be in the “prepared” state simultaneously (see PREPARE TRANSACTION). Setting this parameter to zero (which is the default) disables the prepared-transaction feature. This parameter can only be set at server start.
If you are not planning to use prepared transactions, this parameter should be set to zero to prevent accidental creation of prepared transactions. If you are using prepared transactions, you will probably want
max_prepared_transactions
to be at least as large as max_connections, so that every session can have a prepared transaction pending.When running a standby server, you must set this parameter to the same or higher value than on the master server. Otherwise, queries will not be allowed in the standby server.
max_autonomous_transactions
(integer
)Sets the maximum number of autonomous transactions that can be used simultaneously in all sessions of a Postgres Pro instance. If this value is exceeded, an error occurs.
Default: 100
work_mem
(integer
)Specifies the amount of memory to be used by internal sort operations and hash tables before writing to temporary disk files. The value defaults to four megabytes (
4MB
). Note that for a complex query, several sort or hash operations might be running in parallel; each operation will be allowed to use as much memory as this value specifies before it starts to write data into temporary files. Also, several running sessions could be doing such operations concurrently. Therefore, the total memory used could be many times the value ofwork_mem
; it is necessary to keep this fact in mind when choosing the value. Sort operations are used forORDER BY
,DISTINCT
, and merge joins. Hash tables are used in hash joins, hash-based aggregation, and hash-based processing ofIN
subqueries.maintenance_work_mem
(integer
)Specifies the maximum amount of memory to be used by maintenance operations, such as
VACUUM
,CREATE INDEX
, andALTER TABLE ADD FOREIGN KEY
. It defaults to 64 megabytes (64MB
). Since only one of these operations can be executed at a time by a database session, and an installation normally doesn't have many of them running concurrently, it's safe to set this value significantly larger thanwork_mem
. Larger settings might improve performance for vacuuming and for restoring database dumps.Note that when autovacuum runs, up to autovacuum_max_workers times this memory may be allocated, so be careful not to set the default value too high. It may be useful to control for this by separately setting autovacuum_work_mem.
Note that for the collection of dead tuple identifiers,
VACUUM
is only able to utilize up to a maximum of1GB
of memory.replacement_sort_tuples
(integer
)When the number of tuples to be sorted is smaller than this number, a sort will produce its first output run using replacement selection rather than quicksort. This may be useful in memory-constrained environments where tuples that are input into larger sort operations have a strong physical-to-logical correlation. Note that this does not include input tuples with an inverse correlation. It is possible for the replacement selection algorithm to generate one long run that requires no merging, where use of the default strategy would result in many runs that must be merged to produce a final sorted output. This may allow sort operations to complete sooner.
The default is 150,000 tuples. Note that higher values are typically not much more effective, and may be counter-productive, since the priority queue is sensitive to the size of available CPU cache, whereas the default strategy sorts runs using a cache oblivious algorithm. This property allows the default sort strategy to automatically and transparently make effective use of available CPU cache.
Setting
maintenance_work_mem
to its default value usually prevents utility command external sorts (e.g., sorts used byCREATE INDEX
to build B-Tree indexes) from ever using replacement selection sort, unless the input tuples are quite wide.autovacuum_work_mem
(integer
)Specifies the maximum amount of memory to be used by each autovacuum worker process. It defaults to -1, indicating that the value of maintenance_work_mem should be used instead. The setting has no effect on the behavior of
VACUUM
when run in other contexts. This parameter can only be set in thepostgresql.conf
file or on the server command line.For the collection of dead tuple identifiers, autovacuum is only able to utilize up to a maximum of
1GB
of memory, so settingautovacuum_work_mem
to a value higher than that has no effect on the number of dead tuples that autovacuum can collect while scanning a table.max_stack_depth
(integer
)Specifies the maximum safe depth of the server's execution stack. The ideal setting for this parameter is the actual stack size limit enforced by the kernel (as set by
ulimit -s
or local equivalent), less a safety margin of a megabyte or so. The safety margin is needed because the stack depth is not checked in every routine in the server, but only in key potentially-recursive routines such as expression evaluation. The default setting is two megabytes (2MB
), which is conservatively small and unlikely to risk crashes. However, it might be too small to allow execution of complex functions. Only superusers can change this setting.Setting
max_stack_depth
higher than the actual kernel limit will mean that a runaway recursive function can crash an individual backend process. On platforms where Postgres Pro can determine the kernel limit, the server will not allow this variable to be set to an unsafe value. However, not all platforms provide the information, so caution is recommended in selecting a value.dynamic_shared_memory_type
(enum
)Specifies the dynamic shared memory implementation that the server should use. Possible values are
posix
(for POSIX shared memory allocated usingshm_open
),sysv
(for System V shared memory allocated viashmget
),windows
(for Windows shared memory),mmap
(to simulate shared memory using memory-mapped files stored in the data directory), andnone
(to disable this feature). Not all values are supported on all platforms; the first supported option is the default for that platform. The use of themmap
option, which is not the default on any platform, is generally discouraged because the operating system may write modified pages back to disk repeatedly, increasing system I/O load; however, it may be useful for debugging, when thepg_dynshmem
directory is stored on a RAM disk, or when other shared memory facilities are not available.plan_cache_lru_size
(integer
)Specifies the maximum number of prepared statements for which to keep query trees and generic plans in memory. Having such a limit is useful for sessions with multiple prepared statements that may otherwise use too much memory. If this limit is reached, Postgres Pro Enterprise evicts the query tree and generic plan of the least recently used statement from memory, keeping only the corresponding query text. If this statement is called again later, it has to be parsed and analyzed again. To keep plans in memory for all prepared statements as long as possible, set this parameter to
0
.Default: 64
19.4.2. Disk
temp_file_limit
(integer
)Specifies the maximum amount of disk space that a process can use for temporary files, such as sort and hash temporary files, or the storage file for a held cursor. A transaction attempting to exceed this limit will be canceled. The value is specified in kilobytes, and
-1
(the default) means no limit. Only superusers can change this setting.This setting constrains the total space used at any instant by all temporary files used by a given Postgres Pro process. It should be noted that disk space used for explicit temporary tables, as opposed to temporary files used behind-the-scenes in query execution, does not count against this limit.
19.4.3. Kernel Resource Usage
max_files_per_process
(integer
)Sets the maximum number of simultaneously open files allowed to each server subprocess. The default is one thousand files. If the kernel is enforcing a safe per-process limit, you don't need to worry about this setting. But on some platforms (notably, most BSD systems), the kernel will allow individual processes to open many more files than the system can actually support if many processes all try to open that many files. If you find yourself seeing “Too many open files” failures, try reducing this setting. This parameter can only be set at server start.
19.4.4. Cost-based Vacuum Delay
During the execution of VACUUM and ANALYZE commands, the system maintains an internal counter that keeps track of the estimated cost of the various I/O operations that are performed. When the accumulated cost reaches a limit (specified by vacuum_cost_limit
), the process performing the operation will sleep for a short period of time, as specified by vacuum_cost_delay
. Then it will reset the counter and continue execution.
The intent of this feature is to allow administrators to reduce the I/O impact of these commands on concurrent database activity. There are many situations where it is not important that maintenance commands like VACUUM
and ANALYZE
finish quickly; however, it is usually very important that these commands do not significantly interfere with the ability of the system to perform other database operations. Cost-based vacuum delay provides a way for administrators to achieve this.
This feature is disabled by default for manually issued VACUUM
commands. To enable it, set the vacuum_cost_delay
variable to a nonzero value.
vacuum_cost_delay
(integer
)The length of time, in milliseconds, that the process will sleep when the cost limit has been exceeded. The default value is zero, which disables the cost-based vacuum delay feature. Positive values enable cost-based vacuuming. Note that on many systems, the effective resolution of sleep delays is 10 milliseconds; setting
vacuum_cost_delay
to a value that is not a multiple of 10 might have the same results as setting it to the next higher multiple of 10.When using cost-based vacuuming, appropriate values for
vacuum_cost_delay
are usually quite small, perhaps 10 or 20 milliseconds. Adjusting vacuum's resource consumption is best done by changing the other vacuum cost parameters.vacuum_cost_page_hit
(integer
)The estimated cost for vacuuming a buffer found in the shared buffer cache. It represents the cost to lock the buffer pool, lookup the shared hash table and scan the content of the page. The default value is one.
vacuum_cost_page_miss
(integer
)The estimated cost for vacuuming a buffer that has to be read from disk. This represents the effort to lock the buffer pool, lookup the shared hash table, read the desired block in from the disk and scan its content. The default value is 10.
vacuum_cost_page_dirty
(integer
)The estimated cost charged when vacuum modifies a block that was previously clean. It represents the extra I/O required to flush the dirty block out to disk again. The default value is 20.
vacuum_cost_limit
(integer
)The accumulated cost that will cause the vacuuming process to sleep. The default value is 200.
Note
There are certain operations that hold critical locks and should therefore complete as quickly as possible. Cost-based vacuum delays do not occur during such operations. Therefore it is possible that the cost accumulates far higher than the specified limit. To avoid uselessly long delays in such cases, the actual delay is calculated as vacuum_cost_delay
* accumulated_balance
/ vacuum_cost_limit
with a maximum of vacuum_cost_delay
* 4.
19.4.5. Background Writer
There is a separate server process called the background writer, whose function is to issue writes of “dirty” (new or modified) shared buffers. When the number of clean shared buffers appears to be insufficient, the background writer writes some dirty buffers to the file system and marks them as clean. This reduces the likelihood that server processes handling user queries will be unable to find clean buffers and have to write dirty buffers themselves. However, the background writer does cause a net overall increase in I/O load, because while a repeatedly-dirtied page might otherwise be written only once per checkpoint interval, the background writer might write it several times as it is dirtied in the same interval. The parameters discussed in this subsection can be used to tune the behavior for local needs.
bgwriter_delay
(integer
)Specifies the delay between activity rounds for the background writer. In each round the writer issues writes for some number of dirty buffers (controllable by the following parameters). It then sleeps for
bgwriter_delay
milliseconds, and repeats. When there are no dirty buffers in the buffer pool, though, it goes into a longer sleep regardless ofbgwriter_delay
. The default value is 200 milliseconds (200ms
). Note that on many systems, the effective resolution of sleep delays is 10 milliseconds; settingbgwriter_delay
to a value that is not a multiple of 10 might have the same results as setting it to the next higher multiple of 10. This parameter can only be set in thepostgresql.conf
file or on the server command line.bgwriter_lru_maxpages
(integer
)In each round, no more than this many buffers will be written by the background writer. Setting this to zero disables background writing. (Note that checkpoints, which are managed by a separate, dedicated auxiliary process, are unaffected.) The default value is 100 buffers. This parameter can only be set in the
postgresql.conf
file or on the server command line.bgwriter_lru_multiplier
(floating point
)The number of dirty buffers written in each round is based on the number of new buffers that have been needed by server processes during recent rounds. The average recent need is multiplied by
bgwriter_lru_multiplier
to arrive at an estimate of the number of buffers that will be needed during the next round. Dirty buffers are written until there are that many clean, reusable buffers available. (However, no more thanbgwriter_lru_maxpages
buffers will be written per round.) Thus, a setting of 1.0 represents a “just in time” policy of writing exactly the number of buffers predicted to be needed. Larger values provide some cushion against spikes in demand, while smaller values intentionally leave writes to be done by server processes. The default is 2.0. This parameter can only be set in thepostgresql.conf
file or on the server command line.bgwriter_flush_after
(integer
)Whenever more than
bgwriter_flush_after
bytes have been written by the background writer, attempt to force the OS to issue these writes to the underlying storage. Doing so will limit the amount of dirty data in the kernel's page cache, reducing the likelihood of stalls when anfsync
is issued at the end of a checkpoint, or when the OS writes data back in larger batches in the background. Often that will result in greatly reduced transaction latency, but there also are some cases, especially with workloads that are bigger than shared_buffers, but smaller than the OS's page cache, where performance might degrade. This setting may have no effect on some platforms. The valid range is between0
, which disables forced writeback, and2MB
. The default is512kB
on Linux,0
elsewhere. (IfBLCKSZ
is not 8kB, the default and maximum values scale proportionally to it.) This parameter can only be set in thepostgresql.conf
file or on the server command line.
Smaller values of bgwriter_lru_maxpages
and bgwriter_lru_multiplier
reduce the extra I/O load caused by the background writer, but make it more likely that server processes will have to issue writes for themselves, delaying interactive queries.
19.4.6. Asynchronous Behavior
effective_io_concurrency
(integer
)Sets the number of concurrent disk I/O operations that Postgres Pro expects can be executed simultaneously. Raising this value will increase the number of I/O operations that any individual Postgres Pro session attempts to initiate in parallel. The allowed range is 1 to 1000, or zero to disable issuance of asynchronous I/O requests. Currently, this setting only affects bitmap heap scans.
For magnetic drives, a good starting point for this setting is the number of separate drives comprising a RAID 0 stripe or RAID 1 mirror being used for the database. (For RAID 5 the parity drive should not be counted.) However, if the database is often busy with multiple queries issued in concurrent sessions, lower values may be sufficient to keep the disk array busy. A value higher than needed to keep the disks busy will only result in extra CPU overhead. SSDs and other memory-based storage can often process many concurrent requests, so the best value might be in the hundreds.
Asynchronous I/O depends on an effective
posix_fadvise
function, which some operating systems lack. If the function is not present then setting this parameter to anything but zero will result in an error. On some operating systems (e.g., Solaris), the function is present but does not actually do anything.The default is 1 on supported systems, otherwise 0. This value can be overridden for tables in a particular tablespace by setting the tablespace parameter of the same name (see ALTER TABLESPACE).
max_worker_processes
(integer
)Sets the maximum number of background processes that the system can support. This parameter can only be set at server start. The default is 8.
When running a standby server, you must set this parameter to the same or higher value than on the master server. Otherwise, queries will not be allowed in the standby server.
When changing this value, consider also adjusting max_parallel_workers and max_parallel_workers_per_gather.
max_parallel_workers_per_gather
(integer
)Sets the maximum number of workers that can be started by a single
Gather
orGather Merge
node. Parallel workers are taken from the pool of processes established by max_worker_processes, limited by max_parallel_workers. Note that the requested number of workers may not actually be available at run time. If this occurs, the plan will run with fewer workers than expected, which may be inefficient. The default value is 2. Setting this value to 0 disables parallel query execution.Note that parallel queries may consume very substantially more resources than non-parallel queries, because each worker process is a completely separate process which has roughly the same impact on the system as an additional user session. This should be taken into account when choosing a value for this setting, as well as when configuring other settings that control resource utilization, such as work_mem. Resource limits such as
work_mem
are applied individually to each worker, which means the total utilization may be much higher across all processes than it would normally be for any single process. For example, a parallel query using 4 workers may use up to 5 times as much CPU time, memory, I/O bandwidth, and so forth as a query which uses no workers at all.For more information on parallel query, see Chapter 15.
max_parallel_workers
(integer
)Sets the maximum number of workers that the system can support for parallel queries. The default value is 8. When increasing or decreasing this value, consider also adjusting max_parallel_workers_per_gather. Also, note that a setting for this value which is higher than max_worker_processes will have no effect, since parallel workers are taken from the pool of worker processes established by that setting.
backend_flush_after
(integer
)Whenever more than
backend_flush_after
bytes have been written by a single backend, attempt to force the OS to issue these writes to the underlying storage. Doing so will limit the amount of dirty data in the kernel's page cache, reducing the likelihood of stalls when anfsync
is issued at the end of a checkpoint, or when the OS writes data back in larger batches in the background. Often that will result in greatly reduced transaction latency, but there also are some cases, especially with workloads that are bigger than shared_buffers, but smaller than the OS's page cache, where performance might degrade. This setting may have no effect on some platforms. The valid range is between0
, which disables forced writeback, and2MB
. The default is0
, i.e., no forced writeback. (IfBLCKSZ
is not 8kB, the maximum value scales proportionally to it.)old_snapshot_threshold
(integer
)Sets the minimum time that a snapshot can be used without risk of a
snapshot too old
error occurring when using the snapshot. This parameter can only be set at server start.Beyond the threshold, old data may be vacuumed away. This can help prevent bloat in the face of snapshots which remain in use for a long time. To prevent incorrect results due to cleanup of data which would otherwise be visible to the snapshot, an error is generated when the snapshot is older than this threshold and the snapshot is used to read a page which has been modified since the snapshot was built.
A value of
-1
disables this feature, and is the default. Useful values for production work probably range from a small number of hours to a few days. The setting will be coerced to a granularity of minutes, and small numbers (such as0
or1min
) are only allowed because they may sometimes be useful for testing. While a setting as high as60d
is allowed, please note that in many workloads extreme bloat or page-level transaction ID wraparound may occur in much shorter time frames.When this feature is enabled, freed space at the end of a relation cannot be released to the operating system, since that could remove information needed to detect the
snapshot too old
condition. All space allocated to a relation remains associated with that relation for reuse only within that relation unless explicitly freed (for example, withVACUUM FULL
).This setting does not attempt to guarantee that an error will be generated under any particular circumstances. In fact, if the correct results can be generated from (for example) a cursor which has materialized a result set, no error will be generated even if the underlying rows in the referenced table have been vacuumed away. Some tables cannot safely be vacuumed early, and so will not be affected by this setting, such as system catalogs. For such tables this setting will neither reduce bloat nor create a possibility of a
snapshot too old
error on scanning.
19.4.7. Prioritization
usage_tracking_interval
(integer
)Sets the time interval, in seconds, for calculating usage statistics. Based on this statistics, Postgres Pro Enterprise can control resource usage for each session in accordance with the prioritization policy configured by session_cpu_weight, session_ioread_weight, and session_iowrite_weight parameters.
When set to a positive value, this parameter starts a background worker that collects statistics on CPU time, the number of local and shared blocks read by the backends, and the number of local and shared blocks dirtied by the backends. Avoid setting this parameter to a small value as frequent statistic collection can cause overhead.
The default value is zero, which disables statistics collection and resource prioritization.
This parameter can only be set in the
postgresql.conf
file or on the server command line.session_cpu_weight
(integer
)Sets CPU usage weight for the current session. Possible values are
1
,2
,4
, and8
. The higher the value, the more resources the session can use as compared to sessions with lower weights.The usage_tracking_interval parameter must be set to a positive value for this setting to take effect. Resource usage is planned based on the statistics collected for the previous interval defined by usage_tracking_interval. If all sessions have the same weight, Postgres Pro Enterprise does not prioritize resource usage.
Default:
4
session_ioread_weight
(integer
)Sets the weight for reading local and shared blocks for the current session. Possible values are
1
,2
,4
, and8
. The higher the value, the more resources the session can use as compared to sessions with lower weights.The usage_tracking_interval parameter must be set to a positive value for this setting to take effect. Resource usage is planned based on the statistics collected for the previous interval defined by usage_tracking_interval. If all sessions have the same weight, Postgres Pro Enterprise does not prioritize resource usage.
Default:
4
session_iowrite_weight
(integer
)Sets the weight for writing to local and shared blocks for the current session. Possible values are
1
,2
,4
, and8
. The higher the value, the more resources the session can use as compared to sessions with lower weights.The usage_tracking_interval parameter must be set to a positive value for this setting to take effect. Resource usage is planned based on the statistics collected for the previous interval defined by usage_tracking_interval. If all sessions have the same weight, Postgres Pro Enterprise does not prioritize resource usage.
Default:
4