F.3. aqo
The aqo module is a Postgres Pro Enterprise extension for cost-based query optimization. Using machine learning methods, more precisely, a modification of the k-NN algorithm, aqo improves cardinality estimation, which can optimize execution plans and, consequently, speed up query execution.
The aqo module can collect statistics on all the executed queries, excluding the queries that access system relations. The collected statistics is classified by query class. If the queries differ in their constants only, they belong to the same class. For each query class, aqo stores the cardinality quality, planning time, execution time, and execution statistics for machine learning. Based on this data, aqo builds a new query plan and uses it for the next query of the same class. aqo test runs have shown significant performance improvements for complex queries.
Important
Query optimization using the aqo module is not supported on standby.
aqo saves all the learning data (aqo_data), queries (aqo_query_texts), query settings (aqo_queries), and query execution statistics (aqo_query_stat) to files. When aqo starts, it loads this data to shared memory. You can access aqo data through functions and views.
Warning
Be aware that aqo may not work correctly right after extension upgrades that change its kernel and after Postgres Pro upgrades. Therefore, after each Postgres Pro upgrade, call aqo_reset()
and run DROP EXTENSION aqo
.
F.3.1. Installation and Setup
The aqo extension is included into Postgres Pro Enterprise. Once you have Postgres Pro Enterprise installed, complete the following steps to enable aqo:
Add
aqo
to the shared_preload_libraries parameter in thepostgresql.conf
file:shared_preload_libraries = 'aqo'
The
aqo
library must be preloaded at the server startup, since adaptive query optimization needs to be enabled per cluster.Create the aqo extension using the following query:
CREATE EXTENSION aqo;
Once the extension is created, you can start optimizing queries.
To disable aqo at the cluster level, run:
DROP EXTENSION aqo;
Note
aqo retains its internal state when the module is recreated by DROP EXTENSION
-> CREATE EXTENSION
.
To remove all the data from the aqo storage, including the collected statistics, call aqo_reset()
.
F.3.1.1. Configuration
By default, aqo does not affect query performance. To enable adaptive query optimization for your database, add the aqo.mode
variable to your postgresql.conf
file and reload the cluster. Depending on your database usage model, you can choose between the following modes:
intelligent
— this mode auto-tunes your queries based on statistics collected per query class.forced
— this mode collects statistics for all queries altogether without any classification.controlled
— this mode uses the default planner for all new queries, but continues using the previously specified planning settings for already known query classes, if any.learn
— this mode collects statistics on all the executed queries and updates the data for query classes.frozen
— this mode reads the collected statistics for already known query classes but does not collect any new data. You can use this mode to reduce the impact of aqo on query planning and execution.disabled
— this mode disables aqo for all queries, even for the known query classes. You can use this mode to temporarily disable aqo without losing the collected statistics and configuration.
To dynamically change the aqo settings in your current session, run the following command:
SET aqo.mode = 'mode
';
where mode
is the name of the operation mode to use.
Important
The intelligent
mode of aqo may not work well if the queries in your workload are of multiple different classes. In this case, you can try resetting the mode to controlled
.
F.3.2. Usage
F.3.2.1. Choosing Operation Mode for Query Optimization
If you often run queries of the same class, for example, your application limits the number of possible query classes, you can use the intelligent
mode to improve planning for these queries. In this mode, aqo analyzes each query execution and stores statistics. Statistics on queries of different classes is stored separately. If performance is not improved after 50 iterations, the aqo extension falls back to the default query planner.
Note
You can view the current query plan using the standard Postgres Pro EXPLAIN
command with the ANALYZE
option. For details, see the Section 14.1.
Since the intelligent
mode tries to learn separately for different query classes, aqo may fail to provide performance improvements if the classes of the queries in the workload are constantly changing. For such dynamic workloads, reset the aqo extension to the controlled
mode, or try using the forced
mode.
In the forced
mode, aqo does not classify the collected statistics by query classes and tries to optimize all queries together. This mode can help you optimize workloads with multiple different query classes, and it consumes less memory than the intelligent mode. However, since the forced
mode lacks intelligent tuning, performance may decrease for some queries. If you see performance issues in this mode, switch aqo to the controlled
mode.
In the controlled
mode, aqo does not collect statistics for new query classes, so they will not be optimized. For known query classes, aqo will continue collecting statistics and using optimized planning algorithms.
The learn
mode collects statistics from all the executed queries and updates the data for query classes. This mode is similar to the intelligent
mode, except that it does not provide intelligent tuning.
If you want to reduce the impact of aqo on query planning and execution, you can use it in the frozen
mode. In this mode, aqo only reads the collected statistics for already known query classes but does not collect any new data.
If you want to fully disable aqo, you can switch aqo to the disabled
mode. In this case, the default planner is used for all queries, but the collected statistics and aqo settings are saved and can be used in the future.
F.3.2.2. Fine-Tuning aqo
You must have superuser rights to access aqo views and configure advanced query settings.
When run in the intelligent
mode, aqo assigns a unique hash value to each query class to separate the collected statistics. If you switch to the forced
mode, the statistics for all untracked query classes is stored in a common query class with hash 0. You can view all the processed query classes and their corresponding hash values in the aqo_query_texts
view:
SELECT * FROM aqo_query_texts;
Each query class has an associated separate space, called feature space, in which the statistics for this query class is collected. Each feature space has associated feature subspaces, where the information about selectivity and cardinality for each query plan node is collected.
Each query class has its own optimization settings. These settings are shown in the aqo_queries
view:
SELECT * FROM aqo_queries;
For each query class, the following settings are available:
queryid
stores the query ID that uniquely identifies the query class.learn_aqo
enables statistics collection for this query class.use_aqo
enables aqo cardinality prediction for the next execution of this query class.fspace_hash
is a unique identifier of the feature space in which the statistics for this query class is collected. By default,fspace_hash
is equal toqueryid
.auto_tuning
shows whether aqo tries to change other settings for the given query. By default, auto-tuning is enabled in theintelligent
mode.
You can manually change these settings to adjust optimization for a particular query class. For example:
-- Add a new query class into the aqo_queries view: SET aqo.mode='intelligent'; SELECT * FROM a, b WHERE a.id=b.id; SET aqo.mode='controlled'; -- Disable auto_tuning, enable both learn_aqo and use_aqo -- for this query class: SELECT count(*) FROM (SELECT queryid FROM aqo_queries) AS q1, LATERAL aqo_queries_update(q1.queryid, NULL, true, true, false) AS q2 WHERE queryid = (SELECT queryid FROM aqo_query_texts WHERE query_text LIKE 'SELECT * FROM a, b WHERE a.id=b.id;'); -- Run EXPLAIN ANALYZE until the plan changes: EXPLAIN ANALYZE SELECT * FROM a, b WHERE a.id=b.id; EXPLAIN ANALYZE SELECT * FROM a, b WHERE a.id=b.id; -- Disable learning to stop statistics collection -- and use the optimized plan: SELECT count(*) FROM (SELECT queryid FROM aqo_queries) AS q1, LATERAL aqo_queries_update(q1.queryid, NULL, false, true, false) AS q2 WHERE queryid = (SELECT queryid FROM aqo_query_texts WHERE query_text LIKE 'SELECT * FROM a, b WHERE a.id=b.id;');
To stop intelligent tuning for a particular query class, disable the auto_tuning
setting:
SELECT count(*) FROM (SELECT queryid FROM aqo_queries) AS q1,
LATERAL aqo_queries_update(q1.queryid, NULL, true, true, false) AS q2
WHERE queryid = 'hash
');
where hash
is the hash value for this query class. As a result, aqo disables automatic changing of the learn_aqo
and use_aqo
settings.
To disable further learning for a particular query class, use the following command:
SELECT count(*) FROM (SELECT queryid FROM aqo_queries) AS q1,
LATERAL aqo_queries_update(q1.queryid, NULL, false, true, false) AS q2
WHERE queryid = 'hash
');
where hash
is the hash value for this query class.
To fully disable aqo for all queries and use the default Postgres Pro query planner, run:
SELECT count(*) FROM (SELECT queryid FROM aqo_queries) AS q1, LATERAL aqo_queries_update(q1.queryid, NULL, false, false, false) AS q2 WHERE queryid IN (SELECT queryid FROM aqo_query_texts);
F.3.3. Reference
F.3.3.1. Configuration Parameters
aqo.mode
Defines the aqo operation mode. Possible values are listed in Table F.2.
Table F.2.
aqo.mode
OptionsOption Description intelligent
Auto-tunes your queries based on statistics collected per query class. forced
Collects statistics for all queries altogether without any classification. controlled
Default. Uses the default planner for all new queries, but can reuse the collected statistics for already known query classes, if any. learn
Collects statistics on all the executed queries and updates the data for query classes. frozen
Reads the collected statistics for already known query classes but does not collect any new data in order to reduce the impact of aqo on query planning and execution. disabled
Fully disables aqo for all queries. The collected statistics and aqo settings are saved and can be used in the future. aqo.show_hash
Show a hash value that is computed from a query tree and uniquely identifies the class of queries or class of plan nodes. Starting with Postgres Pro 14, aqo uses the native query ID to identify a query class for consistency with other extensions, such as pg_stat_statements. So, the query ID can be taken from the
Query Identifier
field inEXPLAIN ANALYZE
output of a query.aqo.show_details
Add some details to
EXPLAIN
output of a query, such as prediction or feature-subspace hash, and show some additional aqo-specific on-screen information.aqo.join_threshold
Ignore queries that contain smaller number of joins.
aqo.statement_timeout
Defines the initial value of the smart statement timeout, which is needed to limit the execution time when manually training aqo on special queries with a poor cardinality forecast. aqo can dynamically change the value of the smart statement timeout during this training. Equals zero by default.
aqo.force_collect_stat
Gather statistics on query executions even in the
disabled
mode. Although no predictions are made, some overhead will be added.aqo.dsm_size_max
Defines the maximum size of dynamic shared memory that aqo can allocate to store learning data.
aqo.fs_max_items
Defines the maximum number of feature spaces that aqo can operate with.
aqo.fss_max_items
Defines the maximum number of feature subspaces that aqo can operate with.
aqo.wide_search
Enables searching neighbors with the same feature subspace among different query classes.
aqo.querytext_max_size
Defines the maximum size of the query in the aqo_query_texts view.
aqo.min_neighbors_for_predicting
Defines the minimum number of neighbors needed for cardinality prediction. If there are fewer of them, aqo will not make prediction.
aqo.predict_with_few_neighbors
Enables aqo to make predictions with fewer neighbors than were found.
F.3.3.2. Views
F.3.3.2.1. aqo_query_texts
The aqo_query_texts
view classifies all the query classes processed by aqo. For each query class, the view shows the text of the first analyzed query of this class.
Table F.3. aqo_query_texts
View
Column Name | Description |
---|---|
queryid | Stores the query ID that uniquely identifies the query class. |
query_text | Provides the text of the first analyzed query of the given class. |
F.3.3.2.2. aqo_queries
The aqo_queries
view shows optimization settings for different query classes.
Table F.4. aqo_queries
View
Setting | Description |
---|---|
queryid | Stores the query ID that uniquely identifies the query class. |
learn_aqo | Enables statistics collection for this query class. |
use_aqo | Enables aqo cardinality prediction for the next execution of this query class. If cost estimation model is incomplete, this may slow down query execution. |
fspace_hash | Provides a unique identifier of the separate space in which the statistics for this query class is collected. By default, fspace_hash is equal to queryid . You can change this setting to a different queryid to optimize different query classes together. It may decrease the amount of memory for models and even improve query execution performance. However, changing this setting may cause unexpected aqo behavior, so make sure to use it only if you know what you are doing. |
auto_tuning | Shows whether aqo tries to tune other settings for the given query. By default, auto-tuning is only enabled in the intelligent mode. |
smart_timeout | Shows the value of smart statement timeout for this query class. |
count_increase_timeout | Shows how many times smart statement timeout increased for this query class. |
F.3.3.2.3. aqo_data
The aqo_data
view shows machine learning data for cardinality estimation refinement. To forget all the collected statistics for a particular query class, you can delete all rows from aqo_data
with the corresponding fs
.
Table F.5. aqo_data
View
Data | Description |
---|---|
fs | Feature-space hash. |
fss | Feature-subspace hash. |
nfeatures | Feature-subspace size for the query plan node. |
features | Logarithm of the selectivity which the cardinality prediction is based on. |
targets | Cardinality logarithm for the query plan node. |
reliability | Equals:
|
oids | List of IDs of tables that were involved in prediction for this node. |
F.3.3.2.4. aqo_query_stat
The aqo_query_stat
view shows statistics on query execution, by query class. The aqo extension uses this data when the auto_tuning
option is enabled for a particular query class.
Table F.6. aqo_query_stat
View
Data | Description |
---|---|
execution_time_with_aqo | Execution time for queries run with aqo enabled. |
execution_time_without_aqo | Execution time for queries run with aqo disabled. |
planning_time_with_aqo | Planning time for queries run with aqo enabled. |
planning_time_without_aqo | Planning time for queries run with aqo disabled. |
cardinality_error_with_aqo | Cardinality estimation error in the selected query plans with aqo enabled. |
cardinality_error_without_aqo | Cardinality estimation error in the selected query plans with aqo disabled. |
executions_with_aqo | Number of queries run with aqo enabled. |
executions_without_aqo | Number of queries run with aqo disabled. |
F.3.3.3. Functions
aqo adds several functions to Postgres Pro catalog.
F.3.3.3.1. Storage Management Functions
Important
Functions aqo_queries_update
, aqo_query_texts_update
, aqo_query_stat_update
, and aqo_data_update
modify aqo views. Therefore, call these functions only if you understand the logic of adaptive query optimization.
aqo_cleanup
() →setof integer
Removes data related to query classes that are linked (may be partially) with removed relations. Returns the number of removed feature spaces (classes) and feature subspaces. Insensitive to removing other objects.
aqo_enable_class
(queryid
bigint
) →void
Sets
learn_aqo
,use_aqo
andauto_tuning
to true for a given query class.aqo_disable_class
(queryid
bigint
) →void
Sets
learn_aqo
,use_aqo
andauto_tuning
to false for a given query class.aqo_drop_class
(queryid
bigint
) →integer
Removes all data related to a given query class from the aqo storage. Returns the number of records removed from the aqo storage.
aqo_reset
() →bigint
Removes data from the aqo storage: machine learning data, query texts, statistics and query class preferences. Returns the number of records removed from the aqo storage.
aqo_queries_update
(queryid
bigint
,fs
bigint
,learn_aqo
boolean
,use_aqo
boolean
,auto_tuning
boolean
) →boolean
Assigns new values to the following settings in the aqo_queries view for a given query class:
fspace_hash
,learn_aqo
,use_aqo
andauto_tuning
. NULL value means “leave as is”.aqo_query_texts_update
(queryid
bigint
,query_text
text
) →boolean
Updates or inserts a record in the aqo_query_texts view for a given
queryid
.aqo_query_stat_update
(queryid
bigint
,execution_time_with_aqo
double precision[]
,execution_time_without_aqo
double precision[]
,planning_time_with_aqo
double precision[]
,planning_time_without_aqo
double precision[]
,cardinality_error_with_aqo
double precision[]
,cardinality_error_without_aqo
double precision[]
,executions_with_aqo
bigint[]
,executions_without_aqo
bigint[]
) →boolean
Updates or inserts a record in the aqo_query_stat view for a given
queryid
.aqo_data_update
(fs
bigint
,fss
integer
,nfeatures
integer
,features
double precision[][]
,targets
double precision[]
,reliability
double precision[]
,oids
oid[]
) →boolean
Updates or inserts a record in the aqo_data view for given
fs
andfss
.
F.3.3.3.2. Memory Management Functions
aqo_memory_usage
() →setof record
Displays sizes of aqo memory contexts and hash tables.
F.3.3.3.3. Analytics Functions
aqo_cardinality_error
(controlled
boolean
) →setof record
Shows the cardinality error for each query class. If
controlled
is true, shows the error of the last execution with aqo enabled. Ifcontrolled
is false, returns the average cardinality error for all logged executions with aqo disabled.aqo_execution_time
(controlled
boolean
) →setof record
Shows the execution time for each query class. If
controlled
is true, shows the execution time of the last execution with aqo enabled. Ifcontrolled
is false, returns the average execution time for all logged executions with aqo disabled.
F.3.4. Author
Oleg Ivanov