F.4. aqo
The aqo module is a Postgres Pro Enterprise extension for cost-based query optimization. Using machine learning methods, more precisely, a modification of the k-NN algorithm, aqo improves cardinality estimation, which can optimize execution plans and, consequently, speed up query execution.
The aqo module can collect statistics on all the executed queries, excluding the queries that access system relations. The collected statistics is classified by query class. If the queries differ in their constants only, they belong to the same class. For each query class, aqo stores the cardinality quality, planning time, execution time, and execution statistics for machine learning. Based on this data, aqo builds a new query plan and uses it for the next query of the same class. aqo test runs have shown significant performance improvements for complex queries.
Important
Query optimization using the aqo module is not supported on standby.
aqo saves all the learning data (aqo_data), queries (aqo_query_texts), query settings (aqo_queries), and query execution statistics (aqo_query_stat) to files. When aqo starts, it loads this data to shared memory. You can access aqo data through functions and views.
Warning
Be aware that aqo may not work correctly right after extension upgrades that change its kernel and after Postgres Pro upgrades. Therefore, after each Postgres Pro upgrade, call aqo_reset() and run DROP EXTENSION aqo. However, after Postgres Pro minor release upgrades to versions to 13.11/14.8/15.3 or higher a call to aqo_reset() is not needed as aqo is reset automatically if required.
After a minor release upgrade, also run ALTER EXTENSION aqo UPDATE and keep in mind that aqo downgrade is impossible.
In the event of an automatic aqo reset or manually calling aqo_reset(), all the machine learning data gets lost, and new learning will be needed for future aqo use. Therefore, if some data, such as query texts, may help in aqo learning, back up this data in advance.
F.4.1. Installation and Setup
The aqo extension is included into Postgres Pro Enterprise. Once you have Postgres Pro Enterprise installed, complete the following steps to enable aqo:
Add
aqoto the shared_preload_libraries parameter in thepostgresql.conffile:shared_preload_libraries = 'aqo'
The
aqolibrary must be preloaded at the server startup, since adaptive query optimization needs to be enabled per cluster.Create the aqo extension using the following query:
CREATE EXTENSION aqo;
Once the extension is created, you can start optimizing queries.
The command
DROP EXTENSION aqo;
will only remove aqo interface at the cluster level. aqo will still be running on all the backends while it is listed in shared_preload_libraries and at the server restart, will continue working in the operation mode specified in postgresql.conf. Besides, aqo will retain its internal state after consequent execution of DROP EXTENSION -> CREATE EXTENSION.
To remove all the data from the aqo storage, including the collected statistics, call aqo_reset():
SELECT aqo_reset();
To actually disable aqo at the cluster level, do the following:
ALTER SYSTEM SET aqo.mode = 'disabled'; SELECT pg_reload_conf(); DROP EXTENSION aqo;
If you do not want aqo to be loaded at the server restart, remove the line
shared_preload_libraries = 'aqo'
from the postgresql.conf file.
F.4.1.1. Configuration
With the default operation mode (controlled), aqo does not affect query performance. Depending on your database usage model, you should choose between the following modes:
intelligent— this mode auto-tunes your queries based on statistics collected per query class. See the description of theauto_tuningflag of the aqo_queries view for more details.forced— this mode collects statistics for all queries altogether without any classification.controlled— this mode uses the default planner for all new queries, but continues using the previously specified planning settings for already known query classes, if any.learn— this mode collects statistics on all the executed queries and updates the data for query classes without auto-tuning queries.frozen— this mode reads the collected statistics for already known query classes but does not collect any new data. You can use this mode to reduce the impact of aqo on query planning and execution.disabled— this mode disables aqo for all queries, even for the known query classes. The collected statistics and aqo settings are saved and can be used in the future. You can use this mode to temporarily disable aqo without losing the collected statistics and configuration.
To dynamically change the aqo mode in your current session, run the following command:
ALTER SYSTEM SET aqo.mode = 'mode';
SELECT pg_reload_conf();
where mode is the name of the operation mode to use.
F.4.2. Usage
F.4.2.1. Choosing Operation Mode for Query Optimization
If you often run queries of the same class, for example, your application limits the number of possible query classes, you can use the intelligent mode to improve planning for these queries. In this mode, aqo analyzes each query execution and stores statistics. Statistics on queries of different classes is stored separately. If performance is not improved after 50 iterations, the aqo extension falls back to the default query planner.
Note
You can view the current query plan using the standard Postgres Pro EXPLAIN command with the ANALYZE option. For details, see the Section 14.1.
Since the intelligent mode tries to learn separately for different query classes, aqo may fail to provide performance improvements if queries in the workload are of multiple different classes or if the classes of the queries in the workload are constantly changing. For such workloads, reset the aqo extension to the controlled mode, or try using the forced mode.
In the forced mode, aqo does not classify the collected statistics by query classes and tries to optimize all queries together. Therefore, aqo_query_texts, aqo_queries and aqo_query_stat views do not get updated. Machine learning data collected in any other modes is inapplicable for the forced mode and vice versa. This mode can help you optimize workloads with multiple different query classes, and it consumes less memory than the intelligent mode. However, since the forced mode lacks intelligent tuning, performance may decrease for some queries. If you see performance issues in this mode, switch aqo to the controlled mode.
In the controlled mode, aqo does not collect statistics for new query classes, so they will not be optimized. For known query classes, aqo will continue collecting statistics and using optimized planning algorithms. So use the controlled mode only after aqo learned in the learn or intelligent mode. As there are no query classes in the forced mode, switching from it to the controlled mode actually means disabling aqo.
The learn mode collects statistics from all the executed queries and updates the data for query classes. This mode is similar to the intelligent mode, except that it does not provide intelligent tuning.
If you want to reduce the impact of aqo on query planning and execution, you can use it in the frozen mode. In this mode, aqo only reads the collected statistics for already known query classes but does not collect any new data.
Switching to the disabled mode is the only way to actually disable aqo without losing the statistics and aqo settings, which are saved and can be used in the future. Queries in this mode will be executed as though there were no aqo at all.
F.4.2.2. Fine-Tuning aqo
You must have superuser rights to access aqo views and configure advanced query settings.
When run in the intelligent or learn mode, aqo assigns a unique hash value to each query class to separate the collected statistics. If you switch to the forced mode, the statistics for all untracked query classes is stored in a common query class with hash 0. You can view all the processed query classes and their corresponding hash values in the aqo_query_texts view:
SELECT * FROM aqo_query_texts;
To find out the class, that is, hash, of a query and aqo mode, enable aqo.show_hash and aqo.show_details environment variables and execute the query. The output will contain something like this:
... Planning Time: 23.538 ms ... Execution Time: 249813.875 ms ... Using aqo: true ... AQO mode: LEARN ... Query hash: -2439501042637610315
Each query class has an associated separate space, called feature space, in which the statistics for this query class is collected. Each feature space has associated feature subspaces, where the information about selectivity and cardinality for each query plan node is collected.
Each query class has its own optimization settings. These settings are shown in the aqo_queries view:
SELECT * FROM aqo_queries;
The settings available are listed in the aqo_queries View table.
You can manually change these settings to adjust optimization for a particular query class. For example:
-- Add a new query class into the aqo_queries view: SET aqo.mode='intelligent'; SELECT * FROM a, b WHERE a.id=b.id; SET aqo.mode='controlled'; -- Disable auto_tuning, enable both learn_aqo and use_aqo -- for this query class: SELECT count(*) FROM (SELECT queryid FROM aqo_queries) AS q1, LATERAL aqo_queries_update(q1.queryid, NULL, true, true, false) AS q2 WHERE queryid = (SELECT queryid FROM aqo_query_texts WHERE query_text LIKE 'SELECT * FROM a, b WHERE a.id=b.id;'); -- Run EXPLAIN ANALYZE until the plan changes: EXPLAIN ANALYZE SELECT * FROM a, b WHERE a.id=b.id; EXPLAIN ANALYZE SELECT * FROM a, b WHERE a.id=b.id; -- Disable learning to stop statistics collection -- and use the optimized plan: SELECT count(*) FROM (SELECT queryid FROM aqo_queries) AS q1, LATERAL aqo_queries_update(q1.queryid, NULL, false, true, false) AS q2 WHERE queryid = (SELECT queryid FROM aqo_query_texts WHERE query_text LIKE 'SELECT * FROM a, b WHERE a.id=b.id;');
To stop intelligent tuning for a particular query class, disable the auto_tuning setting:
SELECT count(*) FROM (SELECT queryid FROM aqo_queries) AS q1,
LATERAL aqo_queries_update(q1.queryid, NULL, true, true, false) AS q2
WHERE queryid = 'hash');
where hash is the hash value for this query class. As a result, aqo disables automatic change of the learn_aqo and use_aqo settings.
To disable further learning for a particular query class, use the following command:
SELECT count(*) FROM (SELECT queryid FROM aqo_queries) AS q1,
LATERAL aqo_queries_update(q1.queryid, NULL, false, true, false) AS q2
WHERE queryid = 'hash');
where hash is the hash value for this query class.
To fully disable aqo for all queries and use the default Postgres Pro query planner, run:
SELECT count(*) FROM (SELECT queryid FROM aqo_queries) AS q1, LATERAL aqo_queries_update(q1.queryid, NULL, false, false, false) AS q2 WHERE queryid IN (SELECT queryid FROM aqo_query_texts);
F.4.3. Reference
F.4.3.1. Configuration Parameters
aqo.mode(text)Defines the aqo operation mode. Possible values are listed in Section F.4.1.1.
Default:
controlled.aqo.show_hash(boolean)Show a hash value that is computed from a query tree and uniquely identifies the class of queries or class of plan nodes. Starting with Postgres Pro 14, aqo uses the native query ID to identify a query class for consistency with other extensions, such as pg_stat_statements. So, the query ID can be taken from the
Query Identifierfield inEXPLAIN ANALYZEoutput of a query.Default:
off.aqo.show_details(boolean)Add some details to
EXPLAINoutput of a query, such as the prediction or feature-subspace hash, and show some additional aqo-specific on-screen information.Default:
off.aqo.join_threshold(integer)Ignore queries that contain smaller number of joins, which means that statistics for such queries will not be collected.
Default:
3.aqo.statement_timeout(integer)Defines the initial value of the smart statement timeout, in milliseconds, which is needed to limit the execution time when manually training aqo on special queries with a poor cardinality forecast. aqo can dynamically change the value of the smart statement timeout during this training. When the cardinality estimation error on nodes exceeds 0.1, the value of
aqo.statement_timeoutis automatically incremented exponentially, but remains not greater than statement_timeout.Default:
0.aqo.force_collect_stat(boolean)Gather statistics on query executions even in the
disabledmode. Although no predictions are made, some overhead will be added.Default:
off.aqo.dsm_size_max(integer)Defines the maximum size of dynamic shared memory, in MB, that aqo can allocate to store learning data. When this number is exceeded, an attempt to load the aqo_data view will fail with “out of memory” error.
Default:
100.aqo.fs_max_items(integer)Defines the maximum number of feature spaces that aqo can operate with. When this number is exceeded, learning on new query classes will no longer occur, and they will not appear in the views accordingly.
Default:
10000.aqo.fss_max_items(integer)Defines the maximum number of feature subspaces that aqo can operate with. When this number is exceeded, the selectivity and cardinality for new query plan nodes will no longer be collected, and new feature subspaces will not appear in the aqo_data view accordingly.
Default:
100000.aqo.wide_search(boolean)Enables searching neighbors with the same feature subspace among different query classes.
Default:
off.aqo.querytext_max_size(integer)Defines the maximum size of the query in the aqo_query_texts view.
Default:
1000.aqo.min_neighbors_for_predicting(integer)Defines the minimum number of neighbors needed for the cardinality prediction. If there are fewer of them, aqo will not make any prediction.
Default:
3.aqo.predict_with_few_neighbors(boolean)Enables aqo to make predictions with fewer neighbors than were found.
Default:
on.
F.4.3.2. Views
F.4.3.2.1. aqo_query_texts
The aqo_query_texts view classifies all the query classes processed by aqo. For each query class, the view shows the text of the first analyzed query of this class. The number of rows is limited by aqo.fs_max_items.
Table F.8. aqo_query_texts View
| Column Name | Description |
|---|---|
queryid | Stores the query ID, that is, the feature-space hash, that uniquely identifies the query class. |
query_text | Provides the text of the first analyzed query of the given class. |
F.4.3.2.2. aqo_queries
The aqo_queries view shows optimization settings for different query classes. The number of rows is limited by aqo.fs_max_items.
Table F.9. aqo_queries View
| Setting | Description |
|---|---|
queryid | Stores the query ID that uniquely identifies the query class. |
learn_aqo | Enables statistics collection for this query class. |
use_aqo | Enables the aqo cardinality prediction for the next execution of this query class. If the cost estimation model is inaccurate, this may slow down query execution. |
fspace_hash | Provides a unique identifier of the separate space in which the statistics for this query class is collected. By default, fspace_hash is equal to queryid. You can change this setting to a different queryid to optimize different query classes together. It may decrease the amount of memory for models and even improve query execution performance. However, changing this setting may cause unexpected aqo behavior, so make sure to use it only if you know what you are doing. |
auto_tuning | Shows whether aqo can dynamically change In more detail, when For queries with |
smart_timeout | Shows the value of smart statement timeout for this query class. |
count_increase_timeout | Shows how many times the smart statement timeout increased for this query class. |
F.4.3.2.3. aqo_data
The aqo_data view shows machine learning data for cardinality estimation refinement. To forget all the collected statistics for a particular query class, you can delete all rows from aqo_data with the corresponding fs. The number of rows is limited by aqo.fss_max_items.
Table F.10. aqo_data View
| Data | Description |
|---|---|
fs | Feature-space hash. |
fss | Feature-subspace hash. |
nfeatures | Feature-subspace size for the query plan node. |
features | Logarithm of the selectivity which the cardinality prediction is based on. |
targets | Cardinality logarithm for the query plan node. |
reliability | Equals:
|
oids | List of IDs of tables that were involved in the prediction for this node. |
F.4.3.2.4. aqo_query_stat
The aqo_query_stat view shows statistics on query execution, by query class. The aqo extension uses this data when the auto_tuning option is enabled for a particular query class.
Table F.11. aqo_query_stat View
| Data | Description |
|---|---|
execution_time_with_aqo | Execution time for queries run with aqo enabled. |
execution_time_without_aqo | Execution time for queries run with aqo disabled. |
planning_time_with_aqo | Planning time for queries run with aqo enabled. |
planning_time_without_aqo | Planning time for queries run with aqo disabled. |
cardinality_error_with_aqo | Cardinality estimation error in the selected query plans with aqo enabled. |
cardinality_error_without_aqo | Cardinality estimation error in the selected query plans with aqo disabled. |
executions_with_aqo | Number of queries run with aqo enabled. |
executions_without_aqo | Number of queries run with aqo disabled. |
F.4.3.3. Functions
aqo adds several functions to Postgres Pro catalog.
F.4.3.3.1. Storage Management Functions
Important
Functions aqo_queries_update, aqo_query_texts_update, aqo_query_stat_update, and aqo_data_update modify data files underlying aqo views. Therefore, call these functions only if you understand the logic of adaptive query optimization.
aqo_cleanup() →setof integerRemoves data related to query classes that are linked (may be partially) with removed relations. Returns the number of removed feature spaces (classes) and feature subspaces. Insensitive to removing other objects.
aqo_enable_class(queryidbigint) →voidSets
learn_aqo,use_aqoandauto_tuning(only in theintelligentmode) to true for a given query class.aqo_disable_class(queryidbigint) →voidSets
learn_aqo,use_aqoandauto_tuning(only in theintelligentmode) to false for a given query class.aqo_drop_class(queryidbigint) →integerRemoves all data related to a given query class from the aqo storage. Returns the number of records removed from the aqo storage.
aqo_reset() →bigintRemoves data from the aqo storage: machine learning data, query texts, statistics and query class preferences. Returns the number of records removed from the aqo storage.
aqo_queries_update(queryidbigint,fsbigint,learn_aqoboolean,use_aqoboolean,auto_tuningboolean) →booleanAssigns new values to the following settings in the aqo_queries view for a given query class:
fspace_hash,learn_aqo,use_aqoandauto_tuning. NULL value means “leave as is”.aqo_query_texts_update(queryidbigint,query_texttext) →booleanUpdates or inserts a record in a data file underlying the aqo_query_texts view for a given
queryid.aqo_query_stat_update(queryidbigint,execution_time_with_aqodouble precision[],execution_time_without_aqodouble precision[],planning_time_with_aqodouble precision[],planning_time_without_aqodouble precision[],cardinality_error_with_aqodouble precision[],cardinality_error_without_aqodouble precision[],executions_with_aqobigint[],executions_without_aqobigint[]) →booleanUpdates or inserts a record in a data file underlying the aqo_query_stat view for a given
queryid.aqo_data_update(fsbigint,fssinteger,nfeaturesinteger,featuresdouble precision[][],targetsdouble precision[],reliabilitydouble precision[],oidsoid[]) →booleanUpdates or inserts a record in a data file underlying the aqo_data view for given
fsandfss.
F.4.3.3.2. Memory Management Functions
aqo_memory_usage() →setof recordDisplays sizes of aqo memory contexts and hash tables.
F.4.3.3.3. Analytics Functions
aqo_cardinality_error(controlledboolean) →setof recordShows the cardinality error for each query class. If
controlledis true, shows the error of the last execution with aqo enabled. Ifcontrolledis false, returns the average cardinality error for all logged executions with aqo disabled.aqo_execution_time(controlledboolean) →setof recordShows the execution time for each query class. If
controlledis true, shows the execution time of the last execution with aqo enabled. Ifcontrolledis false, returns the average execution time for all logged executions with aqo disabled.
F.4.4. Examples
Example F.1. Learning on a Query
Consider optimization of a query using aqo.
Mind updating your demo database to the latest version.
When the query is executed for the first time, it is missing in tables underlying aqo views. So there is no data for predicting with aqo for each plan node, and “AQO not used” lines appear in the EXPLAIN output:
demo=# EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF) SELECT bp.*FROM segments s +JOIN flights f ON f.flight_id = s.flight_id +JOIN boarding_passes bp ON bp.ticket_no = s.ticket_no AND bp.flight_id = s.flight_id +WHERE f.scheduled_departure > '2025-12-1 15:00:00+00'; + QUERY PLAN +-------------------------------------------------------------------------------------------------------------------------------- + Hash Join (cost=79793.72..237497.86 rows=1201002 width=33) (actual rows=9455.00 loops=1) + AQO not used, fss=4871603661380287993 + Hash Cond: ((s.flight_id = f.flight_id) AND (s.ticket_no = bp.ticket_no)) + Buffers: shared hit=3713 read=50331, temp read=17210 written=17210 + -> Seq Scan on segments s (cost=0.00..72572.70 rows=3941270 width=18) (actual rows=3941249.00 loops=1) + AQO not used, fss=-1745942650988724053 + Buffers: shared hit=1853 read=31307 + -> Hash (cost=52395.69..52395.69 rows=1201002 width=37) (actual rows=9455.00 loops=1) + Buckets: 131072 Batches: 16 Memory Usage: 1058kB + Buffers: shared hit=1860 read=19024, temp written=45 + -> Hash Join (cost=608.55..52395.69 rows=1201002 width=37) (actual rows=9455.00 loops=1) + AQO not used, fss=4705493075117122362 + Hash Cond: (bp.flight_id = f.flight_id) + Buffers: shared hit=1860 read=19024 + -> Seq Scan on boarding_passes bp (cost=0.00..45318.32 rows=2463832 width=33) (actual rows=2463832.00 loops=1) + AQO not used, fss=1362775811343989307 + Buffers: shared hit=1656 read=19024 + -> Hash (cost=475.98..475.98 rows=10606 width=4) (actual rows=10594.00 loops=1) + Buckets: 16384 Batches: 1 Memory Usage: 501kB + Buffers: shared hit=204 + -> Seq Scan on flights f (cost=0.00..475.98 rows=10606 width=4) (actual rows=10594.00 loops=1) + AQO not used, fss=3484507337497244877 + Filter: (scheduled_departure > '2025-12-01 22:00:00+07'::timestamp with time zone) + Rows Removed by Filter: 11164 + Buffers: shared hit=204 + Planning: + Buffers: shared hit=53 Using aqo: true + AQO mode: AUTO AQO advanced: OFF + Query hash: 0 JOINS: 2 +(32 rows)
If there is no information on a certain node in the aqo_data view, aqo will add the appropriate record there for future learning and predictions except for nodes with fss=0 in the EXPLAIN output. As each of features and targets in the aqo_data view is a logarithm to base e, to get the actual value, raise e to this power. For example: exp(0):
fs | fss | dbid | delta_rows | nfeatures | features | targets | reliability | oids | tmpoids
+----+----------------------+-------+------------+-----------+---------------------------------------------+----------------------+-------------+---------------------+---------
0 | 3484507337497244877 | 16556 | f | 1 | {{-0.7185575545175223}} | {9.268043082104471} | {1} | {17452} |
0 | -1745942650988724053 | 16556 | f | 0 | | {15.187008236114766} | {1} | {17488} |
0 | 4705493075117122362 | 16556 | f | 2 | {{-0.7185575545175223,-9.987736784981028}} | {9.154298981092557} | {1} | {17452,17438} |
0 | 1362775811343989307 | 16556 | f | 0 | | {14.71722841949288} | {1} | {17438} |
0 | 4871603661380287993 | 16556 | f | 2 | {{-0.7185575545175223,-14.672062325711414}} | {9.154298981092557} | {1} | {17488,17452,17438} |
(5 rows)
When the query is executed for the second time, aqo recognizes the query and makes a prediction. Pay attention to the cardinality predicted by aqo and the value of aqo error (“error=0%”).
demo=# EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF) SELECT bp.* FROM segments s +JOIN flights f ON f.flight_id = s.flight_id +JOIN boarding_passes bp ON bp.ticket_no = s.ticket_no AND bp.flight_id = s.flight_id +WHERE f.scheduled_departure > '2025-12-1 15:00:00+00'; + QUERY PLAN +---------------------------------------------------------------------------------------------------------------------------------------- + Gather (cost=1608.83..38142.58 rows=9455 width=33) (actual rows=9455.00 loops=1) + Workers Planned: 2 + Workers Launched: 2 + Buffers: shared hit=27340 read=22325 + -> Nested Loop (cost=608.83..36197.08 rows=3940 width=33) (actual rows=3151.67 loops=3) + AQO: rows=9455, error=0%, fss=4871603661380287993 + Join Filter: (s.flight_id = f.flight_id) + Buffers: shared hit=27340 read=22325 + -> Hash Join (cost=608.40..34249.71 rows=3940 width=37) (actual rows=3151.67 loops=3) + AQO: rows=9455, error=0%, fss=4705493075117122362 + Hash Cond: (bp.flight_id = f.flight_id) + Buffers: shared hit=2360 read=18932 + -> Parallel Seq Scan on boarding_passes bp (cost=0.00..30945.97 rows=1026597 width=33) (actual rows=821277.33 loops=3) + AQO: rows=2463832, error=0%, fss=1362775811343989307 + Buffers: shared hit=1748 read=18932 + -> Hash (cost=475.98..475.98 rows=10594 width=4) (actual rows=10594.00 loops=3) + Buckets: 16384 Batches: 1 Memory Usage: 501kB + Buffers: shared hit=612 + -> Seq Scan on flights f (cost=0.00..475.98 rows=10594 width=4) (actual rows=10594.00 loops=3) + AQO: rows=10594, error=0%, fss=3484507337497244877 + Filter: (scheduled_departure > '2025-12-01 22:00:00+07'::timestamp with time zone) + Rows Removed by Filter: 11164 + Buffers: shared hit=612 + -> Index Only Scan using segments_pkey on segments s (cost=0.43..0.48 rows=1 width=18) (actual rows=1.00 loops=9455) + AQO not used (early terminated), fss=-5182591529139042748 + Index Cond: ((ticket_no = bp.ticket_no) AND (flight_id = bp.flight_id)) + Heap Fetches: 0 + Index Searches: 9455 + Buffers: shared hit=24980 read=3393 + Planning: + Buffers: shared hit=53 Using aqo: true + AQO mode: AUTO AQO advanced: OFF + Query hash: 0 JOINS: 2 (36 rows)
In case of an error, values of features and targets must change, but as there was no error above, they did not change.
fs | fss | nfeatures | features | targets | reliability | oids
---------------------+-------------+-----------+------------------------------------------------------------------------------------+---------------------+-------------+---------------------
-727505571757520766 | 2128507884 | 4 | {{-0.03438753143452488,-5.634789603169249,-0.3149847743198556,-8.006367567650246}} | {8.202208436436448} | {1} | {16579,16555,16563}
-727505571757520766 | -1076069505 | 0 | | {2.302585092994046} | {1} | {16555}
-727505571757520766 | -1838231581 | 0 | | {8.006367567650246} | {1} | {16563}
-727505571757520766 | -303037802 | 3 | {{-0.03438753143452488,-5.634789603169249,-0.3149847743198556}} | {8.202208436436448} | {1} | {16579,16555}
-727505571757520766 | -636613046 | 2 | {{-0.03438753143452488,-0.3149847743198556}} | {8.202208436436448} | {1} | {16579}
(6 rows)
Let's change a constant in the query, and you will notice that the prediction is made with an error:
demo=# EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF) SELECT bp.* +FROM segments s +JOIN flights f ON f.flight_id = s.flight_id +JOIN boarding_passes bp ON bp.ticket_no = s.ticket_no AND bp.flight_id = s.flight_id +WHERE f.scheduled_departure > '2025-11-20 15:00:00+00'; + QUERY PLAN +---------------------------------------------------------------------------------------------------------------------------------------- + Gather (cost=1608.83..38142.58 rows=9455 width=33) (actual rows=438899.00 loops=1) + Workers Planned: 2 + Workers Launched: 2 + Buffers: shared hit=1307156 read=30841 + -> Nested Loop (cost=608.83..36197.08 rows=3940 width=33) (actual rows=146299.67 loops=3) + AQO: rows=9455, error=-4542%, fss=4871603661380287993 + Join Filter: (s.flight_id = f.flight_id) + Buffers: shared hit=1307156 read=30841 + -> Hash Join (cost=608.40..34249.71 rows=3940 width=37) (actual rows=146299.67 loops=3) + AQO: rows=9455, error=-4542%, fss=4705493075117122362 + Hash Cond: (bp.flight_id = f.flight_id) + Buffers: shared hit=1521 read=19771 + -> Parallel Seq Scan on boarding_passes bp (cost=0.00..30945.97 rows=1026597 width=33) (actual rows=821277.33 loops=3) + AQO: rows=2463832, error=0%, fss=1362775811343989307 + Buffers: shared hit=909 read=19771 + -> Hash (cost=475.98..475.98 rows=10594 width=4) (actual rows=12593.00 loops=3) + Buckets: 16384 Batches: 1 Memory Usage: 571kB + Buffers: shared hit=612 + -> Seq Scan on flights f (cost=0.00..475.98 rows=10594 width=4) (actual rows=12593.00 loops=3) + AQO: rows=10594, error=-19%, fss=3484507337497244877 + Filter: (scheduled_departure > '2025-11-20 22:00:00+07'::timestamp with time zone) + Rows Removed by Filter: 9165 + Buffers: shared hit=612 + -> Index Only Scan using segments_pkey on segments s (cost=0.43..0.48 rows=1 width=18) (actual rows=1.00 loops=438899) + AQO not used (early terminated), fss=-5182591529139042748 + Index Cond: ((ticket_no = bp.ticket_no) AND (flight_id = bp.flight_id)) + Heap Fetches: 0 + Index Searches: 438899 + Buffers: shared hit=1305635 read=11070 + Planning: + Buffers: shared hit=53 Using aqo: true + AQO mode: AUTO AQO advanced: OFF + Query hash: 0 JOINS: 2 +(36 rows)
However, instead of recalculating features and targets, aqo added new values of selectivity and cardinality for this query to aqo_data:
demo=# SELECT * FROM aqo_data;
+ fs | fss | dbid | delta_rows | nfeatures | features | targets | reliability | oids | tmpoids
+----+----------------------+-------+------------+-----------+---------------------------------------------------------------------------------------+---------------------------------------+-------------+---------------------+---------
+ 0 | 3484507337497244877 | 16556 | f | 1 | {{-0.7185575545175223},{-0.5463556163769266}} | {9.268043082104471,9.440896383005846} | {1,1} | {17452} |
+ 0 | -1745942650988724053 | 16556 | f | 0 | | {15.187008236114766} | {1} | {17488} |
+ 0 | 4705493075117122362 | 16556 | f | 2 | {{-0.7185575545175223,-9.987736784981028},{-0.5463556163769266,-9.987736784981028}} | {9.154298981092557,12.9920245972504} | {1,1} | {17452,17438} |
+ 0 | 1362775811343989307 | 16556 | f | 0 | | {14.71722841949288} | {1} | {17438} |
+ 0 | 4871603661380287993 | 16556 | f | 2 | {{-0.7185575545175223,-14.672062325711414},{-0.5463556163769266,-14.672062325711414}} | {9.154298981092557,12.9920245972504} | {1,1} | {17488,17452,17438} |
(5 rows)
Now the prediction has no error:
demo=# EXPLAIN (ANALYZE, SUMMARY OFF, TIMING OFF) SELECT bp.*+FROM segments s +JOIN flights f ON f.flight_id = s.flight_id +JOIN boarding_passes bp ON bp.ticket_no = s.ticket_no AND bp.flight_id = s.flight_id +WHERE f.scheduled_departure > '2025-11-20 15:00:00+00'; + QUERY PLAN +---------------------------------------------------------------------------------------------------------------------------------------------- + Gather (cost=39355.89..164831.92 rows=429336 width=33) (actual rows=438899.00 loops=1) + Workers Planned: 2 + Workers Launched: 2 + Buffers: shared hit=1619 read=52833, temp read=21086 written=21152 + -> Parallel Hash Join (cost=38355.89..120898.32 rows=178890 width=33) (actual rows=146299.67 loops=3) + AQO: rows=429336, error=-2%, fss=4871603661380287993 + Hash Cond: ((s.flight_id = f.flight_id) AND (s.ticket_no = bp.ticket_no)) + Buffers: shared hit=1619 read=52833, temp read=21086 written=21152 + -> Parallel Seq Scan on segments s (cost=0.00..49581.96 rows=1642187 width=18) (actual rows=1313749.67 loops=3) + AQO: rows=3941249, error=0%, fss=-1745942650988724053 + Buffers: shared read=33160 + -> Parallel Hash (cost=34274.54..34274.54 rows=178890 width=37) (actual rows=146299.67 loops=3) + Buckets: 131072 Batches: 8 Memory Usage: 4928kB + Buffers: shared hit=1619 read=19673, temp written=2812 + -> Hash Join (cost=633.24..34274.54 rows=178890 width=37) (actual rows=146299.67 loops=3) + AQO: rows=429336, error=-2%, fss=4705493075117122362 + Hash Cond: (bp.flight_id = f.flight_id) + Buffers: shared hit=1619 read=19673 + -> Parallel Seq Scan on boarding_passes bp (cost=0.00..30945.97 rows=1026597 width=33) (actual rows=821277.33 loops=3) + AQO: rows=2463832, error=0%, fss=1362775811343989307 + Buffers: shared hit=1007 read=19673 + -> Hash (cost=475.98..475.98 rows=12581 width=4) (actual rows=12593.00 loops=3) + Buckets: 16384 Batches: 1 Memory Usage: 571kB + Buffers: shared hit=612 + -> Seq Scan on flights f (cost=0.00..475.98 rows=12581 width=4) (actual rows=12593.00 loops=3) + AQO: rows=12581, error=-0%, fss=3484507337497244877 + Filter: (scheduled_departure > '2025-11-20 22:00:00+07'::timestamp with time zone) + Rows Removed by Filter: 9165 + Buffers: shared hit=612 + Planning: + Buffers: shared hit=53 Using aqo: true + AQO mode: AUTO AQO advanced: OFF + Query hash: 0 JOINS: 2 +(36 rows)
Example F.2. Using the aqo_query_stat View
The aqo_query_stats view shows statistics on the query planning time, query execution time and cardinality error. Based on this data you can make a decision whether to use aqo predictions for different query classes.
Let's query the aqo_query_stats view:
select queryid, cardinality_error_with_aqo, cardinality_error_without_aqo,execution_time_with_aqo, execution_time_without_aqo, planning_time_with_aqo, planning_time_without_aqo from aqo_query_stat \gx
-[ RECORD 1 ]-----------------+------------------------------------------------------------------------------------------------------------
queryid | 8041624334006338922
cardinality_error_with_aqo | {0.14932737556062836,0,0.507421202801325,0.00040469447777891077}
cardinality_error_without_aqo | {0.1493979460962751,0.018403615483185476}
execution_time_with_aqo | {0.004760108,0.008743075,0.006608304,0.012392751}
execution_time_without_aqo | {0.005775926,0.012730316}
planning_time_with_aqo | {0.006927997,0.004247339,0.005005022,0.004169717}
planning_time_without_aqo | {0.001783542,0.001706121}
The retrieved data is for the query from Example F.1, which was executed once without aqo for each of the parameters degree<80 and degree<90 and twice with aqo for each of these parameters. It is clear that with aqo, the cardinality error decreases to 0.0004, while the minimum cardinality error without aqo is 0.15. Besides, the execution time with aqo is lower than without it. So the conclusion is that aqo learns well on this query, and the prediction can be used for this query class.
F.4.5. Author
Oleg Ivanov