Execution time from >1s -> 80m+ when extra columns added in SELECTfor sub-query - Mailing list pgsql-performance

From A Guy Named Ryan
Subject Execution time from >1s -> 80m+ when extra columns added in SELECTfor sub-query
Date
Msg-id CAAJz3y3cUhHx7y0ffr9i1Cm4V93CwgnNDY7+uV2GzTQLrn5Q-A@mail.gmail.com
Whole thread Raw
Responses Re: Execution time from >1s -> 80m+ when extra columns added inSELECT for sub-query  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-performance
First time posting here, so please let me know what additional information you'd like.  Thanks!

## A description of what you are trying to achieve and what results you expect:

- I have a program that dynamically generates SQL queries
- I made changes to how that program generates the SELECT part of the queries (and sub-queries)
- I have a query that went from taking less than a second with the old version of the SELECT to over 2 hours to complete with the new version of the SELECT
- I'd expect both versions of the query to take the same amount of time
- The changes to the SELECT appear in a sub-query and even though the changed columns in the sub-query are ultimately ignored by the query using the sub-query

## PostgreSQL version number you are running:

- Originally discovered on 9.6 running directly on the host
  - PostgreSQL 9.6.17 on x86_64-pc-linux-gnu (Ubuntu 9.6.17-2.pgdg18.04+1), compiled by gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0, 64-bit
- Recreated the issue under 9.6 and 12.2 using Docker
  - PostgreSQL 9.6.17 on x86_64-pc-linux-musl, compiled by gcc (Alpine 9.2.0) 9.2.0, 64-bit
  - PostgreSQL 12.2 on x86_64-pc-linux-musl, compiled by gcc (Alpine 9.2.0) 9.2.0, 64-bit

## How you installed PostgreSQL:

- Originally found in PostgreSQL installed on host
  - postgresql-9.6/bionic-pgdg,now 9.6.17-2.pgdg18.04+1 amd64 [installed]
- Recreated in Docker containers for 9.6 and 12.2

## Changes made to the settings in the postgresql.conf file:  see Server Configuration for a quick way to list them all:

- Ran tests using all combinations of the following values:
  - work_mem (4MB and 4GB)
  - random_page_cost: (1, 2, 4)
- I found no difference in performance with any combination of the above

## Operating system and version:

- Ubuntu 18.04
- Linux helios 4.15.0-76-generic #86-Ubuntu SMP Fri Jan 17 17:24:28 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

## What program you're using to connect to PostgreSQL:

- psql

## Is there anything relevant or unusual in the PostgreSQL server logs?:

- Not that I've seen

## For questions about any kind of error:

- No errors

## What you were doing when the error happened / how to cause the error:

- As mentioned above, my dynamically generated SQL now includes a few extra columns in a sub-query than it used to.  The query using the sub-query ignores those columns, so I'd imagine no change in performance, but instead I see a 6000x increase in execution time.
- I created a set of testing scripts that run a total of 48 variations on a few queries and configuration options and have generated EXPLAINS for all of them.
- I've included a copy of each test query and the resulting EXPLAIN
- The files that include "\_present" are those that have the new SQL and are running very slowly
- The files the include \_null, \_deleted, \_casted represent variations on the queries that all run very quickly
- Running VACUUM and/or ANALYZE does not seem to have an effect

## Tables involved

```
experiments=# set search_path to jigsaw_temp;
SET
experiments=# \dt
                              List of relations
   Schema    |                      Name                      | Type  | Owner
-------------+------------------------------------------------+-------+-------
 jigsaw_temp | jtemp1c37l3b_baseline_windows_after_inclusion  | table | ryan
 jigsaw_temp | jtemp1c37l3b_baseline_windows_with_collections | table | ryan
(2 rows)

experiments=# \d jtemp1c37l3b_baseline_windows_after_inclusion;
    Table "jigsaw_temp.jtemp1c37l3b_baseline_windows_after_inclusion"
        Column        |       Type       | Collation | Nullable | Default
----------------------+------------------+-----------+----------+---------
 uuid                 | text             |           |          |
 person_id            | bigint           |           |          |
 criterion_id         | bigint           |           |          |
 criterion_table      | text             |           |          |
 criterion_domain     | text             |           |          |
 start_date           | date             |           |          |
 end_date             | date             |           |          |
 source_value         | text             |           |          |
 source_vocabulary_id | text             |           |          |
 drug_amount          | double precision |           |          |
 drug_amount_units    | text             |           |          |
 drug_days_supply     | integer          |           |          |
 drug_name            | text             |           |          |
 drug_quantity        | bigint           |           |          |
 window_id            | bigint           |           |          |

experiments=# \d jtemp1c37l3b_baseline_windows_with_collections
Table "jigsaw_temp.jtemp1c37l3b_baseline_windows_with_collections"
  Column   |  Type  | Collation | Nullable | Default
-----------+--------+-----------+----------+---------
 person_id | bigint |           |          |
 uuid      | text   |           |          |

experiments=# SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE relname='jtemp1c37l3b_baseline_windows_with_collections';
                    relname                     | relpages | reltuples | relallvisible | relkind | relnatts | relhassubclass | reloptions | pg_table_size
------------------------------------------------+----------+-----------+---------------+---------+----------+----------------+------------+---------------
 jtemp1c37l3b_baseline_windows_with_collections |     1433 |    138972 |             0 | r       |        2 | f              |            |      11771904
(1 row)

experiments=# SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE relname='jtemp1c37l3b_baseline_windows_after_inclusion';
                    relname                    | relpages | reltuples | relallvisible | relkind | relnatts | relhassubclass | reloptions | pg_table_size
-----------------------------------------------+----------+-----------+---------------+---------+----------+----------------+------------+---------------
 jtemp1c37l3b_baseline_windows_after_inclusion |     9187 |    505244 |             0 | r       |       15 | f              |            |      75309056
(1 row)
```

config

```
                  name                  |                 setting                  |                                                          description
----------------------------------------+------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------
 allow_system_table_mods                | off                                      | Allows modifications of the structure of system tables.
 application_name                       | psql                                     | Sets the application name to be reported in statistics and logs.
 archive_cleanup_command                |                                          | Sets the shell command that will be executed at every restart point.
 archive_command                        | (disabled)                               | Sets the shell command that will be called to archive a WAL file.
 archive_mode                           | off                                      | Allows archiving of WAL files using archive_command.
 archive_timeout                        | 0                                        | Forces a switch to the next WAL file if a new file has not been started within N seconds.
 array_nulls                            | on                                       | Enable input of NULL elements in arrays.
 authentication_timeout                 | 1min                                     | Sets the maximum allowed time to complete client authentication.
 autovacuum                             | on                                       | Starts the autovacuum subprocess.
 autovacuum_analyze_scale_factor        | 0.1                                      | Number of tuple inserts, updates, or deletes prior to analyze as a fraction of reltuples.
 autovacuum_analyze_threshold           | 50                                       | Minimum number of tuple inserts, updates, or deletes prior to analyze.
 autovacuum_freeze_max_age              | 200000000                                | Age at which to autovacuum a table to prevent transaction ID wraparound.
 autovacuum_max_workers                 | 3                                        | Sets the maximum number of simultaneously running autovacuum worker processes.
 autovacuum_multixact_freeze_max_age    | 400000000                                | Multixact age at which to autovacuum a table to prevent multixact wraparound.
 autovacuum_naptime                     | 1min                                     | Time to sleep between autovacuum runs.
 autovacuum_vacuum_cost_delay           | 2ms                                      | Vacuum cost delay in milliseconds, for autovacuum.
 autovacuum_vacuum_cost_limit           | -1                                       | Vacuum cost amount available before napping, for autovacuum.
 autovacuum_vacuum_scale_factor         | 0.2                                      | Number of tuple updates or deletes prior to vacuum as a fraction of reltuples.
 autovacuum_vacuum_threshold            | 50                                       | Minimum number of tuple updates or deletes prior to vacuum.
 autovacuum_work_mem                    | -1                                       | Sets the maximum memory to be used by each autovacuum worker process.
 backend_flush_after                    | 0                                        | Number of pages after which previously performed writes are flushed to disk.
 backslash_quote                        | safe_encoding                            | Sets whether "\'" is allowed in string literals.
 bgwriter_delay                         | 200ms                                    | Background writer sleep time between rounds.
 bgwriter_flush_after                   | 512kB                                    | Number of pages after which previously performed writes are flushed to disk.
 bgwriter_lru_maxpages                  | 100                                      | Background writer maximum number of LRU pages to flush per round.
 bgwriter_lru_multiplier                | 2                                        | Multiple of the average buffer usage to free per round.
 block_size                             | 8192                                     | Shows the size of a disk block.
 bonjour                                | off                                      | Enables advertising the server via Bonjour.
 bonjour_name                           |                                          | Sets the Bonjour service name.
 bytea_output                           | hex                                      | Sets the output format for bytea.
 check_function_bodies                  | on                                       | Check function bodies during CREATE FUNCTION.
 checkpoint_completion_target           | 0.5                                      | Time spent flushing dirty buffers during checkpoint, as fraction of checkpoint interval.
 checkpoint_flush_after                 | 256kB                                    | Number of pages after which previously performed writes are flushed to disk.
 checkpoint_timeout                     | 5min                                     | Sets the maximum time between automatic WAL checkpoints.
 checkpoint_warning                     | 30s                                      | Enables warnings if checkpoint segments are filled more frequently than this.
 client_encoding                        | UTF8                                     | Sets the client's character set encoding.
 client_min_messages                    | notice                                   | Sets the message levels that are sent to the client.
 cluster_name                           |                                          | Sets the name of the cluster, which is included in the process title.
 commit_delay                           | 0                                        | Sets the delay in microseconds between transaction commit and flushing WAL to disk.
 commit_siblings                        | 5                                        | Sets the minimum concurrent open transactions before performing commit_delay.
 config_file                            | /var/lib/postgresql/data/postgresql.conf | Sets the server's main configuration file.
 constraint_exclusion                   | partition                                | Enables the planner to use constraints to optimize queries.
 cpu_index_tuple_cost                   | 0.005                                    | Sets the planner's estimate of the cost of processing each index entry during an index scan.
 cpu_operator_cost                      | 0.0025                                   | Sets the planner's estimate of the cost of processing each operator or function call.
 cpu_tuple_cost                         | 0.01                                     | Sets the planner's estimate of the cost of processing each tuple (row).
 cursor_tuple_fraction                  | 0.1                                      | Sets the planner's estimate of the fraction of a cursor's rows that will be retrieved.
 data_checksums                         | off                                      | Shows whether data checksums are turned on for this cluster.
 data_directory                         | /var/lib/postgresql/data                 | Sets the server's data directory.
 data_directory_mode                    | 0700                                     | Mode of the data directory.
 data_sync_retry                        | off                                      | Whether to continue running after a failure to sync data files.
 DateStyle                              | ISO, MDY                                 | Sets the display format for date and time values.
 db_user_namespace                      | off                                      | Enables per-database user names.
 deadlock_timeout                       | 1s                                       | Sets the time to wait on a lock before checking for deadlock.
 debug_assertions                       | off                                      | Shows whether the running server has assertion checks enabled.
 debug_pretty_print                     | on                                       | Indents parse and plan tree displays.
 debug_print_parse                      | off                                      | Logs each query's parse tree.
 debug_print_plan                       | off                                      | Logs each query's execution plan.
 debug_print_rewritten                  | off                                      | Logs each query's rewritten parse tree.
 default_statistics_target              | 100                                      | Sets the default statistics target.
 default_table_access_method            | heap                                     | Sets the default table access method for new tables.
 default_tablespace                     |                                          | Sets the default tablespace to create tables and indexes in.
 default_text_search_config             | pg_catalog.english                       | Sets default text search configuration.
 default_transaction_deferrable         | off                                      | Sets the default deferrable status of new transactions.
 default_transaction_isolation          | read committed                           | Sets the transaction isolation level of each new transaction.
 default_transaction_read_only          | off                                      | Sets the default read-only status of new transactions.
 dynamic_library_path                   | $libdir                                  | Sets the path for dynamically loadable modules.
 dynamic_shared_memory_type             | posix                                    | Selects the dynamic shared memory implementation used.
 effective_cache_size                   | 4GB                                      | Sets the planner's assumption about the total size of the data caches.
 effective_io_concurrency               | 1                                        | Number of simultaneous requests that can be handled efficiently by the disk subsystem.
 enable_bitmapscan                      | on                                       | Enables the planner's use of bitmap-scan plans.
 enable_gathermerge                     | on                                       | Enables the planner's use of gather merge plans.
 enable_hashagg                         | on                                       | Enables the planner's use of hashed aggregation plans.
 enable_hashjoin                        | on                                       | Enables the planner's use of hash join plans.
 enable_indexonlyscan                   | on                                       | Enables the planner's use of index-only-scan plans.
 enable_indexscan                       | on                                       | Enables the planner's use of index-scan plans.
 enable_material                        | on                                       | Enables the planner's use of materialization.
 enable_mergejoin                       | on                                       | Enables the planner's use of merge join plans.
 enable_nestloop                        | on                                       | Enables the planner's use of nested-loop join plans.
 enable_parallel_append                 | on                                       | Enables the planner's use of parallel append plans.
 enable_parallel_hash                   | on                                       | Enables the planner's use of parallel hash plans.
 enable_partition_pruning               | on                                       | Enables plan-time and run-time partition pruning.
 enable_partitionwise_aggregate         | off                                      | Enables partitionwise aggregation and grouping.
 enable_partitionwise_join              | off                                      | Enables partitionwise join.
 enable_seqscan                         | on                                       | Enables the planner's use of sequential-scan plans.
 enable_sort                            | on                                       | Enables the planner's use of explicit sort steps.
 enable_tidscan                         | on                                       | Enables the planner's use of TID scan plans.
 escape_string_warning                  | on                                       | Warn about backslash escapes in ordinary string literals.
 event_source                           | PostgreSQL                               | Sets the application name used to identify PostgreSQL messages in the event log.
 exit_on_error                          | off                                      | Terminate session on any error.
 external_pid_file                      |                                          | Writes the postmaster PID to the specified file.
 extra_float_digits                     | 1                                        | Sets the number of digits displayed for floating-point values.
 force_parallel_mode                    | off                                      | Forces use of parallel query facilities.
 from_collapse_limit                    | 8                                        | Sets the FROM-list size beyond which subqueries are not collapsed.
 fsync                                  | on                                       | Forces synchronization of updates to disk.
 full_page_writes                       | on                                       | Writes full pages to WAL when first modified after a checkpoint.
 geqo                                   | on                                       | Enables genetic query optimization.
 geqo_effort                            | 5                                        | GEQO: effort is used to set the default for other GEQO parameters.
 geqo_generations                       | 0                                        | GEQO: number of iterations of the algorithm.
 geqo_pool_size                         | 0                                        | GEQO: number of individuals in the population.
 geqo_seed                              | 0                                        | GEQO: seed for random path selection.
 geqo_selection_bias                    | 2                                        | GEQO: selective pressure within the population.
 geqo_threshold                         | 12                                       | Sets the threshold of FROM items beyond which GEQO is used.
 gin_fuzzy_search_limit                 | 0                                        | Sets the maximum allowed result for exact search by GIN.
 gin_pending_list_limit                 | 4MB                                      | Sets the maximum size of the pending list for GIN index.
 hba_file                               | /var/lib/postgresql/data/pg_hba.conf     | Sets the server's "hba" configuration file.
 hot_standby                            | on                                       | Allows connections and queries during recovery.
 hot_standby_feedback                   | off                                      | Allows feedback from a hot standby to the primary that will avoid query conflicts.
 huge_pages                             | try                                      | Use of huge pages on Linux or Windows.
 ident_file                             | /var/lib/postgresql/data/pg_ident.conf   | Sets the server's "ident" configuration file.
 idle_in_transaction_session_timeout    | 0                                        | Sets the maximum allowed duration of any idling transaction.
 ignore_checksum_failure                | off                                      | Continues processing after a checksum failure.
 ignore_system_indexes                  | off                                      | Disables reading from system indexes.
 integer_datetimes                      | on                                       | Datetimes are integer based.
 IntervalStyle                          | postgres                                 | Sets the display format for interval values.
 jit                                    | on                                       | Allow JIT compilation.
 jit_above_cost                         | 100000                                   | Perform JIT compilation if query is more expensive.
 jit_debugging_support                  | off                                      | Register JIT compiled function with debugger.
 jit_dump_bitcode                       | off                                      | Write out LLVM bitcode to facilitate JIT debugging.
 jit_expressions                        | on                                       | Allow JIT compilation of expressions.
 jit_inline_above_cost                  | 500000                                   | Perform JIT inlining if query is more expensive.
 jit_optimize_above_cost                | 500000                                   | Optimize JITed functions if query is more expensive.
 jit_profiling_support                  | off                                      | Register JIT compiled function with perf profiler.
 jit_provider                           | llvmjit                                  | JIT provider to use.
 jit_tuple_deforming                    | on                                       | Allow JIT compilation of tuple deforming.
 join_collapse_limit                    | 8                                        | Sets the FROM-list size beyond which JOIN constructs are not flattened.
 krb_caseins_users                      | off                                      | Sets whether Kerberos and GSSAPI user names should be treated as case-insensitive.
 krb_server_keyfile                     |                                          | Sets the location of the Kerberos server key file.
 lc_collate                             | en_US.utf8                               | Shows the collation order locale.
 lc_ctype                               | en_US.utf8                               | Shows the character classification and case conversion locale.
 lc_messages                            | en_US.utf8                               | Sets the language in which messages are displayed.
 lc_monetary                            | en_US.utf8                               | Sets the locale for formatting monetary amounts.
 lc_numeric                             | en_US.utf8                               | Sets the locale for formatting numbers.
 lc_time                                | en_US.utf8                               | Sets the locale for formatting date and time values.
 listen_addresses                       | *                                        | Sets the host name or IP address(es) to listen to.
 lo_compat_privileges                   | off                                      | Enables backward compatibility mode for privilege checks on large objects.
 local_preload_libraries                |                                          | Lists unprivileged shared libraries to preload into each backend.
 lock_timeout                           | 0                                        | Sets the maximum allowed duration of any wait for a lock.
 log_autovacuum_min_duration            | -1                                       | Sets the minimum execution time above which autovacuum actions will be logged.
 log_checkpoints                        | off                                      | Logs each checkpoint.
 log_connections                        | off                                      | Logs each successful connection.
 log_destination                        | stderr                                   | Sets the destination for server log output.
 log_directory                          | log                                      | Sets the destination directory for log files.
 log_disconnections                     | off                                      | Logs end of a session, including duration.
 log_duration                           | off                                      | Logs the duration of each completed SQL statement.
 log_error_verbosity                    | default                                  | Sets the verbosity of logged messages.
 log_executor_stats                     | off                                      | Writes executor performance statistics to the server log.
 log_file_mode                          | 0600                                     | Sets the file permissions for log files.
 log_filename                           | postgresql-%Y-%m-%d_%H%M%S.log           | Sets the file name pattern for log files.
 log_hostname                           | off                                      | Logs the host name in the connection logs.
 log_line_prefix                        | %m [%p]                                  | Controls information prefixed to each log line.
 log_lock_waits                         | off                                      | Logs long lock waits.
 log_min_duration_statement             | -1                                       | Sets the minimum execution time above which statements will be logged.
 log_min_error_statement                | error                                    | Causes all statements generating error at or above this level to be logged.
 log_min_messages                       | warning                                  | Sets the message levels that are logged.
 log_parser_stats                       | off                                      | Writes parser performance statistics to the server log.
 log_planner_stats                      | off                                      | Writes planner performance statistics to the server log.
 log_replication_commands               | off                                      | Logs each replication command.
 log_rotation_age                       | 1d                                       | Automatic log file rotation will occur after N minutes.
 log_rotation_size                      | 10MB                                     | Automatic log file rotation will occur after N kilobytes.
 log_statement                          | none                                     | Sets the type of statements logged.
 log_statement_stats                    | off                                      | Writes cumulative performance statistics to the server log.
 log_temp_files                         | -1                                       | Log the use of temporary files larger than this number of kilobytes.
 log_timezone                           | UTC                                      | Sets the time zone to use in log messages.
 log_transaction_sample_rate            | 0                                        | Set the fraction of transactions to log for new transactions.
 log_truncate_on_rotation               | off                                      | Truncate existing log files of same name during log rotation.
 logging_collector                      | off                                      | Start a subprocess to capture stderr output and/or csvlogs into log files.
 maintenance_work_mem                   | 64MB                                     | Sets the maximum memory to be used for maintenance operations.
 max_connections                        | 100                                      | Sets the maximum number of concurrent connections.
 max_files_per_process                  | 1000                                     | Sets the maximum number of simultaneously open files for each server process.
 max_function_args                      | 100                                      | Shows the maximum number of function arguments.
 max_identifier_length                  | 63                                       | Shows the maximum identifier length.
 max_index_keys                         | 32                                       | Shows the maximum number of index keys.
 max_locks_per_transaction              | 64                                       | Sets the maximum number of locks per transaction.
 max_logical_replication_workers        | 4                                        | Maximum number of logical replication worker processes.
 max_parallel_maintenance_workers       | 2                                        | Sets the maximum number of parallel processes per maintenance operation.
 max_parallel_workers                   | 8                                        | Sets the maximum number of parallel workers that can be active at one time.
 max_parallel_workers_per_gather        | 2                                        | Sets the maximum number of parallel processes per executor node.
 max_pred_locks_per_page                | 2                                        | Sets the maximum number of predicate-locked tuples per page.
 max_pred_locks_per_relation            | -2                                       | Sets the maximum number of predicate-locked pages and tuples per relation.
 max_pred_locks_per_transaction         | 64                                       | Sets the maximum number of predicate locks per transaction.
 max_prepared_transactions              | 0                                        | Sets the maximum number of simultaneously prepared transactions.
 max_replication_slots                  | 10                                       | Sets the maximum number of simultaneously defined replication slots.
 max_stack_depth                        | 2MB                                      | Sets the maximum stack depth, in kilobytes.
 max_standby_archive_delay              | 30s                                      | Sets the maximum delay before canceling queries when a hot standby server is processing archived WAL data.
 max_standby_streaming_delay            | 30s                                      | Sets the maximum delay before canceling queries when a hot standby server is processing streamed WAL data.
 max_sync_workers_per_subscription      | 2                                        | Maximum number of table synchronization workers per subscription.
 max_wal_senders                        | 10                                       | Sets the maximum number of simultaneously running WAL sender processes.
 max_wal_size                           | 1GB                                      | Sets the WAL size that triggers a checkpoint.
 max_worker_processes                   | 8                                        | Maximum number of concurrent worker processes.
 min_parallel_index_scan_size           | 512kB                                    | Sets the minimum amount of index data for a parallel scan.
 min_parallel_table_scan_size           | 8MB                                      | Sets the minimum amount of table data for a parallel scan.
 min_wal_size                           | 80MB                                     | Sets the minimum size to shrink the WAL to.
 old_snapshot_threshold                 | -1                                       | Time before a snapshot is too old to read pages changed after the snapshot was taken.
 operator_precedence_warning            | off                                      | Emit a warning for constructs that changed meaning since PostgreSQL 9.4.
 parallel_leader_participation          | on                                       | Controls whether Gather and Gather Merge also run subplans.
 parallel_setup_cost                    | 1000                                     | Sets the planner's estimate of the cost of starting up worker processes for parallel query.
 parallel_tuple_cost                    | 0.1                                      | Sets the planner's estimate of the cost of passing each tuple (row) from worker to master backend.
 password_encryption                    | md5                                      | Encrypt passwords.
 plan_cache_mode                        | auto                                     | Controls the planner's selection of custom or generic plan.
 port                                   | 5432                                     | Sets the TCP port the server listens on.
 post_auth_delay                        | 0                                        | Waits N seconds on connection startup after authentication.
 pre_auth_delay                         | 0                                        | Waits N seconds on connection startup before authentication.
 primary_conninfo                       |                                          | Sets the connection string to be used to connect to the sending server.
 primary_slot_name                      |                                          | Sets the name of the replication slot to use on the sending server.
 promote_trigger_file                   |                                          | Specifies a file name whose presence ends recovery in the standby.
 quote_all_identifiers                  | off                                      | When generating SQL fragments, quote all identifiers.
 random_page_cost                       | 4                                        | Sets the planner's estimate of the cost of a nonsequentially fetched disk page.
 recovery_end_command                   |                                          | Sets the shell command that will be executed once at the end of recovery.
 recovery_min_apply_delay               | 0                                        | Sets the minimum delay for applying changes during recovery.
 recovery_target                        |                                          | Set to "immediate" to end recovery as soon as a consistent state is reached.
 recovery_target_action                 | pause                                    | Sets the action to perform upon reaching the recovery target.
 recovery_target_inclusive              | on                                       | Sets whether to include or exclude transaction with recovery target.
 recovery_target_lsn                    |                                          | Sets the LSN of the write-ahead log location up to which recovery will proceed.
 recovery_target_name                   |                                          | Sets the named restore point up to which recovery will proceed.
 recovery_target_time                   |                                          | Sets the time stamp up to which recovery will proceed.
 recovery_target_timeline               | latest                                   | Specifies the timeline to recover into.
 recovery_target_xid                    |                                          | Sets the transaction ID up to which recovery will proceed.
 restart_after_crash                    | on                                       | Reinitialize server after backend crash.
 restore_command                        |                                          | Sets the shell command that will retrieve an archived WAL file.
 row_security                           | on                                       | Enable row security.
 search_path                            | "$user", public                          | Sets the schema search order for names that are not schema-qualified.
 segment_size                           | 1GB                                      | Shows the number of pages per disk file.
 seq_page_cost                          | 1                                        | Sets the planner's estimate of the cost of a sequentially fetched disk page.
 server_encoding                        | UTF8                                     | Sets the server (database) character set encoding.
 server_version                         | 12.2                                     | Shows the server version.
 server_version_num                     | 120002                                   | Shows the server version as an integer.
 session_preload_libraries              |                                          | Lists shared libraries to preload into each backend.
 session_replication_role               | origin                                   | Sets the session's behavior for triggers and rewrite rules.
 shared_buffers                         | 12GB                                     | Sets the number of shared memory buffers used by the server.
 shared_memory_type                     | mmap                                     | Selects the shared memory implementation used for the main shared memory region.
 shared_preload_libraries               |                                          | Lists shared libraries to preload into server.
 ssl                                    | off                                      | Enables SSL connections.
 ssl_ca_file                            |                                          | Location of the SSL certificate authority file.
 ssl_cert_file                          | server.crt                               | Location of the SSL server certificate file.
 ssl_ciphers                            | HIGH:MEDIUM:+3DES:!aNULL                 | Sets the list of allowed SSL ciphers.
 ssl_crl_file                           |                                          | Location of the SSL certificate revocation list file.
 ssl_dh_params_file                     |                                          | Location of the SSL DH parameters file.
 ssl_ecdh_curve                         | prime256v1                               | Sets the curve to use for ECDH.
 ssl_key_file                           | server.key                               | Location of the SSL server private key file.
 ssl_library                            | OpenSSL                                  | Name of the SSL library.
 ssl_max_protocol_version               |                                          | Sets the maximum SSL/TLS protocol version to use.
 ssl_min_protocol_version               | TLSv1                                    | Sets the minimum SSL/TLS protocol version to use.
 ssl_passphrase_command                 |                                          | Command to obtain passphrases for SSL.
 ssl_passphrase_command_supports_reload | off                                      | Also use ssl_passphrase_command during server reload.
 ssl_prefer_server_ciphers              | on                                       | Give priority to server ciphersuite order.
 standard_conforming_strings            | on                                       | Causes '...' strings to treat backslashes literally.
 statement_timeout                      | 0                                        | Sets the maximum allowed duration of any statement.
 stats_temp_directory                   | pg_stat_tmp                              | Writes temporary statistics files to the specified directory.
 superuser_reserved_connections         | 3                                        | Sets the number of connection slots reserved for superusers.
 synchronize_seqscans                   | on                                       | Enable synchronized sequential scans.
 synchronous_commit                     | on                                       | Sets the current transaction's synchronization level.
 synchronous_standby_names              |                                          | Number of synchronous standbys and list of names of potential synchronous ones.
 syslog_facility                        | local0                                   | Sets the syslog "facility" to be used when syslog enabled.
 syslog_ident                           | postgres                                 | Sets the program name used to identify PostgreSQL messages in syslog.
 syslog_sequence_numbers                | on                                       | Add sequence number to syslog messages to avoid duplicate suppression.
 syslog_split_messages                  | on                                       | Split messages sent to syslog by lines and to fit into 1024 bytes.
 tcp_keepalives_count                   | 9                                        | Maximum number of TCP keepalive retransmits.
 tcp_keepalives_idle                    | 7200                                     | Time between issuing TCP keepalives.
 tcp_keepalives_interval                | 75                                       | Time between TCP keepalive retransmits.
 tcp_user_timeout                       | 0                                        | TCP user timeout.
 temp_buffers                           | 8MB                                      | Sets the maximum number of temporary buffers used by each session.
 temp_file_limit                        | -1                                       | Limits the total size of all temporary files used by each process.
 temp_tablespaces                       |                                          | Sets the tablespace(s) to use for temporary tables and sort files.
 TimeZone                               | UTC                                      | Sets the time zone for displaying and interpreting time stamps.
 timezone_abbreviations                 | Default                                  | Selects a file of time zone abbreviations.
 trace_notify                           | off                                      | Generates debugging output for LISTEN and NOTIFY.
 trace_recovery_messages                | log                                      | Enables logging of recovery-related debugging information.
 trace_sort                             | off                                      | Emit information about resource usage in sorting.
 track_activities                       | on                                       | Collects information about executing commands.
 track_activity_query_size              | 1kB                                      | Sets the size reserved for pg_stat_activity.query, in bytes.
 track_commit_timestamp                 | off                                      | Collects transaction commit time.
 track_counts                           | on                                       | Collects statistics on database activity.
 track_functions                        | none                                     | Collects function-level statistics on database activity.
 track_io_timing                        | off                                      | Collects timing statistics for database I/O activity.
 transaction_deferrable                 | off                                      | Whether to defer a read-only serializable transaction until it can be executed with no possible serialization failures.
 transaction_isolation                  | read committed                           | Sets the current transaction's isolation level.
 transaction_read_only                  | off                                      | Sets the current transaction's read-only status.
 transform_null_equals                  | off                                      | Treats "expr=NULL" as "expr IS NULL".
 unix_socket_directories                | /var/run/postgresql                      | Sets the directories where Unix-domain sockets will be created.
 unix_socket_group                      |                                          | Sets the owning group of the Unix-domain socket.
 unix_socket_permissions                | 0777                                     | Sets the access permissions of the Unix-domain socket.
 update_process_title                   | on                                       | Updates the process title to show the active SQL command.
 vacuum_cleanup_index_scale_factor      | 0.1                                      | Number of tuple inserts prior to index cleanup as a fraction of reltuples.
 vacuum_cost_delay                      | 0                                        | Vacuum cost delay in milliseconds.
 vacuum_cost_limit                      | 200                                      | Vacuum cost amount available before napping.
 vacuum_cost_page_dirty                 | 20                                       | Vacuum cost for a page dirtied by vacuum.
 vacuum_cost_page_hit                   | 1                                        | Vacuum cost for a page found in the buffer cache.
 vacuum_cost_page_miss                  | 10                                       | Vacuum cost for a page not found in the buffer cache.
 vacuum_defer_cleanup_age               | 0                                        | Number of transactions by which VACUUM and HOT cleanup should be deferred, if any.
 vacuum_freeze_min_age                  | 50000000                                 | Minimum age at which VACUUM should freeze a table row.
 vacuum_freeze_table_age                | 150000000                                | Age at which VACUUM should scan whole table to freeze tuples.
 vacuum_multixact_freeze_min_age        | 5000000                                  | Minimum age at which VACUUM should freeze a MultiXactId in a table row.
 vacuum_multixact_freeze_table_age      | 150000000                                | Multixact age at which VACUUM should scan whole table to freeze tuples.
 wal_block_size                         | 8192                                     | Shows the block size in the write ahead log.
 wal_buffers                            | 16MB                                     | Sets the number of disk-page buffers in shared memory for WAL.
 wal_compression                        | off                                      | Compresses full-page writes written in WAL file.
 wal_consistency_checking               |                                          | Sets the WAL resource managers for which WAL consistency checks are done.
 wal_init_zero                          | on                                       | Writes zeroes to new WAL files before first use.
 wal_keep_segments                      | 0                                        | Sets the number of WAL files held for standby servers.
 wal_level                              | replica                                  | Set the level of information written to the WAL.
 wal_log_hints                          | off                                      | Writes full pages to WAL when first modified after a checkpoint, even for a non-critical modifications.
 wal_receiver_status_interval           | 10s                                      | Sets the maximum interval between WAL receiver status reports to the sending server.
 wal_receiver_timeout                   | 1min                                     | Sets the maximum wait time to receive data from the sending server.
 wal_recycle                            | on                                       | Recycles WAL files by renaming them.
 wal_retrieve_retry_interval            | 5s                                       | Sets the time to wait before retrying to retrieve WAL after a failed attempt.
 wal_segment_size                       | 16MB                                     | Shows the size of write ahead log segments.
 wal_sender_timeout                     | 1min                                     | Sets the maximum time to wait for WAL replication.
 wal_sync_method                        | fdatasync                                | Selects the method used for forcing WAL updates to disk.
 wal_writer_delay                       | 200ms                                    | Time between WAL flushes performed in the WAL writer.
 wal_writer_flush_after                 | 1MB                                      | Amount of WAL written out by WAL writer that triggers a flush.
 work_mem                               | 4MB                                      | Sets the maximum memory to be used for query workspaces.
 xmlbinary                              | base64                                   | Sets how binary values are to be encoded in XML.
 xmloption                              | content                                  | Sets whether XML data in implicit parsing and serialization operations is to be considered as documents or content fragments.
 zero_damaged_pages                     | off                                      | Continues processing past damaged page headers.
(314 rows)
```

table statistics

```
experiments=# SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, inherited, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist, correlation FROM pg_stats WHERE tablename ilike 'jtemp1c37l3b_%' ORDER BY tablename, 1 DESC;
 frac_mcv  |                   tablename                    |       attname        | inherited | null_frac | n_distinct | n_mcv | n_hist | correlation
-----------+------------------------------------------------+----------------------+-----------+-----------+------------+-------+--------+-------------
           | jtemp1c37l3b_baseline_windows_after_inclusion  | drug_amount_units    | f         |         1 |          0 |       |        |
           | jtemp1c37l3b_baseline_windows_after_inclusion  | drug_days_supply     | f         |         1 |          0 |       |        |
           | jtemp1c37l3b_baseline_windows_after_inclusion  | drug_name            | f         |         1 |          0 |       |        |
           | jtemp1c37l3b_baseline_windows_after_inclusion  | window_id            | f         |         1 |          0 |       |        |
           | jtemp1c37l3b_baseline_windows_after_inclusion  | uuid                 | f         |         0 |         -1 |       |    101 |           1
           | jtemp1c37l3b_baseline_windows_after_inclusion  | person_id            | f         |         0 |         -1 |       |    101 |  0.38026863
           | jtemp1c37l3b_baseline_windows_after_inclusion  | criterion_id         | f         |         0 |         -1 |       |    101 |  0.38026673
           | jtemp1c37l3b_baseline_windows_after_inclusion  | drug_quantity        | f         |         1 |          0 |       |        |
           | jtemp1c37l3b_baseline_windows_after_inclusion  | source_value         | f         |         1 |          0 |       |        |
           | jtemp1c37l3b_baseline_windows_after_inclusion  | source_vocabulary_id | f         |         1 |          0 |       |        |
           | jtemp1c37l3b_baseline_windows_after_inclusion  | drug_amount          | f         |         1 |          0 |       |        |
         1 | jtemp1c37l3b_baseline_windows_after_inclusion  | criterion_table      | f         |         0 |          1 |     1 |        |           1
         1 | jtemp1c37l3b_baseline_windows_after_inclusion  | criterion_domain     | f         |         0 |          1 |     1 |        |           1
         1 | jtemp1c37l3b_baseline_windows_after_inclusion  | end_date             | f         |         0 |          1 |     1 |        |           1
 0.2788666 | jtemp1c37l3b_baseline_windows_after_inclusion  | start_date           | f         |         0 |       4729 |   100 |    101 |  0.16305907
           | jtemp1c37l3b_baseline_windows_with_collections | person_id            | f         |         0 |         -1 |       |    101 |           1
           | jtemp1c37l3b_baseline_windows_with_collections | uuid                 | f         |         0 |         -1 |       |    101 |  0.37703142
(17 rows)
```

table schemas

```
--
-- PostgreSQL database dump
--

-- Dumped from database version 12.2
-- Dumped by pg_dump version 12.2

SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;

--
-- Name: jigsaw_temp; Type: SCHEMA; Schema: -; Owner: -
--

CREATE SCHEMA jigsaw_temp;


SET default_tablespace = '';

SET default_table_access_method = heap;

--
-- Name: jtemp1c37l3b_baseline_windows_after_inclusion; Type: TABLE; Schema: jigsaw_temp; Owner: -
--

CREATE TABLE jigsaw_temp.jtemp1c37l3b_baseline_windows_after_inclusion (
    uuid text,
    person_id bigint,
    criterion_id bigint,
    criterion_table text,
    criterion_domain text,
    start_date date,
    end_date date,
    source_value text,
    source_vocabulary_id text,
    drug_amount double precision,
    drug_amount_units text,
    drug_days_supply integer,
    drug_name text,
    drug_quantity bigint,
    window_id bigint
);


--
-- Name: jtemp1c37l3b_baseline_windows_with_collections; Type: TABLE; Schema: jigsaw_temp; Owner: -
--

CREATE TABLE jigsaw_temp.jtemp1c37l3b_baseline_windows_with_collections (
    person_id bigint,
    uuid text
);


--
-- PostgreSQL database dump complete
--

```

Explains

<a href="https://explain.depesz.com/s/HGDc">HGDc : version_9_6_work_mem_4GB_random_page_cost_1_effective_cache_size_36GB_query_file_casted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/f31D">f31D : version_12_2_work_mem_4MB_random_page_cost_4_effective_cache_size_36GB_query_file_present_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/UUPA">UUPA : version_12_2_work_mem_4MB_random_page_cost_4_effective_cache_size_36GB_query_file_nulls_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/P9DF">P9DF : version_12_2_work_mem_4MB_random_page_cost_4_effective_cache_size_36GB_query_file_deleted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/YIU8">YIU8 : version_12_2_work_mem_4MB_random_page_cost_4_effective_cache_size_36GB_query_file_casted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/4woa">4woa : version_12_2_work_mem_4MB_random_page_cost_2_effective_cache_size_36GB_query_file_present_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/VC2b">VC2b : version_12_2_work_mem_4MB_random_page_cost_2_effective_cache_size_36GB_query_file_nulls_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/mQeT4">mQeT4 : version_12_2_work_mem_4MB_random_page_cost_2_effective_cache_size_36GB_query_file_deleted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/BPTA">BPTA : version_12_2_work_mem_4MB_random_page_cost_2_effective_cache_size_36GB_query_file_casted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/K5Af">K5Af : version_12_2_work_mem_4MB_random_page_cost_1_effective_cache_size_36GB_query_file_present_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/kFhX">kFhX : version_12_2_work_mem_4MB_random_page_cost_1_effective_cache_size_36GB_query_file_nulls_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/6tNy">6tNy : version_12_2_work_mem_4MB_random_page_cost_1_effective_cache_size_36GB_query_file_deleted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/OXDs">OXDs : version_12_2_work_mem_4MB_random_page_cost_1_effective_cache_size_36GB_query_file_casted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/jU6F">jU6F : version_12_2_work_mem_4GB_random_page_cost_4_effective_cache_size_36GB_query_file_present_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/layi">layi : version_12_2_work_mem_4GB_random_page_cost_4_effective_cache_size_36GB_query_file_nulls_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/m98L">m98L : version_12_2_work_mem_4GB_random_page_cost_4_effective_cache_size_36GB_query_file_deleted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/iLSa">iLSa : version_12_2_work_mem_4GB_random_page_cost_4_effective_cache_size_36GB_query_file_casted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/8IK7">8IK7 : version_12_2_work_mem_4GB_random_page_cost_2_effective_cache_size_36GB_query_file_present_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/cHr1V">cHr1V : version_12_2_work_mem_4GB_random_page_cost_2_effective_cache_size_36GB_query_file_nulls_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/W5fF">W5fF : version_12_2_work_mem_4GB_random_page_cost_2_effective_cache_size_36GB_query_file_deleted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/SQ9U">SQ9U : version_12_2_work_mem_4GB_random_page_cost_2_effective_cache_size_36GB_query_file_casted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/W2Qy">W2Qy : version_12_2_work_mem_4GB_random_page_cost_1_effective_cache_size_36GB_query_file_present_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/QNuX">QNuX : version_12_2_work_mem_4GB_random_page_cost_1_effective_cache_size_36GB_query_file_nulls_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/1yOO">1yOO : version_12_2_work_mem_4GB_random_page_cost_1_effective_cache_size_36GB_query_file_deleted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/B3TY">B3TY : version_12_2_work_mem_4GB_random_page_cost_1_effective_cache_size_36GB_query_file_casted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/kZ8d">kZ8d : version_9_6_work_mem_4GB_random_page_cost_1_effective_cache_size_36GB_query_file_deleted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/Axup">Axup : version_9_6_work_mem_4MB_random_page_cost_4_effective_cache_size_36GB_query_file_nulls_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/x2J7">x2J7 : version_9_6_work_mem_4MB_random_page_cost_4_effective_cache_size_36GB_query_file_deleted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/OX6f">OX6f : version_9_6_work_mem_4MB_random_page_cost_4_effective_cache_size_36GB_query_file_casted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/k3He">k3He : version_9_6_work_mem_4MB_random_page_cost_2_effective_cache_size_36GB_query_file_present_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/BBhy">BBhy : version_9_6_work_mem_4MB_random_page_cost_2_effective_cache_size_36GB_query_file_nulls_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/5yaz">5yaz : version_9_6_work_mem_4MB_random_page_cost_2_effective_cache_size_36GB_query_file_deleted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/7yiv">7yiv : version_9_6_work_mem_4MB_random_page_cost_2_effective_cache_size_36GB_query_file_casted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/E1cp">E1cp : version_9_6_work_mem_4MB_random_page_cost_1_effective_cache_size_36GB_query_file_present_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/nlQp">nlQp : version_9_6_work_mem_4MB_random_page_cost_1_effective_cache_size_36GB_query_file_nulls_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/b76Y">b76Y : version_9_6_work_mem_4MB_random_page_cost_1_effective_cache_size_36GB_query_file_deleted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/V07e">V07e : version_9_6_work_mem_4MB_random_page_cost_1_effective_cache_size_36GB_query_file_casted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/EHHN">EHHN : version_9_6_work_mem_4GB_random_page_cost_4_effective_cache_size_36GB_query_file_present_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/w4dV">w4dV : version_9_6_work_mem_4GB_random_page_cost_4_effective_cache_size_36GB_query_file_nulls_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/TnAd">TnAd : version_9_6_work_mem_4GB_random_page_cost_4_effective_cache_size_36GB_query_file_deleted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/Dh0pU">Dh0pU : version_9_6_work_mem_4GB_random_page_cost_4_effective_cache_size_36GB_query_file_casted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/jYZE">jYZE : version_9_6_work_mem_4GB_random_page_cost_2_effective_cache_size_36GB_query_file_present_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/9pex">9pex : version_9_6_work_mem_4GB_random_page_cost_2_effective_cache_size_36GB_query_file_nulls_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/feMn">feMn : version_9_6_work_mem_4GB_random_page_cost_2_effective_cache_size_36GB_query_file_deleted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/BtSV">BtSV : version_9_6_work_mem_4GB_random_page_cost_2_effective_cache_size_36GB_query_file_casted_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/rtxW">rtxW : version_9_6_work_mem_4GB_random_page_cost_1_effective_cache_size_36GB_query_file_present_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/ltmx">ltmx : version_9_6_work_mem_4GB_random_page_cost_1_effective_cache_size_36GB_query_file_nulls_sql | explain.depesz.com</a><br/>
<a href="https://explain.depesz.com/s/Xfca">Xfca : version_9_6_work_mem_4MB_random_page_cost_4_effective_cache_size_36GB_query_file_present_sql | explain.depesz.com</a><br/>

Example of quick query:

```
SET work_mem = '4GB';
SET random_page_cost = '1';
SET effective_cache_size = '36GB';
EXPLAIN (ANALYZE,BUFFERS,SETTINGS)  SELECT
                *
            FROM (
                SELECT
                    "l"."person_id" AS "person_id",
                    "l"."criterion_id" AS "criterion_id",
                    "l"."criterion_table" AS "criterion_table",
                    "l"."criterion_domain" AS "criterion_domain",
                    "l"."start_date" AS "start_date",
                    "l"."end_date" AS "end_date",
                    "l"."source_value" AS "source_value",
                    "l"."source_vocabulary_id" AS "source_vocabulary_id",
                    "l"."drug_amount" AS "drug_amount",
                    "l"."drug_amount_units" AS "drug_amount_units",
                    "l"."drug_days_supply" AS "drug_days_supply",
                    "l"."drug_name" AS "drug_name",
                    "l"."drug_quantity" AS "drug_quantity",
                    "l"."uuid" AS "uuid",
                    "l"."window_id" AS "window_id"
                FROM (
                    SELECT
                        "l"."person_id" AS "person_id",
                        "l"."criterion_id" AS "criterion_id",
                        "l"."criterion_table" AS "criterion_table",
                        "l"."criterion_domain" AS "criterion_domain",
                        "l"."start_date" AS "start_date",
                        "l"."end_date" AS "end_date",
                        "l"."source_value" AS "source_value",
                        "l"."source_vocabulary_id" AS "source_vocabulary_id",
                        "l"."drug_amount" AS "drug_amount",
                        "l"."drug_amount_units" AS "drug_amount_units",
                        "l"."drug_days_supply" AS "drug_days_supply",
                        "l"."drug_name" AS "drug_name",
                        "l"."drug_quantity" AS "drug_quantity",
                        "l"."uuid" AS "uuid",
                        "l"."window_id" AS "window_id"
                    FROM (
                        SELECT
                            "person_id" AS "person_id",
                            "criterion_id" AS "criterion_id",
                            "criterion_table" AS "criterion_table",
                            "criterion_domain" AS "criterion_domain",
                            "start_date" AS "start_date",
                            "end_date" AS "end_date",
                            "source_value" AS "source_value",
                            "source_vocabulary_id" AS "source_vocabulary_id",
                            "drug_amount" AS "drug_amount",
                            "drug_amount_units" AS "drug_amount_units",
                            "drug_days_supply" AS "drug_days_supply",
                            "drug_name" AS "drug_name",
                            "drug_quantity" AS "drug_quantity",
                            "uuid" AS "uuid",
                            "window_id" AS "window_id"
                        FROM
                            "jigsaw_temp"."jtemp1c37l3b_baseline_windows_after_inclusion") AS "l") AS "l"
                WHERE (EXISTS (
                        SELECT
                            1
                        FROM (
                            SELECT
                                "r"."uuid" AS "uuid"
                            FROM (
                                SELECT
                                    "uuid" AS "uuid",
                                    CAST(NULL AS float) AS "drug_amount",
                                    CAST(NULL AS text) AS "drug_amount_units",
                                    CAST(NULL AS bigint) AS "drug_days_supply",
                                    CAST(NULL AS text) AS "drug_name",
                                    CAST(NULL AS float) AS "drug_quantity",
                                    CAST(NULL AS integer) AS "window_id",
                                    "person_id" AS "person_id",
                                    CAST(NULL as bigint) AS "criterion_id",
                                    CAST(NULL as text) AS "criterion_table",
                                    CAST(NULL as date) AS "start_date",
                                    CAST(NULL as date) AS "end_date"
                                FROM
                                    "jigsaw_temp"."jtemp1c37l3b_baseline_windows_with_collections") AS "r"
                                GROUP BY "r"."uuid") AS "r"

                        WHERE ("l"."uuid" = "r"."uuid")))) AS "match_2"
```

Quick EXPLAIN:

```
                                                                               QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=7686.74..23252.46 rows=138972 width=265) (actual time=147.773..407.372 rows=138972 loops=1)
   Hash Cond: (jtemp1c37l3b_baseline_windows_after_inclusion.uuid = jtemp1c37l3b_baseline_windows_with_collections.uuid)
   Buffers: shared hit=10620
   ->  Seq Scan on jtemp1c37l3b_baseline_windows_after_inclusion  (cost=0.00..14239.44 rows=505244 width=265) (actual time=0.020..59.875 rows=505244 loops=1)
         Buffers: shared hit=9187
   ->  Hash  (cost=5949.59..5949.59 rows=138972 width=46) (actual time=146.796..146.797 rows=138972 loops=1)
         Buckets: 262144  Batches: 1  Memory Usage: 12711kB
         Buffers: shared hit=1433
         ->  HashAggregate  (cost=3170.15..4559.87 rows=138972 width=46) (actual time=74.334..103.543 rows=138972 loops=1)
               Group Key: jtemp1c37l3b_baseline_windows_with_collections.uuid
               Buffers: shared hit=1433
               ->  Seq Scan on jtemp1c37l3b_baseline_windows_with_collections  (cost=0.00..2822.72 rows=138972 width=46) (actual time=0.011..16.294 rows=138972 loops=1)
                     Buffers: shared hit=1433
 Settings: effective_cache_size = '36GB', random_page_cost = '1', work_mem = '4GB'
 Planning Time: 0.597 ms
 Execution Time: 418.440 ms
(16 rows)
```

Example of Slow Query:

```
SET work_mem = '4GB';
SET random_page_cost = '1';
SET effective_cache_size = '36GB';
EXPLAIN (ANALYZE,BUFFERS,SETTINGS)              SELECT
                *
            FROM (
                SELECT
                    "l"."person_id" AS "person_id",
                    "l"."criterion_id" AS "criterion_id",
                    "l"."criterion_table" AS "criterion_table",
                    "l"."criterion_domain" AS "criterion_domain",
                    "l"."start_date" AS "start_date",
                    "l"."end_date" AS "end_date",
                    "l"."source_value" AS "source_value",
                    "l"."source_vocabulary_id" AS "source_vocabulary_id",
                    "l"."drug_amount" AS "drug_amount",
                    "l"."drug_amount_units" AS "drug_amount_units",
                    "l"."drug_days_supply" AS "drug_days_supply",
                    "l"."drug_name" AS "drug_name",
                    "l"."drug_quantity" AS "drug_quantity",
                    "l"."uuid" AS "uuid",
                    "l"."window_id" AS "window_id"
                FROM (
                    SELECT
                        "l"."person_id" AS "person_id",
                        "l"."criterion_id" AS "criterion_id",
                        "l"."criterion_table" AS "criterion_table",
                        "l"."criterion_domain" AS "criterion_domain",
                        "l"."start_date" AS "start_date",
                        "l"."end_date" AS "end_date",
                        "l"."source_value" AS "source_value",
                        "l"."source_vocabulary_id" AS "source_vocabulary_id",
                        "l"."drug_amount" AS "drug_amount",
                        "l"."drug_amount_units" AS "drug_amount_units",
                        "l"."drug_days_supply" AS "drug_days_supply",
                        "l"."drug_name" AS "drug_name",
                        "l"."drug_quantity" AS "drug_quantity",
                        "l"."uuid" AS "uuid",
                        "l"."window_id" AS "window_id"
                    FROM (
                        SELECT
                            "person_id" AS "person_id",
                            "criterion_id" AS "criterion_id",
                            "criterion_table" AS "criterion_table",
                            "criterion_domain" AS "criterion_domain",
                            "start_date" AS "start_date",
                            "end_date" AS "end_date",
                            "source_value" AS "source_value",
                            "source_vocabulary_id" AS "source_vocabulary_id",
                            "drug_amount" AS "drug_amount",
                            "drug_amount_units" AS "drug_amount_units",
                            "drug_days_supply" AS "drug_days_supply",
                            "drug_name" AS "drug_name",
                            "drug_quantity" AS "drug_quantity",
                            "uuid" AS "uuid",
                            "window_id" AS "window_id"
                        FROM
                            "jigsaw_temp"."jtemp1c37l3b_baseline_windows_after_inclusion") AS "l") AS "l"
                WHERE (EXISTS (
                        SELECT
                            1
                        FROM (
                            SELECT
                                "r"."uuid" AS "uuid"
                            FROM (
                                SELECT
                                    "uuid" AS "uuid",
                                    CAST(NULL AS float) AS "drug_amount",
                                    CAST(NULL AS text) AS "drug_amount_units",
                                    CAST(NULL AS bigint) AS "drug_days_supply",
                                    CAST(NULL AS text) AS "drug_name",
                                    CAST(NULL AS float) AS "drug_quantity",
                                    CAST(NULL AS integer) AS "window_id",
                                    "person_id" AS "person_id",
                                    "criterion_id" AS "criterion_id",
                                    "criterion_table" AS "criterion_table",
                                    "criterion_domain" AS "criterion_domain",
                                    "start_date" AS "start_date",
                                    "end_date" AS "end_date"
                                FROM
                                    "jigsaw_temp"."jtemp1c37l3b_baseline_windows_with_collections") AS "r"
                                GROUP BY "r"."uuid") AS "r"

                        WHERE ("l"."uuid" = "r"."uuid")))) AS "match_2"

```

Slow Query EXPLAIN

```
                                                                              QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on jtemp1c37l3b_baseline_windows_after_inclusion  (cost=0.00..1601719821.59 rows=252622 width=265) (actual time=721.931..5424987.346 rows=138972 loops=1)
   Filter: (SubPlan 1)
   Rows Removed by Filter: 366272
   Buffers: shared hit=624493395
   SubPlan 1
     ->  Subquery Scan on r  (cost=0.00..3170.16 rows=1 width=0) (actual time=10.731..10.731 rows=0 loops=505244)
           Buffers: shared hit=624484208
           ->  Group  (cost=0.00..3170.15 rows=1 width=46) (actual time=10.730..10.730 rows=0 loops=505244)
                 Group Key: jtemp1c37l3b_baseline_windows_with_collections.uuid
                 Buffers: shared hit=624484208
                 ->  Seq Scan on jtemp1c37l3b_baseline_windows_with_collections  (cost=0.00..3170.15 rows=1 width=46) (actual time=10.724..10.724 rows=0 loops=505244)
                       Filter: (jtemp1c37l3b_baseline_windows_after_inclusion.uuid = uuid)
                       Rows Removed by Filter: 119859
                       Buffers: shared hit=624484208
 Settings: effective_cache_size = '36GB', random_page_cost = '1', work_mem = '4GB'
 Planning Time: 0.872 ms
 JIT:
   Functions: 12
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 3.531 ms, Inlining 113.077 ms, Optimization 408.591 ms, Emission 160.142 ms, Total 685.341 ms
 Execution Time: 5425095.601 ms
(21 rows)
```

Thanks,
Ryan
Attachment

pgsql-performance by date:

Previous
From: Ranier Vilela
Date:
Subject: Re: Plan not skipping unnecessary inner join
Next
From: Pavel Stehule
Date:
Subject: Re: Execution time from >1s -> 80m+ when extra columns added inSELECT for sub-query