Analyze results in more expensive query plan - Mailing list pgsql-performance

From Jeremy Altavilla
Subject Analyze results in more expensive query plan
Date
Msg-id CAG3Nut_TY+cn_THAW3PKE+cNUod8HbaLu+tADrT4vEGcvUWCRg@mail.gmail.com
Whole thread Raw
Responses Re: Analyze results in more expensive query plan  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance

Hello,

 

We have several select statements whose performance is greatly improved by deleting some stats from pg_statistic.  With the stats present the database reaches 100% cpu at 13k queries per second. Without these stats, the same machine can handle over 29k queries per second. We were able replicate this behavior with just a single join that all these queries contain. When the stats are present the planner chooses to hash join, and without stats perform a nested loop. The plan using a hash join has a higher estimated cost, and as previously mentioned, uses more cpu.

 

The two tables involved in this query are described below; bag_type and bag. There are 6 bag_type rows and around 6 million bag rows. During this simplified scenario, no writes were occurring. Under normal circumstances rows can be inserted into bag, and no rows in these tables are updated or deleted.

 

\d bag_type

                              Table "public.bag_type"

  Column   |  Type   | Collation | Nullable |               Default

-----------+---------+-----------+----------+--------------------------------------

 id        | bigint  |           | not null | nextval('bag_type_id_seq'::regclass)

 name      | text    |           | not null |

 has_slots | boolean |           | not null |

 game      | text    |           | not null |

Indexes:

    "bag_type_pk" PRIMARY KEY, btree (id)

    "bag_name_u1" UNIQUE CONSTRAINT, btree (name, game)

Referenced by:

    TABLE "bag" CONSTRAINT "bag_fk1" FOREIGN KEY (bag_type_id) REFERENCES bag_type(id)

 

\d bag

                              Table "public.bag"

   Column    |  Type  | Collation | Nullable |             Default

-------------+--------+-----------+----------+---------------------------------

 id          | bigint |           | not null | nextval('bag_id_seq'::regclass)

 owner_id    | uuid   |           | not null |

 bag_type_id | bigint |           | not null |

Indexes:

    "bag_pk" PRIMARY KEY, btree (id)

    "bag_owner_type_u1" UNIQUE CONSTRAINT, btree (owner_id, bag_type_id)

Foreign-key constraints:

    "bag_fk1" FOREIGN KEY (bag_type_id) REFERENCES bag_type(id)

Referenced by:

    TABLE "item" CONSTRAINT "item_fk1" FOREIGN KEY (bag_id) REFERENCES bag(id)

       

The pared down query joins the two tables.

 

EXPLAIN (ANALYZE, BUFFERS)

SELECT 1

FROM bag

INNER JOIN bag_type ON bag.bag_type_id = bag_type.id

WHERE owner_id = '00000000-0000-0000-0000-000000076100'

AND game = 'test_alpha'

AND name = ANY(ARRAY['item','wallet','buildingFixed']);

 

With stats on the bag_type table present, the planner uses a hash join. I noticed that the estimate of the index scan of bag_owner_type_u1 is too high at 8 rows. No owner can have more than 6 bags, so 8 should be logically impossible. Also, given 3 bag_types and a specific owner, there can't be more than 3 rows due to the bag_owner_type_u1 index.

 

ANALYZE bag_type;

 

https://explain.depesz.com/s/zcI (Slower, hash join)

 

If I remove the stats on the bag_type table, the planner estimates 1 row and uses a nested loop.

 

DELETE FROM pg_statistic s

USING pg_class c

WHERE c.oid = s.starelid

AND c.relname = 'bag_type';

 

https://explain.depesz.com/s/yBuEo (nested loop)

 

Below are various stats and configuration options, in case they are helpful. I've tried reindexing everything, clustering the tables and ran vacuum full as well. I've tried increasing the default statistics target (this actually made performance much worse). I’ve tested this on fresh volumes with synthetic data, as well as on replicas of prod data.  I’ve also tested this on different ec2 instance types (r4.16xl and c4.8xl). In all cases the bag_type stats resulted in worse performance. I was hoping someone would be able to give advice on how to improve these queries that doesn’t involve deleting stats.

 

Thanks

--Jeremy

 

 

SELECT version();

                                                         version

-------------------------------------------------------------------------------------------------------------------------

PostgreSQL 10.7 (Debian 10.7-1.pgdg80+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 4.9.2-10+deb8u2) 4.9.2, 64-bit

(1 row)

 

SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE relname='bag_type' OR relname = 'bag';

relname  | relpages |  reltuples  | relallvisible | relkind | relnatts | relhassubclass | reloptions | pg_table_size

----------+----------+-------------+---------------+---------+----------+----------------+------------+---------------

 bag      |    44115 | 5.99964e+06 |             0 | r       |        3 | f              |            |     361390080

 bag_type |        1 |           6 |             0 | r       |        4 | f              |            |         16384

(2 rows)

 

SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, inherited, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist, correlation FROM pg_stats WHERE attname='bag_type_id' AND tablename='bag' ORDER BY 1 DESC;

frac_mcv | tablename |   attname   | inherited | null_frac | n_distinct | n_mcv | n_hist | correlation

----------+-----------+-------------+-----------+-----------+------------+-------+--------+-------------

        1 | bag       | bag_type_id | f         |         0 |          6 |     6 |        |    0.167682

(1 row)

 

SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, inherited, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist, correlation FROM pg_stats WHERE attname='owner_id' AND tablename='bag' ORDER BY 1 DESC;

  frac_mcv  | tablename | attname  | inherited | null_frac | n_distinct | n_mcv | n_hist | correlation

------------+-----------+----------+-----------+-----------+------------+-------+--------+-------------

 0.00680001 | bag       | owner_id | f         |         0 |  -0.123982 |   100 |    101 |    0.994306

(1 row)

 

SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, inherited, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist, correlation FROM pg_stats WHERE attname='name' AND tablename='bag_type' ORDER BY 1 DESC;

frac_mcv | tablename | attname | inherited | null_frac | n_distinct | n_mcv | n_hist | correlation

----------+-----------+---------+-----------+-----------+------------+-------+--------+-------------

          | bag_type  | name    | f         |         0 |         -1 |       |      6 |   -0.428571

(1 row)

 

SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, inherited, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist, correlation FROM pg_stats WHERE attname='game' AND tablename='bag_type' ORDER BY 1 DESC;

frac_mcv | tablename | attname | inherited | null_frac | n_distinct | n_mcv | n_hist | correlation

----------+-----------+---------+-----------+-----------+------------+-------+--------+-------------

        1 | bag_type  | game    | f         |         0 |  -0.166667 |     1 |        |           1

(1 row)

 

SELECT name, current_setting(name), SOURCE

FROM pg_settings

WHERE SOURCE NOT IN ('default', 'override');

                name                 |               current_setting               |       source

-------------------------------------+---------------------------------------------+--------------------

 application_name                    | psql                                        | client

 archive_command                     | /wal-e-shim wal-push %p                     | configuration file

 archive_mode                        | on                                          | configuration file

 archive_timeout                     | 1min                                        | configuration file

 autovacuum                          | on                                          | configuration file

 autovacuum_max_workers              | 6                                           | configuration file

 autovacuum_vacuum_scale_factor      | 0                                           | configuration file

 autovacuum_vacuum_threshold         | 10000                                       | configuration file

 autovacuum_work_mem                 | -1                                          | configuration file

 checkpoint_completion_target        | 0.9                                         | configuration file

 checkpoint_timeout                  | 30min                                       | configuration file

 checkpoint_warning                  | 30s                                         | configuration file

 client_encoding                     | SQL_ASCII                                   | client

 DateStyle                           | ISO, MDY                                    | configuration file

 dynamic_shared_memory_type          | posix                                       | configuration file

 effective_cache_size                | 42432000kB                                  | configuration file

 fsync                               | on                                          | configuration file

 full_page_writes                    | on                                          | configuration file

 huge_pages                          | try                                         | configuration file

 idle_in_transaction_session_timeout | 10min                                       | configuration file

 lc_messages                         | C                                           | configuration file

 lc_monetary                         | C                                           | configuration file

 lc_numeric                          | C                                           | configuration file

 lc_time                             | C                                           | configuration file

 listen_addresses                    | *                                           | configuration file

 log_autovacuum_min_duration         | 0                                           | configuration file

 log_checkpoints                     | on                                          | configuration file

 log_destination                     | stderr                                      | configuration file

 log_line_prefix                     | %t [%p-%l] %q%u@%d                          | configuration file

 log_lock_waits                      | on                                          | configuration file

 log_min_duration_statement          | 1s                                          | configuration file

 log_temp_files                      | 0                                           | configuration file

 log_timezone                        | UTC                                         | configuration file

 maintenance_work_mem                | 3536000kB                                   | configuration file

 max_connections                     | 400                                         | configuration file

 max_prepared_transactions           | 100                                         | configuration file

 max_stack_depth                     | 2MB                                         | configuration file

 max_wal_senders                     | 5                                           | configuration file

 max_wal_size                        | 34GB                                        | configuration file

 pg_partman_bgw.dbname               | redacted                                    | configuration file

 pg_partman_bgw.interval             | 3600                                        | configuration file

 pg_partman_bgw.role                 | postgres                                    | configuration file

 pg_stat_statements.track            | all                                         | configuration file

 port                                | 5432                                        | command line

 random_page_cost                    | 1.1                                         | configuration file

 shared_buffers                      | 14144000kB                                  | configuration file

 shared_preload_libraries            | plpgsql, pg_partman_bgw, pg_stat_statements | configuration file

 stats_temp_directory                | /var/run/postgresql/pg_stat_tmp             | configuration file

 superuser_reserved_connections      | 5                                           | configuration file

 synchronous_commit                  | on                                          | configuration file

 TimeZone                            | UTC                                         | configuration file

 unix_socket_directories             | /var/run/postgresql                         | configuration file

 unix_socket_group                   | postgres                                    | configuration file

 unix_socket_permissions             | 0700                                        | configuration file

 wal_keep_segments                   | 64                                          | configuration file

 wal_level                           | replica                                     | configuration file

 wal_sync_method                     | fsync                                       | configuration file

 work_mem                            | 141440kB                                    | configuration file

(58 rows)

 

pgsql-performance by date:

Previous
From: Vitaly Baranovsky
Date:
Subject: The wrong (?) query plan for queries with remote (postgres_fdw) tables
Next
From: "Naik, Sameer"
Date:
Subject: RE: Re: Re: Generic Plans for Prepared Statement are 158155 times slowerthan Custom Plans