Slow join - Mailing list pgsql-performance

From Roman Kushnir
Subject Slow join
Date
Msg-id 0B4D190F-CC48-4763-918D-4D4A8DAE5B0B@ad2games.com
Whole thread Raw
Responses Re: Slow join
Re: Slow join
List pgsql-performance
Hello,

The following basic inner join is taking too much time for me. (I’m using count(videos.id) instead of count(*) because my actual query looks different, but I simplified it here to the essence).
I’ve tried following random people's suggestions and adjusting the random_page_cost(decreasing it from 4 to 1.1) without a stable improvement. Any hints on what is wrong here? Thank you.

The query

SELECT COUNT(videos.id) FROM videos JOIN accounts ON accounts.channel_id = videos.channel_id;

The accounts table has 744 rows, videos table has 2.2M rows, the join produces 135k rows.

Running on Amazon RDS, with default 10.1 parameters

                                                 version
---------------------------------------------------------------------------------------------------------
 PostgreSQL 10.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-16), 64-bit


Structure and statistics of the tables involved

=> \d videos
                                              Table "public.videos"
         Column         |            Type             | Collation | Nullable |                      Default
------------------------+-----------------------------+-----------+----------+---------------------------------------------------
 id                     | bigint                      |           | not null | nextval('videos_id_seq'::regclass)
 vendor_id              | character varying           |           | not null |
 channel_id             | bigint                      |           |          |
 published_at           | timestamp without time zone |           |          |
 title                  | text                        |           |          |
 description            | text                        |           |          |
 thumbnails             | jsonb                       |           |          |
 tags                   | character varying[]         |           |          |
 category_id            | character varying           |           |          |
 default_language       | character varying           |           |          |
 default_audio_language | character varying           |           |          |
 duration               | integer                     |           |          |
 stereoscopic           | boolean                     |           |          |
 hd                     | boolean                     |           |          |
 captioned              | boolean                     |           |          |
 licensed               | boolean                     |           |          |
 projection             | character varying           |           |          |
 privacy_status         | character varying           |           |          |
 license                | character varying           |           |          |
 embeddable             | boolean                     |           |          |
 terminated_at          | timestamp without time zone |           |          |
 created_at             | timestamp without time zone |           | not null |
 updated_at             | timestamp without time zone |           | not null |
 featured_game_id       | bigint                      |           |          |
Indexes:
    "videos_pkey" PRIMARY KEY, btree (id)
    "index_videos_on_vendor_id" UNIQUE, btree (vendor_id)
    "index_videos_on_channel_id" btree (channel_id)
    "index_videos_on_featured_game_id" btree (featured_game_id)
Foreign-key constraints:
    "fk_rails_257f68ae55" FOREIGN KEY (channel_id) REFERENCES channels(id)
    "fk_rails_ce1b3e10b0" FOREIGN KEY (featured_game_id) REFERENCES games(id)
Referenced by:
    TABLE "video_fetch_statuses" CONSTRAINT "fk_rails_3bfdf013b8" FOREIGN KEY (video_id) REFERENCES videos(id)
    TABLE "video_daily_facts" CONSTRAINT "fk_rails_dc0eca9ebb" FOREIGN KEY (video_id) REFERENCES videos(id)


=> SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE relname='videos’;

        relname        | relpages |  reltuples  | relallvisible | relkind | relnatts | relhassubclass | reloptions | pg_table_size
-----------------------+----------+-------------+---------------+---------+----------+----------------+------------+---------------
 videos                |   471495 | 2.25694e+06 |        471389 | r       |       24 | f              |            |    4447764480


=> SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist FROM pg_stats WHERE attname='channel_id' AND tablename='videos' ORDER BY 1 DESC;

 frac_mcv |       tablename       |    attname     | n_distinct | n_mcv | n_hist
----------+-----------------------+----------------+------------+-------+--------
   0.1704 | videos                | channel_id     |       1915 |   100 |    101



=> \d accounts
                                          Table "public.accounts"
     Column     |            Type             | Collation | Nullable |                     Default
----------------+-----------------------------+-----------+----------+--------------------------------------------------
 id             | bigint                      |           | not null | nextval('accounts_id_seq'::regclass)
 channel_id     | bigint                      |           | not null |
 refresh_token  | character varying           |           | not null |
 created_at     | timestamp without time zone |           | not null |
 updated_at     | timestamp without time zone |           | not null |
Indexes:
    "accounts_pkey" PRIMARY KEY, btree (id)
    "index_accounts_on_channel_id" UNIQUE, btree (channel_id)
    "index_accounts_on_refresh_token" UNIQUE, btree (refresh_token)
Foreign-key constraints:
    "fk_rails_11d6d9bea2" FOREIGN KEY (channel_id) REFERENCES channels(id)


=> SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE relname='accounts’;

       relname        | relpages | reltuples | relallvisible | relkind | relnatts | relhassubclass | reloptions | pg_table_size
----------------------+----------+-----------+---------------+---------+----------+----------------+------------+---------------
 accounts             |       23 |       744 |            23 | r       |        5 | f              |            |        229376


=> SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist FROM pg_stats WHERE attname='channel_id' AND tablename='accounts' ORDER BY 1 DESC;

 frac_mcv |      tablename       |    attname     | n_distinct | n_mcv | n_hist
----------+----------------------+----------------+------------+-------+--------
          | accounts             | channel_id     |         -1 |       |    101


pgsql-performance by date:

Previous
From: Sasa Vilic
Date:
Subject: Re: Slow query when pg_trgm is in inner lopp
Next
From: Justin Pryzby
Date:
Subject: Re: Slow join