(was: CTE with JOIN of two tables is much faster than a regularquery) - Mailing list pgsql-general

From Albrecht Dreß
Subject (was: CTE with JOIN of two tables is much faster than a regularquery)
Date
Msg-id 4YW55TYN.OI42YDTK.ITCKMMOO@AJ45T7CZ.3PZZWAHB.65FILMPD
Whole thread Raw
In response to CTE with JOIN of two tables is much faster than a regular query  (<kpi6288@gmail.com>)
List pgsql-general
Am 18.08.18 11:36 schrieb(en) kpi6288@gmail.com:
[snip]
> What can I do to improve the performance of the regular query without using a CTE?

Sorry for jumping into this discussion late – I'm facing similar problems with Postgres choosing strange and
inefficientquery plans for no (for me) apparent reason.  I use the DEB packages postgresql-10, version 10.5-1.pgdg90+1,
ona Debian stretch box. 

The relevant part of the database structure is:

--8<-----------------------------------------------------------------------------------------------
mydb=> \d strings
                             Table "public.strings"
  Column |  Type  | Collation | Nullable |               Default
--------+--------+-----------+----------+--------------------------------------
  iid    | bigint |           | not null |
  sid    | bigint |           | not null | nextval('strings_sid_seq'::regclass)
  stype  | text   |           |          |
  string | text   |           |          |
Indexes:
     "strings_pkey" PRIMARY KEY, btree (iid, sid)
     "idx_strings_string_gin" gin (string gin_trgm_ops)
     "idx_stype" btree (stype)
Foreign-key constraints:
     "strings_iid_fkey" FOREIGN KEY (iid) REFERENCES items(iid) ON DELETE CASCADE

mydb=> \d items
                                    Table "public.items"
     Column     |     Type      | Collation | Nullable |              Default
---------------+---------------+-----------+----------+------------------------------------
  dbid          | bigint        |           | not null |
  iid           | bigint        |           | not null | nextval('items_iid_seq'::regclass)
  riid          | integer       |           |          |
[…more columns…]
Indexes:
     "items_pkey" PRIMARY KEY, btree (iid)
     "idx_items_riid" btree (riid)
     "items_dbid" btree (dbid)
     […more indexes…]
Referenced by:
     TABLE "strings" CONSTRAINT "strings_iid_fkey" FOREIGN KEY (iid) REFERENCES items(iid) ON DELETE CASCADE
     […more references…]
--8<-----------------------------------------------------------------------------------------------

The table “strings” contains about 2 * 10e7 active rows, “items” about 10e8.

The “instability” occurs with the following somewhat trivial query.  In the correct (IMO) case, the indexes are used:

--8<-----------------------------------------------------------------------------------------------
mydb=> EXPLAIN ANALYZE SELECT items.iid, stype, string, riid FROM items LEFT JOIN strings USING(iid) WHERE stype ~
E'^tag\\..*(?<\!\\.\\d+)$'AND dbid = 7416000; 
                                                             QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------
  Nested Loop  (cost=1.13..522716.95 rows=8 width=133) (actual time=0.078..0.715 rows=16 loops=1)
    ->  Index Scan using items_dbid on items  (cost=0.57..1377.96 rows=773 width=12) (actual time=0.021..0.038 rows=19
loops=1)
          Index Cond: (dbid = 7416000)
    ->  Index Scan using strings_pkey on strings  (cost=0.56..674.18 rows=26 width=129) (actual time=0.030..0.035
rows=1loops=19) 
          Index Cond: (iid = items.iid)
          Filter: (stype ~ '^tag\..*(?<!\.\d+)$'::text)
          Rows Removed by Filter: 3
  Planning time: 1.685 ms
  Execution time: 0.762 ms
(9 rows)
--8<-----------------------------------------------------------------------------------------------

However, seemingly at random, Postgres chooses the following plan which is (planning plus execution) ~1500 times
slower:

--8<-----------------------------------------------------------------------------------------------
mydb=> EXPLAIN ANALYZE SELECT items.iid, stype, string, riid FROM items LEFT JOIN strings USING(iid) WHERE stype ~
E'^tag\\..*(?<\!\\.\\d+)$'AND dbid = 7416000; 
                                                                        QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------
  Gather  (cost=84945.47..522033.97 rows=9 width=133) (actual time=1401.570..3868.239 rows=16 loops=1)
    Workers Planned: 2
    Workers Launched: 2
    ->  Hash Join  (cost=83945.47..521033.07 rows=4 width=133) (actual time=2206.088..3823.982 rows=5 loops=3)
          Hash Cond: (strings.iid = items.iid)
          ->  Parallel Bitmap Heap Scan on strings  (cost=82539.52..518233.10 rows=531057 width=129) (actual
time=390.479..3795.902rows=401149 loops=3) 
                Filter: (stype ~ '^tag\..*(?<!\.\d+)$'::text)
                Rows Removed by Filter: 384802
                Heap Blocks: exact=76067
                ->  Bitmap Index Scan on idx_stype  (cost=0.00..82220.88 rows=2334832 width=0) (actual
time=340.725..340.725rows=2357863 loops=1) 
                      Index Cond: ((stype >= 'tag.'::text) AND (stype < 'tag/'::text))
          ->  Hash  (cost=1395.77..1395.77 rows=814 width=12) (actual time=0.137..0.137 rows=19 loops=3)
                Buckets: 1024  Batches: 1  Memory Usage: 9kB
                ->  Index Scan using items_dbid on items  (cost=0.57..1395.77 rows=814 width=12) (actual
time=0.072..0.126rows=19 loops=3) 
                      Index Cond: (dbid = 7416000)
  Planning time: 2.617 ms
  Execution time: 3868.303 ms
(17 rows)
--8<-----------------------------------------------------------------------------------------------

It looks as if the selection of the plan is more or less random, and does /not/ depend on the statistics state.  I.e.
running“vacuum analyze strings; vacuum analyze items;” immediately before the query does /not/ result in a reproducible
behaviour(a /very/ small number if entries may have been added or deleted between the calls in both tables, though). 

My solution for a stable (but slower than the query utilising the indexes) response time is also using a CTE.  However,
itwould be helpful to fix (or at least understand) the behaviour. 

Best,
Albrecht.
Attachment

pgsql-general by date:

Previous
From: bricklen
Date:
Subject: Re: upgrading from pg 9.3 to 10
Next
From: Stephen Frost
Date:
Subject: Re: upgrading from pg 9.3 to 10