BUG #14085: Weird generic plan in PL/pgSQL function - Mailing list pgsql-bugs
From | root@simply.name |
---|---|
Subject | BUG #14085: Weird generic plan in PL/pgSQL function |
Date | |
Msg-id | 20160414104053.22934.23398@wrigleys.postgresql.org Whole thread Raw |
List | pgsql-bugs |
The following bug has been logged on the website: Bug reference: 14085 Logged by: Vladimir Borodin Email address: root@simply.name PostgreSQL version: 9.4.6 Operating system: RHEL 6 Description: Weâve met a strange problem with not using functional GIN-index in one of the SELECT-queries used inside the PL/pgSQL function. Seems, that it happens after 5 calls of such function inside one session. All attempts to run this query from psql with exactly the same binds donât reproduce the problem (index is used). Calling such function with the same arguments in other sessions also run fast. So my guess is that it is somehow related with generic plan. Below are some details. The table looks like this (a bit simplified): pgload03g/maildb M # \dS+ mail.box Table "mail.box" Column | Type | Modifiers | Storage | Stats target | Description ---------------+--------------------------+------------------------+----------+--------------+------------- uid | bigint | not null | plain | | mid | bigint | not null | plain | | <...> tid | bigint | | plain | | lids | integer[] | not null | extended | | <...> Indexes: "pk_box" PRIMARY KEY, btree (uid, mid) <...> "i_box_uid_lids" gin (mail.ulids(uid, lids)) WITH (fastupdate=off) "i_box_uid_tid" btree (uid, tid) Check constraints: "box_lids_check" CHECK (array_ndims(lids) = 1) <...> pgload03g/maildb M # \sf mail.ulids CREATE OR REPLACE FUNCTION mail.ulids(i_uid bigint, i_lids integer[]) RETURNS bigint[] LANGUAGE sql IMMUTABLE STRICT AS $function$ SELECT array_agg((i_uid << 32) | lid) FROM unnest(i_lids) lid; $function$ pgload03g/maildb M # The best reproduction of the problem until now looks like that: pgload03g/maildb M # PREPARE has_lids (bigint, int) AS SELECT EXISTS ( SELECT 1 FROM mail.box WHERE mail.ulids(uid, lids) && mail.ulids($1::bigint, ARRAY[$2]) AND tid IS NOT NULL ); PREPARE Time: 0.398 ms pgload03g/maildb M # explain (ANALYZE, VERBOSE, BUFFERS) execute has_lids(9, 1); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Result (cost=297.41..297.42 rows=1 width=0) (actual time=0.015..0.015 rows=1 loops=1) Output: $0 Buffers: shared hit=4 InitPlan 1 (returns $0) -> Bitmap Heap Scan on mail.box (cost=293.22..120596.51 rows=28663 width=0) (actual time=0.013..0.013 rows=0 loops=1) Recheck Cond: (mail.ulids(box.uid, box.lids) && '{38654705665}'::bigint[]) Filter: (box.tid IS NOT NULL) Buffers: shared hit=4 -> Bitmap Index Scan on i_box_uid_lids (cost=0.00..286.05 rows=29574 width=0) (actual time=0.013..0.013 rows=0 loops=1) Index Cond: (mail.ulids(box.uid, box.lids) && '{38654705665}'::bigint[]) Buffers: shared hit=4 Execution time: 0.037 ms (12 rows) Time: 0.667 ms pgload03g/maildb M # explain (ANALYZE, VERBOSE, BUFFERS) execute has_lids(9, 2); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Result (cost=297.41..297.42 rows=1 width=0) (actual time=0.017..0.017 rows=1 loops=1) Output: $0 Buffers: shared hit=4 InitPlan 1 (returns $0) -> Bitmap Heap Scan on mail.box (cost=293.22..120596.51 rows=28663 width=0) (actual time=0.016..0.016 rows=0 loops=1) Recheck Cond: (mail.ulids(box.uid, box.lids) && '{38654705666}'::bigint[]) Filter: (box.tid IS NOT NULL) Buffers: shared hit=4 -> Bitmap Index Scan on i_box_uid_lids (cost=0.00..286.05 rows=29574 width=0) (actual time=0.015..0.015 rows=0 loops=1) Index Cond: (mail.ulids(box.uid, box.lids) && '{38654705666}'::bigint[]) Buffers: shared hit=4 Execution time: 0.042 ms (12 rows) Time: 0.695 ms pgload03g/maildb M # explain (ANALYZE, VERBOSE, BUFFERS) execute has_lids(9, 3); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Result (cost=297.41..297.42 rows=1 width=0) (actual time=0.014..0.015 rows=1 loops=1) Output: $0 Buffers: shared hit=4 InitPlan 1 (returns $0) -> Bitmap Heap Scan on mail.box (cost=293.22..120596.51 rows=28663 width=0) (actual time=0.014..0.014 rows=0 loops=1) Recheck Cond: (mail.ulids(box.uid, box.lids) && '{38654705667}'::bigint[]) Filter: (box.tid IS NOT NULL) Buffers: shared hit=4 -> Bitmap Index Scan on i_box_uid_lids (cost=0.00..286.05 rows=29574 width=0) (actual time=0.012..0.012 rows=0 loops=1) Index Cond: (mail.ulids(box.uid, box.lids) && '{38654705667}'::bigint[]) Buffers: shared hit=4 Execution time: 0.038 ms (12 rows) Time: 0.632 ms pgload03g/maildb M # explain (ANALYZE, VERBOSE, BUFFERS) execute has_lids(9, 4); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Result (cost=297.41..297.42 rows=1 width=0) (actual time=0.015..0.015 rows=1 loops=1) Output: $0 Buffers: shared hit=4 InitPlan 1 (returns $0) -> Bitmap Heap Scan on mail.box (cost=293.22..120596.51 rows=28663 width=0) (actual time=0.013..0.013 rows=0 loops=1) Recheck Cond: (mail.ulids(box.uid, box.lids) && '{38654705668}'::bigint[]) Filter: (box.tid IS NOT NULL) Buffers: shared hit=4 -> Bitmap Index Scan on i_box_uid_lids (cost=0.00..286.05 rows=29574 width=0) (actual time=0.012..0.012 rows=0 loops=1) Index Cond: (mail.ulids(box.uid, box.lids) && '{38654705668}'::bigint[]) Buffers: shared hit=4 Execution time: 0.036 ms (12 rows) Time: 0.802 ms pgload03g/maildb M # explain (ANALYZE, VERBOSE, BUFFERS) execute has_lids(9, 5); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Result (cost=297.41..297.42 rows=1 width=0) (actual time=0.014..0.014 rows=1 loops=1) Output: $0 Buffers: shared hit=4 InitPlan 1 (returns $0) -> Bitmap Heap Scan on mail.box (cost=293.22..120596.51 rows=28663 width=0) (actual time=0.013..0.013 rows=0 loops=1) Recheck Cond: (mail.ulids(box.uid, box.lids) && '{38654705669}'::bigint[]) Filter: (box.tid IS NOT NULL) Buffers: shared hit=4 -> Bitmap Index Scan on i_box_uid_lids (cost=0.00..286.05 rows=29574 width=0) (actual time=0.013..0.013 rows=0 loops=1) Index Cond: (mail.ulids(box.uid, box.lids) && '{38654705669}'::bigint[]) Buffers: shared hit=4 Execution time: 0.037 ms (12 rows) Time: 0.763 ms pgload03g/maildb M # explain (ANALYZE, VERBOSE, BUFFERS) execute has_lids(8, 5); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------ Result (cost=54.64..54.65 rows=1 width=0) (actual time=6523020.562..6523020.562 rows=1 loops=1) Output: $0 Buffers: shared hit=116142 read=7458134 dirtied=21 I/O Timings: read=29892.580 InitPlan 1 (returns $0) -> Seq Scan on mail.box (cost=0.00..234921342.40 rows=4299410 width=0) (actual time=6523020.560..6523020.560 rows=0 loops=1) Filter: ((box.tid IS NOT NULL) AND (mail.ulids(box.uid, box.lids) && mail.ulids($1, ARRAY[$2]))) Rows Removed by Filter: 295557368 Buffers: shared hit=116142 read=7458134 dirtied=21 I/O Timings: read=29892.580 Execution time: 6523020.634 ms (11 rows) Time: 6523021.312 ms pgload03g/maildb M # Running last query in another session also uses Bitmap Index Scan on i_box_uid_lids and completes fast. It is clear that first five times postgres used Bitmap Index Scan on i_box_uid_lids with total cost 297.41..297.42. But then it decided to use Seq Scan on mail.box with total cost 54.64..54.65. The cost of seq scan is lower (and it seems to be the reason of choosing it) but it actually takes enormous amount of time to finish this query and it is a huge miss of optimizer. It seems that the reason is that we are using SELECT EXISTS which should stop after receiving the first tuple from subnode and seq scan thinks that it would receive the first tuple immediately (cost=0.00..) but actually it wouldnât. Another thing to pay attention is that optimizer thinks to scan 4.299.410 rows but actually it scans the whole table (295.557.368 rows). It is not related to inaccurate statistics because running VACUUM ANALYZE on that table doesnât change the picture. We are probably going to rewrite the query inside the PL/pgSQL function but it seems that such behavior isnât expected for PostgreSQL and it would be cool to fix it.
pgsql-bugs by date: