Thread: Major performance degradation with joins in 15.8 or 15.7?
Hello! I have a database installation that goes back about 10 years. It's using a hardware RAID consisting of enterprise-caliberspinning drives. I upgraded the installation to PostgreSQL 15.8 a couple months ago. Sometime after that, it was noticed that one of our application's queries suffered major performance degradation. This query"suddenly" went from taking 40-50 milliseconds to ~9 seconds, approximately 180-200 times longer. Vacuuming and analyzingthe table didn't help. The good news is that, after some research and experimentation, I was able to fix this performance degradation by settingrandom_page_cost = 2.0. We've always used the default values for seq_page_cost and random_page_cost (1.0 and 4.0,respectively, I think?). A lot of the advice I found online suggested these *_page_cost settings are too high? The table in question has ~350,000 rows and only 5 text columns. It might have increased in size by about a thousand rowsin the past couple of months, but the table has not grown in size significantly. Did upgrading PostgreSQL to 15.8 cause this performance degradation? Did the page costs calculation change in 15.8 (or 15.7)?The PostgreSQL version is the only thing that has appreciably changed with this database this whole year. Or do you think the size of the table just incrementally reached some threshold that resulted in the optimizer changing itsplan based on the page costs? I was able to simplify the query in my testing down to something like this: select * from atablename a inner join btablename b on a.somekey = b.somekey and b.otherparam='value' where b.otherparam2='othervalue'; All of the columns are text fields and indexed. Unfortunately, I don't have the "EXPLAIN (ANALYZE, BUFFERS)" output frombefore I changed the random_page_cost setting. It's no longer in my scollback buffer. Any theories or suggestions appreciated. Thanks, Ed
On Fri, 8 Nov 2024 at 10:54, Ed Sabol <edwardjsabol@gmail.com> wrote: > The good news is that, after some research and experimentation, I was able to fix this performance degradation by settingrandom_page_cost = 2.0. We've always used the default values for seq_page_cost and random_page_cost (1.0 and 4.0,respectively, I think?). A lot of the advice I found online suggested these *_page_cost settings are too high? Often people lower random_page_cost when running SSDs. The default value of 4.0 was determined before SSDs were a thing. > Did upgrading PostgreSQL to 15.8 cause this performance degradation? Did the page costs calculation change in 15.8 (or15.7)? The PostgreSQL version is the only thing that has appreciably changed with this database this whole year. It's impossible to say with the given information. You didn't mention which version you upgraded from to start with. We'd need to see EXPLAIN ANALYZE from before and after you changed random_page_cost. > Or do you think the size of the table just incrementally reached some threshold that resulted in the optimizer changingits plan based on the page costs? It's possible, but it's also impossible to know without having more information. > All of the columns are text fields and indexed. Unfortunately, I don't have the "EXPLAIN (ANALYZE, BUFFERS)" output frombefore I changed the random_page_cost setting. It's no longer in my scollback buffer. You can set random_page_cost for just the session you're connected to and try it. SET random_page_cost = <old value>; before running EXPLAIN (ANALYZE, BUFFERS). David
On Nov 7, 2024, at 5:18 PM, David Rowley <dgrowleyml@gmail.com> wrote: > It's impossible to say with the given information. You didn't mention > which version you upgraded from to start with. Sorry, 15.6 to 15.7 to 15.8, but we weren't on 15.7 for very long before 15.8. > You can set random_page_cost for just the session you're connected to > and try it. SET random_page_cost = <old value>; before running > EXPLAIN (ANALYZE, BUFFERS). Oh, I didn't think of that. Duh. Here it is with random_page_cost = 4.0: Nested Loop (cost=1216.49..11246.62 rows=1 width=112) (actual time=6374.844..6387.845 rows=1 loops=1) Join Filter: (CASE WHEN ("position"(a.name, 'zz'::text) = 1) THEN a.name ELSE ('foo_'::text || a.name) END = "*SELECT*1".table_name) Rows Removed by Join Filter: 1009 Buffers: shared hit=2227268 read=426 dirtied=1 -> Nested Loop (cost=497.27..5499.27 rows=1 width=54) (actual time=23.852..84.444 rows=1010 loops=1) Buffers: shared hit=24458 read=426 dirtied=1 -> Nested Loop (cost=496.85..5495.04 rows=1 width=76) (actual time=23.842..64.212 rows=1010 loops=1) Buffers: shared hit=14591 read=426 -> Hash Join (cost=496.43..5490.81 rows=1 width=54) (actual time=23.823..40.587 rows=1010 loops=1) Hash Cond: ((a.name || '.doc'::text) = b_1.name) Buffers: shared hit=4724 read=426 -> Bitmap Heap Scan on metainfo a (cost=460.62..5400.28 rows=5471 width=38) (actual time=18.288..32.125rows=1022 loops=1) Recheck Cond: (relation = 'description'::text) Filter: (type = 'table'::text) Rows Removed by Filter: 37044 Heap Blocks: exact=3999 Buffers: shared hit=3614 read=425 -> Bitmap Index Scan on metainfo_r (cost=0.00..459.25 rows=39844 width=0) (actual time=17.623..17.624rows=44615 loops=1) Index Cond: (relation = 'description'::text) Buffers: shared hit=3 read=37 -> Hash (cost=35.77..35.77 rows=3 width=38) (actual time=5.511..5.518 rows=1025 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 116kB Buffers: shared hit=1110 read=1 -> Bitmap Heap Scan on metainfo b_1 (cost=23.96..35.77 rows=3 width=38) (actual time=1.225..4.206rows=1025 loops=1) Recheck Cond: ((relation = 'located'::text) AND (type = 'document'::text)) Heap Blocks: exact=1103 Buffers: shared hit=1110 read=1 -> BitmapAnd (cost=23.96..23.96 rows=3 width=0) (actual time=0.851..0.856 rows=0 loops=1) Buffers: shared hit=7 read=1 -> Bitmap Index Scan on metainfo_r (cost=0.00..11.86 rows=991 width=0) (actual time=0.402..0.403rows=1214 loops=1) Index Cond: (relation = 'located'::text) Buffers: shared hit=3 read=1 -> Bitmap Index Scan on metainfo_t (cost=0.00..11.86 rows=991 width=0) (actual time=0.343..0.344rows=1214 loops=1) Index Cond: (type = 'document'::text) Buffers: shared hit=4 -> Index Scan using metainfo_n on metainfo c (cost=0.42..4.22 rows=1 width=22) (actual time=0.016..0.020rows=1 loops=1010) Index Cond: (name = a.name) Filter: (relation = 'lastUpdated'::text) Rows Removed by Filter: 48 Buffers: shared hit=9867 -> Index Scan using metainfo_n on metainfo d (cost=0.42..4.22 rows=1 width=22) (actual time=0.012..0.017 rows=1loops=1010) Index Cond: (name = a.name) Filter: (relation = 'rowcount'::text) Rows Removed by Filter: 48 Buffers: shared hit=9867 dirtied=1 -> Append (cost=719.22..5747.30 rows=2 width=64) (actual time=6.222..6.228 rows=1 loops=1010) Buffers: shared hit=2202810 -> Subquery Scan on "*SELECT* 1" (cost=719.22..3653.48 rows=1 width=64) (actual time=5.358..5.358 rows=0 loops=1010) Buffers: shared hit=1381680 -> Bitmap Heap Scan on metainfo (cost=719.22..3653.47 rows=1 width=96) (actual time=5.356..5.356 rows=0loops=1010) Recheck Cond: ((relation = ANY ('{start_time,end_time,dataset_id_column,dataset_id_prefix,original_mission_name,defaultSearchRadius,author,tableType,bibcode,priority,regime,author,include sTypes,Mission,subject}'::text[])) AND (type = 'table'::text)) Filter: ((CASE relation WHEN 'Mission'::text THEN upper(value) ELSE value END = 'foo'::text) AND (CASErelation WHEN 'defaultSearchRadius'::text THEN 'default_search_radius'::text WHEN 'Mission'::text THEN 'o_name'::textWHEN 'priority'::text THEN 'table_priority'::text WHEN 'bibcode'::text THEN 'catalog_bibcode'::text WHEN 'regime'::textTHEN 'frequency_regime'::text WHEN 'author'::text THEN 'table_author'::text WHEN 'tableType'::text THEN 'table_type'::textWHEN 'subject'::text THEN 'row_type'::text ELSE relation END = 'o_name'::text)) Rows Removed by Filter: 8253 Heap Blocks: exact=1277650 Buffers: shared hit=1381680 -> BitmapAnd (cost=719.18..719.18 rows=1361 width=0) (actual time=2.744..2.744 rows=0 loops=1010) Buffers: shared hit=104030 -> Bitmap Index Scan on metainfo_r (cost=0.00..140.70 rows=9910 width=0) (actual time=0.495..0.495rows=9896 loops=1010) Index Cond: (relation = ANY ('{start_time,end_time,dataset_id_column,dataset_id_prefix,original_mission_name,defaultSearchRadius,author,tableType,bibcode,priority,regime,author,includesTypes,Mission,subject}'::text[])) Buffers: shared hit=49490 -> Bitmap Index Scan on metainfo_t (cost=0.00..578.23 rows=50374 width=0) (actual time=2.198..2.198rows=58833 loops=1010) Index Cond: (type = 'table'::text) Buffers: shared hit=54540 -> Subquery Scan on "*SELECT* 5" (cost=10.57..2093.81 rows=1 width=64) (actual time=0.862..0.867 rows=1 loops=1010) Buffers: shared hit=821130 -> Bitmap Heap Scan on metainfo metainfo_1 (cost=10.57..2093.80 rows=1 width=96) (actual time=0.861..0.864rows=1 loops=1010) Recheck Cond: (relation = 'containedBy'::text) Filter: ((substr(value, 1, 8) = 'mission:'::text) AND (upper("substring"(value, 9)) = 'foo'::text)) Rows Removed by Filter: 721 Heap Blocks: exact=818100 Buffers: shared hit=821130 -> Bitmap Index Scan on metainfo_r (cost=0.00..10.56 rows=819 width=0) (actual time=0.076..0.076 rows=895loops=1010) Index Cond: (relation = 'containedBy'::text) Buffers: shared hit=3030 Planning: Buffers: shared hit=64 Planning Time: 4.449 ms Execution Time: 6388.155 ms (78 rows) And here it is with random_page_cost = 2.0: Nested Loop (cost=3695.45..7241.35 rows=1 width=112) (actual time=35.046..49.115 rows=1 loops=1) Buffers: shared hit=14243 -> Nested Loop (cost=3695.03..7238.89 rows=1 width=70) (actual time=35.029..49.093 rows=1 loops=1) Join Filter: (CASE WHEN ("position"(a.name, 'zz'::text) = 1) THEN a.name ELSE ('foo_'::text || a.name) END = "*SELECT*1".table_name) Rows Removed by Join Filter: 1009 Buffers: shared hit=14239 -> Nested Loop (cost=3105.81..3627.24 rows=13 width=38) (actual time=6.688..37.136 rows=1010 loops=1) Join Filter: (c.name = a.name) Buffers: shared hit=12058 -> Merge Join (cost=3105.39..3116.74 rows=53 width=44) (actual time=6.648..10.798 rows=1010 loops=1) Merge Cond: (c.name = d.name) Buffers: shared hit=2200 -> Sort (cost=1546.41..1549.10 rows=1076 width=22) (actual time=3.618..4.357 rows=1010 loops=1) Sort Key: c.name Sort Method: quicksort Memory: 78kB Buffers: shared hit=1101 -> Bitmap Heap Scan on metainfo c (cost=12.76..1492.22 rows=1076 width=22) (actual time=0.413..2.446rows=1010 loops=1) Recheck Cond: (relation = 'lastUpdated'::text) Heap Blocks: exact=1097 Buffers: shared hit=1101 -> Bitmap Index Scan on metainfo_r (cost=0.00..12.49 rows=1076 width=0) (actual time=0.250..0.251rows=1200 loops=1) Index Cond: (relation = 'lastUpdated'::text) Buffers: shared hit=4 -> Sort (cost=1558.98..1561.70 rows=1088 width=22) (actual time=3.022..3.820 rows=1010 loops=1) Sort Key: d.name Sort Method: quicksort Memory: 78kB Buffers: shared hit=1099 -> Bitmap Heap Scan on metainfo d (cost=12.85..1504.11 rows=1088 width=22) (actual time=0.384..1.886rows=1010 loops=1) Recheck Cond: (relation = 'rowcount'::text) Heap Blocks: exact=1095 Buffers: shared hit=1099 -> Bitmap Index Scan on metainfo_r (cost=0.00..12.58 rows=1088 width=0) (actual time=0.236..0.237rows=1200 loops=1) Index Cond: (relation = 'rowcount'::text) Buffers: shared hit=4 -> Index Scan using metainfo_n on metainfo a (cost=0.42..9.62 rows=1 width=38) (actual time=0.018..0.023rows=1 loops=1010) Index Cond: (name = d.name) Filter: ((relation = 'description'::text) AND (type = 'table'::text)) Rows Removed by Filter: 48 Buffers: shared hit=9858 -> Materialize (cost=589.22..3611.08 rows=2 width=64) (actual time=0.009..0.009 rows=1 loops=1010) Buffers: shared hit=2181 -> Append (cost=589.22..3611.07 rows=2 width=64) (actual time=7.847..7.870 rows=1 loops=1) Buffers: shared hit=2181 -> Subquery Scan on "*SELECT* 1" (cost=589.22..2395.08 rows=1 width=64) (actual time=6.935..6.944rows=0 loops=1) Buffers: shared hit=1368 -> Bitmap Heap Scan on metainfo (cost=589.22..2395.07 rows=1 width=96) (actual time=6.933..6.940rows=0 loops=1) Recheck Cond: ((relation = ANY ('{start_time,end_time,dataset_id_column,dataset_id_prefix,original_mission_name,defaultSearchRadius,author,tableType,bibcode,priority,regime,author,includesTypes,Mission,subject}'::text[])) AND(type = 'table'::text)) Filter: ((CASE relation WHEN 'Mission'::text THEN upper(value) ELSE value END = 'foo'::text)AND (CASE relation WHEN 'defaultSearchRadius'::text THEN 'default_search_radius'::text WHEN 'Mission'::text THEN'o_name'::text WHEN 'priority'::text THEN 'table_priority'::text WHEN 'bibcode'::text THEN 'catalog_bibcode'::text WHEN'regime'::text THEN 'frequency_regime'::text WHEN 'author'::text THEN 'table_author'::text WHEN 'tableType'::text THEN'table_type'::text WHEN 'subject'::text THEN 'row_type'::text ELSE relation END = 'o_name'::text)) Rows Removed by Filter: 8253 Heap Blocks: exact=1265 Buffers: shared hit=1368 -> BitmapAnd (cost=589.18..589.18 rows=1361 width=0) (actual time=3.406..3.412 rows=0loops=1) Buffers: shared hit=103 -> Bitmap Index Scan on metainfo_r (cost=0.00..110.70 rows=9910 width=0) (actualtime=0.638..0.638 rows=9897 loops=1) Index Cond: (relation = ANY ('{start_time,end_time,dataset_id_column,dataset_id_prefix,original_mission_name,defaultSearchRadius,author,tableType,bibcode,priority,regime,author,includesTypes,Mission,subject}'::text[])) Buffers: shared hit=49 -> Bitmap Index Scan on metainfo_t (cost=0.00..478.23 rows=50374 width=0) (actualtime=2.688..2.690 rows=58842 loops=1) Index Cond: (type = 'table'::text) Buffers: shared hit=54 -> Subquery Scan on "*SELECT* 5" (cost=8.57..1215.97 rows=1 width=64) (actual time=0.909..0.919 rows=1loops=1) Buffers: shared hit=813 -> Bitmap Heap Scan on metainfo metainfo_1 (cost=8.57..1215.96 rows=1 width=96) (actual time=0.907..0.914rows=1 loops=1) Recheck Cond: (relation = 'containedBy'::text) Filter: ((substr(value, 1, 8) = 'mission:'::text) AND (upper("substring"(value, 9)) = 'foo'::text)) Rows Removed by Filter: 721 Heap Blocks: exact=810 Buffers: shared hit=813 -> Bitmap Index Scan on metainfo_r (cost=0.00..8.56 rows=819 width=0) (actual time=0.085..0.085rows=895 loops=1) Index Cond: (relation = 'containedBy'::text) Buffers: shared hit=3 -> Index Scan using metainfo_n on metainfo b_1 (cost=0.42..2.44 rows=1 width=38) (actual time=0.011..0.012 rows=1 loops=1) Index Cond: (name = (a.name || '.doc'::text)) Filter: ((relation = 'located'::text) AND (type = 'document'::text)) Buffers: shared hit=4 Planning: Buffers: shared hit=367 Planning Time: 5.391 ms Execution Time: 49.412 ms (78 rows) Thanks, Ed
On 11/8/24 08:21, Ed Sabol wrote: > On Nov 7, 2024, at 5:18 PM, David Rowley <dgrowleyml@gmail.com> wrote: >> It's impossible to say with the given information. You didn't mention >> which version you upgraded from to start with. > > Sorry, 15.6 to 15.7 to 15.8, but we weren't on 15.7 for very long before 15.8. > >> You can set random_page_cost for just the session you're connected to >> and try it. SET random_page_cost = <old value>; before running >> EXPLAIN (ANALYZE, BUFFERS). > > Oh, I didn't think of that. Duh. Here it is with random_page_cost = 4.0: I see estimation errors in many places here. The second plan survived by using the Materialize node instead of repeating the execution of the inner subquery. Postgres didn't want Materialize in this example because of the low estimation on its outer subquery. AFAIC, by increasing the *_page_cost's value, you added extra weight to the inner subquery and shifted the decision to use materialisation. It looks like a game of chance and doesn't witness issues of the balance between page read cost and other operations. It is hard to say what you can use in general to avoid issues in queries of such a type except for some query-based Postgres extensions like AQO, but for now, you can try the following: I see huge underestimation in the simple scan: Bitmap Heap Scan on metainfo b_1 (cost=23.96..35.77 rows=3 width=38) (actual time=1.225..4.206 rows=1025 loops=1) It may be caused by some functional dependency in its filter: ((relation = 'located'::text) AND (type = 'document'::text)) You can create extended statistics on the columns 'relation' and 'type'. These statistics can reduce estimation errors and enable the optimiser to find a better plan without changing the cost balance. -- regards, Andrei Lepikhov
On Nov 7, 2024, at 9:27 PM, Andrei Lepikhov <lepihov@gmail.com> wrote: > Postgres didn't want Materialize in this example because of the low estimation on its outer subquery. AFAIC, by increasingthe *_page_cost's value, you added extra weight to the inner subquery and shifted the decision to use materialisation. Interesting, except I decreased the random_page_cost. Just clarifying. > I see huge underestimation in the simple scan: > > Bitmap Heap Scan on metainfo b_1 > (cost=23.96..35.77 rows=3 width=38) > (actual time=1.225..4.206 rows=1025 loops=1) > > It may be caused by some functional dependency in its filter: > > ((relation = 'located'::text) AND (type = 'document'::text)) > > You can create extended statistics on the columns 'relation' and 'type'. These statistics can reduce estimation errorsand enable the optimiser to find a better plan without changing the cost balance. OK, this is the first I'm learning about extended statistics... I'm looking at https://www.postgresql.org/docs/15/sql-createstatistics.html and https://www.postgresql.org/docs/15/planner-stats.html#PLANNER-STATS-EXTENDED What kind of extended statistics do you suggest for this? ndistinct, dependencies, or mcv? CREATE STATISTICS tablename_stats (<statistics type?>) ON relation, type FROM tablename; ANALYZE tablename; Thanks, Ed
On 11/8/24 09:45, Ed Sabol wrote: > On Nov 7, 2024, at 9:27 PM, Andrei Lepikhov <lepihov@gmail.com> wrote: >> Postgres didn't want Materialize in this example because of the low estimation on its outer subquery. AFAIC, by increasingthe *_page_cost's value, you added extra weight to the inner subquery > What kind of extended statistics do you suggest for this? ndistinct, dependencies, or mcv? > > CREATE STATISTICS tablename_stats (<statistics type?>) ON relation, type FROM tablename; > ANALYZE tablename; I'd recommend to use all of them - MCV is helpful in most of the cases (and relatively cheap), distinct is actually used in Postgres now to calculate number of groups (GROUP-BY, Sort, Memoize, etc.); dependencies - to find correlations between columns - usually in scan filters. -- regards, Andrei Lepikhov
On Nov 7, 2024, at 9:54 PM, Andrei Lepikhov <lepihov@gmail.com> wrote: > On 11/8/24 09:45, Ed Sabol wrote: >> On Nov 7, 2024, at 9:27 PM, Andrei Lepikhov <lepihov@gmail.com> wrote: >>> Postgres didn't want Materialize in this example because of the low estimation on its outer subquery. AFAIC, by increasingthe *_page_cost's value, you added extra weight to the inner subquery >> What kind of extended statistics do you suggest for this? ndistinct, dependencies, or mcv? >> CREATE STATISTICS tablename_stats (<statistics type?>) ON relation, type FROM tablename; >> ANALYZE tablename; > I'd recommend to use all of them - MCV is helpful in most of the cases (and relatively cheap), distinct is actually usedin Postgres now to calculate number of groups (GROUP-BY, Sort, Memoize, etc.); dependencies - to find correlations betweencolumns - usually in scan filters. OK, I've executed the following: CREATE STATISTICS tablename_stats_rt_nd (ndistinct) ON relation, type FROM tablename; CREATE STATISTICS tablename_stats_rt_mcv (mcv) ON relation, type FROM tablename; CREATE STATISTICS tablename_stats_rt_dep (dependencies) ON relation, type FROM tablename; CREATE STATISTICS tablename_stats_rv_nd (ndistinct) ON relation, value FROM tablename; CREATE STATISTICS tablename_stats_rv_mcv (mcv) ON relation, value FROM tablename; CREATE STATISTICS tablename_stats_rv_dep (dependencies) ON relation, value FROM tablename; CREATE STATISTICS tablename_stats_nr_nd (ndistinct) ON name, relation FROM tablename; CREATE STATISTICS tablename_stats_nr_mcv (mcv) ON name, relation FROM tablename; CREATE STATISTICS tablename_stats_nr_dep (dependencies) ON name, relation FROM tablename; CREATE STATISTICS tablename_stats_nt_nd (ndistinct) ON name, type FROM tablename; CREATE STATISTICS tablename_stats_nt_mcv (mcv) ON name, type FROM tablename; CREATE STATISTICS tablename_stats_nt_dep (dependencies) ON name, type FROM tablename; CREATE STATISTICS tablename_stats_nv_nd (ndistinct) ON name, value FROM tablename; CREATE STATISTICS tablename_stats_nv_mcv (mcv) ON name, value FROM tablename; CREATE STATISTICS tablename_stats_nv_dep (dependencies) ON name, value FROM tablename; ANALYZE tablename; Now with random_page_cost = 4.0, the optimizer materializes, and it's fast again: Nested Loop (cost=1226.12..11129.87 rows=1 width=112) (actual time=30.965..31.333 rows=1 loops=1) Join Filter: (a.name = d.name) Buffers: shared hit=7447 -> Nested Loop (cost=1225.70..11112.51 rows=1 width=108) (actual time=30.921..31.208 rows=1 loops=1) Buffers: shared hit=7418 -> Hash Join (cost=1225.27..11093.62 rows=1 width=86) (actual time=30.862..31.078 rows=1 loops=1) Hash Cond: ((a.name || '.doc'::text) = b_1.name) Buffers: shared hit=7389 -> Nested Loop (cost=1167.53..11019.89 rows=11 width=70) (actual time=27.143..27.347 rows=1 loops=1) Join Filter: (CASE WHEN ("position"(a.name, 'zz'::text) = 1) THEN a.name ELSE ('h_'::text || a.name)END = "*SELECT* 1".table_name) Rows Removed by Join Filter: 1021 Buffers: shared hit=6268 -> Bitmap Heap Scan on tablename a (cost=456.55..5407.28 rows=1077 width=38) (actual time=2.986..15.865rows=1022 loops=1) Recheck Cond: (relation = 'description'::text) Filter: (type = 'table'::text) Rows Removed by Filter: 37044 Heap Blocks: exact=4024 Buffers: shared hit=4065 -> Bitmap Index Scan on tablename_r (cost=0.00..456.29 rows=38915 width=0) (actual time=2.336..2.336rows=44980 loops=1) Index Cond: (relation = 'description'::text) Buffers: shared hit=41 -> Materialize (cost=710.98..5564.15 rows=2 width=64) (actual time=0.008..0.009 rows=1 loops=1022) Buffers: shared hit=2203 -> Append (cost=710.98..5564.14 rows=2 width=64) (actual time=7.519..7.548 rows=1 loops=1) Buffers: shared hit=2203 -> Subquery Scan on "*SELECT* 1" (cost=710.98..3537.89 rows=1 width=64) (actual time=6.629..6.636rows=0 loops=1) Buffers: shared hit=1380 -> Bitmap Heap Scan on tablename (cost=710.98..3537.88 rows=1 width=96) (actualtime=6.628..6.633 rows=0 loops=1) Recheck Cond: ((relation = ANY ('{start_time,end_time,dataset_id_column,dataset_id_prefix,original_mission_name,defaultSearchRadius,author,tableType,bibcode,priority,regime,author,includesTypes,Mission,subject}'::text[])) AND(type = 'table'::text)) Filter: ((CASE relation WHEN 'Mission'::text THEN upper(value) ELSE value END= 'foo'::text) AND (CASE relation WHEN 'defaultSearchRadius'::text THEN 'default_search_radius'::text WHEN 'Mission'::textTHEN 'o_name'::text WHEN 'priority'::text THEN 'table_priority'::text WHEN 'bibcode'::text THEN 'catalog_bibcode'::textWHEN 'regime'::text THEN 'frequency_regime'::text WHEN 'author'::text THEN 'table_author'::text WHEN'tableType'::text THEN 'table_type'::text WHEN 'subject'::text THEN 'row_type'::text ELSE relation END = 'o_name'::text)) Rows Removed by Filter: 8253 Heap Blocks: exact=1276 Buffers: shared hit=1380 -> BitmapAnd (cost=710.94..710.94 rows=1275 width=0) (actual time=3.346..3.350rows=0 loops=1) Buffers: shared hit=104 -> Bitmap Index Scan on tablename_r (cost=0.00..134.96 rows=9145 width=0)(actual time=0.573..0.574 rows=9998 loops=1) Index Cond: (relation = ANY ('{start_time,end_time,dataset_id_column,dataset_id_prefix,original_mission_name,defaultSearchRadius,author,tableType,bibcode,priority,regime,author,includesTypes,Mission,subject}'::text[])) Buffers: shared hit=49 -> Bitmap Index Scan on tablename_t (cost=0.00..575.73 rows=49507 width=0)(actual time=2.693..2.693 rows=59373 loops=1) Index Cond: (type = 'table'::text) Buffers: shared hit=55 -> Subquery Scan on "*SELECT* 5" (cost=10.28..2026.24 rows=1 width=64) (actual time=0.886..0.904rows=1 loops=1) Buffers: shared hit=823 -> Bitmap Heap Scan on tablename tablename_1 (cost=10.28..2026.23 rows=1 width=96)(actual time=0.884..0.899 rows=1 loops=1) Recheck Cond: (relation = 'containedBy'::text) Filter: ((substr(value, 1, 8) = 'mission:'::text) AND (upper("substring"(value,9)) = 'foo'::text)) Rows Removed by Filter: 721 Heap Blocks: exact=820 Buffers: shared hit=823 -> Bitmap Index Scan on tablename_r (cost=0.00..10.28 rows=781 width=0) (actualtime=0.085..0.085 rows=905 loops=1) Index Cond: (relation = 'containedBy'::text) Index Cond: (relation = 'containedBy'::text) Buffers: shared hit=3 -> Hash (cost=44.87..44.87 rows=1030 width=38) (actual time=5.334..5.342 rows=1025 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 124kB Buffers: shared hit=1121 -> Bitmap Heap Scan on tablename b_1 (cost=33.06..44.87 rows=1030 width=38) (actual time=1.157..4.018rows=1025 loops=1) Recheck Cond: ((relation = 'located'::text) AND (type = 'document'::text)) Heap Blocks: exact=1113 Buffers: shared hit=1121 -> BitmapAnd (cost=33.06..33.06 rows=3 width=0) (actual time=0.765..0.769 rows=0 loops=1) Buffers: shared hit=8 -> Bitmap Index Scan on tablename_r (cost=0.00..16.15 rows=1030 width=0) (actual time=0.347..0.347rows=1227 loops=1) Index Cond: (relation = 'located'::text) Buffers: shared hit=4 -> Bitmap Index Scan on tablename_t (cost=0.00..16.15 rows=1030 width=0) (actual time=0.314..0.315rows=1227 loops=1) Index Cond: (type = 'document'::text) Buffers: shared hit=4 -> Index Scan using tablename_n on tablename c (cost=0.42..18.88 rows=1 width=22) (actual time=0.048..0.115 rows=1loops=1) Index Cond: (name = a.name) Filter: (relation = 'lastUpdated'::text) Rows Removed by Filter: 58 Buffers: shared hit=29 -> Index Scan using tablename_n on tablename d (cost=0.42..17.33 rows=1 width=22) (actual time=0.034..0.104 rows=1 loops=1) Index Cond: (name = c.name) Filter: (relation = 'rowcount'::text) Rows Removed by Filter: 58 Buffers: shared hit=29 Planning: Buffers: shared hit=64 Planning Time: 5.086 ms Execution Time: 32.226 ms (81 rows) This was a nice learning experience and I hope it will help with performance going forward, but I still think I'm going tokeep random_page_cost = 2.0. None of this really explains why this became a problem after ~10 years of it not being one, but I think the only likely reasonis that the table just grew gradually over time and reached some threshold that changed the optimizer's plan very adversely. Thanks, Ed
On 9/11/2024 03:32, Ed Sabol wrote: > CREATE STATISTICS tablename_stats_rt_nd (ndistinct) ON relation, type FROM tablename; The only one thing I want to note. It is enough to execute: CREATE STATISTICS tablename_stats (mcv,ndistinct,dependencies) ON relation, type FROM tablename; And all the statistics will be build over any possible combination of (relation, type). So, you don't need to call CREATE STATISTICS more than once. -- regards, Andrei Lepikhov