Home > mailing lists

Re: single table - fighting a seq scan - Mailing list pgsql-general

From	Michael Lewis
Subject	Re: single table - fighting a seq scan
Date	July 14, 2020 18:45:38
Msg-id	CAHOFxGqfRa7uPkG2dEv5U2rt_ZLORVvBYGXLK7t7ZG8HGQgeeQ@mail.gmail.com Whole thread Raw
In response to	single table - fighting a seq scan (Radoslav Nedyalkov <rnedyalkov@gmail.com>)
Responses	Re: single table - fighting a seq scan
List	pgsql-general

Tree view

rows=3832
rows=3870

Your estimate changed very little when you included 100 values vs 200 values. That is interesting to me.

What does the below query give you? How many of those 200 values are found in the MCVs list? If n_distinct is low, and most of the values are NOT in the most common value list, and the fraction of the table covered by the MCVs is also low, then the planner will expect that each of the 200 values represents some deceivingly high portion of the table.

You said there are 80 million rows, yes? That seems likely that ndistinct and the MCVs list are not giving info very correlated with reality. You may want to increase the minimum table size for sequential to kick in. I cannot recall the name of that setting at the moment. You may also want to increase stats target on that column at least, analyze, and explain the query again.

SELECT
( SELECT SUM (x) FROM UNNEST (most_common_freqs) x ) frac_MCV,
tablename,
attname,
inherited,
null_frac,
n_distinct,
array_length(most_common_vals,1) n_mcv,
array_length(histogram_bounds,1) n_hist,
correlation,
*
FROM pg_stats
WHERE
schemaname = 'public'
AND tablename='te'
AND attname='current_pid';

pgsql-general by date:

From: Marc Millas
Date: 14 July 2020, 18:43:51
Subject: Re: how to "explain" some ddl

From: Radoslav Nedyalkov
Date: 14 July 2020, 19:31:39
Subject: Re: single table - fighting a seq scan

Re: single table - fighting a seq scan - Mailing list pgsql-general

Previous

Next