Thread: How to deal with analyze gathering irrelevant stats

How to deal with analyze gathering irrelevant stats

From

Rémi Chatenay

Date:

11 January 2021, 15:50:12

Hi,

Thanks in advance for your help. I'm putting as much context and details as possible, but let me know if you have any questions.

What?

We are experiencing some slow queries due to the query planner using an incorrect index. It is using an unoptimized index because the stats are computed during the night when the data is not the same as during the day.

Context

We have a table conversations like that

|id|status|user_id|

and 2 indexes:

CREATE INDEX index_conversations_on_user_id_and_status ON public.conversations USING btree (user_id, status);

CREATE INDEX index_conversations_on_status ON public.conversations USING btree (status)

The slow query is the following:

SELECT id FROM conversations WHERE status = 'in_progress' AND user_id = 123

We expect the query planner to use the index_conversations_on_user_id_and_status but it sometimes uses the other one.

What's happening ?

There are hundreds of conversations with a status 'in_progress' at a given time during the day but virtually none during the night.

So when the analyze is run during the night, PG then thinks that using the index_conversations_on_status will return almost no rows and so it uses this index instead of the combined one.

When the analyze is run during the day, PG correctly uses the right index (index_conversations_on_user_id_and_status)

[With an analyze run during the day]

Limit (cost=0.43..8.45 rows=1 width=8) (actual time=1.666..1.666 rows=0 loops=1)

-> Index Scan using index_conversations_on_user_id_and_status on conversations (cost=0.43..8.45 rows=1 width=8) (actual_time=1.665..1.665 rows:0 loops:1)

Index Cond: ((user_id = 123) AND ((status)::text = 'in_progress'::text))

Filter: (id <> 1)

Planning Time: 8.642 ms

Execution Time: 1.693 ms

[With an analyze run during the night]

Limit (cost=0.43..8.46 rows=1 width=8) (actual time=272.812..272.812 rows=0 loops=1)

-> Index Scan using index_conversations_on_status on conversations (cost=0.43..8.46 rows=1 width=8) (actual_time=272.812..272.812 rows:0 loops:1)

Index Cond: ((status)::text = 'in_progress'::text))

Filter: (id <> 1) AND (user_id = 123)

Rows Removed by Filter: 559

Planning Time: 0.133 ms

Execution Time: 272.886 ms

The question

We currently run a manual weekly vacuum analyze during the night. I'm wondering what are our possible solutions. One is to manually run the analyze during the day but is there a way to tell PG to run the auto analyze at a given time of the day for example ? I guess we are not the first ones to have data patterns that differ between when the analyze is run and the query is run.

Config

Postgres version: 11

Table Metadata

SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE relname='conversations';

-------------+----------+------------+---------------+---------+----------+----------------+------------+---------------

conversations | 930265 | 7.3366e+06 | 902732 | r | 16 | f | | 7622991872

Maintenance Setup

We have manual vacuum analyze every week during the night.

GUC Settings

Unsure what's necessary...

    "autovacuum_analyze_threshold"      = "50"
    "autovacuum_max_workers"            = "3",
    "autovacuum_naptime"                = "60"
    "autovacuum_vacuum_threshold"       = "50"

Statistics: n_distinct, MCV, histogram

SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV, tablename, attname, inherited, null_frac, n_distinct, array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1) n_hist, correlation FROM pg_stats WHERE attname='status' AND tablename='conversations' ORDER BY 1 DESC;

----------+-------------+---------+-----------+-----------+------------+-------+--------+-------------

0.999967 | conversations | status | f | 0 | 6 | 5 | | 0.967121

Re: How to deal with analyze gathering irrelevant stats

From

Justin Pryzby

Date:

11 January 2021, 16:29:04

On Mon, Jan 11, 2021 at 04:50:12PM +0100, Rémi Chatenay wrote:
> We are experiencing some slow queries due to the query planner using an
> incorrect index. It is using an unoptimized index because the stats are
> computed during the night when the data is not the same as during the day.
> 
> CREATE INDEX index_conversations_on_user_id_and_status ON
> public.conversations USING btree (user_id, status);
> 
> CREATE INDEX index_conversations_on_status ON public.conversations USING
> btree (status)
> 
> The slow query is the following:
> 
> SELECT id FROM conversations WHERE status = 'in_progress' AND user_id = 123
> 
> There are hundreds of conversations with a status 'in_progress' at a given
> time during the day but virtually none during the night.
> 
> So when the analyze is run during the night, PG then thinks that using the
> index_conversations_on_status will return almost no rows and so it uses
> this index instead of the combined one.
> 
> When the analyze is run during the day, PG correctly uses the right index
> (index_conversations_on_user_id_and_status)

> We currently run a manual weekly vacuum analyze during the night. I'm
> wondering what are our possible solutions. One is to manually run the
> analyze during the day but is there a way to tell PG to run the auto
> analyze at a given time of the day for example ? I guess we are not the
> first ones to have data patterns that differ between when the analyze is
> run and the query is run.

I think you could run manual ANALYZE during the day just for this one column:
  ANALYZE conversations (status);

If it takes too long or causes a performance issue, you could do:
  SET default_statistics_target=10;
  ANALYZE conversations (status);

You could also change to make autovacuum do this on its own, by setting:
 ALTER TABLE conversations SET (autovacuum_analyze_scale_factor=0.005);

If that works but too slow, then maybe ALTER TABLE .. SET STATISTICS 10.

-- 
Justin

Re: How to deal with analyze gathering irrelevant stats

From

Michael Lewis

Date:

11 January 2021, 16:47:36

What is the usage pattern of the conversations table? Is getting many inserts during the day, or updates of status mostly?

Why have an index on the status column at all? My guess would be that there are 2-10 statuses, but many many rows in the table for most of those statuses. Having a low cardinality index that changes frequently seems prone to mis-use by the system.

Re: How to deal with analyze gathering irrelevant stats

From

Rémi Chatenay

Date:

11 January 2021, 16:52:25

I'd say it's a 1 insert for 5 - 10 updates.

As for the index on the status, it's because we have a job that runs every night that deals with conversations in specific statuses. Having a low cardinality index that changes frequently seems prone to mis-use by the system. -> What would be an alternative ?

On Mon, Jan 11, 2021 at 5:48 PM Michael Lewis <mlewis@entrata.com> wrote:

What is the usage pattern of the conversations table? Is getting many inserts during the day, or updates of status mostly?

Why have an index on the status column at all? My guess would be that there are 2-10 statuses, but many many rows in the table for most of those statuses. Having a low cardinality index that changes frequently seems prone to mis-use by the system.

Re: How to deal with analyze gathering irrelevant stats

From

Michael Lewis

Date:

11 January 2021, 18:26:58

On Mon, Jan 11, 2021 at 9:52 AM Rémi Chatenay <remi.chatenay@doctolib.com> wrote:

I'd say it's a 1 insert for 5 - 10 updates.

As for the index on the status, it's because we have a job that runs every night that deals with conversations in specific statuses. Having a low cardinality index that changes frequently seems prone to mis-use by the system. -> What would be an alternative ?

One option would be a partial index on another field used in that query where status in ( list_of_uncommon_statuses_queried_nightly )

Sequential scan may be perfectly fine for a nightly script though.