Thread: Query running slow for only one specific id. (Postgres 9.3) version

Query running slow for only one specific id. (Postgres 9.3) version

From
"Sheena, Prabhjot"
Date:

Postgresql 9.3 Version

 

Guys

          Here  is the issue that I’m facing for couple of weeks now. I have table (size  7GB)

 

If I run this query with this specific registration id it is using the wrong execution plan and takes more than a minute to complete. Total number of rows for this registration_id is only 414 in this table

 

explain analyze SELECT max(last_update_date) AS last_update_date FROM btdt_responses WHERE registration_id = 8718704208 AND response != 4;

                                                                                QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Result  (cost=2902.98..2903.01 rows=1 width=0) (actual time=86910.730..86910.731 rows=1 loops=1)

   InitPlan 1 (returns $0)

     ->  Limit  (cost=0.57..2902.98 rows=1 width=8) (actual time=86910.725..86910.725 rows=1 loops=1)

           ->  Index Scan Backward using btdt_responses_n5 on btdt_responses  (cost=0.57..6425932.41 rows=2214 width=8) (actual time=86910.723..86910.723 rows=1 loops=1)

                 Index Cond: (last_update_date IS NOT NULL)

                 Filter: ((response <> 4) AND (registration_id = 8718704208::bigint))

                 Rows Removed by Filter: 52145434

Total runtime: 86910.766 ms

 

 

Same query with any other registration id will come back in milli seconds

 

 

 

explain analyze SELECT max(last_update_date) AS last_update_date FROM btdt_responses WHERE registration_id = 8688546267 AND response != 4;

                                                                  QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------

Aggregate  (cost=529.75..529.78 rows=1 width=8) (actual time=19.723..19.723 rows=1 loops=1)

   ->  Index Scan using btdt_responses_u2 on btdt_responses  (cost=0.57..529.45 rows=119 width=8) (actual time=0.097..19.689 rows=72 loops=1)

         Index Cond: (registration_id = 8688546267::bigint)

         Filter: (response <> 4)

         Rows Removed by Filter: 22

Total runtime: 19.769 ms

 

 

Please let me know what I can do to fix this issue.

 

 

Thanks

 

 

 

 

 

 

Re: Query running slow for only one specific id. (Postgres 9.3) version

From
Igor Neyman
Date:

 

 

From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Sheena, Prabhjot
Sent: Friday, June 05, 2015 1:55 PM
To: pgsql-general@postgresql.org; pgsql-performance@postgresql.org
Subject: [PERFORM] Query running slow for only one specific id. (Postgres 9.3) version

 

Postgresql 9.3 Version

 

Guys

          Here  is the issue that I’m facing for couple of weeks now. I have table (size  7GB)

 

If I run this query with this specific registration id it is using the wrong execution plan and takes more than a minute to complete. Total number of rows for this registration_id is only 414 in this table

 

explain analyze SELECT max(last_update_date) AS last_update_date FROM btdt_responses WHERE registration_id = 8718704208 AND response != 4;

                                                                                QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Result  (cost=2902.98..2903.01 rows=1 width=0) (actual time=86910.730..86910.731 rows=1 loops=1)

   InitPlan 1 (returns $0)

     ->  Limit  (cost=0.57..2902.98 rows=1 width=8) (actual time=86910.725..86910.725 rows=1 loops=1)

           ->  Index Scan Backward using btdt_responses_n5 on btdt_responses  (cost=0.57..6425932.41 rows=2214 width=8) (actual time=86910.723..86910.723 rows=1 loops=1)

                 Index Cond: (last_update_date IS NOT NULL)

                 Filter: ((response <> 4) AND (registration_id = 8718704208::bigint))

                 Rows Removed by Filter: 52145434

Total runtime: 86910.766 ms

 

 

Same query with any other registration id will come back in milli seconds

 

 

 

explain analyze SELECT max(last_update_date) AS last_update_date FROM btdt_responses WHERE registration_id = 8688546267 AND response != 4;

                                                                  QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------

Aggregate  (cost=529.75..529.78 rows=1 width=8) (actual time=19.723..19.723 rows=1 loops=1)

   ->  Index Scan using btdt_responses_u2 on btdt_responses  (cost=0.57..529.45 rows=119 width=8) (actual time=0.097..19.689 rows=72 loops=1)

         Index Cond: (registration_id = 8688546267::bigint)

         Filter: (response <> 4)

         Rows Removed by Filter: 22

Total runtime: 19.769 ms

 

 

Please let me know what I can do to fix this issue.

 

 

Thanks

 

 

Not enough info.

Table structure? Is registration_id – PK?  If not, what is the distribution of the values for this table?

When was it analyzed last time?  M.b. you need to increase statistics target for this table:

 

Index Scan Backward using btdt_responses_n5 on btdt_responses  (cost=0.57..6425932.41 rows=2214 width=8) (actual time=86910.723..86910.723 rows=1 loops=1)

 

It expects 2214 records while really getting only 1.

 

Regards,

Igor Neyman

 

 

 

On 06/05/2015 10:54 AM, Sheena, Prabhjot wrote:

Postgresql 9.3 Version

 

Guys

          Here  is the issue that I’m facing for couple of weeks now. I have table (size  7GB)

 

If I run this query with this specific registration id it is using the wrong execution plan and takes more than a minute to complete. Total number of rows for this registration_id is only 414 in this table

 

explain analyze SELECT max(last_update_date) AS last_update_date FROM btdt_responses WHERE registration_id = 8718704208 AND response != 4;

                                                                                QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Result  (cost=2902.98..2903.01 rows=1 width=0) (actual time=86910.730..86910.731 rows=1 loops=1)

   InitPlan 1 (returns $0)

     ->  Limit  (cost=0.57..2902.98 rows=1 width=8) (actual time=86910.725..86910.725 rows=1 loops=1)

           ->  Index Scan Backward using btdt_responses_n5 on btdt_responses  (cost=0.57..6425932.41 rows=2214 width=8) (actual time=86910.723..86910.723 rows=1 loops=1)

                 Index Cond: (last_update_date IS NOT NULL)

                 Filter: ((response <> 4) AND (registration_id = 8718704208::bigint))

                 Rows Removed by Filter: 52145434

Total runtime: 86910.766 ms

 

 

Same query with any other registration id will come back in milli seconds

 

 

 

explain analyze SELECT max(last_update_date) AS last_update_date FROM btdt_responses WHERE registration_id = 8688546267 AND response != 4;

                                                                  QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------

Aggregate  (cost=529.75..529.78 rows=1 width=8) (actual time=19.723..19.723 rows=1 loops=1)

   ->  Index Scan using btdt_responses_u2 on btdt_responses  (cost=0.57..529.45 rows=119 width=8) (actual time=0.097..19.689 rows=72 loops=1)

         Index Cond: (registration_id = 8688546267::bigint)

         Filter: (response <> 4)

         Rows Removed by Filter: 22

Total runtime: 19.769 ms

 

A couple initial questions:

1. Does the result change if you analyze the table and rerun the query?

2. Are there any non-default settings for statistics collection on your database?

-Steve

Re: Query running slow for only one specific id. (Postgres 9.3) version

From
"Sheena, Prabhjot"
Date:

When I run vacuum analyze it fixes the problem but after 1 or 2 days the problem comes back

 

Here is the table structure

 

      Column      |            Type             |                              Modifiers                               | Storage | Stats target | Description

------------------+-----------------------------+----------------------------------------------------------------------+---------+--------------+-------------

response_id      | integer                     | not null default nextval('btdt_responses_response_id_seq'::regclass) | plain   |              |

registration_id  | bigint                      | not null                                                             | plain   |              |

btdt_id          | integer                     | not null                                                             | plain   |              |

response         | integer                     | not null                                                             | plain   |              |

creation_date    | timestamp without time zone | not null default now()                                               | plain   |              |

last_update_date | timestamp without time zone | not null default now()                                               | plain   |              |

Indexes:

    "btdt_responses_pkey" PRIMARY KEY, btree (response_id)

    "btdt_responses_u2" UNIQUE, btree (registration_id, btdt_id)

    "btdt_responses_n1" btree (btdt_id)

    "btdt_responses_n2" btree (btdt_id, response)

    "btdt_responses_n4" btree (creation_date)

    "btdt_responses_n5" btree (last_update_date)

    "btdt_responses_n6" btree (btdt_id, last_update_date)

Foreign-key constraints:

    "btdt_responses_btdt_id_fkey" FOREIGN KEY (btdt_id) REFERENCES btdt_items(btdt_id)

    "btdt_responses_fk1" FOREIGN KEY (btdt_id) REFERENCES btdt_items(btdt_id)

Has OIDs: no

Options: autovacuum_enabled=true, autovacuum_vacuum_scale_factor=0.02, autovacuum_analyze_scale_factor=0.02

 

Thanks

 

From: Igor Neyman [mailto:ineyman@perceptron.com]
Sent: Friday, June 5, 2015 11:06 AM
To: Sheena, Prabhjot; pgsql-general@postgresql.org; pgsql-performance@postgresql.org
Subject: RE: Query running slow for only one specific id. (Postgres 9.3) version

 

 

 

From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Sheena, Prabhjot
Sent: Friday, June 05, 2015 1:55 PM
To: pgsql-general@postgresql.org; pgsql-performance@postgresql.org
Subject: [PERFORM] Query running slow for only one specific id. (Postgres 9.3) version

 

Postgresql 9.3 Version

 

Guys

          Here  is the issue that I’m facing for couple of weeks now. I have table (size  7GB)

 

If I run this query with this specific registration id it is using the wrong execution plan and takes more than a minute to complete. Total number of rows for this registration_id is only 414 in this table

 

explain analyze SELECT max(last_update_date) AS last_update_date FROM btdt_responses WHERE registration_id = 8718704208 AND response != 4;

                                                                                QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Result  (cost=2902.98..2903.01 rows=1 width=0) (actual time=86910.730..86910.731 rows=1 loops=1)

   InitPlan 1 (returns $0)

     ->  Limit  (cost=0.57..2902.98 rows=1 width=8) (actual time=86910.725..86910.725 rows=1 loops=1)

           ->  Index Scan Backward using btdt_responses_n5 on btdt_responses  (cost=0.57..6425932.41 rows=2214 width=8) (actual time=86910.723..86910.723 rows=1 loops=1)

                 Index Cond: (last_update_date IS NOT NULL)

                 Filter: ((response <> 4) AND (registration_id = 8718704208::bigint))

                 Rows Removed by Filter: 52145434

Total runtime: 86910.766 ms

 

 

Same query with any other registration id will come back in milli seconds

 

 

 

explain analyze SELECT max(last_update_date) AS last_update_date FROM btdt_responses WHERE registration_id = 8688546267 AND response != 4;

                                                                  QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------

Aggregate  (cost=529.75..529.78 rows=1 width=8) (actual time=19.723..19.723 rows=1 loops=1)

   ->  Index Scan using btdt_responses_u2 on btdt_responses  (cost=0.57..529.45 rows=119 width=8) (actual time=0.097..19.689 rows=72 loops=1)

         Index Cond: (registration_id = 8688546267::bigint)

         Filter: (response <> 4)

         Rows Removed by Filter: 22

Total runtime: 19.769 ms

 

 

Please let me know what I can do to fix this issue.

 

 

Thanks

 

 

Not enough info.

Table structure? Is registration_id – PK?  If not, what is the distribution of the values for this table?

When was it analyzed last time?  M.b. you need to increase statistics target for this table:

 

Index Scan Backward using btdt_responses_n5 on btdt_responses  (cost=0.57..6425932.41 rows=2214 width=8) (actual time=86910.723..86910.723 rows=1 loops=1)

 

It expects 2214 records while really getting only 1.

 

Regards,

Igor Neyman

 

 

 

Re: Query running slow for only one specific id. (Postgres 9.3) version

From
Igor Neyman
Date:

 

 

From: Sheena, Prabhjot [mailto:Prabhjot.Singh@classmates.com]
Sent: Friday, June 05, 2015 2:38 PM
To: Igor Neyman; pgsql-general@postgresql.org; pgsql-performance@postgresql.org
Subject: RE: Query running slow for only one specific id. (Postgres 9.3) version

 

When I run vacuum analyze it fixes the problem but after 1 or 2 days the problem comes back

 

Here is the table structure

 

      Column      |            Type             |                              Modifiers                               | Storage | Stats target | Description

------------------+-----------------------------+----------------------------------------------------------------------+---------+--------------+-------------

response_id      | integer                     | not null default nextval('btdt_responses_response_id_seq'::regclass) | plain   |              |

registration_id  | bigint                      | not null                                                             | plain   |              |

btdt_id          | integer                     | not null                                                             | plain   |              |

response         | integer                     | not null                                                             | plain   |              |

creation_date    | timestamp without time zone | not null default now()                                               | plain   |              |

last_update_date | timestamp without time zone | not null default now()                                               | plain   |              |

Indexes:

    "btdt_responses_pkey" PRIMARY KEY, btree (response_id)

    "btdt_responses_u2" UNIQUE, btree (registration_id, btdt_id)

    "btdt_responses_n1" btree (btdt_id)

    "btdt_responses_n2" btree (btdt_id, response)

    "btdt_responses_n4" btree (creation_date)

    "btdt_responses_n5" btree (last_update_date)

    "btdt_responses_n6" btree (btdt_id, last_update_date)

Foreign-key constraints:

    "btdt_responses_btdt_id_fkey" FOREIGN KEY (btdt_id) REFERENCES btdt_items(btdt_id)

    "btdt_responses_fk1" FOREIGN KEY (btdt_id) REFERENCES btdt_items(btdt_id)

Has OIDs: no

Options: autovacuum_enabled=true, autovacuum_vacuum_scale_factor=0.02, autovacuum_analyze_scale_factor=0.02

 

Thanks

 

From: Igor Neyman [mailto:ineyman@perceptron.com]
Sent: Friday, June 5, 2015 11:06 AM
To: Sheena, Prabhjot; pgsql-general@postgresql.org; pgsql-performance@postgresql.org
Subject: RE: Query running slow for only one specific id. (Postgres 9.3) version

 

 

 

From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Sheena, Prabhjot
Sent: Friday, June 05, 2015 1:55 PM
To: pgsql-general@postgresql.org; pgsql-performance@postgresql.org
Subject: [PERFORM] Query running slow for only one specific id. (Postgres 9.3) version

 

Postgresql 9.3 Version

 

Guys

          Here  is the issue that I’m facing for couple of weeks now. I have table (size  7GB)

 

If I run this query with this specific registration id it is using the wrong execution plan and takes more than a minute to complete. Total number of rows for this registration_id is only 414 in this table

 

explain analyze SELECT max(last_update_date) AS last_update_date FROM btdt_responses WHERE registration_id = 8718704208 AND response != 4;

                                                                                QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Result  (cost=2902.98..2903.01 rows=1 width=0) (actual time=86910.730..86910.731 rows=1 loops=1)

   InitPlan 1 (returns $0)

     ->  Limit  (cost=0.57..2902.98 rows=1 width=8) (actual time=86910.725..86910.725 rows=1 loops=1)

           ->  Index Scan Backward using btdt_responses_n5 on btdt_responses  (cost=0.57..6425932.41 rows=2214 width=8) (actual time=86910.723..86910.723 rows=1 loops=1)

                 Index Cond: (last_update_date IS NOT NULL)

                 Filter: ((response <> 4) AND (registration_id = 8718704208::bigint))

                 Rows Removed by Filter: 52145434

Total runtime: 86910.766 ms

 

 

Same query with any other registration id will come back in milli seconds

 

 

 

explain analyze SELECT max(last_update_date) AS last_update_date FROM btdt_responses WHERE registration_id = 8688546267 AND response != 4;

                                                                  QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------

Aggregate  (cost=529.75..529.78 rows=1 width=8) (actual time=19.723..19.723 rows=1 loops=1)

   ->  Index Scan using btdt_responses_u2 on btdt_responses  (cost=0.57..529.45 rows=119 width=8) (actual time=0.097..19.689 rows=72 loops=1)

         Index Cond: (registration_id = 8688546267::bigint)

         Filter: (response <> 4)

         Rows Removed by Filter: 22

Total runtime: 19.769 ms

 

 

Please let me know what I can do to fix this issue.

 

 

Thanks

 

 

Not enough info.

Table structure? Is registration_id – PK?  If not, what is the distribution of the values for this table?

When was it analyzed last time?  M.b. you need to increase statistics target for this table:

 

Index Scan Backward using btdt_responses_n5 on btdt_responses  (cost=0.57..6425932.41 rows=2214 width=8) (actual time=86910.723..86910.723 rows=1 loops=1)

 

It expects 2214 records while really getting only 1.

 

Regards,

Igor Neyman

 

 

Do you have autovacuum running?

If yes, maybe it’s not aggressive enough and you need to adjust its parameters.

 

Regards,

Igor Neyman

 

Re: Re: Query running slow for only one specific id. (Postgres 9.3) version

From
Steve Crawford
Date:
On 06/05/2015 11:38 AM, Sheena, Prabhjot wrote:

When I run vacuum analyze it fixes the problem but after 1 or 2 days the problem comes back

 

Is autovacuum running and using what settings?

(select name, setting from pg_settings where name ~ 'autovacuum' or name ~ 'statistics';)

Cheers,
Steve

P.S. The convention on the PostgreSQL mailing lists it to bottom-post, not top-post replies.
On 06/05/2015 12:28 PM, Steve Crawford wrote:
On 06/05/2015 11:38 AM, Sheena, Prabhjot wrote:

When I run vacuum analyze it fixes the problem but after 1 or 2 days the problem comes back

 

Is autovacuum running and using what settings?

(select name, setting from pg_settings where name ~ 'autovacuum'
or name ~ 'statistics';)

Cheers,
Steve

P.S. The convention on the PostgreSQL mailing lists it to bottom-post, not top-post replies.

And just to confirm, are there any table-specific overrides to the system-wide settings?

Cheers,
Steve

Re: [PERFORM] Query running slow for only one specific id. (Postgres 9.3) version

From
Matheus de Oliveira
Date:

On Fri, Jun 5, 2015 at 2:54 PM, Sheena, Prabhjot <Prabhjot.Singh@classmates.com> wrote:

explain analyze SELECT max(last_update_date) AS last_update_date FROM btdt_responses WHERE registration_id = 8718704208 AND response != 4;

                                                                                QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Result  (cost=2902.98..2903.01 rows=1 width=0) (actual time=86910.730..86910.731 rows=1 loops=1)

   InitPlan 1 (returns $0)

     ->  Limit  (cost=0.57..2902.98 rows=1 width=8) (actual time=86910.725..86910.725 rows=1 loops=1)

           ->  Index Scan Backward using btdt_responses_n5 on btdt_responses  (cost=0.57..6425932.41 rows=2214 width=8) (actual time=86910.723..86910.723 rows=1 loops=1)

                 Index Cond: (last_update_date IS NOT NULL)

                 Filter: ((response <> 4) AND (registration_id = 8718704208::bigint))

                 Rows Removed by Filter: 52145434

Total runtime: 86910.766 ms


The issue here is the "Row Removed by Filter", you are filtering out more than 52M rows, so the index is not being much effective.

What you want for this query is a composite index on (registration_id, last_update_date). And if the filter always include `response <> 4`, then you can also create a partial index with that (unless it is not very selective, then it might not be worthy it).

Regards,
--
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres