Re: postgresql long running query - Mailing list pgsql-admin

From liam saffioti
Subject Re: postgresql long running query
Date
Msg-id CAGHjuaaAKsrSWjyjexo8AzgWkpdo_mKtp72hr=Q9hy7oqOHjLw@mail.gmail.com
Whole thread Raw
In response to AW: postgresql long running query  ("Dischner, Anton" <Anton.Dischner@med.uni-muenchen.de>)
Responses AW: postgresql long running query
List pgsql-admin
Hi again,
I created additional index for just one column. (CREATE INDEX ON "T_CMN_SLAHISTORY" ("DefinitionId");
The query now runs faster. But I don't think that's my problem.
My query takes a very long time sometimes, not always. I could not understand the reason for this. 
I mean, why is it taking longer at that time for example?

My another example is here:

SELECT nspname AS schema_name, relname AS table_name, pg_total_relation_size(C.oid) AS table_size FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN ('pg_catalog', 'information_schema') AND pg_total_relation_size(C.oid) > 100000000 AND C.relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY pg_total_relation_size(C.oid) DESC LIMIT 30;

This query takes 800ms to run. But this query took 18 minutes last night. I don't understand why the query that normally always takes a short time is taking so long at this time. Also there are no locks.

The query logs are here:

2021-12-08 04:23:48.235 [local] surface postgres  508411 SELECT LOG:  duration: 1124822.358 ms  statement: SELECT  nspname as schema_name, relname AS table_name,     pg_total_relation_size (C .oid)  AS table_size  FROM     pg_class C  LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace) WHERE    nspname NOT IN ( 'pg_catalog',  'information_schema'    ) and pg_total_relation_size (C .oid)>100000000 AND C .relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY    pg_total_relation_size (C .oid) DESC LIMIT 30

2021-12-08 04:23:48.227 [local] surface postgres 524884 SELECT LOG:  duration: 525254.222 ms  statement: SELECT  nspname as schema_name, relname AS table_name,     pg_total_relation_size (C .oid)  AS table_size  FROM     pg_class C  LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace) WHERE    nspname NOT IN ( 'pg_catalog',  'information_schema'    ) and pg_total_relation_size (C .oid)>100000000 AND C .relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY    pg_total_relation_size (C .oid) DESC LIMIT 30

2021-12-08 04:23:48.235 [local] surface postgres 516992 SELECT LOG:  duration: 824718.012 ms  statement: SELECT  nspname as schema_name, relname AS table_name,     pg_total_relation_size (C .oid)  AS table_size  FROM     pg_class C  LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace) WHERE    nspname NOT IN ( 'pg_catalog',  'information_schema'    ) and pg_total_relation_size (C .oid)>100000000 AND C .relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY    pg_total_relation_size (C .oid) DESC LIMIT 30


Finally, let me point out; the execution plan is here

EXPLAIN ANALYZE SELECT nspname AS schema_name, relname AS table_name, pg_total_relation_size(C.oid) AS table_size FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN ('pg_catalog', 'information_schema') AND pg_total_relation_size(C.oid) > 100000000 AND C.relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY pg_total_relation_size(C.oid) DESC LIMIT 30;
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=34231.14..34231.22 rows=30 width=136) (actual time=837.358..837.367 rows=30 loops=1)
   ->  Sort  (cost=34231.14..34233.24 rows=840 width=136) (actual time=837.356..837.362 rows=30 loops=1)
         Sort Key: (pg_total_relation_size((c.oid)::regclass)) DESC
         Sort Method: top-N heapsort  Memory: 38kB
         ->  Hash Join  (cost=204.31..34206.33 rows=840 width=136) (actual time=70.550..837.181 rows=161 loops=1)
               Hash Cond: (c.relnamespace = n.oid)
               ->  Seq Scan on pg_class c  (cost=0.00..33995.56 rows=1664 width=72) (actual time=0.246..801.685 rows=202 loops=1)
                     Filter: ((relkind <> 'i'::"char") AND (pg_total_relation_size((oid)::regclass) > 100000000))
                     Rows Removed by Filter: 9022
               ->  Hash  (cost=166.91..166.91 rows=2992 width=68) (actual time=15.494..15.494 rows=3001 loops=1)
                     Buckets: 4096  Batches: 1  Memory Usage: 326kB
                     ->  Seq Scan on pg_namespace n  (cost=0.00..166.91 rows=2992 width=68) (actual time=0.026..14.616 rows=3001 loops=1)
                           Filter: ((nspname <> ALL ('{pg_catalog,information_schema}'::name[])) AND (nspname !~ '^pg_toast'::text))
                           Rows Removed by Filter: 2926
 Planning Time: 2.627 ms
 Execution Time: 837.660 ms
(16 rows)


Thank you so much for your help.

Dischner, Anton <Anton.Dischner@med.uni-muenchen.de>, 6 Ara 2021 Pzt, 11:21 tarihinde şunu yazdı:

Hi all,

 

please report your findings trying to use a new index special for DefinitionId as suggested by Edward.

 

We recently found for our data:

 

By adding indexes the running times of queries were reduced from days to several seconds.

Don’t be scared: you can always remove indexes by dropping them!

 

best,

 

Anton

 

Von: liam saffioti <liam.saffiotti@gmail.com>
Gesendet: Freitag, 3. Dezember 2021 14:12
An: Dischner, Anton <Anton.Dischner@med.uni-muenchen.de>
Cc: pgsql-admin <pgsql-admin@lists.postgresql.org>; Julien Rouhaud <rjuju123@gmail.com>
Betreff: Re: postgresql long running query

 

Hi Anton and Samed,

Thank you for your reply. 

I use PostgreSQL 12.9 on RHEL 8.4. 

There are no DDL queries on the database in a whole day, so there is no lock state.

The pgbadger slow queries report shows this query ran for 43min 11sec. 

And table and indexes definition is:

 

 \d+ "T_CMN_SLAHISTORY"
                                                         Table "T_CMN_SLAHISTORY"
      Column      |            Type             | Collation | Nullable |             Default              | Storage  | Stats target | Description
------------------+-----------------------------+-----------+----------+----------------------------------+----------+--------------+-------------
 HistoryId     | integer                     |           | not null | generated by default as identity | plain    |              |
 DefinitionId  | integer                     |           | not null |                                  | plain    |              |
 RuleName      | character varying(200)      |           | not null |                                  | extended |              |
 CreateUserId     | integer                     |           | not null |                                  | plain    |              |
 CreateUserTime   | timestamp without time zone |           | not null |                                  | plain    |              |
 LastRunningTime  | timestamp without time zone |           |          |                                  | plain    |              |
 Cost             | integer                     |           |          |                                  | plain    |              |
 UpdateUserId     | integer                     |           |          |                                  | plain    |              |
 UpdateUserTime   | timestamp without time zone |           |          |                                  | plain    |              |
 RecId          | character varying(36)       |           |          |                                  | extended |              |
 MyTimeStamp      | bytea                       |           |          |                                  | extended |              |
 IsOldRecordBatch | boolean                     |           |          |                                  | plain    |              |
 InstanceId       | bigint                      |           |          |                                  | plain    |              |
Indexes:
    "T_CMN_SLAHISTORY_HistoryId_idx" UNIQUE, btree ("HistoryId")
    "Index-20180207-152712" btree ("RuleName", "DefinitionId")
    "Index-20180712-192739" btree ("DefinitionId", "IsOldRecordBatch")
    "Index-20180712-192836" btree ("InstanceId")
Access method: heap

 

 

Schema  |       Name       | Type  |  Owner   |  Size  | Description
---------+------------------+-------+----------+--------+-------------
 public | T_CMN_SLAHISTORY | table | postgres | 244 MB | 

 

Dischner, Anton <Anton.Dischner@med.uni-muenchen.de>, 3 Ara 2021 Cum, 14:39 tarihinde şunu yazdı:

Hi Liam,

 

top tuning tip for every relational database is: indexes!

 

I have adminstrated databases where data was 300 GB and created indexes 600 GB

You may think thats useless redundant but ist definitively not.

 

Rules for indexing:

 

Does it make sense to create an index for a table with one or several douzends of entries? -> YES!

Does it make sense to create an index that is optimized for certain queries? -> YES!

Does it make sense to create an index, make the query and drop the index? -> YES

 

Even with tables with billions of entries you should get answer times of less than a second!

… if you have a good index: date for example where the database engine plays ist full potental.

 

For example see: https://www.enterprisedb.com/postgres-tutorials/overview-postgresql-indexes

 

Which indexes do you have?

 

best,

 

A

 

Von: liam saffioti <liam.saffiotti@gmail.com>
Gesendet: Freitag, 3. Dezember 2021 12:24
An: pgsql-admin <
pgsql-admin@lists.postgresql.org>; Julien Rouhaud <rjuju123@gmail.com>
Betreff: postgresql long running query

 

Hello Team,

 

I have a problem with a query that consumes a long time. 

The query' execution plan is :

 

 

EXPLAIN ANALYZE SELECT x."HistoryId", x."SlaDefinition" FROM "T_CMN_SLAHISTORY" AS x WHERE x."DefinitionId" = '302';
                                                                        QUERY PLAN                                                                        
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on "T_CMN_SLAHISTORY" x  (cost=2144.77..35086.42 rows=139012 width=8) (actual time=58.806..116.874 rows=135498 loops=1)
   Recheck Cond: ("DefinitionId" = 302)
   Heap Blocks: exact=2175
   ->  Bitmap Index Scan on "Index-20180712-192739"  (cost=0.00..2110.02 rows=139012 width=0) (actual time=57.043..57.045 rows=135498 loops=1)
         Index Cond: ("DefinitionId" = 302)
 Planning Time: 11.132 ms
 Execution Time: 120.320 ms

 

 

 

But, the query execution time was 43min 11seconds in the morning in pgbadger report. I don't understand why the query is taking so long. Can you guide me?

 

Thank you so much.

 

pgsql-admin by date:

Previous
From: Ron
Date:
Subject: Re: PostgreSQL(v9.6.5) not releasing old/inactive IDLE connections
Next
From: Paul Smith
Date:
Subject: Re: PostgreSQL(v9.6.5) not releasing old/inactive IDLE connections