Thread: postgresql long running query

postgresql long running query

From
liam saffioti
Date:
Hello Team,

I have a problem with a query that consumes a long time. 
The query' execution plan is :


EXPLAIN ANALYZE SELECT x."HistoryId", x."SlaDefinition" FROM "T_CMN_SLAHISTORY" AS x WHERE x."DefinitionId" = '302';
                                                                        QUERY PLAN                                                                        
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on "T_CMN_SLAHISTORY" x  (cost=2144.77..35086.42 rows=139012 width=8) (actual time=58.806..116.874 rows=135498 loops=1)
   Recheck Cond: ("DefinitionId" = 302)
   Heap Blocks: exact=2175
   ->  Bitmap Index Scan on "Index-20180712-192739"  (cost=0.00..2110.02 rows=139012 width=0) (actual time=57.043..57.045 rows=135498 loops=1)
         Index Cond: ("DefinitionId" = 302)
 Planning Time: 11.132 ms
 Execution Time: 120.320 ms



But, the query execution time was 43min 11seconds in the morning in pgbadger report. I don't understand why the query is taking so long. Can you guide me?

Thank you so much.

AW: postgresql long running query

From
"Dischner, Anton"
Date:

Hi Liam,

 

top tuning tip for every relational database is: indexes!

 

I have adminstrated databases where data was 300 GB and created indexes 600 GB

You may think thats useless redundant but ist definitively not.

 

Rules for indexing:

 

Does it make sense to create an index for a table with one or several douzends of entries? -> YES!

Does it make sense to create an index that is optimized for certain queries? -> YES!

Does it make sense to create an index, make the query and drop the index? -> YES

 

Even with tables with billions of entries you should get answer times of less than a second!

… if you have a good index: date for example where the database engine plays ist full potental.

 

For example see: https://www.enterprisedb.com/postgres-tutorials/overview-postgresql-indexes

 

Which indexes do you have?

 

best,

 

A

 

Von: liam saffioti <liam.saffiotti@gmail.com>
Gesendet: Freitag, 3. Dezember 2021 12:24
An: pgsql-admin <pgsql-admin@lists.postgresql.org>; Julien Rouhaud <rjuju123@gmail.com>
Betreff: postgresql long running query

 

Hello Team,

 

I have a problem with a query that consumes a long time. 

The query' execution plan is :

 

 

EXPLAIN ANALYZE SELECT x."HistoryId", x."SlaDefinition" FROM "T_CMN_SLAHISTORY" AS x WHERE x."DefinitionId" = '302';
                                                                        QUERY PLAN                                                                        
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on "T_CMN_SLAHISTORY" x  (cost=2144.77..35086.42 rows=139012 width=8) (actual time=58.806..116.874 rows=135498 loops=1)
   Recheck Cond: ("DefinitionId" = 302)
   Heap Blocks: exact=2175
   ->  Bitmap Index Scan on "Index-20180712-192739"  (cost=0.00..2110.02 rows=139012 width=0) (actual time=57.043..57.045 rows=135498 loops=1)
         Index Cond: ("DefinitionId" = 302)
 Planning Time: 11.132 ms
 Execution Time: 120.320 ms

 

 

 

But, the query execution time was 43min 11seconds in the morning in pgbadger report. I don't understand why the query is taking so long. Can you guide me?

 

Thank you so much.

 

Re: postgresql long running query

From
liam saffioti
Date:
Hi Anton and Samed,
Thank you for your reply. 
I use PostgreSQL 12.9 on RHEL 8.4. 
There are no DDL queries on the database in a whole day, so there is no lock state.
The pgbadger slow queries report shows this query ran for 43min 11sec. 
And table and indexes definition is:

 \d+ "T_CMN_SLAHISTORY"
                                                         Table "T_CMN_SLAHISTORY"
      Column      |            Type             | Collation | Nullable |             Default              | Storage  | Stats target | Description
------------------+-----------------------------+-----------+----------+----------------------------------+----------+--------------+-------------
 HistoryId     | integer                     |           | not null | generated by default as identity | plain    |              |
 DefinitionId  | integer                     |           | not null |                                  | plain    |              |
 RuleName      | character varying(200)      |           | not null |                                  | extended |              |
 CreateUserId     | integer                     |           | not null |                                  | plain    |              |
 CreateUserTime   | timestamp without time zone |           | not null |                                  | plain    |              |
 LastRunningTime  | timestamp without time zone |           |          |                                  | plain    |              |
 Cost             | integer                     |           |          |                                  | plain    |              |
 UpdateUserId     | integer                     |           |          |                                  | plain    |              |
 UpdateUserTime   | timestamp without time zone |           |          |                                  | plain    |              |
 RecId          | character varying(36)       |           |          |                                  | extended |              |
 MyTimeStamp      | bytea                       |           |          |                                  | extended |              |
 IsOldRecordBatch | boolean                     |           |          |                                  | plain    |              |
 InstanceId       | bigint                      |           |          |                                  | plain    |              |
Indexes:
    "T_CMN_SLAHISTORY_HistoryId_idx" UNIQUE, btree ("HistoryId")
    "Index-20180207-152712" btree ("RuleName", "DefinitionId")
    "Index-20180712-192739" btree ("DefinitionId", "IsOldRecordBatch")
    "Index-20180712-192836" btree ("InstanceId")
Access method: heap


Schema  |       Name       | Type  |  Owner   |  Size  | Description
---------+------------------+-------+----------+--------+-------------
 public | T_CMN_SLAHISTORY | table | postgres | 244 MB | 

Dischner, Anton <Anton.Dischner@med.uni-muenchen.de>, 3 Ara 2021 Cum, 14:39 tarihinde şunu yazdı:

Hi Liam,

 

top tuning tip for every relational database is: indexes!

 

I have adminstrated databases where data was 300 GB and created indexes 600 GB

You may think thats useless redundant but ist definitively not.

 

Rules for indexing:

 

Does it make sense to create an index for a table with one or several douzends of entries? -> YES!

Does it make sense to create an index that is optimized for certain queries? -> YES!

Does it make sense to create an index, make the query and drop the index? -> YES

 

Even with tables with billions of entries you should get answer times of less than a second!

… if you have a good index: date for example where the database engine plays ist full potental.

 

For example see: https://www.enterprisedb.com/postgres-tutorials/overview-postgresql-indexes

 

Which indexes do you have?

 

best,

 

A

 

Von: liam saffioti <liam.saffiotti@gmail.com>
Gesendet: Freitag, 3. Dezember 2021 12:24
An: pgsql-admin <pgsql-admin@lists.postgresql.org>; Julien Rouhaud <rjuju123@gmail.com>
Betreff: postgresql long running query

 

Hello Team,

 

I have a problem with a query that consumes a long time. 

The query' execution plan is :

 

 

EXPLAIN ANALYZE SELECT x."HistoryId", x."SlaDefinition" FROM "T_CMN_SLAHISTORY" AS x WHERE x."DefinitionId" = '302';
                                                                        QUERY PLAN                                                                        
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on "T_CMN_SLAHISTORY" x  (cost=2144.77..35086.42 rows=139012 width=8) (actual time=58.806..116.874 rows=135498 loops=1)
   Recheck Cond: ("DefinitionId" = 302)
   Heap Blocks: exact=2175
   ->  Bitmap Index Scan on "Index-20180712-192739"  (cost=0.00..2110.02 rows=139012 width=0) (actual time=57.043..57.045 rows=135498 loops=1)
         Index Cond: ("DefinitionId" = 302)
 Planning Time: 11.132 ms
 Execution Time: 120.320 ms

 

 

 

But, the query execution time was 43min 11seconds in the morning in pgbadger report. I don't understand why the query is taking so long. Can you guide me?

 

Thank you so much.

 

AW: postgresql long running query

From
"Dischner, Anton"
Date:

Hi all,

 

please report your findings trying to use a new index special for DefinitionId as suggested by Edward.

 

We recently found for our data:

 

By adding indexes the running times of queries were reduced from days to several seconds.

Don’t be scared: you can always remove indexes by dropping them!

 

best,

 

Anton

 

Von: liam saffioti <liam.saffiotti@gmail.com>
Gesendet: Freitag, 3. Dezember 2021 14:12
An: Dischner, Anton <Anton.Dischner@med.uni-muenchen.de>
Cc: pgsql-admin <pgsql-admin@lists.postgresql.org>; Julien Rouhaud <rjuju123@gmail.com>
Betreff: Re: postgresql long running query

 

Hi Anton and Samed,

Thank you for your reply. 

I use PostgreSQL 12.9 on RHEL 8.4. 

There are no DDL queries on the database in a whole day, so there is no lock state.

The pgbadger slow queries report shows this query ran for 43min 11sec. 

And table and indexes definition is:

 

 \d+ "T_CMN_SLAHISTORY"
                                                         Table "T_CMN_SLAHISTORY"
      Column      |            Type             | Collation | Nullable |             Default              | Storage  | Stats target | Description
------------------+-----------------------------+-----------+----------+----------------------------------+----------+--------------+-------------
 HistoryId     | integer                     |           | not null | generated by default as identity | plain    |              |
 DefinitionId  | integer                     |           | not null |                                  | plain    |              |
 RuleName      | character varying(200)      |           | not null |                                  | extended |              |
 CreateUserId     | integer                     |           | not null |                                  | plain    |              |
 CreateUserTime   | timestamp without time zone |           | not null |                                  | plain    |              |
 LastRunningTime  | timestamp without time zone |           |          |                                  | plain    |              |
 Cost             | integer                     |           |          |                                  | plain    |              |
 UpdateUserId     | integer                     |           |          |                                  | plain    |              |
 UpdateUserTime   | timestamp without time zone |           |          |                                  | plain    |              |
 RecId          | character varying(36)       |           |          |                                  | extended |              |
 MyTimeStamp      | bytea                       |           |          |                                  | extended |              |
 IsOldRecordBatch | boolean                     |           |          |                                  | plain    |              |
 InstanceId       | bigint                      |           |          |                                  | plain    |              |
Indexes:
    "T_CMN_SLAHISTORY_HistoryId_idx" UNIQUE, btree ("HistoryId")
    "Index-20180207-152712" btree ("RuleName", "DefinitionId")
    "Index-20180712-192739" btree ("DefinitionId", "IsOldRecordBatch")
    "Index-20180712-192836" btree ("InstanceId")
Access method: heap

 

 

Schema  |       Name       | Type  |  Owner   |  Size  | Description
---------+------------------+-------+----------+--------+-------------
 public | T_CMN_SLAHISTORY | table | postgres | 244 MB | 

 

Dischner, Anton <Anton.Dischner@med.uni-muenchen.de>, 3 Ara 2021 Cum, 14:39 tarihinde şunu yazdı:

Hi Liam,

 

top tuning tip for every relational database is: indexes!

 

I have adminstrated databases where data was 300 GB and created indexes 600 GB

You may think thats useless redundant but ist definitively not.

 

Rules for indexing:

 

Does it make sense to create an index for a table with one or several douzends of entries? -> YES!

Does it make sense to create an index that is optimized for certain queries? -> YES!

Does it make sense to create an index, make the query and drop the index? -> YES

 

Even with tables with billions of entries you should get answer times of less than a second!

… if you have a good index: date for example where the database engine plays ist full potental.

 

For example see: https://www.enterprisedb.com/postgres-tutorials/overview-postgresql-indexes

 

Which indexes do you have?

 

best,

 

A

 

Von: liam saffioti <liam.saffiotti@gmail.com>
Gesendet: Freitag, 3. Dezember 2021 12:24
An: pgsql-admin <
pgsql-admin@lists.postgresql.org>; Julien Rouhaud <rjuju123@gmail.com>
Betreff: postgresql long running query

 

Hello Team,

 

I have a problem with a query that consumes a long time. 

The query' execution plan is :

 

 

EXPLAIN ANALYZE SELECT x."HistoryId", x."SlaDefinition" FROM "T_CMN_SLAHISTORY" AS x WHERE x."DefinitionId" = '302';
                                                                        QUERY PLAN                                                                        
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on "T_CMN_SLAHISTORY" x  (cost=2144.77..35086.42 rows=139012 width=8) (actual time=58.806..116.874 rows=135498 loops=1)
   Recheck Cond: ("DefinitionId" = 302)
   Heap Blocks: exact=2175
   ->  Bitmap Index Scan on "Index-20180712-192739"  (cost=0.00..2110.02 rows=139012 width=0) (actual time=57.043..57.045 rows=135498 loops=1)
         Index Cond: ("DefinitionId" = 302)
 Planning Time: 11.132 ms
 Execution Time: 120.320 ms

 

 

 

But, the query execution time was 43min 11seconds in the morning in pgbadger report. I don't understand why the query is taking so long. Can you guide me?

 

Thank you so much.

 

Re: postgresql long running query

From
liam saffioti
Date:
Hi again,
I created additional index for just one column. (CREATE INDEX ON "T_CMN_SLAHISTORY" ("DefinitionId");
The query now runs faster. But I don't think that's my problem.
My query takes a very long time sometimes, not always. I could not understand the reason for this. 
I mean, why is it taking longer at that time for example?

My another example is here:

SELECT nspname AS schema_name, relname AS table_name, pg_total_relation_size(C.oid) AS table_size FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN ('pg_catalog', 'information_schema') AND pg_total_relation_size(C.oid) > 100000000 AND C.relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY pg_total_relation_size(C.oid) DESC LIMIT 30;

This query takes 800ms to run. But this query took 18 minutes last night. I don't understand why the query that normally always takes a short time is taking so long at this time. Also there are no locks.

The query logs are here:

2021-12-08 04:23:48.235 [local] surface postgres  508411 SELECT LOG:  duration: 1124822.358 ms  statement: SELECT  nspname as schema_name, relname AS table_name,     pg_total_relation_size (C .oid)  AS table_size  FROM     pg_class C  LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace) WHERE    nspname NOT IN ( 'pg_catalog',  'information_schema'    ) and pg_total_relation_size (C .oid)>100000000 AND C .relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY    pg_total_relation_size (C .oid) DESC LIMIT 30

2021-12-08 04:23:48.227 [local] surface postgres 524884 SELECT LOG:  duration: 525254.222 ms  statement: SELECT  nspname as schema_name, relname AS table_name,     pg_total_relation_size (C .oid)  AS table_size  FROM     pg_class C  LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace) WHERE    nspname NOT IN ( 'pg_catalog',  'information_schema'    ) and pg_total_relation_size (C .oid)>100000000 AND C .relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY    pg_total_relation_size (C .oid) DESC LIMIT 30

2021-12-08 04:23:48.235 [local] surface postgres 516992 SELECT LOG:  duration: 824718.012 ms  statement: SELECT  nspname as schema_name, relname AS table_name,     pg_total_relation_size (C .oid)  AS table_size  FROM     pg_class C  LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace) WHERE    nspname NOT IN ( 'pg_catalog',  'information_schema'    ) and pg_total_relation_size (C .oid)>100000000 AND C .relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY    pg_total_relation_size (C .oid) DESC LIMIT 30


Finally, let me point out; the execution plan is here

EXPLAIN ANALYZE SELECT nspname AS schema_name, relname AS table_name, pg_total_relation_size(C.oid) AS table_size FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN ('pg_catalog', 'information_schema') AND pg_total_relation_size(C.oid) > 100000000 AND C.relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY pg_total_relation_size(C.oid) DESC LIMIT 30;
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=34231.14..34231.22 rows=30 width=136) (actual time=837.358..837.367 rows=30 loops=1)
   ->  Sort  (cost=34231.14..34233.24 rows=840 width=136) (actual time=837.356..837.362 rows=30 loops=1)
         Sort Key: (pg_total_relation_size((c.oid)::regclass)) DESC
         Sort Method: top-N heapsort  Memory: 38kB
         ->  Hash Join  (cost=204.31..34206.33 rows=840 width=136) (actual time=70.550..837.181 rows=161 loops=1)
               Hash Cond: (c.relnamespace = n.oid)
               ->  Seq Scan on pg_class c  (cost=0.00..33995.56 rows=1664 width=72) (actual time=0.246..801.685 rows=202 loops=1)
                     Filter: ((relkind <> 'i'::"char") AND (pg_total_relation_size((oid)::regclass) > 100000000))
                     Rows Removed by Filter: 9022
               ->  Hash  (cost=166.91..166.91 rows=2992 width=68) (actual time=15.494..15.494 rows=3001 loops=1)
                     Buckets: 4096  Batches: 1  Memory Usage: 326kB
                     ->  Seq Scan on pg_namespace n  (cost=0.00..166.91 rows=2992 width=68) (actual time=0.026..14.616 rows=3001 loops=1)
                           Filter: ((nspname <> ALL ('{pg_catalog,information_schema}'::name[])) AND (nspname !~ '^pg_toast'::text))
                           Rows Removed by Filter: 2926
 Planning Time: 2.627 ms
 Execution Time: 837.660 ms
(16 rows)


Thank you so much for your help.

Dischner, Anton <Anton.Dischner@med.uni-muenchen.de>, 6 Ara 2021 Pzt, 11:21 tarihinde şunu yazdı:

Hi all,

 

please report your findings trying to use a new index special for DefinitionId as suggested by Edward.

 

We recently found for our data:

 

By adding indexes the running times of queries were reduced from days to several seconds.

Don’t be scared: you can always remove indexes by dropping them!

 

best,

 

Anton

 

Von: liam saffioti <liam.saffiotti@gmail.com>
Gesendet: Freitag, 3. Dezember 2021 14:12
An: Dischner, Anton <Anton.Dischner@med.uni-muenchen.de>
Cc: pgsql-admin <pgsql-admin@lists.postgresql.org>; Julien Rouhaud <rjuju123@gmail.com>
Betreff: Re: postgresql long running query

 

Hi Anton and Samed,

Thank you for your reply. 

I use PostgreSQL 12.9 on RHEL 8.4. 

There are no DDL queries on the database in a whole day, so there is no lock state.

The pgbadger slow queries report shows this query ran for 43min 11sec. 

And table and indexes definition is:

 

 \d+ "T_CMN_SLAHISTORY"
                                                         Table "T_CMN_SLAHISTORY"
      Column      |            Type             | Collation | Nullable |             Default              | Storage  | Stats target | Description
------------------+-----------------------------+-----------+----------+----------------------------------+----------+--------------+-------------
 HistoryId     | integer                     |           | not null | generated by default as identity | plain    |              |
 DefinitionId  | integer                     |           | not null |                                  | plain    |              |
 RuleName      | character varying(200)      |           | not null |                                  | extended |              |
 CreateUserId     | integer                     |           | not null |                                  | plain    |              |
 CreateUserTime   | timestamp without time zone |           | not null |                                  | plain    |              |
 LastRunningTime  | timestamp without time zone |           |          |                                  | plain    |              |
 Cost             | integer                     |           |          |                                  | plain    |              |
 UpdateUserId     | integer                     |           |          |                                  | plain    |              |
 UpdateUserTime   | timestamp without time zone |           |          |                                  | plain    |              |
 RecId          | character varying(36)       |           |          |                                  | extended |              |
 MyTimeStamp      | bytea                       |           |          |                                  | extended |              |
 IsOldRecordBatch | boolean                     |           |          |                                  | plain    |              |
 InstanceId       | bigint                      |           |          |                                  | plain    |              |
Indexes:
    "T_CMN_SLAHISTORY_HistoryId_idx" UNIQUE, btree ("HistoryId")
    "Index-20180207-152712" btree ("RuleName", "DefinitionId")
    "Index-20180712-192739" btree ("DefinitionId", "IsOldRecordBatch")
    "Index-20180712-192836" btree ("InstanceId")
Access method: heap

 

 

Schema  |       Name       | Type  |  Owner   |  Size  | Description
---------+------------------+-------+----------+--------+-------------
 public | T_CMN_SLAHISTORY | table | postgres | 244 MB | 

 

Dischner, Anton <Anton.Dischner@med.uni-muenchen.de>, 3 Ara 2021 Cum, 14:39 tarihinde şunu yazdı:

Hi Liam,

 

top tuning tip for every relational database is: indexes!

 

I have adminstrated databases where data was 300 GB and created indexes 600 GB

You may think thats useless redundant but ist definitively not.

 

Rules for indexing:

 

Does it make sense to create an index for a table with one or several douzends of entries? -> YES!

Does it make sense to create an index that is optimized for certain queries? -> YES!

Does it make sense to create an index, make the query and drop the index? -> YES

 

Even with tables with billions of entries you should get answer times of less than a second!

… if you have a good index: date for example where the database engine plays ist full potental.

 

For example see: https://www.enterprisedb.com/postgres-tutorials/overview-postgresql-indexes

 

Which indexes do you have?

 

best,

 

A

 

Von: liam saffioti <liam.saffiotti@gmail.com>
Gesendet: Freitag, 3. Dezember 2021 12:24
An: pgsql-admin <
pgsql-admin@lists.postgresql.org>; Julien Rouhaud <rjuju123@gmail.com>
Betreff: postgresql long running query

 

Hello Team,

 

I have a problem with a query that consumes a long time. 

The query' execution plan is :

 

 

EXPLAIN ANALYZE SELECT x."HistoryId", x."SlaDefinition" FROM "T_CMN_SLAHISTORY" AS x WHERE x."DefinitionId" = '302';
                                                                        QUERY PLAN                                                                        
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on "T_CMN_SLAHISTORY" x  (cost=2144.77..35086.42 rows=139012 width=8) (actual time=58.806..116.874 rows=135498 loops=1)
   Recheck Cond: ("DefinitionId" = 302)
   Heap Blocks: exact=2175
   ->  Bitmap Index Scan on "Index-20180712-192739"  (cost=0.00..2110.02 rows=139012 width=0) (actual time=57.043..57.045 rows=135498 loops=1)
         Index Cond: ("DefinitionId" = 302)
 Planning Time: 11.132 ms
 Execution Time: 120.320 ms

 

 

 

But, the query execution time was 43min 11seconds in the morning in pgbadger report. I don't understand why the query is taking so long. Can you guide me?

 

Thank you so much.

 

AW: postgresql long running query

From
"Dischner, Anton"
Date:

Hi all,

 

if we were on Oracle i'd suggest /*+ rule */ to disable bad cost-based optimizer optimisation, but my knowlegde is very old regarding Oracle.

 

Do optimizer hints in PG exists?

 

Obviously not: https://wiki.postgresql.org/wiki/OptimizerHintsDiscussion

 

My next guess would be: make several steps; not a single select but: create table … select … drop table;

 

In my experience "in" or "not in" are worst time killers. Try to avoid that.

Ist even worth a try to

create table ttemp … insert 'pg_catalog', 'information_schema'

Create index ... on ttemp …

WHERE nspname NOT IN (select .. from ttemp);

 

This may sound crazy but doing such experiments saved me a lot of execution time with Oracle and MySQL

 

best,

 

Anton

 

Von: liam saffioti <liam.saffiotti@gmail.com>
Gesendet: Mittwoch, 8. Dezember 2021 09:35
An: Dischner, Anton <Anton.Dischner@med.uni-muenchen.de>
Cc: pgsql-admin <pgsql-admin@lists.postgresql.org>; Julien Rouhaud <rjuju123@gmail.com>
Betreff: Re: postgresql long running query

 

Hi again,

I created additional index for just one column. (CREATE INDEX ON "T_CMN_SLAHISTORY" ("DefinitionId");

The query now runs faster. But I don't think that's my problem.

My query takes a very long time sometimes, not always. I could not understand the reason for this. 

I mean, why is it taking longer at that time for example?

 

My another example is here:

 

SELECT nspname AS schema_name, relname AS table_name, pg_total_relation_size(C.oid) AS table_size FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN ('pg_catalog', 'information_schema') AND pg_total_relation_size(C.oid) > 100000000 AND C.relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY pg_total_relation_size(C.oid) DESC LIMIT 30;

 

This query takes 800ms to run. But this query took 18 minutes last night. I don't understand why the query that normally always takes a short time is taking so long at this time. Also there are no locks.

 

The query logs are here:

 

2021-12-08 04:23:48.235 [local] surface postgres  508411 SELECT LOG:  duration: 1124822.358 ms  statement: SELECT  nspname as schema_name, relname AS table_name,     pg_total_relation_size (C .oid)  AS table_size  FROM     pg_class C  LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace) WHERE    nspname NOT IN ( 'pg_catalog',  'information_schema'    ) and pg_total_relation_size (C .oid)>100000000 AND C .relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY    pg_total_relation_size (C .oid) DESC LIMIT 30

 

2021-12-08 04:23:48.227 [local] surface postgres 524884 SELECT LOG:  duration: 525254.222 ms  statement: SELECT  nspname as schema_name, relname AS table_name,     pg_total_relation_size (C .oid)  AS table_size  FROM     pg_class C  LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace) WHERE    nspname NOT IN ( 'pg_catalog',  'information_schema'    ) and pg_total_relation_size (C .oid)>100000000 AND C .relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY    pg_total_relation_size (C .oid) DESC LIMIT 30

 

2021-12-08 04:23:48.235 [local] surface postgres 516992 SELECT LOG:  duration: 824718.012 ms  statement: SELECT  nspname as schema_name, relname AS table_name,     pg_total_relation_size (C .oid)  AS table_size  FROM     pg_class C  LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace) WHERE    nspname NOT IN ( 'pg_catalog',  'information_schema'    ) and pg_total_relation_size (C .oid)>100000000 AND C .relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY    pg_total_relation_size (C .oid) DESC LIMIT 30

 

 

Finally, let me point out; the execution plan is here

 

EXPLAIN ANALYZE SELECT nspname AS schema_name, relname AS table_name, pg_total_relation_size(C.oid) AS table_size FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN ('pg_catalog', 'information_schema') AND pg_total_relation_size(C.oid) > 100000000 AND C.relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY pg_total_relation_size(C.oid) DESC LIMIT 30;
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=34231.14..34231.22 rows=30 width=136) (actual time=837.358..837.367 rows=30 loops=1)
   ->  Sort  (cost=34231.14..34233.24 rows=840 width=136) (actual time=837.356..837.362 rows=30 loops=1)
         Sort Key: (pg_total_relation_size((c.oid)::regclass)) DESC
         Sort Method: top-N heapsort  Memory: 38kB
         ->  Hash Join  (cost=204.31..34206.33 rows=840 width=136) (actual time=70.550..837.181 rows=161 loops=1)
               Hash Cond: (c.relnamespace = n.oid)
               ->  Seq Scan on pg_class c  (cost=0.00..33995.56 rows=1664 width=72) (actual time=0.246..801.685 rows=202 loops=1)
                     Filter: ((relkind <> 'i'::"char") AND (pg_total_relation_size((oid)::regclass) > 100000000))
                     Rows Removed by Filter: 9022
               ->  Hash  (cost=166.91..166.91 rows=2992 width=68) (actual time=15.494..15.494 rows=3001 loops=1)
                     Buckets: 4096  Batches: 1  Memory Usage: 326kB
                     ->  Seq Scan on pg_namespace n  (cost=0.00..166.91 rows=2992 width=68) (actual time=0.026..14.616 rows=3001 loops=1)
                           Filter: ((nspname <> ALL ('{pg_catalog,information_schema}'::name[])) AND (nspname !~ '^pg_toast'::text))
                           Rows Removed by Filter: 2926
 Planning Time: 2.627 ms
 Execution Time: 837.660 ms
(16 rows)

 

 

Thank you so much for your help.

 

Dischner, Anton <Anton.Dischner@med.uni-muenchen.de>, 6 Ara 2021 Pzt, 11:21 tarihinde şunu yazdı:

Hi all,

 

please report your findings trying to use a new index special for DefinitionId as suggested by Edward.

 

We recently found for our data:

 

By adding indexes the running times of queries were reduced from days to several seconds.

Don’t be scared: you can always remove indexes by dropping them!

 

best,

 

Anton

 

Von: liam saffioti <liam.saffiotti@gmail.com>
Gesendet: Freitag, 3. Dezember 2021 14:12
An: Dischner, Anton <Anton.Dischner@med.uni-muenchen.de>
Cc: pgsql-admin <pgsql-admin@lists.postgresql.org>; Julien Rouhaud <rjuju123@gmail.com>
Betreff: Re: postgresql long running query

 

Hi Anton and Samed,

Thank you for your reply. 

I use PostgreSQL 12.9 on RHEL 8.4. 

There are no DDL queries on the database in a whole day, so there is no lock state.

The pgbadger slow queries report shows this query ran for 43min 11sec. 

And table and indexes definition is:

 

 \d+ "T_CMN_SLAHISTORY"
                                                         Table "T_CMN_SLAHISTORY"
      Column      |            Type             | Collation | Nullable |             Default              | Storage  | Stats target | Description
------------------+-----------------------------+-----------+----------+----------------------------------+----------+--------------+-------------
 HistoryId     | integer                     |           | not null | generated by default as identity | plain    |              |
 DefinitionId  | integer                     |           | not null |                                  | plain    |              |
 RuleName      | character varying(200)      |           | not null |                                  | extended |              |
 CreateUserId     | integer                     |           | not null |                                  | plain    |              |
 CreateUserTime   | timestamp without time zone |           | not null |                                  | plain    |              |
 LastRunningTime  | timestamp without time zone |           |          |                                  | plain    |              |
 Cost             | integer                     |           |          |                                  | plain    |              |
 UpdateUserId     | integer                     |           |          |                                  | plain    |              |
 UpdateUserTime   | timestamp without time zone |           |          |                                  | plain    |              |
 RecId          | character varying(36)       |           |          |                                  | extended |              |
 MyTimeStamp      | bytea                       |           |          |                                  | extended |              |
 IsOldRecordBatch | boolean                     |           |          |                                  | plain    |              |
 InstanceId       | bigint                      |           |          |                                  | plain    |              |
Indexes:
    "T_CMN_SLAHISTORY_HistoryId_idx" UNIQUE, btree ("HistoryId")
    "Index-20180207-152712" btree ("RuleName", "DefinitionId")
    "Index-20180712-192739" btree ("DefinitionId", "IsOldRecordBatch")
    "Index-20180712-192836" btree ("InstanceId")
Access method: heap

 

 

Schema  |       Name       | Type  |  Owner   |  Size  | Description
---------+------------------+-------+----------+--------+-------------
 public | T_CMN_SLAHISTORY | table | postgres | 244 MB | 

 

Dischner, Anton <Anton.Dischner@med.uni-muenchen.de>, 3 Ara 2021 Cum, 14:39 tarihinde şunu yazdı:

Hi Liam,

 

top tuning tip for every relational database is: indexes!

 

I have adminstrated databases where data was 300 GB and created indexes 600 GB

You may think thats useless redundant but ist definitively not.

 

Rules for indexing:

 

Does it make sense to create an index for a table with one or several douzends of entries? -> YES!

Does it make sense to create an index that is optimized for certain queries? -> YES!

Does it make sense to create an index, make the query and drop the index? -> YES

 

Even with tables with billions of entries you should get answer times of less than a second!

… if you have a good index: date for example where the database engine plays ist full potental.

 

For example see: https://www.enterprisedb.com/postgres-tutorials/overview-postgresql-indexes

 

Which indexes do you have?

 

best,

 

A

 

Von: liam saffioti <liam.saffiotti@gmail.com>
Gesendet: Freitag, 3. Dezember 2021 12:24
An: pgsql-admin <
pgsql-admin@lists.postgresql.org>; Julien Rouhaud <rjuju123@gmail.com>
Betreff: postgresql long running query

 

Hello Team,

 

I have a problem with a query that consumes a long time. 

The query' execution plan is :

 

 

EXPLAIN ANALYZE SELECT x."HistoryId", x."SlaDefinition" FROM "T_CMN_SLAHISTORY" AS x WHERE x."DefinitionId" = '302';
                                                                        QUERY PLAN                                                                        
-----------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on "T_CMN_SLAHISTORY" x  (cost=2144.77..35086.42 rows=139012 width=8) (actual time=58.806..116.874 rows=135498 loops=1)
   Recheck Cond: ("DefinitionId" = 302)
   Heap Blocks: exact=2175
   ->  Bitmap Index Scan on "Index-20180712-192739"  (cost=0.00..2110.02 rows=139012 width=0) (actual time=57.043..57.045 rows=135498 loops=1)
         Index Cond: ("DefinitionId" = 302)
 Planning Time: 11.132 ms
 Execution Time: 120.320 ms

 

 

 

But, the query execution time was 43min 11seconds in the morning in pgbadger report. I don't understand why the query is taking so long. Can you guide me?

 

Thank you so much.

 

Re: AW: postgresql long running query

From
Mladen Gogala
Date:
On 12/8/21 06:13, Dischner, Anton wrote:

 

Do optimizer hints in PG exists?

 

Obviously not: https://wiki.postgresql.org/wiki/OptimizerHintsDiscussion

Yes, they do exist: http://pghintplan.osdn.jp/pg_hint_plan.html

Packages can be downloaded here: https://osdn.net/projects/pghintplan/releases/73862

There is no package for version 14, you will have to build it from the source:

https://github.com/ossc-db/pg_hint_plan

-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: AW: postgresql long running query

From
liam saffioti
Date:
Hi everyone,

Thank you for your help. But my query is not always slow. It works fast under normal conditions, but in a way that I don't understand it works very slow at some times.
I think there may be a different reason why a query that takes 800ms, in general, takes 18 minutes at a time.

Thanks a lot
Liam

Mladen Gogala <gogala.mladen@gmail.com>, 8 Ara 2021 Çar, 17:11 tarihinde şunu yazdı:
On 12/8/21 06:13, Dischner, Anton wrote:

 

Do optimizer hints in PG exists?

 

Obviously not: https://wiki.postgresql.org/wiki/OptimizerHintsDiscussion

Yes, they do exist: http://pghintplan.osdn.jp/pg_hint_plan.html

Packages can be downloaded here: https://osdn.net/projects/pghintplan/releases/73862

There is no package for version 14, you will have to build it from the source:

https://github.com/ossc-db/pg_hint_plan

-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: AW: postgresql long running query

From
"androxkentaki ."
Date:
Can you tell us how do you run query at night?
IS it via Odbc? Or direct by pgsql script in cron?

Pozdrawiam
Andrzej Gerasimuk

From: liam saffioti <liam.saffiotti@gmail.com>
Sent: Wednesday, December 8, 2021 4:19:33 PM
To: Mladen Gogala <gogala.mladen@gmail.com>
Cc: pgsql-admin <pgsql-admin@lists.postgresql.org>
Subject: Re: AW: postgresql long running query
 
Hi everyone,

Thank you for your help. But my query is not always slow. It works fast under normal conditions, but in a way that I don't understand it works very slow at some times.
I think there may be a different reason why a query that takes 800ms, in general, takes 18 minutes at a time.

Thanks a lot
Liam

Mladen Gogala <gogala.mladen@gmail.com>, 8 Ara 2021 Çar, 17:11 tarihinde şunu yazdı:
On 12/8/21 06:13, Dischner, Anton wrote:

 

Do optimizer hints in PG exists?

 

Obviously not: https://wiki.postgresql.org/wiki/OptimizerHintsDiscussion

Yes, they do exist: http://pghintplan.osdn.jp/pg_hint_plan.html

Packages can be downloaded here: https://osdn.net/projects/pghintplan/releases/73862

There is no package for version 14, you will have to build it from the source:

https://github.com/ossc-db/pg_hint_plan

-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: AW: postgresql long running query

From
liam saffioti
Date:
I have a psql script that runs every 15 minutes in cron.
Thanks,
Liam

androxkentaki . <androxkentaki@gmail.com>, 8 Ara 2021 Çar, 21:57 tarihinde şunu yazdı:
Can you tell us how do you run query at night?
IS it via Odbc? Or direct by pgsql script in cron?

Pozdrawiam
Andrzej Gerasimuk

From: liam saffioti <liam.saffiotti@gmail.com>
Sent: Wednesday, December 8, 2021 4:19:33 PM
To: Mladen Gogala <gogala.mladen@gmail.com>
Cc: pgsql-admin <pgsql-admin@lists.postgresql.org>
Subject: Re: AW: postgresql long running query
 
Hi everyone,

Thank you for your help. But my query is not always slow. It works fast under normal conditions, but in a way that I don't understand it works very slow at some times.
I think there may be a different reason why a query that takes 800ms, in general, takes 18 minutes at a time.

Thanks a lot
Liam

Mladen Gogala <gogala.mladen@gmail.com>, 8 Ara 2021 Çar, 17:11 tarihinde şunu yazdı:
On 12/8/21 06:13, Dischner, Anton wrote:

 

Do optimizer hints in PG exists?

 

Obviously not: https://wiki.postgresql.org/wiki/OptimizerHintsDiscussion

Yes, they do exist: http://pghintplan.osdn.jp/pg_hint_plan.html

Packages can be downloaded here: https://osdn.net/projects/pghintplan/releases/73862

There is no package for version 14, you will have to build it from the source:

https://github.com/ossc-db/pg_hint_plan

-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: AW: postgresql long running query

From
androxkentaki
Date:
do You have whole SELECT SQL statement in script?

May be try to prepare function on Your database with Your SQL function and call it in cron via psql script



CREATE OR REPLACE FUNCTION public.cron_job(
integer)
    RETURNS void
    LANGUAGE 'plpgsql'
    COST 100
    VOLATILE PARALLEL UNSAFE
AS $BODY$
declare

insert into MYTABLE
SELECT nspname AS schema_name, relname AS table_name, pg_total_relation_size(C.oid) AS table_size FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN ('pg_catalog', 'information_schema') AND pg_total_relation_size(C.oid) > 100000000 AND C.relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY pg_total_relation_size(C.oid) DESC LIMIT 30;


END;
$BODY$;

ALTER FUNCTION public.cron_job(integer)
    OWNER TO postgres;






and in psql:



sudo -u postgres psql --dbname MYDB -c 'SELECT public.cron_job(1);'



śr., 8 gru 2021 o 21:02 liam saffioti <liam.saffiotti@gmail.com> napisał(a):
I have a psql script that runs every 15 minutes in cron.
Thanks,
Liam

androxkentaki . <androxkentaki@gmail.com>, 8 Ara 2021 Çar, 21:57 tarihinde şunu yazdı:
Can you tell us how do you run query at night?
IS it via Odbc? Or direct by pgsql script in cron?

Pozdrawiam
Andrzej Gerasimuk

From: liam saffioti <liam.saffiotti@gmail.com>
Sent: Wednesday, December 8, 2021 4:19:33 PM
To: Mladen Gogala <gogala.mladen@gmail.com>
Cc: pgsql-admin <pgsql-admin@lists.postgresql.org>
Subject: Re: AW: postgresql long running query
 
Hi everyone,

Thank you for your help. But my query is not always slow. It works fast under normal conditions, but in a way that I don't understand it works very slow at some times.
I think there may be a different reason why a query that takes 800ms, in general, takes 18 minutes at a time.

Thanks a lot
Liam

Mladen Gogala <gogala.mladen@gmail.com>, 8 Ara 2021 Çar, 17:11 tarihinde şunu yazdı:
On 12/8/21 06:13, Dischner, Anton wrote:

 

Do optimizer hints in PG exists?

 

Obviously not: https://wiki.postgresql.org/wiki/OptimizerHintsDiscussion

Yes, they do exist: http://pghintplan.osdn.jp/pg_hint_plan.html

Packages can be downloaded here: https://osdn.net/projects/pghintplan/releases/73862

There is no package for version 14, you will have to build it from the source:

https://github.com/ossc-db/pg_hint_plan

-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com


--
Z wyrazami szacunku
Andrzej Gerasimuk

Re: AW: postgresql long running query

From
Jeff Janes
Date:
On Wed, Dec 8, 2021 at 10:19 AM liam saffioti <liam.saffiotti@gmail.com> wrote:
Hi everyone,

Thank you for your help. But my query is not always slow. It works fast under normal conditions, but in a way that I don't understand it works very slow at some times.
I think there may be a different reason why a query that takes 800ms, in general, takes 18 minutes at a time.

As previously asked, do you have log_lock_waits turned on?  If so, do you see entries in the db server's log file?

Also, we really could use the plan capture of the slow execution, not of the same query except when it is not slow.  The only way of to get that reliably while also getting the original output of the query (which I assume you need) is with auto_explain, with settings like:

auto_explain.log_min_duration = '600s'
auto_explain.log_analyze=on
auto_explain.log_buffers=on
auto_explain.log_nested_statements=on
 
Then go trolling through the log file.  Note that this does cause every query run on your system to take detailed timings of each step, which can slow down the entire system, especially if you are on a system with slow user access to the clock (common with old hardware or old kernels).  But I think the extra info you get is well worth the extra load.  You could leave auto_explain.log_analyze=off globally, then turn it on only in the specific script file you are seeing the problem with.  
 
Cheers,

Jeff