Thread: HIGH IO and Less CPU utilization

HIGH IO and Less CPU utilization

From
Rambabu g
Date:
Hi All,

We have an issue with high load and IO Wait's but less cpu on postgres Database, The emp Table size is around 500GB, and the connections are very less.

Please suggest to us do we need to change and config parameters at system level or Postgres configuration.

postgres=# select version();

                                                 version                                                  

----------------------------------------------------------------------------------------------------------

 PostgreSQL 11.15 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit

(1 row)


postgres=# \q


Postgres Parameters Setting :


shared_buffers=12GB
work_mem=128MB
effective_cache_size=48GB
maintenance_work_mem=2GB
max_connections=500


14428 | 04:45:59.712892 | active  | INSERT INTO target (empno, name)                                                                            SELECT empno, '' AS name FROM (select distinct  empno  from emp where sname='test' and tp='EMP NAME 1' LIMIT 10) AS query   ;           



  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                

14428 postgres  20   0   12.6g  12.2g  12.2g D   5.3 13.3   4:43.57 postgres:  postgres   postgres (59436) INSERT                                                                                       

29136 postgres  20   0   12.6g 401812 398652 D   4.7  0.4   0:01.20 postgres:  postgres   postgres (48220) SELECT                                                                                       

29119 postgres  20   0   12.6g 677704 674064 S   3.3  0.7   0:02.05 postgres:  postgres   postgres (37684) idle                                                                                        

29121 postgres  20   0   12.6g 758428 755252 S   3.0  0.8   0:02.33 postgres:  postgres   postgres (57392) idle                                                                                        

29166 postgres  20   0   12.6g 260436 257408 S   3.0  0.3   0:00.63 postgres:  postgres   postgres (59424) idle                                                                                       

29181 postgres  20   0   12.6g 179136 175860 D   2.3  0.2   0:00.18 postgres:  postgres   postgres (57092) SELECT                                                                                       

29129 postgres  20   0   12.6g 442444 439212 S   1.7  0.5   0:01.33 postgres:  postgres   postgres (36560) idle 



-bash-4.2$ cat /etc/redhat-release 

Red Hat Enterprise Linux Server release 7.9 (Maipo)

-bash-4.2$ uname

Linux

-bash-4.2$ uname -a

Linux ip.ec2.internal 3.10.0-1160.59.1.el7.x86_64 #1 SMP Wed Feb 16 12:17:35 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

-bash-4.2$ top


top - 17:02:52 up 1 day,  1:44,  2 users,  load average: 11.60, 22.27, 22.22

Tasks: 316 total,   1 running, 315 sleeping,   0 stopped,   0 zombie

%Cpu(s):  0.5 us,  0.5 sy,  0.0 ni, 92.0 id,  7.0 wa,  0.0 hi,  0.0 si,  0.0 st

KiB Mem : 96639952 total,   483896 free,  1693960 used, 94462096 buff/cache

KiB Swap:        0 total,        0 free,        0 used. 81408928 avail Mem 





-bash-4.2$ iostat -x

Linux 3.10.0-1160.59.1.el7.x86_64 (ip.ec2.internal)  03/29/2022  _x86_64_ (24 CPU)


avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.33    0.00    0.24    7.54    0.00   91.88


Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util

nvme1n1           0.00     3.45 1042.22   29.88 41998.88  1476.75    81.10     7.61    7.10    6.62   23.70   0.40  43.19

nvme2n1           0.00     0.02    0.02    1.06     0.15   268.80   497.00     0.09   80.87    0.85   82.56   1.40   0.15

nvme0n1           0.00     0.01    0.21    0.08     4.94     7.07    81.37     0.00    6.88    0.61   22.83   0.64   0.02


-bash-4.2$ vmstat -a

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----

 r  b   swpd   free  inact active   si   so    bi    bo   in   cs us sy id wa st

 1  8      0 476180 40092640 53043492    0    0  1753    73    2   14  0  0 92  8  0

-bash-4.2$ vmstat -d

disk- ------------reads------------ ------------writes----------- -----IO------

       total merged sectors      ms  total merged sectors      ms    cur    sec

nvme1n1 99492480      0 8016369922 658540488 2849690 329519 281661496 67518819      0  41210

nvme2n1   2126      0   27946    1811 101078   2312 51264208 8344670      0    144

nvme0n1  20254      6  942763   12340   7953    641 1348866  181438      0     18



-bash-4.2$ sar

Linux 3.10.0-1160.59.1.el7.x86_64 (ip.ec2.internal) 03/29/2022 _x86_64_ (24 CPU)


04:20:01 PM     CPU     %user     %nice   %system   %iowait    %steal     %idle

04:30:01 PM     all      0.70      0.00      0.69     27.92      0.00     70.68

04:40:01 PM     all      0.71      0.00      0.70     27.76      0.00     70.84

04:50:01 PM     all      0.70      0.00      0.69     26.34      0.00     72.27

05:00:01 PM     all      0.70      0.00      0.68     27.32      0.00     71.31

05:10:01 PM     all      0.70      0.00      0.69     27.83      0.00     70.77

05:20:01 PM     all      0.70      0.00      0.69     28.16      0.00     70.45

05:30:01 PM     all      0.71      0.00      0.69     26.62      0.00     71.98

05:40:01 PM     all      0.69      0.00      0.68     25.77      0.00     72.85

Average:        all      0.70      0.00      0.69     27.21      0.00     71.40

-bash-4.2$ 

-bash-4.2$ free -g

              total        used        free      shared  buff/cache   available

Mem:             92           1           0          12          90          77

Swap:             0           0           0

-bash-4.2$ free -m

              total        used        free      shared  buff/cache   available

Mem:          94374        1721         474       12581       92178       79430

Swap:             0           0           0

-bash-4.2$ lscpu

Architecture:          x86_64

CPU op-mode(s):        32-bit, 64-bit

Byte Order:            Little Endian

CPU(s):                24

On-line CPU(s) list:   0-23

Thread(s) per core:    2

Core(s) per socket:    12

Socket(s):             1

NUMA node(s):          1

Vendor ID:             GenuineIntel

CPU family:            6

Model:                 85

Model name:            Intel(R) Xeon(R) Platinum 8252C CPU @ 3.80GHz

Stepping:              7

CPU MHz:               3799.998

BogoMIPS:              7599.99

Hypervisor vendor:     KVM

Virtualization type:   full

L1d cache:             32K

L1i cache:             32K

L2 cache:              1024K

L3 cache:              25344K

NUMA node0 CPU(s):     0-23

Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 ida arat pku ospke avx512_vnni



Thanks & Regards,

Ramababu.

Re: HIGH IO and Less CPU utilization

From
Justin Pryzby
Date:
Hi,

Thanks for providing all this info.

On Tue, Mar 29, 2022 at 11:34:18PM +0530, Rambabu g wrote:
> Hi All,
> 
> We have an issue with high load and IO Wait's but less cpu on postgres
> Database, The emp Table size is around 500GB, and the connections are very
> less.

What indexes are defined on this table ?
How large are they ?

> Red Hat Enterprise Linux Server release 7.9 (Maipo)
>  PostgreSQL 11.15 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
> 
> shared_buffers=12GB
> work_mem=128MB

> 14428 | 04:45:59.712892 | active  | INSERT INTO target (empno, name)
> SELECT empno, '' AS name FROM (select distinct  empno  from emp where sname='test'
> and tp='EMP NAME 1' LIMIT 10) AS query   ;

Is the only only problem query, or just one example or ??
Are your issues with loading data, querying data or both ?

> -bash-4.2$ iostat -x

It shows that you only have a few filesystems in use.
It's common to have WAL and temp_tablespaces on a separate FS.
That probably wouldn't help your performance at all, but it would help to tell
what's doing I/O.  Is there anything else running on the VM besides postgres ?

You can also check:
SELECT COUNT(1), wait_event FROM pg_stat_activity GROUP BY 2 ORDER BY 1 DESC;

And the pg_buffercache extension:
SELECT COUNT(nullif(isdirty,'f')) dirty, COUNT(1) all, COALESCE(c.relname, b.relfilenode::text) FROM pg_buffercache b
LEFTJOIN pg_class c ON b.relfilenode=pg_relation_filenode(c.oid) GROUP BY 3 ORDER BY 1 DESC,2 DESC LIMIT 9;
 

> Hypervisor vendor:     KVM

Are KSM or THP enabled on the hypervisor ?

tail /sys/kernel/mm/ksm/run /sys/kernel/mm/transparent_hugepage/khugepaged/defrag
/sys/kernel/mm/transparent_hugepage/enabled/sys/kernel/mm/transparent_hugepage/defrag
                       
 

-- 
Justin



Re: HIGH IO and Less CPU utilization

From
Rambabu g
Date:
Hi Justin,

Thanks for the quick response and your help,  Please go through the inputs and let me know if need to change anything at OS level parameters tune and DB parameters.


On Tue, 29 Mar 2022 at 23:54, Justin Pryzby <pryzby@telsasoft.com> wrote:
Hi,

Thanks for providing all this info.

On Tue, Mar 29, 2022 at 11:34:18PM +0530, Rambabu g wrote:
> Hi All,
>
> We have an issue with high load and IO Wait's but less cpu on postgres
> Database, The emp Table size is around 500GB, and the connections are very
> less.

What indexes are defined on this table ?
How large are they ?


There are three indexes defined on the table, each one is around 20 to 25GB and the indexes is create on 

postgres=# explain select distinct  empno  from emp where sname='test' and tp='EMP NAME 1'

                                              QUERY PLAN                                              

------------------------------------------------------------------------------------------------------

 HashAggregate  (cost=71899575.17..71900816.97 rows=124179 width=9)

   Group Key: empno

   ->  Gather  (cost=1000.00..71820473.80 rows=31640550 width=9)

         Workers Planned: 2

         ->  Parallel Seq Scan on emp  (cost=0.00..68655418.80 rows=13183562 width=9)

               Filter: (((sname)::text = 'test'::text) AND ((tp)::text = 'EMP NAME 1'::text)

 
> Red Hat Enterprise Linux Server release 7.9 (Maipo)
>  PostgreSQL 11.15 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
>
> shared_buffers=12GB
> work_mem=128MB

> 14428 | 04:45:59.712892 | active  | INSERT INTO target (empno, name)
> SELECT empno, '' AS name FROM (select distinct  empno  from emp where sname='test'
> and tp='EMP NAME 1' LIMIT 10) AS query   ;

Is the only only problem query, or just one example or ??
Are your issues with loading data, querying data or both ?

> -bash-4.2$ iostat -x

It shows that you only have a few filesystems in use.
It's common to have WAL and temp_tablespaces on a separate FS.
That probably wouldn't help your performance at all, but it would help to tell
what's doing I/O.  Is there anything else running on the VM besides postgres ?

 
No, the Ec2 VM is delicate to postgres DB instances only.
 
You can also check:
SELECT COUNT(1), wait_event FROM pg_stat_activity GROUP BY 2 ORDER BY 1 DESC;

postgres=# SELECT COUNT(1), wait_event FROM pg_stat_activity GROUP BY 2 ORDER BY 1 DESC;

 count |     wait_event      

-------+---------------------

    70 | ClientRead

    34 | DataFileRead

     3 | 

     1 | LogicalLauncherMain

     1 | WalWriterMain

     1 | BgWriterMain

     1 | AutoVacuumMain

(7 rows)
 
And the pg_buffercache extension:
SELECT COUNT(nullif(isdirty,'f')) dirty, COUNT(1) as all, COALESCE(c.relname, b.relfilenode::text) FROM pg_buffercache b LEFT JOIN pg_class c ON b.relfilenode=pg_relation_filenode(c.oid) GROUP BY 3 ORDER BY 1 DESC,2 DESC LIMIT 9;


postgres=# SELECT COUNT(nullif(isdirty,'f')) dirty, COUNT(1) as all, COALESCE(c.relname, b.relfilenode::text) FROM pg_buffercache b LEFT JOIN pg_class c ON b.relfilenode=pg_relation_filenode(c.oid) GROUP BY 3 ORDER BY 1 DESC,2 DESC LIMIT 9;

 dirty |   all   |                      coalesce                      

-------+---------+----------------------------------------------------

   189 |  237348 | emp_status

    97 | 1214949 | emp

    77 |     259 | public_group

    75 |     432 | public_gid

    74 |     233 | public_utpu

    26 |     115 | code_evd

    15 |      55 | group

    15 |      49 | output

    14 |      77 | output_status

(9 rows

 
> Hypervisor vendor:     KVM

Are KSM or THP enabled on the hypervisor ?

tail /sys/kernel/mm/ksm/run /sys/kernel/mm/transparent_hugepage/khugepaged/defrag /sys/kernel/mm/transparent_hugepage/enabled /sys/kernel/mm/transparent_hugepage/defrag                                                           


-bash-4.2$ tail /sys/kernel/mm/ksm/run /sys/kernel/mm/transparent_hugepage/khugepaged/defrag /sys/kernel/mm/transparent_hugepage/enabled /sys/kernel/mm/transparent_hugepage/defrag 

==> /sys/kernel/mm/ksm/run <==

0


==> /sys/kernel/mm/transparent_hugepage/khugepaged/defrag <==

1


==> /sys/kernel/mm/transparent_hugepage/enabled <==

[always] madvise never


==> /sys/kernel/mm/transparent_hugepage/defrag <==

[always] madvise never

 
--
Justin


Regards,
Rambabu. 

Re: HIGH IO and Less CPU utilization

From
Justin Pryzby
Date:
On Wed, Mar 30, 2022 at 12:52:05AM +0530, Rambabu g wrote:
> > What indexes are defined on this table ?
> > How large are they ?
>
> There are three indexes defined on the table, each one is around 20 to 25GB
> and the indexes is create on

Did you mean to say something else after "on" ?

Show the definition of the indexes from psql \d

> postgres=# explain select distinct  empno  from emp where sname='test' and tp='EMP NAME 1'

Is this the only query that's performing poorly ?
You should send explain (analyze,buffers) for the prolematic queries.

> > > Hypervisor vendor:     KVM
> >
> > Are KSM or THP enabled on the hypervisor ?

> No, the Ec2 VM is delicate to postgres DB instances only.

Oh, so this is an EC2 and you cannot change the hypervisor itself.

> -bash-4.2$ tail /sys/kernel/mm/ksm/run /sys/kernel/mm/transparent_hugepage/khugepaged/defrag
/sys/kernel/mm/transparent_hugepage/enabled/sys/kernel/mm/transparent_hugepage/defrag
 
...
> ==> /sys/kernel/mm/transparent_hugepage/defrag <==
> [always] madvise never

I doubt it will help, but you could try disabling these.
It's a quick experiment anyway.



Re: HIGH IO and Less CPU utilization

From
Rambabu g
Date:

Hi Justin, 

Only one query is causing the issue, sharing the def of indexes. Please have a look.



On Wed, 30 Mar 2022 at 01:09, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Wed, Mar 30, 2022 at 12:52:05AM +0530, Rambabu g wrote:
> > What indexes are defined on this table ?
> > How large are they ?
>
> There are three indexes defined on the table, each one is around 20 to 25GB
> and the indexes is create on

Did you mean to say something else after "on" ?

Show the definition of the indexes from psql \d

Index Definition : 

postgres=#                    \d+ idx_empno

                          Index "l2.pd_activity_empi"

 Column |          Type           | Key? | Definition | Storage  | Stats target 

--------+-------------------------+------+------------+----------+--------------

 empno   | character varying(2000) | yes  | empno       | extended | 

btree, for table "emp"


postgres=#                    \d+ id_dt

                           Index "dt"

 Column |            Type             | Key? | Definition | Storage | Stats target 

--------+-----------------------------+------+------------+---------+--------------

   dt  | timestamp without time zone  | yes  | dt      | plain   | 

btree, for table "emp"


postgres=#                    \d+ idx_tp

                          Index "idx_tp"

 Column |          Type           | Key? | Definition | Storage  | Stats target 

--------+-------------------------+------+------------+----------+--------------

 tp    | character varying(2000)   | yes    | tp       | extended | 

btree, for table "emp" 



 
Query is  been running  for 30min.

> postgres=# explain select distinct  empno  from emp where sname='test' and tp='EMP NAME 1'

Is this the only query that's performing poorly ?
You should send explain (analyze,buffers) for the prolematic queries.


postgres=# select pid,(now()-query_start) as age,wait_event_type,wait_event,query from pg_stat_activity where state!='idle';

  pid        age       | wait_event_type |  wait_event                                                         query                                                       

-------+-----------------+-----------------+---------------+-------------------------------------------------------------------------------------------------------------------

 32154 | 00:09:56.131136 | IPC             | ExecuteGather | explain analyze select distinct  empno  from emp where sname='test' and tp='EMP NAME 1'

   847 | 00:09:56.131136 | IO              | DataFileRead  | explain analyze select distinct  empno  from emp where sname='test' and tp='EMP NAME 1'

   848 | 00:09:56.131136 | IO              | DataFileRead  | explain analyze select distinct  empno  from emp where sname='test' and tp='EMP NAME 1'

   849 | 00:09:56.131136 | IO              | DataFileRead  | explain analyze select distinct  empno  from emp where sname='test' and tp='EMP NAME 1'

   850 | 00:09:56.131136 | IO              | DataFileRead  | explain analyze select distinct  empno  from emp where sname='test' and tp='EMP NAME 1'

   851 | 00:09:56.131136 | IO              | DataFileRead  | explain analyze select distinct  empno  from emp where sname='test' and tp='EMP NAME 1'

   852 | 00:09:56.131136 | IO              | DataFileRead  | explain analyze select distinct  empno  from emp where sname='test' and tp='EMP NAME 1'

   645 | 00:00:00                                      | select pid,(now()-query_start) as age,wait_event_type,wait_event,query from pg_stat_activity where state!='idle'




postgres=# SELECT COUNT(nullif(isdirty,'f')) dirty, COUNT(1) as all, COALESCE(c.relname, b.relfilenode::text) FROM pg_buffercache b LEFT JOIN pg_class c ON b.relfilenode=pg_relation_filenode(c.oid) GROUP BY 3 ORDER BY 1 DESC,2 DESC LIMIT 9;

 dirty |   all   |            coalesce             

-------+---------+---------------------------------

    32 |     136 | fn_deployment

    18 |     176 | fn_deployment_key

     8 |      12 | event_logs_pkey

     6 |     157 | event_logs

     1 |     355 | pg_class

     0 | 2890261 | 

     0 |  252734 | utput_status

     0 |     378 | emp

     0 |     299 | 1249

(9 rows)



-bash-4.2$ sar

Linux 3.10.0-1160.59.1.el7.x86_64 (ip-10-54-145-108.ec2.internal)  03/30/2022  _x86_64_ (24 CPU)


12:00:01 AM     CPU     %user     %nice   %system   %iowait    %steal     %idle

12:10:01 AM     all      1.19      0.00      0.82     36.17      0.00     61.81

12:20:01 AM     all      0.72      0.00      0.75     35.59      0.00     62.94

12:30:01 AM     all      0.74      0.00      0.77     35.04      0.00     63.46

12:40:02 AM     all      0.74      0.00      0.76     34.65      0.00     63.85

12:50:01 AM     all      0.77      0.00      0.78     33.36      0.00     65.09

01:00:01 AM     all      0.83      0.00      0.78     27.46      0.00     70.93

01:10:01 AM     all      0.85      0.00      0.78     30.11      0.00     68.26

01:20:01 AM     all      0.70      0.00      0.61     20.46      0.00     78.24

01:30:01 AM     all      0.15      0.00      0.06      0.02      0.00     99.77

01:40:01 AM     all      0.14      0.00      0.05      0.00      0.00     99.80

01:50:01 AM     all      0.14      0.00      0.05      0.00      0.00     99.80

02:00:01 AM     all      0.15      0.00      0.06      0.00      0.00     99.78

02:10:01 AM     all      0.14      0.00      0.05      0.00      0.00     99.80

02:20:01 AM     all      0.14      0.00      0.05      0.00      0.00     99.81

02:30:01 AM     all      0.15      0.00      0.06      0.00      0.00     99.80

02:40:01 AM     all      0.14      0.00      0.05      0.00      0.00     99.80

02:50:01 AM     all      0.14      0.00      0.05      0.00      0.00     99.80

03:00:01 AM     all      0.14      0.00      0.05      0.00      0.00     99.80

03:10:01 AM     all      0.14      0.00      0.05      0.00      0.00     99.81

03:20:01 AM     all      0.14      0.00      0.05      0.00      0.00     99.81

03:30:01 AM     all      0.23      0.00      0.15      2.18      0.00     97.44

03:40:01 AM     all      1.16      0.00      0.87     22.76      0.00     75.21

03:50:01 AM     all      0.75      0.00      0.60     13.89      0.00     84.76

04:00:01 AM     all      1.13      0.00      0.87     22.75      0.00     75.26

04:10:01 AM     all      0.87      0.00      0.79     22.91      0.00     75.43

04:20:01 AM     all      0.71      0.00      0.71     22.07      0.00     76.50

Average:        all      0.50      0.00      0.41     13.81      0.00     85.28

-bash-4.2$ iostat

Linux 3.10.0-1160.59.1.el7.x86_64 (ip-.ec2.internal)  03/30/2022  _x86_64_ (24 CPU)


avg-cpu:  %user   %nice %system %iowait  %steal   %idle

           0.44    0.00    0.34   13.35    0.00   85.86


Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn

nvme1n1        1370.20     54514.54      4964.18 7297971937  664565000

nvme2n1           0.92         0.12       223.19      16085   29878260

nvme0n1           0.30         5.12         5.23     685029     699968


-bash-4.2$ iostat -d

Linux 3.10.0-1160.59.1.el7.x86_64 (ip-ec2.internal)  03/30/2022  _x86_64_ (24 CPU)


Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn

nvme1n1        1370.25     54518.06      4963.95 7298793425  664565248

nvme2n1           0.92         0.12       223.17      16085   29878260

nvme0n1           0.30         5.12         5.23     685029     699968


-bash-4.2$ free -g

              total        used        free      shared  buff/cache   available

Mem:             92                               2          90          87

Swap:                                 0


 

> > > Hypervisor vendor:     KVM
> >
> > Are KSM or THP enabled on the hypervisor ?

> No, the Ec2 VM is delicate to postgres DB instances only.

Oh, so this is an EC2 and you cannot change the hypervisor itself.

> -bash-4.2$ tail /sys/kernel/mm/ksm/run /sys/kernel/mm/transparent_hugepage/khugepaged/defrag /sys/kernel/mm/transparent_hugepage/enabled /sys/kernel/mm/transparent_hugepage/defrag
...
> ==> /sys/kernel/mm/transparent_hugepage/defrag <==
> [always] madvise never
I doubt it will help, but you could try disabling these.
It's a quick experiment anyway.

Disable THP 

-bash-4.2$  tail /sys/kernel/mm/ksm/run /sys/kernel/mm/transparent_hugepage/khugepaged/defrag /sys/kernel/mm/transparent_hugepage/enabled /sys/kernel/mm/transparent_hugepage/defrag

==> /sys/kernel/mm/ksm/run <==

0


==> /sys/kernel/mm/transparent_hugepage/khugepaged/defrag <==

1


==> /sys/kernel/mm/transparent_hugepage/enabled <==

always madvise [never]


==> /sys/kernel/mm/transparent_hugepage/defrag <==

always madvise [never]

 
Regards,
Rambabu. 

Re: HIGH IO and Less CPU utilization

From
Justin Pryzby
Date:
On Wed, Mar 30, 2022 at 10:17:38AM +0530, Rambabu g wrote:
> Hi Justin,
> 
> Only one query is causing the issue, sharing the def of indexes. Please
> have a look.
> 
> > > There are three indexes defined on the table, each one is around 20 to 25GB
> 
>  tp    | character varying(2000)   | yes    | tp       | extended |
> 
>    852 | 00:09:56.131136 | IO              | DataFileRead  | explain
> analyze select distinct  empno  from emp where sname='test' and tp='EMP
> NAME 1'

The server is doing a scan of the large table.
The tp index matches a lot of rows (13e6) which probably aren't clustered, so
it elects to scan the 500GB table each time.

Looking at this in isolation, maybe it'd be enough to create an index on
tp,empno (and maybe drop the tp index).  CREATE INDEX CONCURRENTLY if you don't
want to disrupt other queries.

But This seems like something that should be solved in a better way though ;
like keeping a table with all the necessary "empno" maintained with "INSERT ON
CONFLICT DO NOTHING".  Or a trigger.



Re: HIGH IO and Less CPU utilization

From
Mladen Gogala
Date:
On 3/29/22 14:04, Rambabu g wrote:
Hi All,

We have an issue with high load and IO Wait's but less cpu on postgres Database, The emp Table size is around 500GB, and the connections are very less.

Please suggest to us do we need to change and config parameters at system level or Postgres configuration.


The "emp" table is 500 GB? You're doing something wrong, The "emp" table should have 14 rows and the "dept" table should have 4 rows The "bonus" and "salgrade" tables should also be very small. The guy named Bruce Scott could probably help you with that schema. Other than that, do you have a SQL causing all this ruckus and a detailed explain plan ("explain (analyze,costs,buffers)") for the SQL using most of the time? You can analyze the log file with PgBadger to get the queries consuming the most time.

-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: HIGH IO and Less CPU utilization

From
Rambabu g
Date:
Hi Justin,

I executed the same query first time it's takes 6+ sec, but if I run again same query that is taking 34 mill seconds, it's seems shared buffer reads are taking, but the second time shared buffer reads are not showing us,  so Please suggest me ig I need to change any  parameters to tune here.


postgres=# explain (analyze,buffers)   SELECT * FROM emp WHERE (empno='C3916271986');

                                                                  QUERY PLAN                                                                   

-----------------------------------------------------------------------------------------------------------------------------------------------

 Index Scan using pd_activity_empi on pd_activity  (cost=0.57..23391.90 rows=7956 width=9202) (actual time=4.346..6442.761 rows=12771 loops=1)

   Index Cond: ((empno)::text = 'C3916271986'::text)

   Buffers: shared hit=598 read=12224

 Planning Time: 0.130 ms

 Execution Time: 6446.664 ms

(5 rows)


postgres=# explain (analyze,buffers)    SELECT * FROM emp WHERE (empno='C3916271986');

                                                                 QUERY PLAN                                                                  

---------------------------------------------------------------------------------------------------------------------------------------------

 Index Scan using pd_activity_empi on pd_activity  (cost=0.57..23391.90 rows=7956 width=9202) (actual time=0.027..33.921 rows=12771 loops=1)

   Index Cond: ((empi)::text = 'C3916271986'::text)

   Buffers: shared hit=12822

 Planning Time: 0.138 ms

 Execution Time: 34.344 ms

(5 rows)



empno Changed :

postgres=# explain (analyze,buffers)  SELECT * FROM emp WHERE (empno='C6853372011');

                                                                 QUERY PLAN                                                                 

--------------------------------------------------------------------------------------------------------------------------------------------

 Index Scan using pd_activity_empi on pd_activity  (cost=0.57..23391.90 rows=7956 width=9202) (actual time=2.764..430.357 rows=758 loops=1)

   Index Cond: ((empi)::text = 'C6853372011'::text)

   Buffers: shared hit=46 read=718

 Planning Time: 0.136 ms

 Execution Time: 430.617 ms

(5 rows)



Regards,
Rambabu.

On Thu, 31 Mar 2022 at 12:19, Justin Pryzby <pryzby@telsasoft.com> wrote:
On Wed, Mar 30, 2022 at 10:17:38AM +0530, Rambabu g wrote:
> Hi Justin,
>
> Only one query is causing the issue, sharing the def of indexes. Please
> have a look.
>
> > > There are three indexes defined on the table, each one is around 20 to 25GB
>
>  tp    | character varying(2000)   | yes    | tp       | extended |
>
>    852 | 00:09:56.131136 | IO              | DataFileRead  | explain
> analyze select distinct  empno  from emp where sname='test' and tp='EMP
> NAME 1'

The server is doing a scan of the large table.
The tp index matches a lot of rows (13e6) which probably aren't clustered, so
it elects to scan the 500GB table each time.

Looking at this in isolation, maybe it'd be enough to create an index on
tp,empno (and maybe drop the tp index).  CREATE INDEX CONCURRENTLY if you don't
want to disrupt other queries.

But This seems like something that should be solved in a better way though ;
like keeping a table with all the necessary "empno" maintained with "INSERT ON
CONFLICT DO NOTHING".  Or a trigger.