Thread: HIGH IO and Less CPU utilization

HIGH IO and Less CPU utilization

From

Rambabu g

Date:

29 March 2022, 18:04:18

Hi All,

We have an issue with high load and IO Wait's but less cpu on postgres Database, The emp Table size is around 500GB, and the connections are very less.

Please suggest to us do we need to change and config parameters at system level or Postgres configuration.

postgres=# select version();

version

----------------------------------------------------------------------------------------------------------

PostgreSQL 11.15 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit

(1 row)

postgres=# \q

Postgres Parameters Setting :

shared_buffers=12GB

work_mem=128MB

effective_cache_size=48GB

maintenance_work_mem=2GB

max_connections=500

14428 | 04:45:59.712892 | active | INSERT INTO target (empno, name) SELECT empno, '' AS name FROM (select distinct empno from emp where sname='test' and tp='EMP NAME 1' LIMIT 10) AS query ;

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

14428 postgres 20 0 12.6g 12.2g 12.2g D 5.3 13.3 4:43.57 postgres: postgres postgres (59436) INSERT

29136 postgres 20 0 12.6g 401812 398652 D 4.7 0.4 0:01.20 postgres: postgres postgres (48220) SELECT

29119 postgres 20 0 12.6g 677704 674064 S 3.3 0.7 0:02.05 postgres: postgres postgres (37684) idle

29121 postgres 20 0 12.6g 758428 755252 S 3.0 0.8 0:02.33 postgres: postgres postgres (57392) idle

29166 postgres 20 0 12.6g 260436 257408 S 3.0 0.3 0:00.63 postgres: postgres postgres (59424) idle

29181 postgres 20 0 12.6g 179136 175860 D 2.3 0.2 0:00.18 postgres: postgres postgres (57092) SELECT

29129 postgres 20 0 12.6g 442444 439212 S 1.7 0.5 0:01.33 postgres: postgres postgres (36560) idle

-bash-4.2$ cat /etc/redhat-release

Red Hat Enterprise Linux Server release 7.9 (Maipo)

-bash-4.2$ uname

Linux

-bash-4.2$ uname -a

Linux ip.ec2.internal 3.10.0-1160.59.1.el7.x86_64 #1 SMP Wed Feb 16 12:17:35 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

-bash-4.2$ top

top - 17:02:52 up 1 day, 1:44, 2 users, load average: 11.60, 22.27, 22.22

Tasks: 316 total, 1 running, 315 sleeping, 0 stopped, 0 zombie

%Cpu(s): 0.5 us, 0.5 sy, 0.0 ni, 92.0 id, 7.0 wa, 0.0 hi, 0.0 si, 0.0 st

KiB Mem : 96639952 total, 483896 free, 1693960 used, 94462096 buff/cache

KiB Swap: 0 total, 0 free, 0 used. 81408928 avail Mem

-bash-4.2$ iostat -x

Linux 3.10.0-1160.59.1.el7.x86_64 (ip.ec2.internal) 03/29/2022 _x86_64_ (24 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle

0.33 0.00 0.24 7.54 0.00 91.88

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util

nvme1n1 0.00 3.45 1042.22 29.88 41998.88 1476.75 81.10 7.61 7.10 6.62 23.70 0.40 43.19

nvme2n1 0.00 0.02 0.02 1.06 0.15 268.80 497.00 0.09 80.87 0.85 82.56 1.40 0.15

nvme0n1 0.00 0.01 0.21 0.08 4.94 7.07 81.37 0.00 6.88 0.61 22.83 0.64 0.02

-bash-4.2$ vmstat -a

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----

r b swpd free inact active si so bi bo in cs us sy id wa st

1 8 0 476180 40092640 53043492 0 0 1753 73 2 14 0 0 92 8 0

-bash-4.2$ vmstat -d

disk- ------------reads------------ ------------writes----------- -----IO------

total merged sectors ms total merged sectors ms cur sec

nvme1n1 99492480 0 8016369922 658540488 2849690 329519 281661496 67518819 0 41210

nvme2n1 2126 0 27946 1811 101078 2312 51264208 8344670 0 144

nvme0n1 20254 6 942763 12340 7953 641 1348866 181438 0 18

-bash-4.2$ sar

Linux 3.10.0-1160.59.1.el7.x86_64 (ip.ec2.internal) 03/29/2022 _x86_64_ (24 CPU)

04:20:01 PM CPU %user %nice %system %iowait %steal %idle

04:30:01 PM all 0.70 0.00 0.69 27.92 0.00 70.68

04:40:01 PM all 0.71 0.00 0.70 27.76 0.00 70.84

04:50:01 PM all 0.70 0.00 0.69 26.34 0.00 72.27

05:00:01 PM all 0.70 0.00 0.68 27.32 0.00 71.31

05:10:01 PM all 0.70 0.00 0.69 27.83 0.00 70.77

05:20:01 PM all 0.70 0.00 0.69 28.16 0.00 70.45

05:30:01 PM all 0.71 0.00 0.69 26.62 0.00 71.98

05:40:01 PM all 0.69 0.00 0.68 25.77 0.00 72.85

Average: all 0.70 0.00 0.69 27.21 0.00 71.40

-bash-4.2$

-bash-4.2$ free -g

total used free shared buff/cache available

Mem: 92 1 0 12 90 77

Swap: 0 0 0

-bash-4.2$ free -m

total used free shared buff/cache available

Mem: 94374 1721 474 12581 92178 79430

Swap: 0 0 0

-bash-4.2$ lscpu

Architecture: x86_64

CPU op-mode(s): 32-bit, 64-bit

Byte Order: Little Endian

CPU(s): 24

On-line CPU(s) list: 0-23

Thread(s) per core: 2

Core(s) per socket: 12

Socket(s): 1

NUMA node(s): 1

Vendor ID: GenuineIntel

CPU family: 6

Model: 85

Model name: Intel(R) Xeon(R) Platinum 8252C CPU @ 3.80GHz

Stepping: 7

CPU MHz: 3799.998

BogoMIPS: 7599.99

Hypervisor vendor: KVM

Virtualization type: full

L1d cache: 32K

L1i cache: 32K

L2 cache: 1024K

L3 cache: 25344K

NUMA node0 CPU(s): 0-23

Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 ida arat pku ospke avx512_vnni

Thanks & Regards,

Ramababu.

Re: HIGH IO and Less CPU utilization

From

Justin Pryzby

Date:

29 March 2022, 18:24:53

Hi,

Thanks for providing all this info.

On Tue, Mar 29, 2022 at 11:34:18PM +0530, Rambabu g wrote:
> Hi All,
> 
> We have an issue with high load and IO Wait's but less cpu on postgres
> Database, The emp Table size is around 500GB, and the connections are very
> less.

What indexes are defined on this table ?
How large are they ?

> Red Hat Enterprise Linux Server release 7.9 (Maipo)
>  PostgreSQL 11.15 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
> 
> shared_buffers=12GB
> work_mem=128MB

> 14428 | 04:45:59.712892 | active  | INSERT INTO target (empno, name)
> SELECT empno, '' AS name FROM (select distinct  empno  from emp where sname='test'
> and tp='EMP NAME 1' LIMIT 10) AS query   ;

Is the only only problem query, or just one example or ??
Are your issues with loading data, querying data or both ?

> -bash-4.2$ iostat -x

It shows that you only have a few filesystems in use.
It's common to have WAL and temp_tablespaces on a separate FS.
That probably wouldn't help your performance at all, but it would help to tell
what's doing I/O.  Is there anything else running on the VM besides postgres ?

You can also check:
SELECT COUNT(1), wait_event FROM pg_stat_activity GROUP BY 2 ORDER BY 1 DESC;

And the pg_buffercache extension:
SELECT COUNT(nullif(isdirty,'f')) dirty, COUNT(1) all, COALESCE(c.relname, b.relfilenode::text) FROM pg_buffercache b
LEFTJOIN pg_class c ON b.relfilenode=pg_relation_filenode(c.oid) GROUP BY 3 ORDER BY 1 DESC,2 DESC LIMIT 9;
 

> Hypervisor vendor:     KVM

Are KSM or THP enabled on the hypervisor ?

tail /sys/kernel/mm/ksm/run /sys/kernel/mm/transparent_hugepage/khugepaged/defrag
/sys/kernel/mm/transparent_hugepage/enabled/sys/kernel/mm/transparent_hugepage/defrag
                       
 

-- 
Justin

Re: HIGH IO and Less CPU utilization

From

Rambabu g

Date:

29 March 2022, 19:22:05

Hi Justin,

Thanks for the quick response and your help, Please go through the inputs and let me know if need to change anything at OS level parameters tune and DB parameters.

On Tue, 29 Mar 2022 at 23:54, Justin Pryzby <pryzby@telsasoft.com> wrote:

Hi,

Thanks for providing all this info.

On Tue, Mar 29, 2022 at 11:34:18PM +0530, Rambabu g wrote:
> Hi All,
>
> We have an issue with high load and IO Wait's but less cpu on postgres
> Database, The emp Table size is around 500GB, and the connections are very
> less.

What indexes are defined on this table ?
How large are they ?

There are three indexes defined on the table, each one is around 20 to 25GB and the indexes is create on

postgres=# explain select distinct empno from emp where sname='test' and tp='EMP NAME 1'

QUERY PLAN

------------------------------------------------------------------------------------------------------

HashAggregate (cost=71899575.17..71900816.97 rows=124179 width=9)

Group Key: empno

-> Gather (cost=1000.00..71820473.80 rows=31640550 width=9)

Workers Planned: 2

-> Parallel Seq Scan on emp (cost=0.00..68655418.80 rows=13183562 width=9)

Filter: (((sname)::text = 'test'::text) AND ((tp)::text = 'EMP NAME 1'::text)

> Red Hat Enterprise Linux Server release 7.9 (Maipo)
> PostgreSQL 11.15 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
>
> shared_buffers=12GB
> work_mem=128MB

> 14428 | 04:45:59.712892 | active | INSERT INTO target (empno, name)
> SELECT empno, '' AS name FROM (select distinct empno from emp where sname='test'
> and tp='EMP NAME 1' LIMIT 10) AS query ;

Is the only only problem query, or just one example or ??
Are your issues with loading data, querying data or both ?

> -bash-4.2$ iostat -x

It shows that you only have a few filesystems in use.
It's common to have WAL and temp_tablespaces on a separate FS.
That probably wouldn't help your performance at all, but it would help to tell
what's doing I/O. Is there anything else running on the VM besides postgres ?

No, the Ec2 VM is delicate to postgres DB instances only.

You can also check:
SELECT COUNT(1), wait_event FROM pg_stat_activity GROUP BY 2 ORDER BY 1 DESC;

postgres=# SELECT COUNT(1), wait_event FROM pg_stat_activity GROUP BY 2 ORDER BY 1 DESC;

count | wait_event

-------+---------------------

70 | ClientRead

34 | DataFileRead

3 |

1 | LogicalLauncherMain

1 | WalWriterMain

1 | BgWriterMain

1 | AutoVacuumMain

(7 rows)

And the pg_buffercache extension:
SELECT COUNT(nullif(isdirty,'f')) dirty, COUNT(1) as all, COALESCE(c.relname, b.relfilenode::text) FROM pg_buffercache b LEFT JOIN pg_class c ON b.relfilenode=pg_relation_filenode(c.oid) GROUP BY 3 ORDER BY 1 DESC,2 DESC LIMIT 9;

postgres=# SELECT COUNT(nullif(isdirty,'f')) dirty, COUNT(1) as all, COALESCE(c.relname, b.relfilenode::text) FROM pg_buffercache b LEFT JOIN pg_class c ON b.relfilenode=pg_relation_filenode(c.oid) GROUP BY 3 ORDER BY 1 DESC,2 DESC LIMIT 9;

dirty | all | coalesce

-------+---------+----------------------------------------------------

189 | 237348 | emp_status

97 | 1214949 | emp

77 | 259 | public_group

75 | 432 | public_gid

74 | 233 | public_utpu

26 | 115 | code_evd

15 | 55 | group

15 | 49 | output

14 | 77 | output_status

(9 rows

> Hypervisor vendor: KVM

Are KSM or THP enabled on the hypervisor ?

tail /sys/kernel/mm/ksm/run /sys/kernel/mm/transparent_hugepage/khugepaged/defrag /sys/kernel/mm/transparent_hugepage/enabled /sys/kernel/mm/transparent_hugepage/defrag

-bash-4.2$ tail /sys/kernel/mm/ksm/run /sys/kernel/mm/transparent_hugepage/khugepaged/defrag /sys/kernel/mm/transparent_hugepage/enabled /sys/kernel/mm/transparent_hugepage/defrag

==> /sys/kernel/mm/ksm/run <==

==> /sys/kernel/mm/transparent_hugepage/khugepaged/defrag <==

==> /sys/kernel/mm/transparent_hugepage/enabled <==

[always] madvise never

==> /sys/kernel/mm/transparent_hugepage/defrag <==

[always] madvise never

--
Justin

Regards,

Rambabu.

Re: HIGH IO and Less CPU utilization

From

Justin Pryzby

Date:

29 March 2022, 19:39:41

On Wed, Mar 30, 2022 at 12:52:05AM +0530, Rambabu g wrote:
> > What indexes are defined on this table ?
> > How large are they ?
>
> There are three indexes defined on the table, each one is around 20 to 25GB
> and the indexes is create on

Did you mean to say something else after "on" ?

Show the definition of the indexes from psql \d

> postgres=# explain select distinct  empno  from emp where sname='test' and tp='EMP NAME 1'

Is this the only query that's performing poorly ?
You should send explain (analyze,buffers) for the prolematic queries.

> > > Hypervisor vendor:     KVM
> >
> > Are KSM or THP enabled on the hypervisor ?

> No, the Ec2 VM is delicate to postgres DB instances only.

Oh, so this is an EC2 and you cannot change the hypervisor itself.

> -bash-4.2$ tail /sys/kernel/mm/ksm/run /sys/kernel/mm/transparent_hugepage/khugepaged/defrag
/sys/kernel/mm/transparent_hugepage/enabled/sys/kernel/mm/transparent_hugepage/defrag
 
...
> ==> /sys/kernel/mm/transparent_hugepage/defrag <==
> [always] madvise never

I doubt it will help, but you could try disabling these.
It's a quick experiment anyway.

Re: HIGH IO and Less CPU utilization

From

Rambabu g

Date:

30 March 2022, 04:47:38

Hi Justin,

Only one query is causing the issue, sharing the def of indexes. Please have a look.

On Wed, 30 Mar 2022 at 01:09, Justin Pryzby <pryzby@telsasoft.com> wrote:

On Wed, Mar 30, 2022 at 12:52:05AM +0530, Rambabu g wrote:
> > What indexes are defined on this table ?
> > How large are they ?
>
> There are three indexes defined on the table, each one is around 20 to 25GB
> and the indexes is create on

Did you mean to say something else after "on" ?

Show the definition of the indexes from psql \d

Index Definition :

postgres=# \d+ idx_empno

Index "l2.pd_activity_empi"

--------+-------------------------+------+------------+----------+--------------

btree, for table "emp"

postgres=# \d+ id_dt

Index "dt"

--------+-----------------------------+------+------------+---------+--------------

dt | timestamp without time zone | yes | dt | plain |

btree, for table "emp"

postgres=# \d+ idx_tp

Index "idx_tp"

--------+-------------------------+------+------------+----------+--------------

tp | character varying(2000) | yes | tp | extended |

btree, for table "emp"

Query is been running for 30min.

> postgres=# explain select distinct empno from emp where sname='test' and tp='EMP NAME 1'

Is this the only query that's performing poorly ?
You should send explain (analyze,buffers) for the prolematic queries.

postgres=# select pid,(now()-query_start) as age,wait_event_type,wait_event,query from pg_stat_activity where state!='idle';

pid | age | wait_event_type | wait_event | query

-------+-----------------+-----------------+---------------+-------------------------------------------------------------------------------------------------------------------

32154 | 00:09:56.131136 | IPC | ExecuteGather | explain analyze select distinct empno from emp where sname='test' and tp='EMP NAME 1'

847 | 00:09:56.131136 | IO | DataFileRead | explain analyze select distinct empno from emp where sname='test' and tp='EMP NAME 1'

848 | 00:09:56.131136 | IO | DataFileRead | explain analyze select distinct empno from emp where sname='test' and tp='EMP NAME 1'

849 | 00:09:56.131136 | IO | DataFileRead | explain analyze select distinct empno from emp where sname='test' and tp='EMP NAME 1'

850 | 00:09:56.131136 | IO | DataFileRead | explain analyze select distinct empno from emp where sname='test' and tp='EMP NAME 1'

851 | 00:09:56.131136 | IO | DataFileRead | explain analyze select distinct empno from emp where sname='test' and tp='EMP NAME 1'

852 | 00:09:56.131136 | IO | DataFileRead | explain analyze select distinct empno from emp where sname='test' and tp='EMP NAME 1'

645 | 00:00:00 | | | select pid,(now()-query_start) as age,wait_event_type,wait_event,query from pg_stat_activity where state!='idle'

dirty | all | coalesce

-------+---------+---------------------------------

32 | 136 | fn_deployment

18 | 176 | fn_deployment_key

8 | 12 | event_logs_pkey

6 | 157 | event_logs

1 | 355 | pg_class

0 | 2890261 |

0 | 252734 | utput_status

0 | 378 | emp

0 | 299 | 1249

(9 rows)

-bash-4.2$ sar

Linux 3.10.0-1160.59.1.el7.x86_64 (ip-10-54-145-108.ec2.internal) 03/30/2022 _x86_64_ (24 CPU)

12:00:01 AM CPU %user %nice %system %iowait %steal %idle

12:10:01 AM all 1.19 0.00 0.82 36.17 0.00 61.81

12:20:01 AM all 0.72 0.00 0.75 35.59 0.00 62.94

12:30:01 AM all 0.74 0.00 0.77 35.04 0.00 63.46

12:40:02 AM all 0.74 0.00 0.76 34.65 0.00 63.85

12:50:01 AM all 0.77 0.00 0.78 33.36 0.00 65.09

01:00:01 AM all 0.83 0.00 0.78 27.46 0.00 70.93

01:10:01 AM all 0.85 0.00 0.78 30.11 0.00 68.26

01:20:01 AM all 0.70 0.00 0.61 20.46 0.00 78.24

01:30:01 AM all 0.15 0.00 0.06 0.02 0.00 99.77

01:40:01 AM all 0.14 0.00 0.05 0.00 0.00 99.80

01:50:01 AM all 0.14 0.00 0.05 0.00 0.00 99.80

02:00:01 AM all 0.15 0.00 0.06 0.00 0.00 99.78

02:10:01 AM all 0.14 0.00 0.05 0.00 0.00 99.80

02:20:01 AM all 0.14 0.00 0.05 0.00 0.00 99.81

02:30:01 AM all 0.15 0.00 0.06 0.00 0.00 99.80

02:40:01 AM all 0.14 0.00 0.05 0.00 0.00 99.80

02:50:01 AM all 0.14 0.00 0.05 0.00 0.00 99.80

03:00:01 AM all 0.14 0.00 0.05 0.00 0.00 99.80

03:10:01 AM all 0.14 0.00 0.05 0.00 0.00 99.81

03:20:01 AM all 0.14 0.00 0.05 0.00 0.00 99.81

03:30:01 AM all 0.23 0.00 0.15 2.18 0.00 97.44

03:40:01 AM all 1.16 0.00 0.87 22.76 0.00 75.21

03:50:01 AM all 0.75 0.00 0.60 13.89 0.00 84.76

04:00:01 AM all 1.13 0.00 0.87 22.75 0.00 75.26

04:10:01 AM all 0.87 0.00 0.79 22.91 0.00 75.43

04:20:01 AM all 0.71 0.00 0.71 22.07 0.00 76.50

Average: all 0.50 0.00 0.41 13.81 0.00 85.28

-bash-4.2$ iostat

Linux 3.10.0-1160.59.1.el7.x86_64 (ip-.ec2.internal) 03/30/2022 _x86_64_ (24 CPU)

avg-cpu: %user %nice %system %iowait %steal %idle

0.44 0.00 0.34 13.35 0.00 85.86

Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn

nvme1n1 1370.20 54514.54 4964.18 7297971937 664565000

nvme2n1 0.92 0.12 223.19 16085 29878260

nvme0n1 0.30 5.12 5.23 685029 699968

-bash-4.2$ iostat -d

Linux 3.10.0-1160.59.1.el7.x86_64 (ip-ec2.internal) 03/30/2022 _x86_64_ (24 CPU)

Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn

nvme1n1 1370.25 54518.06 4963.95 7298793425 664565248

nvme2n1 0.92 0.12 223.17 16085 29878260

nvme0n1 0.30 5.12 5.23 685029 699968

-bash-4.2$ free -g

total used free shared buff/cache available

Mem: 92 1 0 2 90 87

Swap: 0 0 0

> > > Hypervisor vendor: KVM
> >
> > Are KSM or THP enabled on the hypervisor ?

> No, the Ec2 VM is delicate to postgres DB instances only.

Oh, so this is an EC2 and you cannot change the hypervisor itself.

> -bash-4.2$ tail /sys/kernel/mm/ksm/run /sys/kernel/mm/transparent_hugepage/khugepaged/defrag /sys/kernel/mm/transparent_hugepage/enabled /sys/kernel/mm/transparent_hugepage/defrag
...
> ==> /sys/kernel/mm/transparent_hugepage/defrag <==
> [always] madvise never

I doubt it will help, but you could try disabling these.
It's a quick experiment anyway.

Disable THP

-bash-4.2$ tail /sys/kernel/mm/ksm/run /sys/kernel/mm/transparent_hugepage/khugepaged/defrag /sys/kernel/mm/transparent_hugepage/enabled /sys/kernel/mm/transparent_hugepage/defrag

==> /sys/kernel/mm/ksm/run <==

==> /sys/kernel/mm/transparent_hugepage/khugepaged/defrag <==

==> /sys/kernel/mm/transparent_hugepage/enabled <==

always madvise [never]

==> /sys/kernel/mm/transparent_hugepage/defrag <==

always madvise [never]

Regards,

Rambabu.

Re: HIGH IO and Less CPU utilization

From

Justin Pryzby

Date:

31 March 2022, 06:49:41

On Wed, Mar 30, 2022 at 10:17:38AM +0530, Rambabu g wrote:
> Hi Justin,
> 
> Only one query is causing the issue, sharing the def of indexes. Please
> have a look.
> 
> > > There are three indexes defined on the table, each one is around 20 to 25GB
> 
>  tp    | character varying(2000)   | yes    | tp       | extended |
> 
>    852 | 00:09:56.131136 | IO              | DataFileRead  | explain
> analyze select distinct  empno  from emp where sname='test' and tp='EMP
> NAME 1'

The server is doing a scan of the large table.
The tp index matches a lot of rows (13e6) which probably aren't clustered, so
it elects to scan the 500GB table each time.

Looking at this in isolation, maybe it'd be enough to create an index on
tp,empno (and maybe drop the tp index).  CREATE INDEX CONCURRENTLY if you don't
want to disrupt other queries.

But This seems like something that should be solved in a better way though ;
like keeping a table with all the necessary "empno" maintained with "INSERT ON
CONFLICT DO NOTHING".  Or a trigger.

Re: HIGH IO and Less CPU utilization

From

Mladen Gogala

Date:

31 March 2022, 14:44:57

On 3/29/22 14:04, Rambabu g wrote:

Hi All,

We have an issue with high load and IO Wait's but less cpu on postgres Database, The emp Table size is around 500GB, and the connections are very less.

Please suggest to us do we need to change and config parameters at system level or Postgres configuration.

The "emp" table is 500 GB? You're doing something wrong, The "emp" table should have 14 rows and the "dept" table should have 4 rows The "bonus" and "salgrade" tables should also be very small. The guy named Bruce Scott could probably help you with that schema. Other than that, do you have a SQL causing all this ruckus and a detailed explain plan ("explain (analyze,costs,buffers)") for the SQL using most of the time? You can analyze the log file with PgBadger to get the queries consuming the most time.

-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

Re: HIGH IO and Less CPU utilization

From

Rambabu g

Date:

04 April 2022, 07:47:59

Hi Justin,

I executed the same query first time it's takes 6+ sec, but if I run again same query that is taking 34 mill seconds, it's seems shared buffer reads are taking, but the second time shared buffer reads are not showing us, so Please suggest me ig I need to change any parameters to tune here.

postgres=# explain (analyze,buffers) SELECT * FROM emp WHERE (empno='C3916271986');

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------

Index Scan using pd_activity_empi on pd_activity (cost=0.57..23391.90 rows=7956 width=9202) (actual time=4.346..6442.761 rows=12771 loops=1)

Index Cond: ((empno)::text = 'C3916271986'::text)

Buffers: shared hit=598 read=12224

Planning Time: 0.130 ms

Execution Time: 6446.664 ms

(5 rows)

postgres=# explain (analyze,buffers) SELECT * FROM emp WHERE (empno='C3916271986');

QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------

Index Scan using pd_activity_empi on pd_activity (cost=0.57..23391.90 rows=7956 width=9202) (actual time=0.027..33.921 rows=12771 loops=1)

Index Cond: ((empi)::text = 'C3916271986'::text)

Buffers: shared hit=12822

Planning Time: 0.138 ms

Execution Time: 34.344 ms

(5 rows)

empno Changed :

postgres=# explain (analyze,buffers) SELECT * FROM emp WHERE (empno='C6853372011');

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------

Index Scan using pd_activity_empi on pd_activity (cost=0.57..23391.90 rows=7956 width=9202) (actual time=2.764..430.357 rows=758 loops=1)

Index Cond: ((empi)::text = 'C6853372011'::text)

Buffers: shared hit=46 read=718

Planning Time: 0.136 ms

Execution Time: 430.617 ms

(5 rows)

Regards,

Rambabu.

On Thu, 31 Mar 2022 at 12:19, Justin Pryzby <pryzby@telsasoft.com> wrote:

On Wed, Mar 30, 2022 at 10:17:38AM +0530, Rambabu g wrote:
> Hi Justin,
>
> Only one query is causing the issue, sharing the def of indexes. Please
> have a look.
>
> > > There are three indexes defined on the table, each one is around 20 to 25GB
>
> tp | character varying(2000) | yes | tp | extended |
>
> 852 | 00:09:56.131136 | IO | DataFileRead | explain
> analyze select distinct empno from emp where sname='test' and tp='EMP
> NAME 1'

The server is doing a scan of the large table.
The tp index matches a lot of rows (13e6) which probably aren't clustered, so
it elects to scan the 500GB table each time.

Looking at this in isolation, maybe it'd be enough to create an index on
tp,empno (and maybe drop the tp index). CREATE INDEX CONCURRENTLY if you don't
want to disrupt other queries.

But This seems like something that should be solved in a better way though ;
like keeping a table with all the necessary "empno" maintained with "INSERT ON
CONFLICT DO NOTHING". Or a trigger.