Disk wait problem... - Mailing list pgsql-general

From pf@pfortin.com
Subject Disk wait problem...
Date
Msg-id 20231023145459.143faba7.pfortin@pfortin.com
Whole thread Raw
Responses Re: Disk wait problem...  (Adrian Klaver <adrian.klaver@aklaver.com>)
Re: Disk wait problem... 15.4  (pf@pfortin.com)
List pgsql-general
Hi,

I have a 1.6TB database with over 330 tables on a 4TB NVMe SSD. All
tables are static (no updates); most in 8M and 33M row sizes. Queries have
been great, until recently.

I use SQL-workbench/J (WB) and starting at any table, if I use up/down
arrow to switch to another table, all that happens is WB queries the
table to get the row count and a LIMIT of rows to display.

If I set LIMIT to 10, I see one process in disk wait:
$ top -b -n 1 | grep postgre

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
1913118 postgres  20   0  172620  22184  21064 S   0.0   0.0   0:04.26 postgres
1913119 postgres  20   0  172744  11428  10240 S   0.0   0.0   0:00.02 postgres
1913120 postgres  20   0  172756   8868   7680 S   0.0   0.0   0:00.42 postgres
1913122 postgres  20   0  172620   9988   8800 S   0.0   0.0   0:00.36 postgres
1913123 postgres  20   0  178300   9508   8160 S   0.0   0.0   0:02.29 postgres
1913124 postgres  20   0  176232   7908   6560 S   0.0   0.0   0:00.02 postgres
1931492 postgres  20   0  178044  21668  18720 S   0.0   0.0   0:00.22 postgres
1977002 postgres  20   0  182984  36992  31360 D   0.0   0.0   0:01.61 postgres
3851279 postgres  20   0   13544   6560   2880 S   0.0   0.0   0:00.01 bash
3854720 postgres  20   0   10716   5416   4456 S   0.0   0.0   0:00.02 psql


With LIMIT set to 500, this table has 2,148,035 rows (disk-wait process
only):
$ top -b -n 1 | grep postgre
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
1977002 postgres  20   0  184708  37900  32000 D   0.0   0.0   0:01.83 postgres

2,152,372 rows -- took nearly 30 seconds
$ top -b -n 1 | grep postgre
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
1977002 postgres  20   0  184708  37900  32000 D   0.0   0.0   0:01.83 postgres

Earlier, with LIMIT 500, I was seeing 3 processes in disk-wait.

Also attached is the relevant system journal entries for one query that
took 2 seconds and two more that took 62 and 52 seconds... Unlike the
above, these had 3 processes in disk-wait -- see attached screenshot...

I just stopped PG, updated the full backup which was done overnight (so
the update took about a second), restarted PG and the problem persists...

Any suggestions where to look next?

Thanks, 
Pierre

Attachment

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: psql _ and \e -- what does clearing the query buffer actually do?
Next
From: Adrian Klaver
Date:
Subject: Re: Disk wait problem...