Re: recoveryCheck/008_fsm_truncation is failing on dodo in v14- (due to slow fsync?) - Mailing list pgsql-hackers

From Robins Tharakan
Subject Re: recoveryCheck/008_fsm_truncation is failing on dodo in v14- (due to slow fsync?)
Date
Msg-id CAEP4nAxQizWfM+TpK_0WBBZD2CDtutFeKg-_0+Jpa080-=gDBw@mail.gmail.com
Whole thread Raw
In response to recoveryCheck/008_fsm_truncation is failing on dodo in v14- (due to slow fsync?)  (Alexander Lakhin <exclusion@gmail.com>)
List pgsql-hackers

On Sat, 22 Jun 2024 at 18:30, Alexander Lakhin <exclusion@gmail.com> wrote:
So it doesn't seem impossible for this operation to last for more than two
minutes.

The facts that SyncDataDirectory() is executed between these two messages
logged, 008_fsm_truncation is the only test which turns fsync on, and we
see no such failures in newer branches (because of a7f417107), make me
suspect that dodo is slow on fsync.


Not sure if it helps but I can confirm that dodo is used for multiple tasks and that
it is using a (slow) external USB3 disk. Also, while using dodo last week (for
something unrelated), I noticed iotop at ~30MB/s usage & 1-min CPU around ~7.

Right now (while dodo's idle), via dd I see ~30MB/s is pretty much the max:

pi@pi4:/media/pi/250gb $ dd if=/dev/zero of=./test count=1024 oflag=direct bs=128k
1024+0 records in
1024+0 records out
134217728 bytes (134 MB, 128 MiB) copied, 4.51225 s, 29.7 MB/s

pi@pi4:/media/pi/250gb $ dd if=/dev/zero of=./test count=1024 oflag=dsync bs=128k
1024+0 records in
1024+0 records out
134217728 bytes (134 MB, 128 MiB) copied, 24.4916 s, 5.5 MB/s

-
robins

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: New standby_slot_names GUC in PG 17
Next
From: Bertrand Drouvot
Date:
Subject: Re: Track the amount of time waiting due to cost_delay