Thank you so much Amit! I have created the patch below:
https://commitfest.postgresql.org/22/2003/
Please let me know should you have more suggestions. Thank you!
Best regards,
--
Chengchao Yu
Software Engineer | Microsoft | Azure Database for PostgreSQL
https://azure.microsoft.com/en-us/services/postgresql/
-----Original Message-----
From: Amit Kapila <amit.kapila16@gmail.com>
Sent: Friday, February 1, 2019 6:58 PM
To: Chengchao Yu <chengyu@microsoft.com>
Cc: Thomas Munro <thomas.munro@enterprisedb.com>; Pg Hackers <pgsql-hackers@postgresql.org>; Prabhat Tripathi
<ptrip@microsoft.com>;Sunil Kamath <Sunil.Kamath@microsoft.com>; Michal Primke <mprimke@microsoft.com>; TEJA Mupparti
<Tejeswar.Mupparti@microsoft.com>
Subject: Re: [PATCH] Fix Proposal - Deadlock Issue in Single User Mode When IO Failure Occurs
On Sat, Feb 2, 2019 at 4:42 AM Chengchao Yu <chengyu@microsoft.com> wrote:
>
> Hi Amit, Thomas,
>
> Thank you very much for your feedbacks! Apologizes but I just saw both messages.
>
> > We generally reserve the space in a relation before attempting to write, so not sure how you are able to hit the
diskfull situation via mdwrite. If you see the description of the function, that also indicates same.
>
> Absolutely agree, this isn’t a PG issue. Issue manifest for us at Microsoft due to our own storage layer which treat
mdextend()actions as setting length of the file only. We have a workaround, and any change isn’t needed for Postgres.
>
> > I am not telling that mdwrite can never lead to error, but just trying to understand the issue you actually faced.
Ihaven't read your proposed solution yet, let's first try to establish the problem you are facing.
>
> We see transient IO errors reading a block where PG instance gets dead-lock in single user mode until we kill the
instance.The stack trace below shows the behavior which is when mdread() failed with buffer holding its lw-lock. This
happensbecause in single user mode there is no call back to release the lock and when AbortBufferIO() tries to acquire
thesame lock again, it will wait for the lock indefinitely.
>
I think you can register your patch for next CF [1] so that we don't forget about it.
[1] -
https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommitfest.postgresql.org%2F22%2F&data=02%7C01%7Cchengyu%40microsoft.com%7Cfee132e6ec2843c2838a08d688ba3aef%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636846730778775307&sdata=lJ2LjRgo%2Bd6ViKqwJ040BPzicOTFtFO8NmmVft00yKY%3D&reserved=0
--
With Regards,
Amit Kapila.
EnterpriseDB:
https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.enterprisedb.com&data=02%7C01%7Cchengyu%40microsoft.com%7Cfee132e6ec2843c2838a08d688ba3aef%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636846730778775307&sdata=nXcVn6B1fl6b5iiDKybl3zf0fXo22%2BrZ1Ne7v1%2FM5DE%3D&reserved=0