Home > mailing lists

Re: [PATCH] Fix Proposal - Deadlock Issue in Single User Mode When IOFailure Occurs - Mailing list pgsql-hackers

From	Thomas Munro
Subject	Re: [PATCH] Fix Proposal - Deadlock Issue in Single User Mode When IOFailure Occurs
Date	January 25, 2019 01:31:46
Msg-id	CAEepm=0u5aKOj3sanw9oXaJ8L521R+xgLhMEN8sNsfVbA-ndvQ@mail.gmail.com Whole thread Raw
In response to	Re: [PATCH] Fix Proposal - Deadlock Issue in Single User Mode When IOFailure Occurs (Amit Kapila <amit.kapila16@gmail.com>)
Responses	RE: [PATCH] Fix Proposal - Deadlock Issue in Single User Mode When IOFailure Occurs (Chengchao Yu <chengyu@microsoft.com>)
List	pgsql-hackers

Tree view

On Sun, Jan 20, 2019 at 4:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Sat, Dec 1, 2018 at 2:30 AM Chengchao Yu <chengyu@microsoft.com> wrote:
> > Recently, we hit a few occurrences of deadlock when IO failure (including disk full, random remote disk IO
failures)happens in single user mode. We found the issue exists on both Linux and Windows in multiple postgres
versions.
> >
> > 3.       Because the unable to write relation data scenario is difficult to hit naturally even reserved space is
turnedoff, I have prepared a small patch (see attachment “emulate-error.patch”) to force an error when PG tries to
writedata to relation files. We can just apply the patch and there is no need to put efforts flooding data to disk any
more.
>
> I have one question related to the way you have tried to emulate the error.
>
> @@ -840,6 +840,10 @@ mdwrite(SMgrRelation reln, ForkNumber forknum,
> BlockNumber blocknum,
> nbytes,
> BLCKSZ);
> + ereport(ERROR,
> + (errcode(ERRCODE_INTERNAL_ERROR),
> + errmsg("Emulate exception in mdwrite() when writing to disk")));
> +
>
> We generally reserve the space in a relation before attempting to
> write, so not sure how you are able to hit the disk full situation via
> mdwrite.  If you see the description of the function, that also
> indicates same.

Presumably ZFS or BTRFS or something more exotic could still get
ENOSPC here, and of course any filesystem could give us EIO here
(because the disk is on fire or the remote NFS server is rebooting due
to an automatic Windows update).

--
Thomas Munro
http://www.enterprisedb.com

pgsql-hackers by date:

From: Tom Lane
Date: 25 January 2019, 01:18:56
Subject: Re: Old protocol fastpath calls borked?

From: Alvaro Herrera
Date: 25 January 2019, 01:51:08
Subject: Re: monitoring CREATE INDEX [CONCURRENTLY]

Re: [PATCH] Fix Proposal - Deadlock Issue in Single User Mode When IOFailure Occurs - Mailing list pgsql-hackers

Previous

Next