Re: BUG #14929: Unchecked AllocateDir() return value in restoreTwoPhaseData() - Mailing list pgsql-bugs

From Michael Paquier
Subject Re: BUG #14929: Unchecked AllocateDir() return value in restoreTwoPhaseData()
Date
Msg-id CAB7nPqS5ua9ktEX7p_2WzFcTVCzo0zP9xAx4A7thPwV-8kq7bA@mail.gmail.com
Whole thread Raw
In response to Re: BUG #14929: Unchecked AllocateDir() return value in restoreTwoPhaseData()  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: BUG #14929: Unchecked AllocateDir() return value in restoreTwoPhaseData()  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
On Tue, Nov 28, 2017 at 12:08 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> PanBian <bianpan2016@163.com> writes:
>> On Mon, Nov 27, 2017 at 07:53:30PM +0900, Michael Paquier wrote:
>>> You are missing the fact that ReadDir goes through ReadDirExtended,
>>> which drops an ERROR log if the folder allocated is NULL.
>
>> You are right. Its my carelessness. ReadDir will not return back on a
>> NULL dir parameter. The code is bug free. Sorry for the trouble.
>
> There are some issues here, though:
>
> 1. twophase.c does this:
>
>     cldir = AllocateDir(TWOPHASE_DIR);
>     LWLockAcquire(TwoPhaseStateLock, LW_EXCLUSIVE);
>     while ((clde = ReadDir(cldir, TWOPHASE_DIR)) != NULL)
>     {
>
> which is flat out wrong because LWLockAcquire might well clobber errno.
> I don't see any good reason why we couldn't just swap the order of
> those two calls.

I have not checked if it actually updates errno or not, but relying on
the fact that it may do it sucks.

> 2. basebackup.c and some other places do things like
>
>         dir = AllocateDir("pg_wal");
>         if (!dir)
>             ereport(ERROR,
>                     (errmsg("could not open directory \"%s\": %m", "pg_wal")));
>         while ((de = ReadDir(dir, "pg_wal")) != NULL)
>
> Not only is this a waste of code, because the error message is no better
> than what ReadDir would provide, but it's wrong because it omits
> errcode_for_file_access(), causing the SQLSTATE to be reported as XX000.
> There are other places that are even lazier and use elog(), failing
> translatability as well as the errcode test.

I agree with using ereport() everywhere, a path may have been created
by initdb, but anything used after a base backup should be reported to
the user.

> There might be some other problems I missed in a quick scan.
>
> So there's definitely room for a cleanup patch here, but the originally
> proposed change isn't it.

I have spotted more problems. In pg_available_extensions,
AllocateDir() does nothing in the event of an error but forgets to
reset errno. In perform_base_backup, CheckXLogRemoved() is called
without saving errno, so the error message generated after may be
wrong.

I think that this requires its own new thread with a more extended
analysis on -hackers to attract attention, this goes way beyond the
original complain about a pointer dereference. And there is a
collection of small issues. I'll try to look at that after I am done
with my CF duties except if someone beats me or volunteers for it.
-- 
Michael


pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #14928: Unchecked SearchSysCacheCopy1() return value
Next
From: Amit Langote
Date:
Subject: Re: BUG #14928: Unchecked SearchSysCacheCopy1() return value