Re: .ready and .done files considered harmful - Mailing list pgsql-hackers

From Robert Haas
Subject Re: .ready and .done files considered harmful
Date
Msg-id CA+TgmobswMqycLSJ7GVj3+oaGWqr_685TiHULqDCeH9RGLKOJA@mail.gmail.com
Whole thread Raw
In response to Re: .ready and .done files considered harmful  ("Bossart, Nathan" <bossartn@amazon.com>)
Responses Re: .ready and .done files considered harmful
List pgsql-hackers
On Tue, Aug 24, 2021 at 1:26 PM Bossart, Nathan <bossartn@amazon.com> wrote:
> I think Horiguchi-san made a good point that the .ready file creators
> should ideally not need to understand archiving details.  However, I
> think this approach requires them to be inextricably linked.  In the
> happy case, the archiver will follow the simple path of processing
> each consecutive WAL file without incurring a directory scan.  Any
> time there is something other than a regular WAL file to archive, we
> need to take special action to make sure it is picked up.

I think they should be inextricably linked, really. If we know
something - like that there's a file ready to be archived - then it
seems like we should not throw that information away and force
somebody else to rediscover it through an expensive process. The whole
problem here comes from the fact that we're using the filesystem as an
IPC mechanism, and it's sometimes a very inefficient one.

I can't quite decide whether the problems we're worrying about here
are real issues or just kind of hypothetical. I mean, today, it seems
to be possible that we fail to mark some file ready for archiving,
emit a log message, and then a huge amount of time could go by before
we try again to mark it ready for archiving. Are the problems we're
talking about here objectively worse than that, or just different? Is
it a problem in practice, or just in theory?

I really want to avoid getting backed into a corner where we decide
that the status quo is the best we can do, because I'm pretty sure
that has to be the wrong conclusion. If we think that
get-a-bunch-of-files-per-readdir approach is better than the
keep-trying-the-next-file approach, I mean that's OK with me; I just
want to do something about this. I am not sure whether or not that's
the right course of action.

--
Robert Haas
EDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Chapman Flack
Date:
Subject: Re: Mark all GUC variable as PGDLLIMPORT
Next
From: Robert Haas
Date:
Subject: Re: Mark all GUC variable as PGDLLIMPORT