On Mon, Apr 19, 2021 at 12:17 PM PegoraroF10 <marcos@f10.com.br> wrote:
> I´m sure problem was hardware and I hope it does not occur anymore.
> If I have a logical replication and on replica I do a Vacuum Full, Cluster
> or any other EXCLUSIVE LOCK operation which, replication will wait for that.
> I was thinking was about a time to release that lock, or in my situation a
> hardware problem. If N seconds
> later and that file is not released then change DataFileRead to
> DataFileRead + relfilenode
But how would we implement that with reasonable efficiency? If we
called setitimer() before every read() call to set the timeout, and
then again to clear it after the read(), that would probably be
hideously expensive. Perhaps it would work to have a background
"heartbeat" process that pings every backend in the system every 1s or
something like that, and make the signal handler do this, but that
supposes that the signal handler would have ready access to this
information, which doesn't seem totally straightforward to arrange,
and that it would be OK for the signal handler to grab a lock to
update shared memory, which as things stand today is definitely not
safe.
I am not trying to say that there is no way that something like this
could be made to work. There's probably something that can be done. I
don't think I know what that thing is, though.
--
Robert Haas
EDB: http://www.enterprisedb.com