Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away
Date
Msg-id 76df9206-559c-09fb-efac-b9bca70f7079@oss.nttdata.com
Whole thread Raw
In response to Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Retry Cached Remote Connections for postgres_fdw in case remote backend gets killed/goes away
List pgsql-hackers

On 2020/10/03 20:40, Bharath Rupireddy wrote:
> On Fri, Oct 2, 2020 at 11:30 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>
>>> Attaching v8 patch, please review it.. Both make check and make
>>> check-world passes on v8.
>>
>> Thanks for updating the patch! It basically looks good to me.
>> I tweaked the patch as follows.
>>
>> +               if (!entry->conn ||
>> +                       PQstatus(entry->conn) != CONNECTION_BAD ||
>>
>> With the above change, if entry->conn is NULL, an error is thrown and no new
>> connection is reestablished. But why? IMO it's more natural to reestablish
>> new connection in that case. So I removed "!entry->conn" from the above
>> condition.
>>
> 
> Yeah, that makes sense.
> 
>>
>> +               ereport(DEBUG3,
>> +                               (errmsg("could not start remote transaction on connection %p",
>> +                                entry->conn)),
>>
>> I replaced errmsg() with errmsg_internal() because the translation of
>> this debug message is not necessary.
>>
> 
> I'm okay with this as we don't have any specific strings that need
> translation in the debug message. But, should we also try to have
> errmsg_internal in a few other places in connection.c?
> 
>                   errmsg("could not obtain message string for remote error"),
>                   errmsg("cannot PREPARE a transaction that has
> operated on postgres_fdw foreign tables")));
>                   errmsg("password is required"),
> 
> I see the errmsg() with plain texts in other places in the code base
> as well. Is it  that we look at the error message and if it is a plain
> text(without database objects or table data), we decide to have no
> translation? Or is there any other policy?

I was thinking that elog() basically should be used to report this
debug message, instead, but you used ereport() because maybe
you'd like to add detail message about connection error. Is this
understanding right? elog() uses errmsg_internal(). So if ereport()
is used as an aternative of elog() for some reasons,
IMO errmsg_internal() should be used. Thought?

OTOH, the messages you mentioned are not debug ones,
so basically ereport() and errmsg() should be used, I think.


> 
>>
>> +SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE
>> +backend_type = 'client backend' AND application_name = 'fdw_retry_check';
>> +CALL wait_for_backend_termination();
>>
>> Since we always use pg_terminate_backend() and wait_for_backend_termination()
>> together, I merged them into one function.
>>
>> I simplied the comments on the regression test.
>>
> 
> +1. I slightly adjusted comments in connection.c and ran pg_indent to
> keep them upt 80 char limit.
> 
>>
>>    Attached is the updated version of the patch. If this patch is ok,
>>    I'd like to mark it as ready for committer.
>>
> 
> Thanks. Attaching v10 patch. Please have a look.

Thanks for updating the patch! I will mark the patch as ready for committer in CF.
Barring any objections, I will commit that.

> 
>>
>>> I have another question not related to this patch: though we have
>>> wait_pid() function, we are not able to use it like
>>> pg_terminate_backend() in other modules, wouldn't be nice if we can
>>> make it generic under the name pg_wait_pid() and usable across all pg
>>> modules?
>>
>> I thought that, too. But I could not come up with good idea for *real* use case
>> of that function. At least that's useful for the regression test, though.
>> Anyway, IMO it's worth proposing that and hearing more opinions about that
>> from other hackers.
>>
> 
> Yes it will be useful for testing when coupled with
> pg_terminate_backend(). I will post the idea in a separate thread soon
> for more thoughts.

Sounds good!
ISTM that he function should at least check the target process is PostgreSQL one.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Logical Replication - detail message with names of missing columns
Next
From: Kyotaro Horiguchi
Date:
Subject: Re: Asynchronous Append on postgres_fdw nodes.