Re: pgsql_fdw, FDW for PostgreSQL server - Mailing list pgsql-hackers

From Shigeru HANADA
Subject Re: pgsql_fdw, FDW for PostgreSQL server
Date
Msg-id 4F8275B1.1040203@gmail.com
Whole thread Raw
In response to Re: pgsql_fdw, FDW for PostgreSQL server  (Thom Brown <thom@linux.com>)
Responses Re: pgsql_fdw, FDW for PostgreSQL server  (Gerald Devotta <gdevotta@newtglobal.com>)
Re: pgsql_fdw, FDW for PostgreSQL server  (Thom Brown <thom@linux.com>)
List pgsql-hackers
(2012/04/08 5:19), Thom Brown wrote:
> 2012/4/7 Shigeru HANADA<shigeru.hanada@gmail.com>:
>> I've updated pgsql_fdw so that it can collect statistics from foreign
>> data with new FDW API.
> 
> I notice that if you restart the remote server, the connection is
> broken, but the client doesn't notice this until it goes to fire off
> another command.  Should there be an option to automatically
> re-establish the connection upon noticing the connection has dropped,
> and issue a NOTICE that it had done so?

Hm, I'd prefer reporting the connection failure and aborting the local
transaction, because reconnecting to the server would break consistency
between the results come from multiple foreign tables.  Server shutdown
(or other troubles e.g. network failure) might happen at various timing
in the sequence of remote query (or sampling in ANALYZE).  For example,
when we execute a local query which contains two foreign tables, foo and
bar, then the sequence of libpq activity would be like this.
   1) connect to the server at the beginning of the local query   2) execute EXPLAIN for foreign table foo   3) execute
EXPLAINfor foreign table bar   4) execute actual query for foreign table foo   5) execute actual query for foreign
tablebar   6) disconnect from the server at the end of the local query
 

If the connection has broken between 4) and 5), and immediate reconnect
succeeded, retrieved results for foo and bar might be inconsistent from
the viewpoint of transaction isolation.

In current implementation, next local query which contains foreign table
of failed foreign table tries to reconnect to the server.

> Also I'm not particularly keen on the message provided to the user in
> this event:
> 
> ERROR:  could not execute EXPLAIN for cost estimation
> DETAIL:  FATAL:  terminating connection due to administrator command
> FATAL:  terminating connection due to administrator command
> 
> There's no explanation what the "administrator" command was, and I
> suspect this is really just a "I don't know what's happened here"
> condition.  I don't think we should reach that point.

That FATAL message is returned by remote backend's ProcessInterrupts()
during some administrator commands, such as immediate shutdown or
pg_terminate_backend().  If remote backend died of fast shutdown or
SIGKILL, no error message is available (see the sample below).

postgres=# select * From pgsql_branches ;
ERROR:  could not execute EXPLAIN for cost estimation
DETAIL:
HINT:  SELECT bid, bbalance, filler FROM public.pgbench_branches

I agree that the message is confusing.  How about showing message like
"pgsql_fdw connection failure on <servername>" or something with remote
error message for such cases?  It can be achieved by adding extra check
for connection status right after PQexec()/PQexecParams().  Although
some word polishing would be required :)

postgres=# select * from pgsql_branches ;
ERROR:  pgsql_fdw connection failure on subaru_pgbench
DETAIL:  FATAL:  terminating connection due to administrator command
FATAL:  terminating connection due to administrator command

This seems to impress users that remote side has some trouble.

Regards,
-- 
Shigeru HANADA


pgsql-hackers by date:

Previous
From: Greg Smith
Date:
Subject: Re: Last gasp
Next
From: Noah Misch
Date:
Subject: Re: Last gasp