Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternativehosts when some errors occur - Mailing list pgsql-hackers
From | Tsunakawa, Takayuki |
---|---|
Subject | Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternativehosts when some errors occur |
Date | |
Msg-id | 0A3221C70F24FB45833433255569204D1F6F9B26@G01JPEXMBYT05 Whole thread Raw |
In response to | Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternativehosts when some errors occur
(Robert Haas <robertmhaas@gmail.com>)
Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternativehosts when some errors occur ("David G. Johnston" <david.g.johnston@gmail.com>) |
List | pgsql-hackers |
Hello Robert, Tom, Thank you for being kind enough to explain. I think I could understand your concern. From: pgsql-hackers-owner@postgresql.org > [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Robert Haas > Who is right is a judgement call, but I don't think it's self-evident that > users want to ignore anything and everything that might have gone wrong > with the connection to the first server, rather than only those things which > resemble a down server. It seems quite possible to me that if we had defined > it as you are proposing, somebody would now be arguing for a behavior change > in the other direction. Judgment call... so, I understood that it's a matter of choosing between helping to detect configuration errors early orservice continuity. Hmm, I'd like to know how other databases treat this, but I couldn't find useful information aftersome Google search. I wonder whether I sould ask PgJDBC people if they know something, because they chose service continuity. From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > The bigger picture here is that we only want to fail past transient errors, > not configuration errors. I'm willing to err in favor of regarding doubtful > cases as transient, but most server login rejections aren't for transient > causes. I got "doubtful cases" as ones such as specifying non-existent host or an unused port number. In that case, the configurationerror can't be distinguished from the server failure. What do you think of the following cases? Don't you want to connect to other servers? * The DBA shuts down the database. The server takes a long time to do checkpointing. During the shutdown checkpoint, libpqtries to connect to the server and receive an error "the database system is shutting down." * The former primary failed and now is trying to start as a standby, catching up by applying WAL. During the recovery, libpqtries to connect to the server and receive an error "the database system is performing recovery." * The database server crashed due to a bug. Unfortunately, the server takes unexpectedly long time to shut down becauseit takes many seconds to write the stats file (as you remember, Tom-san experienced 57 seconds to write the statsfile during regression tests.) During the stats file write, libpq tries to connect to the server and receive an error"the database system is shutting down." These are equivalent to server failure. I believe we should prioritize rescuing errors during operation over detecting configurationerrors. > Of course, the user would have to try connections to both foo and bar to > be sure that they're both configured correctly. But he might try > "host=foo,bar" and "host=bar,foo" and figure he was OK, not noticing that > both connections had silently been made to bar. In that case, I think he would specify "host=foo" and "host=bar" in turn, because he would be worried about where he's connectedif he specified multiple hosts. Regards Takayuki Tsunakawa
pgsql-hackers by date: