Patroni question - Mailing list pgsql-general

From Zwettler Markus (OIZ)
Subject Patroni question
Date
Msg-id d1ee012b1c9c4367ade1e8662e80a0dc@zuerich.ch
Whole thread Raw
List pgsql-general
We had a failover.
I would read the Patroni logs below as following.

2022-09-21 11:13:56,384 secondary did a HTTP GET request to primary. This failed with a read timeout.
2022-09-21 11:13:56,792 secondary promoted itself to primary
2022-09-21 11:13:57,279 primary did a HTTP GET request to secondary. An exception happend. Probably also due to read
timeout.
2022-09-21 11:13:57,983 primary demoted itself

So, the failover has been caused by a network timeout between primary and secondary.
QUESTION 1 : Do you agree?

I thought that the Patroni nodes do not communicate directly with each other but only by DCS?
QUESTION 2: Is this not correct anymore?



===========================


patroni version: 2.1.3


===========================


Patroni Logfile of Host szhm49346 (IP 10.9.132.13) => Primary until Failover
...
...
2022-09-21 11:13:57,279 DEBUG: API thread: 10.9.132.16 - - "GET /patroni HTTP/1.1" 200 - latency: 2245.090 ms
2022-09-21 11:13:57,378 ERROR:
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/patroni/dcs/etcd.py", line 566, in wrapper
    retval = func(self, *args, **kwargs) is not None
  File "/usr/lib/python3.6/site-packages/patroni/dcs/etcd.py", line 696, in _update_leader
    return self.retry(self._client.write, self.leader_path, self._name, prevValue=self._name, ttl=self._ttl)
  File "/usr/lib/python3.6/site-packages/patroni/dcs/etcd.py", line 447, in retry
    return retry(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/patroni/utils.py", line 334, in __call__
    return func(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/etcd/client.py", line 500, in write
    response = self.api_execute(path, method, params=params)
  File "/usr/lib/python3.6/site-packages/patroni/dcs/etcd.py", line 257, in api_execute
    return self._handle_server_response(response)
  File "/usr/lib/python3.6/site-packages/etcd/client.py", line 987, in _handle_server_response
    etcd.EtcdError.handle(r)
  File "/usr/lib/python3.6/site-packages/etcd/__init__.py", line 306, in handle
    raise exc(msg, payload)
etcd.EtcdCompareFailed: Compare failed : [pcl_p011@szhm49346 != pcl_p011@szhm49345]
2022-09-21 11:13:57,558 WARNING: Exception happened during processing of request from 10.9.132.16:49080
2022-09-21 11:13:57,965 ERROR: failed to update leader lock
2022-09-21 11:13:57,983 INFO: Demoting self (immediate-nolock)
2022-09-21 11:13:58,214 WARNING: Traceback (most recent call last):
  File "/usr/lib64/python3.6/socketserver.py", line 654, in process_request_thread
    self.finish_request(request, client_address)
  File "/usr/lib64/python3.6/socketserver.py", line 364, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/lib64/python3.6/socketserver.py", line 724, in __init__
    self.handle()
  File "/usr/lib64/python3.6/http/server.py", line 418, in handle
    self.handle_one_request()
  File "/usr/lib/python3.6/site-packages/patroni/api.py", line 652, in handle_one_request
    BaseHTTPRequestHandler.handle_one_request(self)
  File "/usr/lib64/python3.6/http/server.py", line 406, in handle_one_request
    method()
  File "/usr/lib/python3.6/site-packages/patroni/api.py", line 198, in do_GET_patroni
    self._write_status_response(200, response)
  File "/usr/lib/python3.6/site-packages/patroni/api.py", line 94, in _write_status_response
    self._write_json_response(status_code, response)
  File "/usr/lib/python3.6/site-packages/patroni/api.py", line 53, in _write_json_response
    self._write_response(status_code, json.dumps(response, default=str), content_type='application/json')
  File "/usr/lib/python3.6/site-packages/patroni/api.py", line 50, in _write_response
    self.wfile.write(body.encode('utf-8'))
  File "/usr/lib64/python3.6/socketserver.py", line 803, in write
    self._sock.sendall(b)
BrokenPipeError: [Errno 32] Broken pipe
...
...


===========================


Patroni Logfile of Host szhm49345 (IP 10.9.132.16) => Standby until Failover
...
...
2022-09-21 11:13:54,381 DEBUG: Starting new HTTP connection (1): szhm49346.global.szh.loc:8009
2022-09-21 11:13:56,384 WARNING: Request failed to pcl_p011@szhm49346: GET http://szhm49346.global.szh.loc:8009/patroni
(HTTPConnectionPool(host='szhm49346.global.szh.loc',port=8009): Max retries exceeded with url: /patroni (Caused by
ReadTimeoutError("HTTPConnectionPool(host='szhm49346.global.szh.loc',port=8009): Read timed out. (read timeout=2)",))) 
2022-09-21 11:13:56,484 DEBUG: Writing pcl_p011@szhm49345 to key /patroni/pcl_p011/leader ttl=30 dir=False append=False
2022-09-21 11:13:56,485 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0,
status=None)
2022-09-21 11:13:56,562 DEBUG: http://10.7.211.13:2379 "PUT /v2/keys/patroni/pcl_p011/leader HTTP/1.1" 201 197
2022-09-21 11:13:56,562 DEBUG: Issuing read for key /patroni/pcl_p011/ with args {'recursive': True, 'retry':
<patroni.utils.Retryobject at 0x7fcbb0d0c2b0>} 
2022-09-21 11:13:56,563 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0,
status=None)
2022-09-21 11:13:56,634 DEBUG: http://10.7.211.13:2379 "GET /v2/keys/patroni/pcl_p011/?recursive=true HTTP/1.1" 200
None
2022-09-21 11:13:56,635 DEBUG: Writing {"leader":"pcl_p011@szhm49345","sync_standby":null} to key
/patroni/pcl_p011/syncttl=None dir=False append=False 
2022-09-21 11:13:56,635 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0,
status=None)
2022-09-21 11:13:56,713 DEBUG: http://10.7.211.13:2379 "PUT /v2/keys/patroni/pcl_p011/sync HTTP/1.1" 200 368
2022-09-21 11:13:56,713 DEBUG: Writing
{"conn_url":"postgres://szhm49345.global.szh.loc:5432/pcl_p011","api_url":"http://szhm49345.global.szh.loc:8009/patroni","state":"running","role":"replica","version":"2.1.3","checkpoint_after_promote":false,"xlog_location":9087609453816,"timeline":6}
tokey /patroni/pcl_p011/members/pcl_p011@szhm49345 ttl=30 dir=False append=False 
2022-09-21 11:13:56,714 DEBUG: Converted retries value: 0 -> Retry(total=0, connect=None, read=None, redirect=0,
status=None)
2022-09-21 11:13:56,791 DEBUG: http://10.7.211.13:2379 "PUT /v2/keys/patroni/pcl_p011/members/pcl_p011@szhm49345
HTTP/1.1"200 896 
2022-09-21 11:13:56,792 INFO: promoted self to leader by acquiring session lock
2022-09-21 11:13:56,798 INFO: cleared rewind state after becoming the leader
...
...





pgsql-general by date:

Previous
From: Laurenz Albe
Date:
Subject: Re: [EXT] pg_stat_activity.backend_xmin
Next
From: Ron
Date:
Subject: Re: PCI-DSS Requirements