I actually just recently encountered this very same problem when calling
bdr_group_join(). The dboid generated is the same as that of an existing
node, and the tuple (sysid, timeline, dboid) is the same as well.
I saw this manifest two different ways in the logs:
A)
< 2021-03-30 02:42:56.942 UTC >FATAL: could not send replication command
"CREATE_REPLICATION_SLOT "bdr_16386_6924805489516289687_1_17615__" LOGICAL
bdr": status PGRES_FATAL_ERROR: ERROR: replication slot
"bdr_16386_6924805489516289687_1_17615__" already exists
B)
< 2021-03-30 21:02:29.260 UTC >LOG: Creating replica with: <…>
Restoring dump to <…>
< 2021-03-30 21:02:31.929 UTC >ERROR: duplicate key value violates unique
constraint "bdr_nodes_pkey"
< 2021-03-30 21:02:31.929 UTC >DETAIL: Key (node_sysid, node_timeline,
node_dboid)=(6924805489516289687, 1, 17615) already exists.
I did not see this issue previously on an earlier version of the OS we are
using. The Postgres/BDR version has not changed either.
It seems that (on this platform, for the experiments I’ve tried thus far)
‘17615’ is always generated as the first dboid of an added node, hence the
conflict. When we remove the node and try again, another dboid is
predictably tried. In general (except for the addition of the 2nd node,
which is always successful), for the Nth node added, (N-1) tries are always
needed to ensure a unique dboid (and a unique tuple).
At this point it would be great if there is a way to avoid this
programmatically. It seems that I can only detect this error condition in
the logs. The bdr_group_join() call itself does not return error.
Is there a way to make the sysid, timeline, or dboid unique?
Thank you very much for your help.
--
Sent from: https://www.postgresql-archive.org/PostgreSQL-general-f1843780.html