Thread: Reuse an existing slot with a new initdb
I normalized my replislots with the name of my nodes. I have 2 options in my recovery script that tries first pg_basebackup to recover and sync the hot standby, but unfortunately big DB fails sometimes due to very slow or unstable network. So my second option is to completely make a new inidb and import an sql file from pg_dumpall master as it takes less bytes once compressed. But I'm facing an issue with the slot complaining (obviously) about the ho standby node that does not match the slot identifier. So my question is simple, is there a way to reinitialize the slot to not renew the identifier with the new hot standby initdb? Thanks David
On 5/8/2020 1:51 PM, Support wrote: > I normalized my replislots with the name of my nodes. > I have 2 options in my recovery script that tries first pg_basebackup > to recover and sync the hot standby, but unfortunately big DB fails > sometimes due > to very slow or unstable network. So my second option is to completely > make a new inidb and import an sql file from pg_dumpall master as it > takes less bytes once compressed. But I'm facing an issue with the > slot complaining (obviously) about the ho standby node that does not > match the slot identifier. So my question is simple, is there a way to > reinitialize the slot to not renew the identifier with the new hot > standby initdb? > > Thanks > > David No one has an answer to my question? thanks! David
On Wed, 2020-05-13 at 06:18 -0700, Support wrote: > On 5/8/2020 1:51 PM, Support wrote: > > I normalized my replislots with the name of my nodes. > > I have 2 options in my recovery script that tries first pg_basebackup > > to recover and sync the hot standby, but unfortunately big DB fails > > sometimes due > > to very slow or unstable network. So my second option is to completely > > make a new inidb and import an sql file from pg_dumpall master as it > > takes less bytes once compressed. But I'm facing an issue with the > > slot complaining (obviously) about the ho standby node that does not > > match the slot identifier. So my question is simple, is there a way to > > reinitialize the slot to not renew the identifier with the new hot > > standby initdb? > > No one has an answer to my question? That may be because your question is hard to understand. You cannot create a standby server using "pg_dumpall", so it is unclear what exactly you are doing here. Also, it is not clear what error message you get. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Best Regards LIVE-SCHOOL FOUNDATION http://www.live-school.net infos@live-school.net On 5/13/2020 1:24 PM, Laurenz Albe wrote: > On Wed, 2020-05-13 at 06:18 -0700, Support wrote: >> On 5/8/2020 1:51 PM, Support wrote: >>> I normalized my replislots with the name of my nodes. >>> I have 2 options in my recovery script that tries first pg_basebackup >>> to recover and sync the hot standby, but unfortunately big DB fails >>> sometimes due >>> to very slow or unstable network. So my second option is to completely >>> make a new inidb and import an sql file from pg_dumpall master as it >>> takes less bytes once compressed. But I'm facing an issue with the >>> slot complaining (obviously) about the ho standby node that does not >>> match the slot identifier. So my question is simple, is there a way to >>> reinitialize the slot to not renew the identifier with the new hot >>> standby initdb? >> No one has an answer to my question? > That may be because your question is hard to understand. > > You cannot create a standby server using "pg_dumpall", so it is > unclear what exactly you are doing here. > > Also, it is not clear what error message you get. > > Yours, > Laurenz Albe Thanks Laurenz, I didn't recal that it was not possible to create a hot standby with a fresh new install and pg_dumpall :(. only pg_basebackup or an exact copy of the data folder can do it right? is the reason technical or else?
On 5/13/2020 1:24 PM, Laurenz Albe wrote: > On Wed, 2020-05-13 at 06:18 -0700, Support wrote: >> On 5/8/2020 1:51 PM, Support wrote: >>> I normalized my replislots with the name of my nodes. >>> I have 2 options in my recovery script that tries first pg_basebackup >>> to recover and sync the hot standby, but unfortunately big DB fails >>> sometimes due >>> to very slow or unstable network. So my second option is to completely >>> make a new inidb and import an sql file from pg_dumpall master as it >>> takes less bytes once compressed. But I'm facing an issue with the >>> slot complaining (obviously) about the ho standby node that does not >>> match the slot identifier. So my question is simple, is there a way to >>> reinitialize the slot to not renew the identifier with the new hot >>> standby initdb? >> No one has an answer to my question? > That may be because your question is hard to understand. > > You cannot create a standby server using "pg_dumpall", so it is > unclear what exactly you are doing here. > > Also, it is not clear what error message you get. > > Yours, > Laurenz Albe > it is not clear what error message you get. Each has apparently an internal identifier based on the hot standby initdb when it connected to the master the first time(?) or when a pg_basebackup occured previously this identifier (unique bigint) obviously does not match if I connect the hot standby with a new initdb and a restore from pg_dumpall copy of the master. Sad because everything seems to be running but the master just does not like the identifier doesn't match up. (sorry I cannot show you the original error since I run the db in prod now)
On 5/13/2020 1:24 PM, Laurenz Albe wrote: > On Wed, 2020-05-13 at 06:18 -0700, Support wrote: >> On 5/8/2020 1:51 PM, Support wrote: >>> I normalized my replislots with the name of my nodes. >>> I have 2 options in my recovery script that tries first pg_basebackup >>> to recover and sync the hot standby, but unfortunately big DB fails >>> sometimes due >>> to very slow or unstable network. So my second option is to completely >>> make a new inidb and import an sql file from pg_dumpall master as it >>> takes less bytes once compressed. But I'm facing an issue with the >>> slot complaining (obviously) about the ho standby node that does not >>> match the slot identifier. So my question is simple, is there a way to >>> reinitialize the slot to not renew the identifier with the new hot >>> standby initdb? >> No one has an answer to my question? > That may be because your question is hard to understand. > > You cannot create a standby server using "pg_dumpall", so it is > unclear what exactly you are doing here. > > Also, it is not clear what error message you get. > > Yours, > Laurenz Albe I didn't recal that it was not possible to create a hot standby with a fresh new install and pg_dumpall . only pg_basebackup or an exact copy of the data folder can do it right? is the reason technical or else? Each has apparently an internal identifier based on the hot standby initdb when it connected to the master the first time(?) or when a pg_basebackup occured previously this identifier (unique bigint) obviously does not match if I connect the hot standby with a new initdb and a restore from pg_dumpall copy of the master. Sad because everything seems to be running but the master just does not like the identifier doesn't match up. (sorry I cannot show you the original error since I run the db in prod now)
On Wed, May 13, 2020 at 02:12:45PM -0700, live-school support wrote: > I didn't recal that it was not possible to create a hot standby with a fresh > new install and pg_dumpall :(. > > only pg_basebackup or an exact copy of the data folder can do it right? is > the reason technical or else? When using physical replication, both the primary and standby need to have the same system ID, and both instances need to share the same architectures to work properly as data is physically replayed from one cluster to the other using WAL, which includes for example copies of on disk relation 8kB pages (ever heard of full_page_writes?). Like Laurenz, I am not sure what is your problem, what are the errors PostgreSQL are producing and what you expect from Postgres. If you could describe clearly step-by-step what you are doing and what you expect the result would be based on your configuration, we may be able to help, but it is not really possible to help out without more details. For example, the first sentence of your first email mentions the use of replication slots. You may want to explain better where the slots are used, how they get either dropped and/or recreated, etc. - Michael
Attachment
On Wednesday, May 13, 2020, Michael Paquier <michael@paquier.xyz> wrote:
On Wed, May 13, 2020 at 02:12:45PM -0700, live-school support wrote:
> I didn't recal that it was not possible to create a hot standby with a fresh
> new install and pg_dumpall :(.
>
> only pg_basebackup or an exact copy of the data folder can do it right? is
> the reason technical or else?
When using physical replication, both the primary and standby need to
have the same system ID, and both instances need to share the same
architectures to work properly as data is physically replayed from one
cluster to the other using WAL, which includes for example copies of
on disk relation 8kB pages (ever heard of full_page_writes?).
This basically hits the nail on the head. My reading is that the OP has two abstractly identical restored databases, one created from a physical copy and the other from a logical copy. The issue is why the original server cannot use the same replication slot name to continue synchronizing with the logically restored one but is able to continue with the physically restored one. The above is why.
The OP asks whether the technical identifier error encountered can be overcome. It cannot but even if it could the attempt would still end up failed due to fundamental differences in the physical data layouts between physical and logical restoration. If the OP needs to rebuild a physical replication hot standby database they must use a physical backup of the original database as a starting point. To use a logically restored database target would require logical replication.
David J.
On 5/13/2020 9:28 PM, David G. Johnston wrote:
On Wednesday, May 13, 2020, Michael Paquier <michael@paquier.xyz> wrote:Thanks Michael and David for your answersOn Wed, May 13, 2020 at 02:12:45PM -0700, live-school support wrote:
> I didn't recal that it was not possible to create a hot standby with a fresh
> new install and pg_dumpall :(.
>
> only pg_basebackup or an exact copy of the data folder can do it right? is
> the reason technical or else?
When using physical replication, both the primary and standby need to
have the same system ID, and both instances need to share the same
architectures to work properly as data is physically replayed from one
cluster to the other using WAL, which includes for example copies of
on disk relation 8kB pages (ever heard of full_page_writes?).This basically hits the nail on the head. My reading is that the OP has two abstractly identical restored databases, one created from a physical copy and the other from a logical copy. The issue is why the original server cannot use the same replication slot name to continue synchronizing with the logically restored one but is able to continue with the physically restored one. The above is why.The OP asks whether the technical identifier error encountered can be overcome. It cannot but even if it could the attempt would still end up failed due to fundamental differences in the physical data layouts between physical and logical restoration. If the OP needs to rebuild a physical replication hot standby database they must use a physical backup of the original database as a starting point. To use a logically restored database target would require logical replication.David J.
I think David caught it, the question is Why, as long as we have an exact copy of the master (from pg_dumpall) we cannot start a new initdb hot standby with an already existing physical replication slots without the master complain about this "identifier doesn't match up"
knowing that everything seems to be synchronized?
Sorry Michael to not show you more logs, I made these tests weeks ago and cannot restart them for now, too busy on other jobs.
David
Support <admin@e-blokos.com> writes: > I think David caught it, the question is Why, as long as we have an > exact copy of the master (from pg_dumpall) Stop right there. pg_dumpall does *not* produce an exact copy. It produces a logically equivalent copy, which is not close enough for physical replication to work. regards, tom lane
On 5/14/2020 6:33 AM, Tom Lane wrote: > Support <admin@e-blokos.com> writes: >> I think David caught it, the question is Why, as long as we have an >> exact copy of the master (from pg_dumpall) > Stop right there. pg_dumpall does *not* produce an exact copy. > It produces a logically equivalent copy, which is not close enough > for physical replication to work. > > regards, tom lane > > Ah ok, so there is no way to make physical <--> logical copy jumping so... :(