Thread: Database migration across different operating systems

Database migration across different operating systems

From
Natalie Wenz
Date:
We are in an unusual circumstance, where we need to move all of our Postgres 10 databases from their homes on servers
runningFreeBSD 11 with ZFS to servers in another data center running Red Hat Linux 7.6 (also with ZFS), with minimal
downtime. 

I understand that the recommended, safest, approach would be to dump/restore. We intend to do that wherever possible.

However, we have some databases that require high availability/minimal downtime that are also very large. (50 TB, for
example)

Because we use ZFS on all of our current FreeBSD database hosts, one idea that we are exploring is to use filesystem
replicationto move the data ahead of time, and keep it in sync. Then we could start the databases on the Linux hosts at
anytime and it would be up to date. With that option, there’s almost no downtime.  

We’ve tested this procedure on a small database, and it starts, it runs, returns queries. There seems to be no issues.

Is this a reasonable approach? Are there any known concerns/things to investigate before choosing this course?

If it helps:
The cpu architecture is the same on both hosts, and we’ll be running Postgres with the same version and same build
options. 

Many thanks,
Natalie

Re: Database migration across different operating systems

From
Robert Simmons
Date:


On Tue, Jan 8, 2019, 2:28 AM Natalie Wenz <nataliewenz@gmail.com wrote:
We are in an unusual circumstance, where we need to move all of our Postgres 10 databases from their homes on servers running FreeBSD 11 with ZFS to servers in another data center running Red Hat Linux 7.6 (also with ZFS), with minimal downtime.

I understand that the recommended, safest, approach would be to dump/restore. We intend to do that wherever possible.

However, we have some databases that require high availability/minimal downtime that are also very large. (50 TB, for example)

Because we use ZFS on all of our current FreeBSD database hosts, one idea that we are exploring is to use filesystem replication to move the data ahead of time, and keep it in sync. Then we could start the databases on the Linux hosts at any time and it would be up to date. With that option, there’s almost no downtime.

We’ve tested this procedure on a small database, and it starts, it runs, returns queries. There seems to be no issues.

Is this a reasonable approach? Are there any known concerns/things to investigate before choosing this course?

If it helps:
The cpu architecture is the same on both hosts, and we’ll be running Postgres with the same version and same build options.

Many thanks,
Natalie

Re: Database migration across different operating systems

From
Stéphane KANSCHINE
Date:
Hey !

Le lun.  7 janv., vers 22:51, Natalie Wenz exprimait :
>
> We are in an unusual circumstance, where we need to move all of our
> Postgres 10 databases from their homes on servers running FreeBSD 11
> with ZFS to servers in another data center running Red Hat Linux 7.6
> (also with ZFS), with minimal downtime. 
> 
> I understand that the recommended, safest, approach would be to
> dump/restore. We intend to do that wherever possible. 

It's not anymore since PostgreSQL have logical replication. You can
initialize you FreeBSD cluster with a pg_basebackup and you follow
updates with logical streaming.

> However, we have some databases that require high
> availability/minimal downtime that are also very large. (50 TB, for
> example)

With logical replication, is it's permanently up to date in
miliseconds, you can switch when you want.

> If it helps: The cpu architecture is the same on both hosts, and
> we’ll be running Postgres with the same version and same build
> options. 

Your OS or cpu architecture doesn't matter, really :-)

Regards,
-- 
Stéphane KANSCHINE - https://www.hexack.fr./ - https://www.nuajik.io./
@ stephane@hexack.fr
  +33 6 64 31 72 52


Re: Database migration across different operating systems

From
Achilleas Mantzios
Date:
Hello,
On 8/1/19 6:51 π.μ., Natalie Wenz wrote:
> We are in an unusual circumstance, where we need to move all of our Postgres 10 databases from their homes on servers
runningFreeBSD 11 with ZFS to servers in another data center running Red Hat Linux 7.6 (also with ZFS), with minimal
downtime.
any technical reason behind the switch to linux?
>
> I understand that the recommended, safest, approach would be to dump/restore. We intend to do that wherever
possible.
>
> However, we have some databases that require high availability/minimal downtime that are also very large. (50 TB, for
example)
>
> Because we use ZFS on all of our current FreeBSD database hosts, one idea that we are exploring is to use filesystem
replicationto move the data ahead of time, and keep it in sync. Then we could start the databases on the Linux hosts at
anytime and it would be up to date. With that option, there’s almost no downtime.
 
Not at any time, you should :
- pause (all) the application(s)/service(s) accessing the DB
- stop the filesystem replication
- start the new cluster
- point the apps/services to the new server
- resume the apps/services
So, you should not start the new cluster with the filesystem replication still active.

There are many other solutions as well, you should give logical replication a try. But you have to test first on a
similarsized db with similar sized tables, ideally with a copy of the production db. 
 
Some tuning might be needed.
>
> We’ve tested this procedure on a small database, and it starts, it runs, returns queries. There seems to be no
issues.
>
> Is this a reasonable approach? Are there any known concerns/things to investigate before choosing this course?
>
> If it helps:
> The cpu architecture is the same on both hosts, and we’ll be running Postgres with the same version and same build
options.
>
> Many thanks,
> Natalie


-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt



Re: Database migration across different operating systems

From
Andrew Gierth
Date:
>>>>> "Natalie" == Natalie Wenz <nataliewenz@gmail.com> writes:

 Natalie> We are in an unusual circumstance, where we need to move all
 Natalie> of our Postgres 10 databases from their homes on servers
 Natalie> running FreeBSD 11 with ZFS to servers in another data center
 Natalie> running Red Hat Linux 7.6 (also with ZFS), with minimal
 Natalie> downtime.

 [...]

 Natalie> We’ve tested this procedure on a small database, and it
 Natalie> starts, it runs, returns queries. There seems to be no issues.

 Natalie> Is this a reasonable approach? Are there any known
 Natalie> concerns/things to investigate before choosing this course?

Locales will bite you unless your databases all have lc_collate=C or
you're using ICU for all collations); you will need, at minimum, to
reindex every index containing a collatable column with an OS-provided
collation (i.e. not ICU and not C / POSIX).

--
Andrew (irc:RhodiumToad)


Re: Database migration across different operating systems

From
Achilleas Mantzios
Date:
On 8/1/19 11:01 π.μ., Stéphane KANSCHINE wrote:
> Hey !
>
> Le lun.  7 janv., vers 22:51, Natalie Wenz exprimait :
>> We are in an unusual circumstance, where we need to move all of our
>> Postgres 10 databases from their homes on servers running FreeBSD 11
>> with ZFS to servers in another data center running Red Hat Linux 7.6
>> (also with ZFS), with minimal downtime.
>>
>> I understand that the recommended, safest, approach would be to
>> dump/restore. We intend to do that wherever possible.
> It's not anymore since PostgreSQL have logical replication. You can
> initialize you FreeBSD cluster with a pg_basebackup and you follow
> updates with logical streaming.

Hmm, maybe you mean pg_dump --schema-only? I cannot see how taking physical copy with pg_basebackup will help with
logicalreplication. Logical replication has its own initial snapshot phase. Doing a 
 
pg_basebackup would work if followed by truncating all tables, but that's duplicate work and equivalent to pg_dumpall
--schema-only| psql -f -
 

>
>> However, we have some databases that require high
>> availability/minimal downtime that are also very large. (50 TB, for
>> example)
> With logical replication, is it's permanently up to date in
> miliseconds, you can switch when you want.
>
>> If it helps: The cpu architecture is the same on both hosts, and
>> we’ll be running Postgres with the same version and same build
>> options.
> Your OS or cpu architecture doesn't matter, really :-)
>
> Regards,


-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt



Re: Database migration across different operating systems

From
Natalie Wenz
Date:
I thought about logical replication, but it seems that the initial sync is just a plain copy for each table, which will
getus into xid trouble on the publisher side while large tables copy. Unless someone can say conclusively that the copy
doesNOT hold an xid open that would prevent autovacuums from completing on the publisher.  

Then of course there’s the excitement of having to freeze everything on the subscriber side, which can take weeks.

> On Jan 8, 2019, at 3:01 AM, Stéphane KANSCHINE <stephane@hexack.fr> wrote:
>
>
> Hey !
>
> Le lun.  7 janv., vers 22:51, Natalie Wenz exprimait :
>>
>> We are in an unusual circumstance, where we need to move all of our
>> Postgres 10 databases from their homes on servers running FreeBSD 11
>> with ZFS to servers in another data center running Red Hat Linux 7.6
>> (also with ZFS), with minimal downtime.
>>
>> I understand that the recommended, safest, approach would be to
>> dump/restore. We intend to do that wherever possible.
>
> It's not anymore since PostgreSQL have logical replication. You can
> initialize you FreeBSD cluster with a pg_basebackup and you follow
> updates with logical streaming.
>
>> However, we have some databases that require high
>> availability/minimal downtime that are also very large. (50 TB, for
>> example)
>
> With logical replication, is it's permanently up to date in
> miliseconds, you can switch when you want.
>
>> If it helps: The cpu architecture is the same on both hosts, and
>> we’ll be running Postgres with the same version and same build
>> options.
>
> Your OS or cpu architecture doesn't matter, really :-)
>
> Regards,
> --
> Stéphane KANSCHINE - https://www.hexack.fr./ - https://www.nuajik.io./
> @ stephane@hexack.fr
>   +33 6 64 31 72 52


Re: Database migration across different operating systems

From
Stéphane KANSCHINE
Date:
Le mar.  8 janv., vers 11:26, Achilleas Mantzios exprimait :

> On 8/1/19 11:01 π.μ., Stéphane KANSCHINE wrote:
> > 
> > Le lun.  7 janv., vers 22:51, Natalie Wenz exprimait :
> > > 
> > > I understand that the recommended, safest, approach would be to
> > > dump/restore. We intend to do that wherever possible.
> >
> > It's not anymore since PostgreSQL have logical replication. You can
> > initialize you FreeBSD cluster with a pg_basebackup and you follow
> > updates with logical streaming.
> 
> Hmm, maybe you mean pg_dump --schema-only? I cannot see how taking physical
> copy with pg_basebackup will help with logical replication.

You're right, it's useless and counterproductive. I'm still on
holydays. But still, take a look at logical replication, it's way more
efficient than pg_dump.

-- 
Stéphane KANSCHINE - https://www.hexack.fr./ - https://www.nuajik.io./
@ stephane@hexack.fr
  +33 6 64 31 72 52


Re: Database migration across different operating systems

From
Natalie Wenz
Date:
Thanks, Andrew.

All of our databases do have lc_collate=C. To check my understanding, if that were not the case, we’d need to reindex,
butsince it is all lc_collate=C, we shouldn’t run into that particular problem. Did I get that right?  

On Jan 8, 2019, at 3:06 AM, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:

>>>>>> "Natalie" == Natalie Wenz <nataliewenz@gmail.com> writes:
>
> Natalie> We are in an unusual circumstance, where we need to move all
> Natalie> of our Postgres 10 databases from their homes on servers
> Natalie> running FreeBSD 11 with ZFS to servers in another data center
> Natalie> running Red Hat Linux 7.6 (also with ZFS), with minimal
> Natalie> downtime.
>
> [...]
>
> Natalie> We’ve tested this procedure on a small database, and it
> Natalie> starts, it runs, returns queries. There seems to be no issues.
>
> Natalie> Is this a reasonable approach? Are there any known
> Natalie> concerns/things to investigate before choosing this course?
>
> Locales will bite you unless your databases all have lc_collate=C or
> you're using ICU for all collations); you will need, at minimum, to
> reindex every index containing a collatable column with an OS-provided
> collation (i.e. not ICU and not C / POSIX).
>
> --
> Andrew (irc:RhodiumToad)


Re: Database migration across different operating systems

From
Andrew Gierth
Date:
>>>>> "Natalie" == Natalie Wenz <nataliewenz@gmail.com> writes:

 Natalie> Thanks, Andrew.

 Natalie> All of our databases do have lc_collate=C. To check my
 Natalie> understanding, if that were not the case, we’d need to
 Natalie> reindex, but since it is all lc_collate=C, we shouldn’t run
 Natalie> into that particular problem. Did I get that right?

To be completely precise, it's not enough to have lc_collate=C, you also
need to not have indexed columns with individually specified non-default
collations. i.e. what matters is not so much the database setting, but
the actual collation being used for indexed columns (because it's the
collation that determines the order in which values appear in the
index).

--
Andrew (irc:RhodiumToad)