Thread: How to validate restore of backup?

How to validate restore of backup?

From
Vince McMahon
Date:
Hi,

I have some questions When doing pg_restore of backup of a database to a NEW server.  

Is there a way to ensure the data integrity is in tact, and user ID and access works liked how it was in the old server?

How to properly handle the materialized views when backing up and restoring?

Thanks.

Re: How to validate restore of backup?

From
Muhammad Usman Khan
Date:
Hi Vince,
For validation of databases, you can use the following approach

/usr/pgsql-16/bin/pg_dump -d postgres -h localhost -p 5428 | md5sum > /var/lib/pgsql/db1.txt
/usr/pgsql-16/bin/pg_dump -d postgres -h localhost -p 5420 | md5sum > /var/lib/pgsql/db2.txt
diff db1.txt db2.txt

By executing above queries, if diff is null then it means there is no difference between source and destination databases. Adjust your port and databases accordingly.

On Thu, 22 Aug 2024 at 16:06, Vince McMahon <sippingonesandzeros@gmail.com> wrote:
Hi,

I have some questions When doing pg_restore of backup of a database to a NEW server.  

Is there a way to ensure the data integrity is in tact, and user ID and access works liked how it was in the old server?

How to properly handle the materialized views when backing up and restoring?

Thanks.

Re: How to validate restore of backup?

From
Ron Johnson
Date:
On Thu, Aug 22, 2024 at 7:06 AM Vince McMahon <sippingonesandzeros@gmail.com> wrote:
Hi,

I have some questions When doing pg_restore of backup of a database to a NEW server.  

Is there a way to ensure the data integrity is in tact, and user ID and access works liked how it was in the old server?

pg_restore is just a bunch of CREATE, COPY and ALTER statements, since pg_restore replays what was generated by pg_dump. 
 
There can be errors, of course.  That's why I  run "createdb $DB 2> /dev/null" to ensure that there's something for pg_restore to drop, then "pg_restore --create --clean --exit-on-error", and  most importantly, check the return code!!!

How to properly handle the materialized views when backing up and restoring?

Thanks.


--
Death to America, and butter sauce.
Iraq lobster!

Re: How to validate restore of backup?

From
Ron Johnson
Date:
That's great on small databases.  Not so practical when they're big.

On Thu, Aug 22, 2024 at 7:10 AM Muhammad Usman Khan <usman.k@bitnine.net> wrote:
Hi Vince,
For validation of databases, you can use the following approach

/usr/pgsql-16/bin/pg_dump -d postgres -h localhost -p 5428 | md5sum > /var/lib/pgsql/db1.txt
/usr/pgsql-16/bin/pg_dump -d postgres -h localhost -p 5420 | md5sum > /var/lib/pgsql/db2.txt
diff db1.txt db2.txt

By executing above queries, if diff is null then it means there is no difference between source and destination databases. Adjust your port and databases accordingly.

On Thu, 22 Aug 2024 at 16:06, Vince McMahon <sippingonesandzeros@gmail.com> wrote:
Hi,

I have some questions When doing pg_restore of backup of a database to a NEW server.  

Is there a way to ensure the data integrity is in tact, and user ID and access works liked how it was in the old server?

How to properly handle the materialized views when backing up and restoring?

Thanks.


--
Death to America, and butter sauce.
Iraq lobster!

Re: How to validate restore of backup?

From
o1bigtenor
Date:


On Thu, Aug 22, 2024 at 6:24 AM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
That's great on small databases.  Not so practical when they're big.

So - - - - what is the recommended procedure for 'large' databases?

(Might be useful to have a definition for what a large database is as well.) 

Regards

Re: How to validate restore of backup?

From
Ron Johnson
Date:
On Thu, Aug 22, 2024 at 8:49 AM o1bigtenor <o1bigtenor@gmail.com> wrote:


On Thu, Aug 22, 2024 at 6:24 AM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
That's great on small databases.  Not so practical when they're big.

So - - - - what is the recommended procedure for 'large' databases?

(Might be useful to have a definition for what a large database is as well.) 

"Large" is when it takes too long to run TWO text mode pg_dump commands in addition to the pg_dump and pg_restore.

--
Death to America, and butter sauce.
Iraq lobster!

Re: How to validate restore of backup?

From
o1bigtenor
Date:


On Thu, Aug 22, 2024 at 8:03 AM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
On Thu, Aug 22, 2024 at 8:49 AM o1bigtenor <o1bigtenor@gmail.com> wrote:


On Thu, Aug 22, 2024 at 6:24 AM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
That's great on small databases.  Not so practical when they're big.

So - - - - what is the recommended procedure for 'large' databases?

(Might be useful to have a definition for what a large database is as well.) 

"Large" is when it takes too long to run TWO text mode pg_dump commands in addition to the pg_dump and pg_restore.


Hmmmmmmmmm - - - - I'd say that's about as neat a non-answer as I've ever seen. 

Can you try again?

(You forgot the first question - - - maybe you could try that one too - - - what is the recommended procedure
for 'large' databases?)

Regards

Re: How to validate restore of backup?

From
Greg Sabino Mullane
Date:
On Thu, Aug 22, 2024 at 8:49 AM o1bigtenor <o1bigtenor@gmail.com> wrote:


On Thu, Aug 22, 2024 at 6:24 AM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
That's great on small databases.  Not so practical when they're big.

So - - - - what is the recommended procedure for 'large' databases?

Use a real backup system like pgBackRest. Stop using pg_dump.

Cheers,
Greg

Re: How to validate restore of backup?

From
Ron Johnson
Date:
On Thu, Aug 22, 2024 at 10:22 AM Greg Sabino Mullane <htamfids@gmail.com> wrote:
On Thu, Aug 22, 2024 at 8:49 AM o1bigtenor <o1bigtenor@gmail.com> wrote:


On Thu, Aug 22, 2024 at 6:24 AM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
That's great on small databases.  Not so practical when they're big.

So - - - - what is the recommended procedure for 'large' databases?

Use a real backup system like pgBackRest. Stop using pg_dump.
 
Not useful when you're migrating not only between major versions but glibc levels.

Use logical replication!!  Maybe.  It gets difficult with partitioned tables that regularly have children added and dropped; mistakes can be made.  pg_dump/pg_restore is guaranteed to work.

--
Death to America, and butter sauce.
Iraq lobster!

Re: How to validate restore of backup?

From
Ron Johnson
Date:
On Thu, Aug 22, 2024 at 9:59 AM o1bigtenor <o1bigtenor@gmail.com> wrote:
On Thu, Aug 22, 2024 at 8:03 AM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
On Thu, Aug 22, 2024 at 8:49 AM o1bigtenor <o1bigtenor@gmail.com> wrote:
On Thu, Aug 22, 2024 at 6:24 AM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
That's great on small databases.  Not so practical when they're big.

So - - - - what is the recommended procedure for 'large' databases?

(Might be useful to have a definition for what a large database is as well.) 

"Large" is when it takes too long to run TWO text mode pg_dump commands in addition to the pg_dump and pg_restore.


Hmmmmmmmmm - - - - I'd say that's about as neat a non-answer as I've ever seen. 
 
Eh?  If you've got hundreds of hours of down time to pipe a text-mode pg_dump of a TB-sized database through md5sum. twice, then that database isn't too big.  I don't have that much down time; thus, it's "too big".

Can you try again?

(You forgot the first question - - - maybe you could try that one too - - - what is the recommended procedure
for 'large' databases?)

I already did, in my message three hours ago.

--
Death to America, and butter sauce.
Iraq lobster!

Re: How to validate restore of backup?

From
Adrian Klaver
Date:
On 8/22/24 04:06, Vince McMahon wrote:
> Hi,
> 
> I have some questions When doing pg_restore of backup of a database to a 
> NEW server.

How large a backup?

> 
> Is there a way to ensure the data integrity is in tact, and user ID and 
> access works liked how it was in the old server?

As to user access, write tests that cover that and run on the new cluster.

Data is trickier and if that is possible to a degree of certainty is 
going to depend on answer to the first question above.

> 
> How to properly handle the materialized views when backing up and restoring?

create materialized view prj_mv(p_item_no, year) as select p_item_no, 
year from projection with data;

pg_dump -d production -U postgres -h localhost -t projection -t prj_mv 
-f prj.sql

In prj.sql:

CREATE MATERIALIZED VIEW public.prj_mv AS
  SELECT p_item_no,
     year
    FROM public.projection
   WITH NO DATA;

COPY public.projection ( ...


[...]

REFRESH MATERIALIZED VIEW public.prj_mv;

It is done for you.

> 
> Thanks.

-- 
Adrian Klaver
adrian.klaver@aklaver.com




Re: How to validate restore of backup?

From
Vince McMahon
Date:
Hi, Adrian.

The largest one is 8 GB after compression.

I have a window of 8 hours to handle 30 GB total of backup at various sizes.



On Thu, Aug 22, 2024, 11:36 AM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 8/22/24 04:06, Vince McMahon wrote:
> Hi,
>
> I have some questions When doing pg_restore of backup of a database to a
> NEW server.

How large a backup?

>
> Is there a way to ensure the data integrity is in tact, and user ID and
> access works liked how it was in the old server?

As to user access, write tests that cover that and run on the new cluster.

Data is trickier and if that is possible to a degree of certainty is
going to depend on answer to the first question above.

>
> How to properly handle the materialized views when backing up and restoring?

create materialized view prj_mv(p_item_no, year) as select p_item_no,
year from projection with data;

pg_dump -d production -U postgres -h localhost -t projection -t prj_mv
-f prj.sql

In prj.sql:

CREATE MATERIALIZED VIEW public.prj_mv AS
  SELECT p_item_no,
     year
    FROM public.projection
   WITH NO DATA;

COPY public.projection ( ...


[...]

REFRESH MATERIALIZED VIEW public.prj_mv;

It is done for you.

>
> Thanks.

--
Adrian Klaver
adrian.klaver@aklaver.com

Re: How to validate restore of backup?

From
Adrian Klaver
Date:
On 8/22/24 14:31, Vince McMahon wrote:
> Hi, Adrian.
> 
> The largest one is 8 GB after compression.
> 
> I have a window of 8 hours to handle 30 GB total of backup at various sizes.

I assume by compression you mean using some form of pg_dump -Fc.

As to your timeline determining whether that can be met is going to 
depend on a more detailed explanation on your part of what you expect 
from the dump/restore process.


-- 
Adrian Klaver
adrian.klaver@aklaver.com