Re: Comparing primary/HS standby in tests - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: Comparing primary/HS standby in tests
Date
Msg-id CAMkU=1xoevNRUFSA-mu2Ri-9KmXeqCD0cySYsV8FKAkAQnZpDQ@mail.gmail.com
Whole thread Raw
In response to Comparing primary/HS standby in tests  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Comparing primary/HS standby in tests  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Tue, Mar 3, 2015 at 7:49 AM, Andres Freund <andres@2ndquadrant.com> wrote:
Hi,

I've regularly wished we had automated tests that setup HS and then
compare primary/standby at the end to verify replay worked
correctly.

Heikki's page comparison tools deals with some of that verification, but
it's really quite expensive and doesn't care about runtime only
differences. I.e. it doesn't test HS at all.

I every now and then run installcheck against a primary, verify that
replay works without errors, and then compare pg_dumpall from both
clusters. Unfortunately that currently requires hand inspection of
dumps, there are differences like:
-SELECT pg_catalog.setval('default_seq', 1, true);
+SELECT pg_catalog.setval('default_seq', 33, true);

The reason these differences is that the primary increases the
sequence's last_value by 1, but temporarily sets it to +SEQ_LOG_VALS
before XLogInsert(). So the two differ.

Does anybody have a good idea how to get rid of that difference? One way
to do that would be to log the value the standby is sure to have - but
that's not entirely trivial.

I'd very much like to add a automated test like this to the tree, but I
don't see wa way to do that sanely without a comparison tool...

Couldn't we just arbitrarily exclude sequence internal states from the comparison?

That wouldn't work where the standby has been promoted and then used in a way that draws on the sequence (with the same workload being put through the now-promoted standby and the original-master), though, but I don't think that that was what you were asking about.

How many similar issues have you seen?

In the case where you have a promoted replica and put the same through workflow through both it and the master, I've seen "pg_dump -s" dump objects in different orders, for no apparent reason.  That is kind of annoying, but I never traced it back to the cause (nor have I excluded PEBCAK as the real cause).

Cheers,

Jeff

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: MD5 authentication needs help
Next
From: Mike Rylander
Date:
Subject: Re: xpath changes in the recent back branches