Stephen Frost wrote:
> Alvaro, Tom,
>
> * Alvaro Herrera (alvherre@alvh.no-ip.org) wrote:
> > Crazy idea: maybe a large fraction of that test could be replaced with
> > comparisons of the "pg_restore -l" output file rather than pg_dump's
> > text output (i.e. the TOC entry for each object, rather than the
> > object's textual representation.) Sounds easier in code than current
> > implementation. Separately, verify that textual representation for each
> > TOC entry type is what we expect.
>
> I'm not sure how that's different..? We do check that the textual
> representation is what we expect, both directly from pg_dump and from
> pg_restore output, and using the exact same code to check both (which I
> generally think is a good thing since we want the results from both to
> more-or-less match up). What you're proposing here sounds like we're
> throwing that away in favor of keeping all the same code to test the
> textual representation but then only checking for listed contents from
> pg_restore instead of checking that the textual representation is
> correct. That doesn't actually reduce the amount of code though..
Well, the current implementation compares a dozen of pg_dump output text
files, three hundred lines apiece, against a thousand of regexes (give
or take). Whenever there is a mismatch, what you get is "this regexp
failed to match <three hundred lines>" (or sometimes "matched when it
should have not"), so looking for the mismatch is quite annoying.
My proposal is that instead of looking at three hundred lines, you'd
look for 50 lines of `pg_restore -l` output -- is element XYZ in there
or not. Quite a bit simpler for the guy adding a new test. This tests
combinations of pg_dump switches: are we dumping the right set of
objects.
*Separately* test that each individual TOC entry type ("table data",
"index", "tablespace") is dumped as this or that SQL command, where you
create a separate dump file for each object type. So you match a single
TOC entry to a dozen lines of SQL, half of them comments -- pretty easy
to see where a mismatch is.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services