Re: Testing DDL deparsing support - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Testing DDL deparsing support
Date
Msg-id CA+TgmoYZ+BxuYbr9=adOsfGE05wFRQ4r0tM7THLpwVDp1ahe8w@mail.gmail.com
Whole thread Raw
In response to Testing DDL deparsing support  (Ian Barwick <ian@2ndquadrant.com>)
Responses Re: Testing DDL deparsing support  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
On Thu, Nov 27, 2014 at 11:43 PM, Ian Barwick <ian@2ndquadrant.com> wrote:
> DDL deparsing is a feature that allows collection of DDL commands as they
> are
> executed in a server, in some flexible, complete, and fully-contained format
> that allows manipulation, storage, and transmission.  This feature has
> several
> use cases; the two best known ones are DDL replication and DDL auditing.
>
> We have came up with a design that uses a JSON structure to store commands.
> It is similar to the C sprintf() call in spirit: there is a base format
> string, which is a generic template for each command type, and contains
> placeholders that represent the variable parts of the command.  The values
> for
> the placeholders in each specific command are members of the JSON object.  A
> helper function is provided that expands the format string and replaces the
> placeholders with the values, and returns the SQL command as text.  This
> design lets the user operate on the JSON structure in either a read-only
> fashion (for example to block table creation if the names don't satisfy a
> certain condition), or by modifying it (for example, to change the schema
> name
> so that tables are created in different schemas when they are replicated to
> some remote server).
>
> This design is mostly accepted by the community.  The one sticking point is
> testing: how do we ensure that the JSON representation we have created
> correctly deparses back into a command that has the same effect as the
> original command.  This was expressed by Robert Haas:
> http://www.postgresql.org/message-id/CA+TgmoZ=vZriJMxLkqi_V0jg4k4LEAPmwUSC6RWXS5MquXUJNA@mail.gmail.com
>
> The problem cannot be solved by a standard regression test module which runs
> a
> bunch of previously-defined commands and verifies the output.  We need not
> only check the output for the commands as they exist today, but also we need
> to ensure that this does not get broken as future patches modify the
> existing
> commands as well as create completely new commands.
>
> The challenge here is to create a new testing framework that ensures the DDL
> deparsing module will be maintained by future hackers as the DDL grammar is
> modified.
>
> What and How to Test
> --------------------
>
> Our goal should be that patch authors run "make check-world" in their
> patched
> copies and notice that the DDL deparse test is failing; they can then modify
> deparse_utility.c to add support for the new commands, which should in
> general
> be pretty straightforward.  This way, maintaining deparsing code would be
> part
> of new patches just like we require pg_dump support and documentation for
> new
> features.
>
> It would not work to require patch authors to add their new commands to a
> new
> pg_regress test file, because most would not be aware of the need, or they
> would just forget to do it, and patches would be submitted and possibly even
> committed without any realization of the breakage caused.
>
> There are two things we can rely on: standard regression tests, and pg_dump.
>
> Standard regression tests are helpful because patch authors include new test
> cases in the patches that stress their new options or commands.  This new
> test
> framework needs to be something that internally runs the regression tests
> and
> exercises DDL deparsing as the regression tests execute DDL.  That would
> mean
> that new commands and options would automatically be deparse-tested by our
> new
> framework as soon as patch authors add standard regress support.
>
> One thing is first-grade testing, that is ensure that the deparsed version
> of
> a DDL command can be executed at all, for if the deparsed version throws an
> error, it's immediately obvious that the deparse code is bogus.  This is
> because we know the original command did not throw an error: otherwise, the
> deparse code would not have run at all, because ddl_command_end triggers are
> only executed once the original command has completed execution.  So
> first-grade testing ensures that no trivial bugs are present.
>
> But there's second-grade verification as well: is the object produced by the
> deparsed version identical to the one produced by the original command?  One
> trivial but incomplete approach is to run the command, then save the
> deparsed
> version; run the deparsed version, and deparse that one too; compare both.
> The problem with this approach is that if the deparse code is omitting some
> clause (say it omits IN TABLESPACE in a CREATE TABLE command), then both
> deparsed versions would contain the same bug yet they would compare equal.
> Therefore this approach is not good enough.
>
> The best idea we have so far to attack second-grade testing is to trust
> pg_dump to expose differences: accumulate commands as they run in the
> regression database, the run the deparsed versions in a different database;
> then pg_dump both databases and compare the dumped outputs.
>
> Proof-of-concept
> ----------------
>
> We have now implemented this as a proof-of-concept; the code is available
> in the deparse branch at:
>
>   http://git.postgresql.org/gitweb/?p=2ndquadrant_bdr.git
>
> a diff is attached for reference, but relies on the deparsing functionality
> available in the deparse branch.
>
> To implement the DDL deparsing, a pseudo-test is executed first, which
> creates an event trigger and a table in which to store queries captured
> by the event trigger. Following conclusion of all regression tests, a
> further test is executed which exports the query table, imports it into
> the second database and runs pg_dump; the output of this is then compared
> against the expected output. This test can fail either at the import
> stage, if the deparsed commands are syntactically incorrect, or at the
> comparison stage, if the a deparsed command is valid but syntactically
> different to the original.
>
> To facilitate this, some minimal changes to pg_regress itself have been
> necessary. In the current proof-of-concept it automatically creates
> (and where appropriate drops) the "shadow" database used to load the
> deparsed commands; and also provides a couple of additional tokens to
> the .source files to provide information otherwise unavailable to the
> SQL scripts such as the location of pg_dump and the name of the "shadow"
> database.
>
> A simple schedule to demonstrate this is available; execute from the
> src/test/regress/ directory like this:
>
>     ./pg_regress \
>       --temp-install=./tmp_check \
>       --top-builddir=../../.. \
>       --dlpath=. \
>       --schedule=./schedule_ddl_deparse_demo

I haven't read the code, but this concept seems good to me.  It has
the unfortunate weakness that a difference could exist during the
*middle* of the regression test run that is gone by the *end* of the
run, but our existing pg_upgrade testing has the same weakness, so I
guess we can view this as one more reason not to be too aggressive
about having regression tests drop the unshared objects they create.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [REVIEW] Re: Compression of full-page-writes
Next
From: Robert Haas
Date:
Subject: Re: Using pg_rewind for differential backup