Re: Support logical replication of DDLs - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Support logical replication of DDLs |
Date | |
Msg-id | CAD21AoCXCAQ5QyXu9-xs30ViUHtUxQMmf-818d8GX--5pTmZ7g@mail.gmail.com Whole thread Raw |
In response to | RE: Support logical replication of DDLs ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>) |
Responses |
Re: Support logical replication of DDLs
RE: Support logical replication of DDLs |
List | pgsql-hackers |
On Tue, Jul 11, 2023 at 8:01 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > Hi, > > We have been researching how to create a test that detects failures resulting > from future syntax changes, where the deparser fails to update properly. > > The basic idea comes from what Robert Haas suggested in [1]: when running the > regression test(tests in parallel_schedule), we replace the executing ddl > statement with the its deparsed version and execute the deparsed statement, so > that we can run all the regression with the deparsed statement and can expect > the output to be the same as the existing expected/*.out. As developers > typically add new regression tests to test new syntax, so we expect this test > can automatically identify most of the new syntax changes. > > One thing to note is that when entering the event trigger(where the deparsing > happen), the ddl have already been executed. So we need to get the deparsed > statement before the execution and replace the current statement with it. To > achieve this, the current approach is to create another database with deparser > trigger and in the ProcessUtility hook(e.g. tdeparser_ProcessUtility in the > patch) we redirect the local statement to that remote database, then the > statment will be deparsed and stored somewhere, we can query the remote > database to get the deparsed statement and use it to replace the original > statment. > > The process looks like: > 1) create another database and deparser event trigger in it before running the regression test. > 2) run the regression test (the statement will be redirect to the remote database and get the deparsed statement) > 3) compare the output file with the expected ones. > > Attach the POC patch(0004) for this approach. Note that, the new test module only > test test_setup.sql, create_index.sql, create_table.sql and alter_table.sql as > we only support deparsing TABLE related command for now. And the current test > cannot pass because of some bugs in deparser, we will fix these bugs soon. > > > TO IMPROVE: > > 1. The current approach needs to handle the ERRORs, WARNINGs, and NOTICEs from > the remote database. Currently it will directly rethrow any ERRORs encountered > in the remote database. However, for WARNINGs and NOTICEs, we will only rethrow > them along with the ERRORs. This is done to prevent duplicate messages in the > output file during local statement execution, which would be inconsistent with > the existing expected output. Note that this approach may potentially miss some > bugs, as there could be additional WARNINGs or NOTICEs caused by the deparser > in the remote database. > > 2. The variable reference and assignment (xxx /gset and select :var_name) will > not be sent to the server(only the qualified value will be sent), but it's > possible the variable in another session should be set to a different value, so > this can cause inconsistent output. > > 3 .CREATE INDEX CONCURRENTLY will create an invalid index internally even if it > reports an ERROR later. But since we will directly rethrow the remote ERROR in > the main database, we won't execute the "CREATE INDEX CONCURRENTLY" in the main > database. This means that we cannot see the invalid index in the main database. > > To improve the above points, another variety is: run the regression test twice. > The first run is solely intended to collect all the deparsed statements. We can > dump these statements from the database and then reload them in the second > regression run. This allows us to utilize the deparsed statements to replace > the local statements in the second regression run. This approach does not need > to handle any remote messages and client variable stuff during execution, > although it could take more time to finsh the test. > I've considered some alternative approaches but I prefer the second approach. A long test time could not be a big problem unless we run it by default. We can prepare a buildfarm animal that is configured to run the DDL deparse tests. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: