Re: logical changeset generation v6 - Mailing list pgsql-hackers
From | Steve Singer |
---|---|
Subject | Re: logical changeset generation v6 |
Date | |
Msg-id | BLU0-SMTP47C2FBC5826B581F0CFFA4DC280@phx.gbl Whole thread Raw |
In response to | Re: logical changeset generation v6 (Steve Singer <steve@ssinger.info>) |
Responses |
Re: logical changeset generation v6
|
List | pgsql-hackers |
On 09/25/2013 01:20 PM, Steve Singer wrote: > On 09/25/2013 11:08 AM, Andres Freund wrote: >> On 2013-09-25 11:01:44 -0400, Steve Singer wrote: >>> On 09/17/2013 10:31 AM, Andres Freund wrote: >>>> This patch set now fails to apply because of the commit "Rename >>>> various >>>> "freeze multixact" variables". >>>> And I am even partially guilty for that patch... >>>> >>>> Rebased patches attached. >>> While testing the logical replication changes against my WIP logical >>> slony I >>> am sometimes getting error messages from the WAL sender of the form: >>> unexpected duplicate for tablespace X relfilenode X >> Any chance you could provide a setup to reproduce the error? >> > > The steps to build a setup that should reproduce this error are: > > 1. I had apply the attached patch on top of your logical replication > branch so my pg_decode_init would now if it was being called as part > of a INIT_REPLICATION or START_REPLICATION. > Unless I have misunderstood something you probably will want to merge > this fix in > > 2. Get my WIP for adding logical support to slony from: > git@github.com:ssinger/slony1-engine.git branch logical_repl > (4af1917f8418a) > (My code changes to slony are more prototype level code quality than > production code quality) > > 3. > cd slony1-engine > ./configure --with-pgconfigdir=/usr/local/pg94wal/bin (or whatever) > make > make install > > 4. Grab the clustertest framework JAR from > https://github.com/clustertest/clustertest-framework and build up a > clustertest jar file > > 5. Create a file > slony1-engine/clustertest/conf/java.conf > that contains the path to the above JAR file as a shell variable > assignment: ie > CLUSTERTESTJAR=/home/ssinger/src/clustertest/clustertest_git/build/jar/clustertest-coordinator.jar > > > 6. > cp clustertest/conf/disorder.properties.sample > clustertest/conf/disorder.properties > > > edit disorder.properites to have the proper values for your > environment. All 6 databases can point at the same postgres instance, > this test will only actually use 2 of them(so far). > > 7. Run the test > cd clustertest > ./run_all_disorder_tests.sh > > This involves having the slon connect to the walsender on the database > test1 and replicate the data into test2 (which is a different database > on the same postmaster) > > If this setup seems like too much effort I can request one of the > commitfest VM's from Josh and get everything setup there for you. > > Steve > >>> Any ideas? >> I'll look into it. Could you provide any context to what youre doing >> that's being decoded? >> I've determined that when in this test the walsender seems to be hitting this when it is decode the transactions that are behind the slonik commands to add tables to replication (set add table, set add sequence). This is before the SUBSCRIBE SET is submitted. I've also noticed something else that is strange (but might be unrelated). If I stop my slon process and restart it I get messages like: WARNING: Starting logical replication from 0/a9321360 ERROR: cannot stream from 0/A9321360, minimum is 0/A9320B00 Where 0/A9321360 was sent in the last packet my slon received from the walsender before the restart. If force it to restart replication from 0/A9320B00 I see datarows that I appear to have already seen before the restart. I think this is happening when I process the data for 0/A9320B00 but don't get the feedback message my slon was killed. Is this expected? >> Greetings, >> >> Andres Freund >> > > >
pgsql-hackers by date: