Re: [HACKERS] logical decoding of two-phase transactions - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: [HACKERS] logical decoding of two-phase transactions |
Date | |
Msg-id | CAA4eK1J_WyGPnHvwwG41JFJTKFZztZKwZXBrGzMZ6dqwY_ZoJg@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] logical decoding of two-phase transactions (Nikhil Sontakke <nikhils@2ndquadrant.com>) |
Responses |
Re: [HACKERS] logical decoding of two-phase transactions
Re: [HACKERS] logical decoding of two-phase transactions |
List | pgsql-hackers |
On Tue, Oct 6, 2020 at 10:23 AM Peter.B.Smith@fujitsu.com <Peter.B.Smith@fujitsu.com> wrote: > > > [BEGIN] > > ========== > Patch V6-0001, File: contrib/test_decoding/expected/prepared.out (so > prepared.sql also) > ========== > > COMMENT > Line 30 - The INSERT INTO test_prepared1 VALUES (2); is kind of > strange because it is not really part of the prior test nor the > following test. Maybe it would be better to have a comment describing > the purpose of this isolated INSERT and to also consume the data from > the slot so it does not get jumbled with the data of the following > (abort) test. > > ; > > COMMENT > Line 53 - Same comment for this test INSERT INTO test_prepared1 VALUES > (4); It kind of has nothing really to do with either the prior (abort) > test nor the following (ddl) test. > > ; > > COMMENT > Line 60 - Seems to check which locks are held for the test_prepared_1 > table while the transaction is in progress. Maybe it would be better > to have more comments describing what is expected here and why. > > ; > > COMMENT > Line 88 - There is a comment in the test saying "-- We should see '7' > before '5' in our results since it commits first." but I did not see > any test code that actually verifies that happens. > > ; All the above comments are genuine and I think it is mostly because the author has blindly modified the existing tests without completely understanding the intent of the test. I suggest we write a completely new regression file (decode_prepared.sql) for these and just copy whatever is required from prepared.sql. Once we do that we might also want to rename existing prepared.sql to decode_commit_prepared.sql or something like that. I think modifying the existing test appears to be quite ugly and also it is changing the intent of the existing tests. > > QUESTION > Line 120 - I did not really understand the SQL checking the pg_class. > I expected this would be checking table 'test_prepared1' instead. Can > you explain it? > SELECT 'pg_class' AS relation, locktype, mode > FROM pg_locks > WHERE locktype = 'relation' > AND relation = 'pg_class'::regclass; > relation | locktype | mode > ----------+----------+------ > (0 rows) > > ; Yes, I also think your expectation is correct and this should be on 'test_prepared_1'. > > QUESTION > Line 139 - SET statement_timeout = '1s'; is 1 seconds short enough > here for this test, or might it be that these statements would be > completed in less than one seconds anyhow? > > ; Good question. I think we have to mention the reason why logical decoding is not blocked while it needs to acquire a shared lock on the table and the previous commands already held an exclusive lock on the table. I am not sure if I am missing something but like you, it is not clear to me as well what this test intends to do, so surely more commentary is required. > > QUESTION > Line 163 - How is this testing a SAVEPOINT? Or is it only to check > that the SAVEPOINT command is not part of the replicated changes? > > ; It is more of testing that subtransactions will not create a problem while decoding. > > COMMENT > Line 175 - Missing underscore in comment. Code requires also underscore: > "nodecode" --> "_nodecode" > makes sense. > ========== > Patch V6-0001, File: contrib/test_decoding/test_decoding.c > ========== > > COMMENT > Line 43 > @@ -36,6 +40,7 @@ typedef struct > bool skip_empty_xacts; > bool xact_wrote_changes; > bool only_local; > + TransactionId check_xid; /* track abort of this txid */ > } TestDecodingData; > > The "check_xid" seems a meaningless name. Check what? > IIUC maybe should be something like "check_xid_aborted" > > ; > > COMMENT > Line 105 > @ -88,6 +93,19 @@ static void > pg_decode_stream_truncate(LogicalDecodingContext *ctx, > ReorderBufferTXN *txn, > int nrelations, Relation relations[], > ReorderBufferChange *change); > +static bool pg_decode_filter_prepare(LogicalDecodingContext *ctx, > + ReorderBufferTXN *txn, > > Remove extra blank line after these functions > > ; The above two sounds reasonable suggestions. > > COMMENT > Line 149 > @@ -116,6 +134,11 @@ _PG_output_plugin_init(OutputPluginCallbacks *cb) > cb->stream_change_cb = pg_decode_stream_change; > cb->stream_message_cb = pg_decode_stream_message; > cb->stream_truncate_cb = pg_decode_stream_truncate; > + cb->filter_prepare_cb = pg_decode_filter_prepare; > + cb->prepare_cb = pg_decode_prepare_txn; > + cb->commit_prepared_cb = pg_decode_commit_prepared_txn; > + cb->abort_prepared_cb = pg_decode_abort_prepared_txn; > + > } > > There is a confusing mix of terminology where sometimes things are > referred as ROLLBACK/rollback and other times apparently the same > operation is referred as ABORT/abort. I do not know the root cause of > this mixture. IIUC maybe the internal functions and protocol generally > use the term "abort", whereas the SQL syntax is "ROLLBACK"... but > where those two terms collide in the middle it gets quite confusing. > > At least I thought the names of the "callbacks" which get exposed to > the user (e.g. in the help) might be better if they would match the > SQL. > "abort_prepared_cb" --> "rollback_prepared_db" > This suggestion sounds reasonable. I think it is to entertain the case where due to error we need to rollback the transaction. I think it is better if use 'rollback' terminology in the exposed functions. We already have a function with the name stream_abort_cb in the code which we also might want to rename but that is a separate thing and we can deal it with a separate patch. > There are similar review comments like this below where the > alternating terms caused me some confusion. > > ~ > > Also, Remove the extra blank line before the end of the function. > > ; > > COMMENT > Line 267 > @ -227,6 +252,42 @@ pg_decode_startup(LogicalDecodingContext *ctx, > OutputPluginOptions *opt, > errmsg("could not parse value \"%s\" for parameter \"%s\"", > strVal(elem->arg), elem->defname))); > } > + else if (strcmp(elem->defname, "two-phase-commit") == 0) > + { > + if (elem->arg == NULL) > + continue; > > IMO the "check-xid" code might be better rearranged so the NULL check > is first instead of if/else. > e.g. > if (elem->arg == NULL) > ereport(FATAL, > (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > errmsg("check-xid needs an input value"))); > ~ > > Also, is it really supposed to be FATAL instead or ERROR. That is not > the same as the other surrounding code. > > ; +1. > > COMMENT > Line 296 > if (data->check_xid <= 0) > ereport(ERROR, > (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > errmsg("Specify positive value for parameter \"%s\"," > " you specified \"%s\"", > elem->defname, strVal(elem->arg)))); > > The code checking for <= 0 seems over-complicated. Because conversion > was using strtoul() I fail to see how this can ever be < 0. Wouldn't > it be easier to simply test the result of the strtoul() function? > > BEFORE: if (errno == EINVAL || errno == ERANGE) > AFTER: if (data->check_xid == 0) > Better to use TransactionIdIsValid(data->check_xid) here. > ~ > > Also, should this be FATAL? Everything else similar is ERROR. > > ; It should be an error. > > COMMENT > (general) > I don't recall seeing any of these decoding options (e.g. > "two-phase-commit", "check-xid") documented anywhere. > So how can a user even know these options exist so they can use them? > Perhaps options should be described on this page? > https://www.postgresql.org/docs/13/functions-admin.html#FUNCTIONS-REPLICATION > > ; I think we should do what we are doing for other options, if they are not documented then why to document this one separately. I guess we can make a case to document all the existing options and write a separate patch for that. > > COMMENT > (general) > "check-xid" is a meaningless option name. Maybe something like > "checked-xid-aborted" is more useful? > Suggest changing the member, the option, and the error messages to > match some better name. > > ; > > COMMENT > Line 314 > @@ -238,6 +299,7 @@ pg_decode_startup(LogicalDecodingContext *ctx, > OutputPluginOptions *opt, > } > > ctx->streaming &= enable_streaming; > + ctx->enable_twophase &= enable_2pc; > } > > The "ctx->enable_twophase" is inconsistent naming with the > "ctx->streaming" member. > "enable_twophase" --> "twophase" > > ; +1. > > COMMENT > Line 374 > @@ -297,6 +359,94 @@ pg_decode_commit_txn(LogicalDecodingContext *ctx, > ReorderBufferTXN *txn, > OutputPluginWrite(ctx, true); > } > > + > +/* > + * Filter out two-phase transactions. > + * > + * Each plugin can implement its own filtering logic. Here > + * we demonstrate a simple logic by checking the GID. If the > + * GID contains the "_nodecode" substring, then we filter > + * it out. > + */ > +static bool > +pg_decode_filter_prepare(LogicalDecodingContext *ctx, ReorderBufferTXN *txn, > > Remove the extra preceding blank line. > > ~ > > I did not find anything in the help about "_nodecode". Should it be > there or is this deliberately not documented feature? > > ; I guess we can document it along with filter_prepare API, if not already documented. > > QUESTION > Line 440 > +pg_decode_abort_prepared_txn(LogicalDecodingContext *ctx, > ReorderBufferTXN *txn, > > Is this a wrong comment > "ABORT PREPARED" --> "ROLLBACK PREPARED" ?? > > ; > > COMMENT > Line 620 > @@ -455,6 +605,22 @@ pg_decode_change(LogicalDecodingContext *ctx, > ReorderBufferTXN *txn, > } > data->xact_wrote_changes = true; > > + /* if check_xid is specified */ > + if (TransactionIdIsValid(data->check_xid)) > + { > + elog(LOG, "waiting for %u to abort", data->check_xid); > + while (TransactionIdIsInProgress(dat > > The check_xid seems a meaningless name, and the comment "/* if > check_xid is specified */" was not helpful either. > IIUC purpose of this is to check that the nominated xid always is rolled back. > So the appropriate name may be more like "check-xid-aborted". > > ; Yeah, this part deserves better comments. -- With Regards, Amit Kapila.
pgsql-hackers by date: