Re: Support logical replication of DDLs - Mailing list pgsql-hackers
From | vignesh C |
---|---|
Subject | Re: Support logical replication of DDLs |
Date | |
Msg-id | CALDaNm1WEnw2Oykb90PO1c4oDAVrAR+16W8Cm_F-KzgNvqmmKg@mail.gmail.com Whole thread Raw |
In response to | Re: Support logical replication of DDLs (Peter Smith <smithpb2250@gmail.com>) |
List | pgsql-hackers |
On Fri, 11 Nov 2022 at 10:39, Peter Smith <smithpb2250@gmail.com> wrote: > > On Fri, Nov 11, 2022 at 3:47 PM Peter Smith <smithpb2250@gmail.com> wrote: > > > > Here are more review comments for the v32-0001 file ddl_deparse.c > > > > *** NOTE - my review post became too big, so I split it into smaller parts. > > THIS IS PART 2 OF 4. > > ======= > > src/backend/commands/ddl_deparse.c > > 1. deparse_AlterExtensionStmt > > +/* > + * Deparse an AlterExtensionStmt (ALTER EXTENSION .. UPDATE TO VERSION) > + * > + * Given an extension OID and a parse tree that modified it, return an ObjTree > + * representing the alter type. > + */ > +static ObjTree * > +deparse_AlterExtensionStmt(Oid objectId, Node *parsetree) > > Spurious blank space before "OID" Modified > 2. > > + ObjTree *stmt; > + ObjTree *tmp; > + List *list = NIL; > + ListCell *cell; > > Variable 'tmp' can be declared only in the scope that it is used. Modified > 3. > > + foreach(cell, node->options) > + { > + DefElem *opt = (DefElem *) lfirst(cell); > + > + if (strcmp(opt->defname, "new_version") == 0) > + { > + tmp = new_objtree_VA("TO %{version}L", 2, > + "type", ObjTypeString, "version", > + "version", ObjTypeString, defGetString(opt)); > + list = lappend(list, new_object_object(tmp)); > + } > + else > + elog(ERROR, "unsupported option %s", opt->defname); > + } > > This code seems strange to be adding new versions to a list. How can > there be multiple new versions? It does not seem compatible with the > command syntax [1] Modified > 4. deparse_CreateCastStmt > > + initStringInfo(&func); > + appendStringInfo(&func, "%s(", > + quote_qualified_identifier(get_namespace_name(funcForm->pronamespace), > + NameStr(funcForm->proname))); > + for (i = 0; i < funcForm->pronargs; i++) > + appendStringInfoString(&func, > + format_type_be_qualified(funcForm->proargtypes.values[i])); > + appendStringInfoChar(&func, ')'); > > Is this correct, or should there be some separators (e.g. commas) > between multiple arg-types? Modified > 5. deparse_AlterDefaultPrivilegesStmt > > + > +static ObjTree * > +deparse_AlterDefaultPrivilegesStmt(CollectedCommand *cmd) > > Missing function comment Modified > 6. > > + schemas = lappend(schemas, > + new_string_object(strVal(val))); > > Unnecessary wrapping. Modified > 7. > > + /* Add the IN SCHEMA clause, if any */ > + tmp = new_objtree("IN SCHEMA"); > + append_array_object(tmp, "%{schemas:, }I", schemas); > + if (schemas == NIL) > + append_bool_object(tmp, "present", false); > + append_object_object(alterStmt, "%{in_schema}s", tmp); > + > + /* Add the FOR ROLE clause, if any */ > + tmp = new_objtree("FOR ROLE"); > + append_array_object(tmp, "%{roles:, }R", roles); > + if (roles == NIL) > + append_bool_object(tmp, "present", false); > + append_object_object(alterStmt, "%{for_roles}s", tmp); > > > I don't really understand why the logic prefers to add a whole new > empty tree with "present: false" versus just adding nothing at all > unless it is relevant. This code was intended to generate a verbose json node. So that user can easily modify the command by changing the value to generate a new ddl command. > 8. > > + if (stmt->action->is_grant) > + grant = new_objtree("GRANT"); > + else > + grant = new_objtree("REVOKE"); > + > + /* add the GRANT OPTION clause for REVOKE subcommand */ > + if (!stmt->action->is_grant) > + { > + tmp = new_objtree_VA("GRANT OPTION FOR", > + 1, "present", ObjTypeBool, > + stmt->action->grant_option); > + append_object_object(grant, "%{grant_option}s", tmp); > + } > > That 2nd 'if' can just be combined with the 'else' logic of the prior if. Modified > 9. > > + Assert(priv->cols == NIL); > + privs = lappend(privs, > + new_string_object(priv->priv_name)); > > Unnecessary wrapping. Modified > 10. deparse_AlterTableStmt > > Maybe this function name should be different because it is not only > for TABLEs but also serves for INDEX, VIEW, TYPE, etc Changed the function name > 11. > > AFAICT every case in the switch (subcmd->subtype) is doing subcmds = > lappend(subcmds, new_object_object(tmpobj)); > > Just doing this in common code at the end might be an easy way to > remove ~50 lines of duplicate code. There are fewer cases where we don't do anything: case AT_ReAddIndex: case AT_ReAddConstraint: case AT_ReAddComment: case AT_ReplaceRelOptions: case AT_CheckNotNull: case AT_ReAddStatistics: /* Subtypes used for internal operations; nothing to do here */ break; case AT_AddColumnToView: /* CREATE OR REPLACE VIEW -- nothing to do here */ break; This lappend cannot be moved to end as it is not applicable for all the cases. > 12. deparse_ColumnDef > > + * NOT NULL constraints in the column definition are emitted directly in the > + * column definition by this routine; other constraints must be emitted > + * elsewhere (the info in the parse node is incomplete anyway.). > + */ > +static ObjTree * > +deparse_ColumnDef(Relation relation, List *dpcontext, bool composite, > + ColumnDef *coldef, bool is_alter, List **exprs) > > "anyway.)." -> "anyway)." Modified > 13. > > + /* USING clause */ > + tmpobj = new_objtree("COMPRESSION"); > + if (coldef->compression) > + append_string_object(tmpobj, "%{compression_method}I", coldef->compression); > + else > + { > + append_null_object(tmpobj, "%{compression_method}I"); > + append_bool_object(tmpobj, "present", false); > + } > > Why is it necessary to specify a NULL compression method if the entire > "COMPRESSION" is anyway flagged as present=false? This code was intended to generate a verbose json node. So that user can easily modify the command by changing the value to generate a new ddl command. > 14. > > + foreach(cell, coldef->constraints) > + { > + Constraint *constr = (Constraint *) lfirst(cell); > + > + if (constr->contype == CONSTR_NOTNULL) > + saw_notnull = true; > + } > > Why not break immediately from this loop the first time you find > 'saw_notnull' true? Modified > 15. > > + tmpobj = new_objtree("DEFAULT"); > + if (attrForm->atthasdef) > + { > + char *defstr; > + > + defstr = RelationGetColumnDefault(relation, attrForm->attnum, > + dpcontext, exprs); > + > + append_string_object(tmpobj, "%{default}s", defstr); > + } > + else > + append_bool_object(tmpobj, "present", false); > + append_object_object(column, "%{default}s", tmpobj); > > Something seems a bit strange here. It looks like there are formats > called "%{default}s" at 2 levels in this tree, so will it cause a > hierarchy of objects with the same name? Yes it will have contents something like: "default": {"fmt": "DEFAULT %{default}s", "default": "11"}}}], I don't not find any issues even though it has same name, it is able to replicate the statement without any issue > 16. deparse_ColumnIdentity > > + column = new_objtree(""); > + > + if (!OidIsValid(seqrelid)) > + { > + append_bool_object(column, "present", false); > + return column; > + } > > I don't really understand the point of making empty tree structures > for not "present" elements. IIUC this is just going to make the tree > bigger for no reason and all these not "present" branches will be > ultimately thrown away, right? I guess the justification is that it > might be for debugging/documentation but that does not really stand up > in this case because it seems like just a nameless tree here. Modified > 17. deparse_CreateDomain > > + createDomain = new_objtree("CREATE"); > + > + append_object_object(createDomain, > + "DOMAIN %{identity}D AS", > + new_objtree_for_qualname_id(TypeRelationId, > + objectId)); > + append_object_object(createDomain, > + "%{type}T", > + new_objtree_for_type(typForm->typbasetype, typForm->typtypmod)); > + > + if (typForm->typnotnull) > + append_string_object(createDomain, "%{not_null}s", "NOT NULL"); > + else > + append_string_object(createDomain, "%{not_null}s", ""); > > 17a. > I don't understand why this is not just a single _VA() call instead of > spread over multiple append_objects like this. Modified > 17b. > In other places, something like the "%{not_null}s" is done with a > ternary operator instead of the excessive if/else. Modified > 18. deparse_CreateFunction > > + if (isnull) > + probin = NULL; > + else > + { > + probin = TextDatumGetCString(tmpdatum); > + if (probin[0] == '\0' || strcmp(probin, "-") == 0) > + probin = NULL; > + } > > Maybe it is simpler to assign prbin = NULL where it is declared, then > here you only need to test the !isnull case. Modified > 19. > > + append_string_object(createFunc, "%{or_replace}s", > + node->replace ? "OR REPLACE" : ""); > > It is not clear to me what is the point of such code - I mean if > node->replace is false why do append at all? ... Why not use > appen_format_string() instead()? > My guess is that this way is preferred to simplify the calling code, > but knowing that a "" value will just do nothing anyway - seems an > overcomplicated way to do it though. We generally use this along with new_objtree_VA, in this case it was not used along with new_objtree_VA. I have changed it to new_objtree_VA so that it can be combined > 20. > > + typarray = palloc(list_length(node->parameters) * sizeof(Oid)); > + if (list_length(node->parameters) > procForm->pronargs) > + { > + Datum alltypes; > + Datum *values; > + bool *nulls; > + int nelems; > + > + alltypes = SysCacheGetAttr(PROCOID, procTup, > + Anum_pg_proc_proallargtypes, &isnull); > + if (isnull) > + elog(ERROR, "NULL proallargtypes, but more parameters than args"); > + deconstruct_array(DatumGetArrayTypeP(alltypes), > + OIDOID, 4, 't', 'i', > + &values, &nulls, &nelems); > + if (nelems != list_length(node->parameters)) > + elog(ERROR, "mismatched proallargatypes"); > + for (i = 0; i < list_length(node->parameters); i++) > + typarray[i] = values[i]; > + } > + else > + { > + for (i = 0; i < list_length(node->parameters); i++) > + typarray[i] = procForm->proargtypes.values[i]; > + } > > The list_length(node->parameters) is used multiple times here; it > might have been cleaner code to assign that to some local variable. Modified > 21. > > + * Note that %{name}s is a string here, not an identifier; the reason > + * for this is that an absent parameter name must produce an empty > + * string, not "", which is what would happen if we were to use > + * %{name}I here. So we add another level of indirection to allow us > + * to inject a "present" parameter. > + */ > > The above comment says: > must produce an empty string, not "" > > I didn't get the point - what is the difference between an empty string and ""? if we specify as %{variable}s and if variable is specified as "", the append_XXX functions will add a " " while deparsing json to string in the subscriber side. It is not intended to add " " in this case. I have changed it to below which is better and being followed similarly in other places too: if (param->name) append_string_object(name, "%{name}I", param->name); else { append_null_object(name, "%{name}I"); append_bool_object(name, "present", false); } append_object_object(paramobj, "%{name}s", name); I have removed the comment which is confusing and changed it to above which is common way that is used in these scenarios > 22. > > + append_string_object(paramobj, "%{mode}s", > + param->mode == FUNC_PARAM_IN ? "IN" : > + param->mode == FUNC_PARAM_OUT ? "OUT" : > + param->mode == FUNC_PARAM_INOUT ? "INOUT" : > + param->mode == FUNC_PARAM_VARIADIC ? "VARIADIC" : > + "IN"); > > There doesn't seem to be much point to test for param->mode == > FUNC_PARAM_IN here since "IN" is the default mode anyway. Modified > 23. > > + name = new_objtree(""); > + append_string_object(name, "%{name}I", > + param->name ? param->name : "NULL"); > + > + append_bool_object(name, "present", > + param->name ? true : false); > > IIUC it is uncommon to inject a "present" object if it was "true", so > why do it like that here? Modified to keep it similar as we do in other places > 24. > > + append_format_string(tmpobj, "("); > + append_array_object(tmpobj, "%{arguments:, }s", params); > + append_format_string(tmpobj, ")"); > > Is it necessary to do that in 3 lines? IIUC it would be the same if > the parens were just included in the append_array_object format, > right? There is a possibility that arguments is null, if it is null then append_array_object will not add "(" and ")". That is the reason we need to keep it separate. > 25. > > + if (procForm->prosupport) > + { > + Oid argtypes[1]; > + > + /* > + * We should qualify the support function's name if it wouldn't be > + * resolved by lookup in the current search path. > + */ > + argtypes[0] = INTERNALOID; > > Might as well just declare this as: > > Oid argtypes[] = { INTERNALOID }; Modified > 26. deparse_CreateOpClassStmt > > + > + stmt = new_objtree_VA("CREATE OPERATOR CLASS %{identity}D", 1, > + "identity", ObjTypeObject, > + new_objtree_for_qualname(opcForm->opcnamespace, > + NameStr(opcForm->opcname))); > + > + /* Add the DEFAULT clause */ > + append_string_object(stmt, "%{default}s", > + opcForm->opcdefault ? "DEFAULT" : ""); > + > + /* Add the FOR TYPE clause */ > + append_object_object(stmt, "FOR TYPE %{type}T", > + new_objtree_for_type(opcForm->opcintype, -1)); > + > + /* Add the USING clause */ > + append_string_object(stmt, "USING %{amname}I", > get_am_name(opcForm->opcmethod)); > > This can all be done just as a single VA call I think. Modified > 27. > > + append_format_string(tmpobj, "("); > + append_array_object(tmpobj, "%{argtypes:, }T", arglist); > + append_format_string(tmpobj, ")"); > > AFAIK this can just be done by a single call including the parens in > the format string of appen_array_object. There is a possibility that arguments is null, if it is null then append_array_object will not add "(" and ")". That is the reason we need to keep it separate. > 28. deparse_CreatePolicyStmt > > + > +static ObjTree * > +deparse_CreatePolicyStmt(Oid objectId, Node *parsetree) > > Missing function comment. Modified > 29. > > + /* Add the rest of the stuff */ > + add_policy_clauses(policy, objectId, node->roles, !!node->qual, > + !!node->with_check); > > The !! to cast the pointer parameter to boolean is cute, but IIUC that > is not commonly used in the PG source. Maybe it is more conventional > to just pass node->qual != NULL etc? Modified > 30. deparse_AlterPolicyStmt > > + > +static ObjTree * > +deparse_AlterPolicyStmt(Oid objectId, Node *parsetree) > > Missing function comment. Modified > 31. > > + /* Add the rest of the stuff */ > + add_policy_clauses(policy, objectId, node->roles, !!node->qual, > + !!node->with_check); > > The !! to cast the pointer parameter to boolean is cute, but IIUC that > technique is not commonly used in the PG source. Maybe it is more > conventional to just pass node->qual != NULL etc? Modifed > 32. deparse_CreateSchemaStmt > > + else > + { > + append_null_object(auth, "%{authorization_role}I "); > + append_bool_object(auth, "present", false); > + } > > 32a. > Why append a NULL object if the "present" says it is false anyway? This code was intended to generate a verbose json node. So that user can easily modify the command by changing the value to generate a new ddl command. > 32b. > "%{authorization_role}I " -- why do they have extra space on the end? > Just let the append_XXX functions can take care of the space > separators automagically instead. Modified > 33. deparse_AlterDomainStmt > > + { > + fmt = "ALTER DOMAIN"; > + type = "drop default"; > + alterDom = new_objtree_VA(fmt, 1, "type", ObjTypeString, type); > > This code style of assigning the 'fmt' and 'type' like this is not > typical of all the other deparse_XXX functions which just pass > parameter literals. Also, I see no good reason that the 'fmt' is > unconditionally assigned to "ALTER DOMAIN" in 6 different places. Removed it > 34. > > AFAICT all these cases can be simplified to use single VA() calls and > remove all the append_XXX. Modified > 35. > > + > + break; > + case 'N': > > Spurious or misplaced blank line. Modified > 36. > > + case 'C': > + > + /* > + * ADD CONSTRAINT. Only CHECK constraints are supported by > + * domains > + */ > > A spurious blank line is inconsistent with the other cases. Modified > 36. > > + > + break; > + default: > > Spurious or misplaced blank line. Modified > 37. deparse_CreateStatisticsStmt > > + append_format_string(createStat, "FROM"); > + > + append_object_object(createStat, "%{stat_table_identity}D", > + new_objtree_for_qualname(get_rel_namespace(statform->stxrelid), > + get_rel_name(statform->stxrelid))); > > It would be easier to do things like this using a single call using a > format of "FROM %{stat_table_identity}D", rather than have the extra > append_format_string call. Modified > 38. deparse_CreateForeignServerStmt > > + /* Add a TYPE clause, if any */ > + tmp = new_objtree_VA("TYPE", 0); > + if (node->servertype) > + append_string_object(tmp, "%{type}L", node->servertype); > + else > + append_bool_object(tmp, "present", false); > + append_object_object(createServer, "%{type}s", tmp); > + > + /* Add a VERSION clause, if any */ > + tmp = new_objtree_VA("VERSION", 0); > > Why use the VA() function if passing 0 args? Modified > 39. > > + append_string_object(createServer, "FOREIGN DATA WRAPPER %{fdw}I", > node->fdwname); > + /* add an OPTIONS clause, if any */ > + append_object_object(createServer, "%{generic_options}s", > + deparse_FdwOptions(node->options, NULL)); > > 39a. > Use uppercase comment. Modified > 39b. > Missing blank line above comment? Modified > 40. deparse_AlterForeignServerStmt > > + /* Add a VERSION clause, if any */ > + tmp = new_objtree_VA("VERSION", 0); > > Why use the VA() function if passing 0 args? This was already fixed > 41. > > + /* Add a VERSION clause, if any */ > + tmp = new_objtree_VA("VERSION", 0); > + if (node->has_version && node->version) > + append_string_object(tmp, "%{version}L", node->version); > + else if (node->has_version) > + append_string_object(tmp, "version", "NULL"); > + else > + append_bool_object(tmp, "present", false); > + append_object_object(alterServer, "%{version}s", tmp); > + > + /* Add a VERSION clause, if any */ > + tmp = new_objtree_VA("VERSION", 0); > + if (node->has_version && node->version) > + append_string_object(tmp, "%{version}L", node->version); > + else if (node->has_version) > + append_string_object(tmp, "version", "NULL"); > + else > + append_bool_object(tmp, "present", false); > + append_object_object(alterServer, "%{version}s", tmp); > > Huh? Looks like a cut/paste error of duplicate VERSION clauses. Is this correct? Removed the duplicate code > 42. deparse_CreateStmt > > + if (tableelts == NIL) > + { > + tmpobj = new_objtree(""); > + append_bool_object(tmpobj, "present", false); > + } > + else > + tmpobj = new_objtree_VA("(%{elements:, }s)", 1, > + "elements", ObjTypeArray, tableelts); > > This fragment seemed a bit complicated. IIUC this is the same as just: > > tmpobj = new_objtree(""); > if (tableelts) > append_array_object(tmpobj, "(%{elements:, }s)", tableelts); > else > append_bool_object(tmpobj, "present", false); Modified > 43. > > + tmpobj = new_objtree("INHERITS"); > + if (list_length(node->inhRelations) > 0) > + append_array_object(tmpobj, "(%{parents:, }D)", > deparse_InhRelations(objectId)); > + else > + { > + append_null_object(tmpobj, "(%{parents:, }D)"); > + append_bool_object(tmpobj, "present", false); > + } > + append_object_object(createStmt, "%{inherits}s", tmpobj); > > 43a. > AFAIK convention for checking non-empty List is just "if > (node->inhRelations != NIL)" or simply "if (node->inhRelations) Modified > 43b. > Maybe I misunderstand something but I don't see why append_null_object > is needed for tree marked as "present"=false anyhow. This similar > pattern happens multiple times in this function. This code was intended to generate a verbose json node. So that user can easily modify the command by changing the value to generate a new ddl command. > 44. deparse_DefineStmt > > + switch (define->kind) > + { > > IMO better to put all these OBJECT_XXX cases in alphabetical order > instead of just random. Modified > 45. > > + default: > + elog(ERROR, "unsupported object kind"); > + } > > Should this also log what the define->kind was attempted? Modified > 46. deparse_DefineStmt_Collation > > + stmt = new_objtree_VA("CREATE COLLATION", 0); > + > + append_object_object(stmt, "%{identity}D", > + new_objtree_for_qualname(colForm->collnamespace, > + NameStr(colForm->collname))); > > Why not combine there to avoid VA args with 0 and use VA args with 1 instead? Modified > 47. > > + if (fromCollid.objectId != InvalidOid) > > Use OisIsValid macro. Modified > 48. > > + append_object_object(stmt, "FROM %{from_identity}D", > + new_objtree_for_qualname(fromColForm->collnamespace, > + NameStr(fromColForm->collname))); > + > + > + ReleaseSysCache(tp); > + ReleaseSysCache(colTup); > + return stmt; > > Extra blank line. Modified > 49. > > + if (!isnull) > + { > + tmp = new_objtree_VA("LOCALE=", 1, > + "clause", ObjTypeString, "locale"); > + append_string_object(tmp, "%{locale}L", > + psprintf("%s", TextDatumGetCString(datum))); > > IMO it should be easy enough to express this using a single VA(2 args) > function, so avoiding the extra append_string. e.g. other functions > like deparse_DefineStmt_Operator do this. > > And this same comment also applies to the rest of this function: > - tmp = new_objtree_VA("LC_COLLATE=", 1, > - tmp = new_objtree_VA("LC_CTYPE=", 1, > - tmp = new_objtree_VA("PROVIDER=", 1, > - tmp = new_objtree_VA("PROVIDER=", 1, > - tmp = new_objtree_VA("DETERMINISTIC=", 1, > - tmp = new_objtree_VA("VERSION=", 1, Modified Thanks for the comments, the attached v40 patch has the changes for the same. Regards, Vignesh
Attachment
pgsql-hackers by date: