Re: Support logical replication of DDLs - Mailing list pgsql-hackers
From | vignesh C |
---|---|
Subject | Re: Support logical replication of DDLs |
Date | |
Msg-id | CALDaNm2CazsD+Vibr8Khd_JXAvb1-FDtqhAaMS34HmdH0D9ATw@mail.gmail.com Whole thread Raw |
In response to | Re: Support logical replication of DDLs (Peter Smith <smithpb2250@gmail.com>) |
List | pgsql-hackers |
On Thu, 3 Nov 2022 at 13:11, Peter Smith <smithpb2250@gmail.com> wrote: > > Here are some more review comments for the v32-0001 file ddl_deparse.c > > (This is a WIP since it is such a large file) > > ====== > > 1. General - calling VA with 0 args > > There are some calls to new_objtree_VA() where 0 extra args are passed. > > e.g. See in deparse_AlterFunction > * alterFunc = new_objtree_VA("ALTER FUNCTION", 0); > * ObjTree *tmpobj = new_objtree_VA("%{type}T", 0); > * tmpobj = new_objtree_VA(intVal(defel->arg) ? "RETURNS NULL ON NULL > INPUT" : "CALLED ON NULL INPUT", 0); > * tmpobj = new_objtree_VA(intVal(defel->arg) ? "SECURITY DEFINER" : > "SECURITY INVOKER", 0); > * tmpobj = new_objtree_VA(intVal(defel->arg) ? "LEAKPROOF" : "NOT > LEAKPROOF", 0); > * etc. > > Shouldn't all those just be calling the new_objtree() function instead > of new_objtree_VA()? > > Probably there are more than just those cited - please search for others. Modified wherever possible > 2. General - when to call append_xxx functions? > > I did not always understand what seems like an arbitrary choice of > function calls to append_xxx. > > e.g. Function deparse_AlterObjectSchemaStmt() does: > > + append_string_object(alterStmt, "%{identity}s", ident); > + > + append_string_object(alterStmt, "SET SCHEMA %{newschema}I", newschema); > > Why doesn't the above just use new_objtree_VA instead -- it seemed to > me like the _VA function is underused in some places. Maybe search all > the append_xxx usage - I suspect many of those can in fact be combined > to use new_objtree_VA(). Modified wherever possible > 3. General - extract object names from formats > > IIUC the calls to append_XXX__object will call deeper to > append_object_to_format_string(), which has a main side-effect loop to > extract the "name" part out of the sub_fmt string. But this logic all > seems inefficient and unnecessary to me. I think in most (all?) cases > the caller already knows what the object name should be, so instead of > making code work to figure it out again, it can just be passed in the > same way the _VA() function passes the known object name. > > There are many cases of this: > > e.g. > > BEFORE > append_string_object(alterop, "(%{left_type}s", "NONE"); > > AFTER - just change the signature to pass the known object name > append_string_object(alterop, "(%{left_type}s", "left_type", "NONE"); Modified > 4. General - fwd declares > > static void append_array_object(ObjTree *tree, char *name, List *array); > static void append_bool_object(ObjTree *tree, char *name, bool value); > static void append_float_object(ObjTree *tree, char *name, float8 value); > static void append_null_object(ObjTree *tree, char *name); > static void append_object_object(ObjTree *tree, char *name, ObjTree *value); > static char *append_object_to_format_string(ObjTree *tree, char *sub_fmt); > static void append_premade_object(ObjTree *tree, ObjElem *elem); > static void append_string_object(ObjTree *tree, char *name, char *value); > > > I think those signatures are misleading. AFAIK seems what is called > the 'name' param above is often a 'sub_fmt' param in the > implementation. Modified > 5. General - inconsistent append_array object calls. > > Sometimes enclosing brackets are included as part of the format string > to be appended and other times they are appended separately. IIUC > there is no difference but IMO the code should always be consistent to > avoid it being confusing. > > e.g.1 (brackets in fmt) > append_array_object(tmpobj, "(%{rettypes:, }s)", rettypes); > > e.g.2 (brackets appended separately) > + append_format_string(tmpobj, "("); > + append_array_object(tmpobj, "%{argtypes:, }T", arglist); > + append_format_string(tmpobj, ")"); We cannot change it to a single call in all cases because in some cases there is a possibility that the list can be NULL, if the list is empty then append_array_object will return without appending "(". For an empty list, we should append it with (). I'm not making any changes for this. > 6. General - missing space before '(' > > I noticed a number of comment where there is a space missing before a '('. > Here are some examples: > > - * An element of an object tree(ObjTree). > - * typmod at the middle of name(e.g. TIME(6) with time zone ). We cannot > - * Sequence for IDENTITY COLUMN output separately(via CREATE TABLE or > - * Sequence for IDENTITY COLUMN output separately(via CREATE TABLE or > > Search all the patch-code to find others and add missing spaces. Modified > 7. General - Verbose syntax comments > > Some (but not all) of the deparse_XXX functions have a comment > describing the verbose syntax. > > e.g. > /* > * Verbose syntax > * > * CREATE %{default}s CONVERSION %{identity}D FOR %{source}L TO %{dest}L > * FROM %{function}D > */ > > These are helpful for understanding the logic of the function, so IMO > similar comments should be written for *all* of the deparse_xxx > function. > > And maybe a more appropriate place to put these comments is in the > function header comment. Modified > 8. new_objtree_VA > > + /* > + * For all other param types there must be a value in the varargs. > + * Fetch it and add the fully formed subobject into the main object. > + */ > + switch (type) > > What does the comment mean when it says - for all "other" param types? I felt the comment meant "For all param types other than ObjTypeNull", changed it accordingly. > 9. objtree_to_jsonb_element > > + ListCell *cell; > + JsonbValue val; > > The 'cell' is only for the ObjTypeArray so consider declaring it for > that case only. Modified > 10. obtainConstraints > > + else > + { > + Assert(OidIsValid(domainId)); > > Looks like the above Assert is unnecessary because the earlier Assert > (below) already ensures this: > + /* Only one may be valid */ > + Assert(OidIsValid(relationId) ^ OidIsValid(domainId)); Removed it > 11. pg_get_indexdef_detailed > > + /* Output tablespace */ > + { > + Oid tblspc; > + > + tblspc = get_rel_tablespace(indexrelid); > + if (OidIsValid(tblspc)) > + *tablespace = pstrdup(quote_identifier(get_tablespace_name(tblspc))); > + else > + *tablespace = NULL; > + } > + > + /* Report index predicate, if any */ > + if (!heap_attisnull(ht_idx, Anum_pg_index_indpred, NULL)) > + { > + Node *node; > + Datum predDatum; > + char *predString; > + > + /* Convert text string to node tree */ > + predDatum = SysCacheGetAttr(INDEXRELID, ht_idx, > + Anum_pg_index_indpred, &isnull); > + Assert(!isnull); > + predString = TextDatumGetCString(predDatum); > + node = (Node *) stringToNode(predString); > + pfree(predString); > + > + /* Deparse */ > + *whereClause = > + deparse_expression(node, context, false, false); > + } > + else > + *whereClause = NULL; > > Maybe just assign defaults: > *tablespace = NULL; > *whereClause = NULL; > > then overwrite those defaults, so can avoid the 'else' code. Modified > 12. stringify_objtype > > +/* > + * Return the given object type as a string. > + */ > +static const char * > +stringify_objtype(ObjectType objtype) > > 12a. > This statics function feels like it belongs more in another module as > a utility function. Moved it > 12b. > Actually, this function looks like it might be more appropriate just > as a static lookup array/map of names keys by the ObjectType, and > using a StaticAssertDecl for sanity checking. We will not be able to use static array map in this case as this function returns multiple types like in case of OBJECT_COLUMN returns "TABLE" or "COLUMN" based on isgrant is true or false. similarly in case of OBJECT_FOREIGN_SERVER too. I have not made any changes for this. > 13. deparse_GrantStmt > > + /* > + * If there are no objects from "ALL ... IN SCHEMA" to be granted, then we > + * need not do anything. > + */ > + if (istmt->objects == NIL) > + return NULL; > > "we need not do anything." -> "nothing to do." Modified > 14. deparse_GrantStmt > > + switch (istmt->objtype) > + { > + case OBJECT_COLUMN: > + case OBJECT_TABLE: > + objtype = "TABLE"; > + classId = RelationRelationId; > + break; > + case OBJECT_SEQUENCE: > + objtype = "SEQUENCE"; > + classId = RelationRelationId; > + break; > + case OBJECT_DOMAIN: > + objtype = "DOMAIN"; > + classId = TypeRelationId; > + break; > + case OBJECT_FDW: > + objtype = "FOREIGN DATA WRAPPER"; > + classId = ForeignDataWrapperRelationId; > + break; > + case OBJECT_FOREIGN_SERVER: > + objtype = "FOREIGN SERVER"; > + classId = ForeignServerRelationId; > + break; > + case OBJECT_FUNCTION: > + objtype = "FUNCTION"; > + classId = ProcedureRelationId; > + break; > + case OBJECT_PROCEDURE: > + objtype = "PROCEDURE"; > + classId = ProcedureRelationId; > + break; > + case OBJECT_ROUTINE: > + objtype = "ROUTINE"; > + classId = ProcedureRelationId; > + break; > + case OBJECT_LANGUAGE: > + objtype = "LANGUAGE"; > + classId = LanguageRelationId; > + break; > + case OBJECT_LARGEOBJECT: > + objtype = "LARGE OBJECT"; > + classId = LargeObjectRelationId; > + break; > + case OBJECT_SCHEMA: > + objtype = "SCHEMA"; > + classId = NamespaceRelationId; > + break; > + case OBJECT_TYPE: > + objtype = "TYPE"; > + classId = TypeRelationId; > + break; > + case OBJECT_DATABASE: > + case OBJECT_TABLESPACE: > + objtype = ""; > + classId = InvalidOid; > + elog(ERROR, "global objects not supported"); > + break; > + default: > + elog(ERROR, "invalid OBJECT value %d", istmt->objtype); > + } > > > Shouldn't code be calling to the other function stringify_objtype() to > do some of this? Modified > 15. > > + grantStmt = new_objtree_VA(fmt, 0); > + > + /* build a list of privileges to grant/revoke */ > + if (istmt->all_privs) > + { > + tmp = new_objtree_VA("ALL PRIVILEGES", 0); > > Here are some more examples of the _VA function being called with 0 > args. Why use _VA function? This was already modified > 16. > > + list = NIL; > + > + if (istmt->privileges & ACL_INSERT) > + list = lappend(list, new_string_object("INSERT")); > + if (istmt->privileges & ACL_SELECT) > + list = lappend(list, new_string_object("SELECT")); > + if (istmt->privileges & ACL_UPDATE) > + list = lappend(list, new_string_object("UPDATE")); > + if (istmt->privileges & ACL_DELETE) > + list = lappend(list, new_string_object("DELETE")); > + if (istmt->privileges & ACL_TRUNCATE) > + list = lappend(list, new_string_object("TRUNCATE")); > + if (istmt->privileges & ACL_REFERENCES) > + list = lappend(list, new_string_object("REFERENCES")); > + if (istmt->privileges & ACL_TRIGGER) > + list = lappend(list, new_string_object("TRIGGER")); > + if (istmt->privileges & ACL_EXECUTE) > + list = lappend(list, new_string_object("EXECUTE")); > + if (istmt->privileges & ACL_USAGE) > + list = lappend(list, new_string_object("USAGE")); > + if (istmt->privileges & ACL_CREATE) > + list = lappend(list, new_string_object("CREATE")); > + if (istmt->privileges & ACL_CREATE_TEMP) > + list = lappend(list, new_string_object("TEMPORARY")); > + if (istmt->privileges & ACL_CONNECT) > + list = lappend(list, new_string_object("CONNECT")); > > 16a. > Shouldn't this be trying to re-use code like privilege_to_string() > mapping function already in aclchk.c to get all those ACL strings? Modified > 16b. > Is it correct that ACL_SET and ACL_ALTER_SYSTEM are missing? Yes this is intentional as event trigger is not supported for global objects > 17. > > The coding style is inconsistent in this function... > > For the same things - sometimes use the ternary operator; sometimes use if/else. > > e.g.1 > + append_string_object(grantStmt, "%{grant_option}s", > + istmt->grant_option ? "WITH GRANT OPTION" : ""); > > e.g.2 > + if (istmt->behavior == DROP_CASCADE) > + append_string_object(grantStmt, "%{cascade}s", "CASCADE"); > + else > + append_string_object(grantStmt, "%{cascade}s", ""); Modified > 18. deparse_AlterOpFamily > > + tmpobj2 = new_objtree_VA("FOR ORDER BY", 0); > + append_object_object(tmpobj2, "%{opfamily}D", > + new_objtree_for_qualname_id(OperatorFamilyRelationId, > + oper->sortfamily)); > > Why isn't something like this combined to be written as a signle > new_objtree_VA call? Modified > 19. deparse_Type_Storage > > + tmpstr = psprintf("%s", str); > + > + fmt = "STORAGE = %{value}s"; > + > + storage = new_objtree_VA(fmt, 2, > + "clause", ObjTypeString, "storage", > + "value", ObjTypeString, tmpstr); > > 19a. > What is the purpose of tmpstr? Seems unnecessary It is not required, removed it > 19b. > What is the purpose of separate 'fmt' var? Why not just pass format > string as a parameter literal to the new_objtree_VA() Modified > 20. deparse_CreateConversion > > + /* Add the DEFAULT clause */ > + append_string_object(ccStmt, "default", > + conForm->condefault ? "DEFAULT" : ""); > > 20a. > Is that code correct? I thought the fmt should look like > "%{default}s", otherwise won't the resulting string object have no > name? I have changed it to use it in new_objtree_VA and changed it to %{default}s > 20b. > Anyway, it does not seem to match what the preceding verbose syntax > comment says. Modified > 21. > > + > + > + /* Add the DEFAULT clause */ > + append_string_object(ccStmt, "default", > + conForm->condefault ? "DEFAULT" : ""); > + > + tmpObj = new_objtree_for_qualname(conForm->connamespace, > NameStr(conForm->conname)); > + append_object_object(ccStmt, "CONVERSION %{identity}D", tmpObj); > + append_string_object(ccStmt, "FOR %{source}L", (char *) > + pg_encoding_to_char(conForm->conforencoding)); > + append_string_object(ccStmt, "TO %{dest}L", (char *) > + pg_encoding_to_char(conForm->contoencoding)); > + append_object_object(ccStmt, "FROM %{function}D", > + new_objtree_for_qualname_id(ProcedureRelationId, > + conForm->conproc)); > > I don't really understand why all this is not written instead using a > single new_objtree_VA() call. Modified > 22. deparse_CreateEnumStmt > > + enumtype = new_objtree("CREATE TYPE"); > + append_object_object(enumtype, "%{identity}D", > + new_objtree_for_qualname_id(TypeRelationId, > + objectId)); > + > + values = NIL; > + foreach(cell, node->vals) > + { > + String *val = lfirst_node(String, cell); > + > + values = lappend(values, new_string_object(strVal(val))); > + } > + > + append_array_object(enumtype, "AS ENUM (%{values:, }L)", values); > + return enumtype; > > Ditto. I don't really understand why all this is not written instead > using a single new_objtree_VA() call. Modified > 23. deparse_CreateExtensionStmt > > + extStmt = new_objtree("CREATE EXTENSION"); > + > + append_string_object(extStmt, "%{if_not_exists}s", > + node->if_not_exists ? "IF NOT EXISTS" : ""); > + > + append_string_object(extStmt, "%{name}I", node->extname); > > Ditto. I don't really understand why all this is not written instead > using a single new_objtree_VA() call. Modified > 24. deparse_FdwOptions > > + tmp = new_objtree_VA("OPTIONS", 0); > > Isn't it better to call other function instead of passing zero params > to this one? This was already fixed > 25. deparse_CreateFdwStmt > > 25a. > + /* add HANDLER clause */ > + if (fdwForm->fdwhandler == InvalidOid) > + tmp = new_objtree_VA("NO HANDLER", 0); > + else > > Isn't it better to call other function instead of passing zero params > to this one? This is fixed already > 25b. > + /* add VALIDATOR clause */ > + if (fdwForm->fdwvalidator == InvalidOid) > + tmp = new_objtree_VA("NO VALIDATOR", 0); > > Ditto #25a This is fixed already > > 25c. > Both above should use OidIsValid macro. Modified > 26. deparse_AlterFdwStmt > > 26a. > + /* add HANDLER clause */ > + if (fdwForm->fdwhandler == InvalidOid) > + tmp = new_objtree_VA("NO HANDLER", 0); > > Ditto #25a This is fixed already > 26b. > + /* add VALIDATOR clause */ > + if (fdwForm->fdwvalidator == InvalidOid) > + tmp = new_objtree_VA("NO VALIDATOR", 0); > > Ditto #25a This is fixed already > 26c. > Both above should use OidIsValid macro. Modified > 27. deparse_CreateRangeStmt > > + /* SUBTYPE */ > + tmp = new_objtree_for_qualname_id(TypeRelationId, > + rangeForm->rngsubtype); > + tmp = new_objtree_VA("SUBTYPE = %{type}D", > + 2, > + "clause", ObjTypeString, "subtype", > + "type", ObjTypeObject, tmp); > + definition = lappend(definition, new_object_object(tmp)); > > > The reusing of 'tmp' variable seems a bit sneaky to me. Perhaps using > 'tmp' and 'tmp_qualid' might be a more readable way to go here. Removed usage of tmp > 28. deparse_AlterEnumStmt > > + if (node->newValNeighbor) > + { > + append_string_object(alterEnum, "%{after_or_before}s", > + node->newValIsAfter ? "AFTER" : "BEFORE"); > + append_string_object(alterEnum, "%{neighbour}L", node->newValNeighbor); > + } > > Has a mix of US and UK spelling of neighbor/neighbour? Modified to neighbor Thanks for the comments, the attached v49 patch has the changes for the same. Regards, Vignesh
Attachment
- v49-0004-Test-cases-for-DDL-replication.patch
- v49-0001-Functions-to-deparse-DDL-commands.patch
- v49-0005-Skip-ALTER-TABLE-subcommands-generated-for-Table.patch
- v49-0003-Support-CREATE-TABLE-AS-SELECT-INTO.patch
- v49-0002-Support-DDL-replication.patch
- v49-0006-Support-DDL-replication-of-alter-type-having-USI.patch
- v49-0007-Introduce-the-test_ddl_deparse_regress-test-modu.patch
pgsql-hackers by date: