Re: Support logical replication of DDLs - Mailing list pgsql-hackers
From | Peter Smith |
---|---|
Subject | Re: Support logical replication of DDLs |
Date | |
Msg-id | CAHut+Puxo_kq2toicNK_BQdeccK3REGW-Xv8tVauFvTNku6V-w@mail.gmail.com Whole thread Raw |
In response to | Re: Support logical replication of DDLs (Peter Smith <smithpb2250@gmail.com>) |
Responses |
Re: Support logical replication of DDLs
Re: Support logical replication of DDLs |
List | pgsql-hackers |
Here are some more review comments for the v32-0001 file ddl_deparse.c (This is a WIP since it is such a large file) ====== 1. General - calling VA with 0 args There are some calls to new_objtree_VA() where 0 extra args are passed. e.g. See in deparse_AlterFunction * alterFunc = new_objtree_VA("ALTER FUNCTION", 0); * ObjTree *tmpobj = new_objtree_VA("%{type}T", 0); * tmpobj = new_objtree_VA(intVal(defel->arg) ? "RETURNS NULL ON NULL INPUT" : "CALLED ON NULL INPUT", 0); * tmpobj = new_objtree_VA(intVal(defel->arg) ? "SECURITY DEFINER" : "SECURITY INVOKER", 0); * tmpobj = new_objtree_VA(intVal(defel->arg) ? "LEAKPROOF" : "NOT LEAKPROOF", 0); * etc. Shouldn't all those just be calling the new_objtree() function instead of new_objtree_VA()? Probably there are more than just those cited - please search for others. ~~~ 2. General - when to call append_xxx functions? I did not always understand what seems like an arbitrary choice of function calls to append_xxx. e.g. Function deparse_AlterObjectSchemaStmt() does: + append_string_object(alterStmt, "%{identity}s", ident); + + append_string_object(alterStmt, "SET SCHEMA %{newschema}I", newschema); Why doesn't the above just use new_objtree_VA instead -- it seemed to me like the _VA function is underused in some places. Maybe search all the append_xxx usage - I suspect many of those can in fact be combined to use new_objtree_VA(). ~~~ 3. General - extract object names from formats IIUC the calls to append_XXX__object will call deeper to append_object_to_format_string(), which has a main side-effect loop to extract the "name" part out of the sub_fmt string. But this logic all seems inefficient and unnecessary to me. I think in most (all?) cases the caller already knows what the object name should be, so instead of making code work to figure it out again, it can just be passed in the same way the _VA() function passes the known object name. There are many cases of this: e.g. BEFORE append_string_object(alterop, "(%{left_type}s", "NONE"); AFTER - just change the signature to pass the known object name append_string_object(alterop, "(%{left_type}s", "left_type", "NONE"); ~~~ 4. General - fwd declares static void append_array_object(ObjTree *tree, char *name, List *array); static void append_bool_object(ObjTree *tree, char *name, bool value); static void append_float_object(ObjTree *tree, char *name, float8 value); static void append_null_object(ObjTree *tree, char *name); static void append_object_object(ObjTree *tree, char *name, ObjTree *value); static char *append_object_to_format_string(ObjTree *tree, char *sub_fmt); static void append_premade_object(ObjTree *tree, ObjElem *elem); static void append_string_object(ObjTree *tree, char *name, char *value); I think those signatures are misleading. AFAIK seems what is called the 'name' param above is often a 'sub_fmt' param in the implementation. ~~~ 5. General - inconsistent append_array object calls. Sometimes enclosing brackets are included as part of the format string to be appended and other times they are appended separately. IIUC there is no difference but IMO the code should always be consistent to avoid it being confusing. e.g.1 (brackets in fmt) append_array_object(tmpobj, "(%{rettypes:, }s)", rettypes); e.g.2 (brackets appended separately) + append_format_string(tmpobj, "("); + append_array_object(tmpobj, "%{argtypes:, }T", arglist); + append_format_string(tmpobj, ")"); ~~~ 6. General - missing space before '(' I noticed a number of comment where there is a space missing before a '('. Here are some examples: - * An element of an object tree(ObjTree). - * typmod at the middle of name(e.g. TIME(6) with time zone ). We cannot - * Sequence for IDENTITY COLUMN output separately(via CREATE TABLE or - * Sequence for IDENTITY COLUMN output separately(via CREATE TABLE or Search all the patch-code to find others and add missing spaces. ~~~ 7. General - Verbose syntax comments Some (but not all) of the deparse_XXX functions have a comment describing the verbose syntax. e.g. /* * Verbose syntax * * CREATE %{default}s CONVERSION %{identity}D FOR %{source}L TO %{dest}L * FROM %{function}D */ These are helpful for understanding the logic of the function, so IMO similar comments should be written for *all* of the deparse_xxx function. And maybe a more appropriate place to put these comments is in the function header comment. ====== 8. new_objtree_VA + /* + * For all other param types there must be a value in the varargs. + * Fetch it and add the fully formed subobject into the main object. + */ + switch (type) What does the comment mean when it says - for all "other" param types? ~~~ 9. objtree_to_jsonb_element + ListCell *cell; + JsonbValue val; The 'cell' is only for the ObjTypeArray so consider declaring it for that case only. ~~~ 10. obtainConstraints + else + { + Assert(OidIsValid(domainId)); Looks like the above Assert is unnecessary because the earlier Assert (below) already ensures this: + /* Only one may be valid */ + Assert(OidIsValid(relationId) ^ OidIsValid(domainId)); ~~~ 11. pg_get_indexdef_detailed + /* Output tablespace */ + { + Oid tblspc; + + tblspc = get_rel_tablespace(indexrelid); + if (OidIsValid(tblspc)) + *tablespace = pstrdup(quote_identifier(get_tablespace_name(tblspc))); + else + *tablespace = NULL; + } + + /* Report index predicate, if any */ + if (!heap_attisnull(ht_idx, Anum_pg_index_indpred, NULL)) + { + Node *node; + Datum predDatum; + char *predString; + + /* Convert text string to node tree */ + predDatum = SysCacheGetAttr(INDEXRELID, ht_idx, + Anum_pg_index_indpred, &isnull); + Assert(!isnull); + predString = TextDatumGetCString(predDatum); + node = (Node *) stringToNode(predString); + pfree(predString); + + /* Deparse */ + *whereClause = + deparse_expression(node, context, false, false); + } + else + *whereClause = NULL; Maybe just assign defaults: *tablespace = NULL; *whereClause = NULL; then overwrite those defaults, so can avoid the 'else' code. ~~~ 12. stringify_objtype +/* + * Return the given object type as a string. + */ +static const char * +stringify_objtype(ObjectType objtype) 12a. This statics function feels like it belongs more in another module as a utility function. ~ 12b. Actually, this function looks like it might be more appropriate just as a static lookup array/map of names keys by the ObjectType, and using a StaticAssertDecl for sanity checking. ~~~ 13. deparse_GrantStmt + /* + * If there are no objects from "ALL ... IN SCHEMA" to be granted, then we + * need not do anything. + */ + if (istmt->objects == NIL) + return NULL; "we need not do anything." -> "nothing to do." ~~~ 14. deparse_GrantStmt + switch (istmt->objtype) + { + case OBJECT_COLUMN: + case OBJECT_TABLE: + objtype = "TABLE"; + classId = RelationRelationId; + break; + case OBJECT_SEQUENCE: + objtype = "SEQUENCE"; + classId = RelationRelationId; + break; + case OBJECT_DOMAIN: + objtype = "DOMAIN"; + classId = TypeRelationId; + break; + case OBJECT_FDW: + objtype = "FOREIGN DATA WRAPPER"; + classId = ForeignDataWrapperRelationId; + break; + case OBJECT_FOREIGN_SERVER: + objtype = "FOREIGN SERVER"; + classId = ForeignServerRelationId; + break; + case OBJECT_FUNCTION: + objtype = "FUNCTION"; + classId = ProcedureRelationId; + break; + case OBJECT_PROCEDURE: + objtype = "PROCEDURE"; + classId = ProcedureRelationId; + break; + case OBJECT_ROUTINE: + objtype = "ROUTINE"; + classId = ProcedureRelationId; + break; + case OBJECT_LANGUAGE: + objtype = "LANGUAGE"; + classId = LanguageRelationId; + break; + case OBJECT_LARGEOBJECT: + objtype = "LARGE OBJECT"; + classId = LargeObjectRelationId; + break; + case OBJECT_SCHEMA: + objtype = "SCHEMA"; + classId = NamespaceRelationId; + break; + case OBJECT_TYPE: + objtype = "TYPE"; + classId = TypeRelationId; + break; + case OBJECT_DATABASE: + case OBJECT_TABLESPACE: + objtype = ""; + classId = InvalidOid; + elog(ERROR, "global objects not supported"); + break; + default: + elog(ERROR, "invalid OBJECT value %d", istmt->objtype); + } Shouldn't code be calling to the other function stringify_objtype() to do some of this? ~~~ 15. + grantStmt = new_objtree_VA(fmt, 0); + + /* build a list of privileges to grant/revoke */ + if (istmt->all_privs) + { + tmp = new_objtree_VA("ALL PRIVILEGES", 0); Here are some more examples of the _VA function being called with 0 args. Why use _VA function? ~~~ 16. + list = NIL; + + if (istmt->privileges & ACL_INSERT) + list = lappend(list, new_string_object("INSERT")); + if (istmt->privileges & ACL_SELECT) + list = lappend(list, new_string_object("SELECT")); + if (istmt->privileges & ACL_UPDATE) + list = lappend(list, new_string_object("UPDATE")); + if (istmt->privileges & ACL_DELETE) + list = lappend(list, new_string_object("DELETE")); + if (istmt->privileges & ACL_TRUNCATE) + list = lappend(list, new_string_object("TRUNCATE")); + if (istmt->privileges & ACL_REFERENCES) + list = lappend(list, new_string_object("REFERENCES")); + if (istmt->privileges & ACL_TRIGGER) + list = lappend(list, new_string_object("TRIGGER")); + if (istmt->privileges & ACL_EXECUTE) + list = lappend(list, new_string_object("EXECUTE")); + if (istmt->privileges & ACL_USAGE) + list = lappend(list, new_string_object("USAGE")); + if (istmt->privileges & ACL_CREATE) + list = lappend(list, new_string_object("CREATE")); + if (istmt->privileges & ACL_CREATE_TEMP) + list = lappend(list, new_string_object("TEMPORARY")); + if (istmt->privileges & ACL_CONNECT) + list = lappend(list, new_string_object("CONNECT")); 16a. Shouldn't this be trying to re-use code like privilege_to_string() mapping function already in aclchk.c to get all those ACL strings? ~ 16b. Is it correct that ACL_SET and ACL_ALTER_SYSTEM are missing? ~~~ 17. The coding style is inconsistent in this function... For the same things - sometimes use the ternary operator; sometimes use if/else. e.g.1 + append_string_object(grantStmt, "%{grant_option}s", + istmt->grant_option ? "WITH GRANT OPTION" : ""); e.g.2 + if (istmt->behavior == DROP_CASCADE) + append_string_object(grantStmt, "%{cascade}s", "CASCADE"); + else + append_string_object(grantStmt, "%{cascade}s", ""); ~~~ 18. deparse_AlterOpFamily + tmpobj2 = new_objtree_VA("FOR ORDER BY", 0); + append_object_object(tmpobj2, "%{opfamily}D", + new_objtree_for_qualname_id(OperatorFamilyRelationId, + oper->sortfamily)); Why isn't something like this combined to be written as a signle new_objtree_VA call? ~~~ 19. deparse_Type_Storage + tmpstr = psprintf("%s", str); + + fmt = "STORAGE = %{value}s"; + + storage = new_objtree_VA(fmt, 2, + "clause", ObjTypeString, "storage", + "value", ObjTypeString, tmpstr); 19a. What is the purpose of tmpstr? Seems unnecessary ~ 19b. What is the purpose of separate 'fmt' var? Why not just pass format string as a parameter literal to the new_objtree_VA() ~~~ 20. deparse_CreateConversion + /* Add the DEFAULT clause */ + append_string_object(ccStmt, "default", + conForm->condefault ? "DEFAULT" : ""); 20a. Is that code correct? I thought the fmt should look like "%{default}s", otherwise won't the resulting string object have no name? ~ 20b. Anyway, it does not seem to match what the preceding verbose syntax comment says. ~~~ 21. + + + /* Add the DEFAULT clause */ + append_string_object(ccStmt, "default", + conForm->condefault ? "DEFAULT" : ""); + + tmpObj = new_objtree_for_qualname(conForm->connamespace, NameStr(conForm->conname)); + append_object_object(ccStmt, "CONVERSION %{identity}D", tmpObj); + append_string_object(ccStmt, "FOR %{source}L", (char *) + pg_encoding_to_char(conForm->conforencoding)); + append_string_object(ccStmt, "TO %{dest}L", (char *) + pg_encoding_to_char(conForm->contoencoding)); + append_object_object(ccStmt, "FROM %{function}D", + new_objtree_for_qualname_id(ProcedureRelationId, + conForm->conproc)); I don't really understand why all this is not written instead using a single new_objtree_VA() call. ~~~ 22. deparse_CreateEnumStmt + enumtype = new_objtree("CREATE TYPE"); + append_object_object(enumtype, "%{identity}D", + new_objtree_for_qualname_id(TypeRelationId, + objectId)); + + values = NIL; + foreach(cell, node->vals) + { + String *val = lfirst_node(String, cell); + + values = lappend(values, new_string_object(strVal(val))); + } + + append_array_object(enumtype, "AS ENUM (%{values:, }L)", values); + return enumtype; Ditto. I don't really understand why all this is not written instead using a single new_objtree_VA() call. ~~~ 23. deparse_CreateExtensionStmt + extStmt = new_objtree("CREATE EXTENSION"); + + append_string_object(extStmt, "%{if_not_exists}s", + node->if_not_exists ? "IF NOT EXISTS" : ""); + + append_string_object(extStmt, "%{name}I", node->extname); Ditto. I don't really understand why all this is not written instead using a single new_objtree_VA() call. ~~~ 24. deparse_FdwOptions + tmp = new_objtree_VA("OPTIONS", 0); Isn't it better to call other function instead of passing zero params to this one? ~~~ 25. deparse_CreateFdwStmt 25a. + /* add HANDLER clause */ + if (fdwForm->fdwhandler == InvalidOid) + tmp = new_objtree_VA("NO HANDLER", 0); + else Isn't it better to call other function instead of passing zero params to this one? ~ 25b. + /* add VALIDATOR clause */ + if (fdwForm->fdwvalidator == InvalidOid) + tmp = new_objtree_VA("NO VALIDATOR", 0); Ditto #25a ~ 25c. Both above should use OidIsValid macro. ~~~ 26. deparse_AlterFdwStmt 26a. + /* add HANDLER clause */ + if (fdwForm->fdwhandler == InvalidOid) + tmp = new_objtree_VA("NO HANDLER", 0); Ditto #25a ~ 26b. + /* add VALIDATOR clause */ + if (fdwForm->fdwvalidator == InvalidOid) + tmp = new_objtree_VA("NO VALIDATOR", 0); Ditto #25a ~ 26c. Both above should use OidIsValid macro. ~~~ 27. deparse_CreateRangeStmt + /* SUBTYPE */ + tmp = new_objtree_for_qualname_id(TypeRelationId, + rangeForm->rngsubtype); + tmp = new_objtree_VA("SUBTYPE = %{type}D", + 2, + "clause", ObjTypeString, "subtype", + "type", ObjTypeObject, tmp); + definition = lappend(definition, new_object_object(tmp)); The reusing of 'tmp' variable seems a bit sneaky to me. Perhaps using 'tmp' and 'tmp_qualid' might be a more readable way to go here. ~~~ 28. deparse_AlterEnumStmt + if (node->newValNeighbor) + { + append_string_object(alterEnum, "%{after_or_before}s", + node->newValIsAfter ? "AFTER" : "BEFORE"); + append_string_object(alterEnum, "%{neighbour}L", node->newValNeighbor); + } Has a mix of US and UK spelling of neighbor/neighbour? ------ Kind Regards, Peter Smith. Fujitsu Australia
pgsql-hackers by date: