RE: Support logical replication of DDLs - Mailing list pgsql-hackers

From houzj.fnst@fujitsu.com
Subject RE: Support logical replication of DDLs
Date
Msg-id OS0PR01MB5716D3B79A607B547F2AB90D94819@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Support logical replication of DDLs  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Support logical replication of DDLs
List pgsql-hackers
On Wednesday, June 29, 2022 6:48 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> 
> On Wed, Jun 29, 2022 at 3:24 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Wednesday, June 29, 2022 11:07 AM Amit Kapila
> <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Jun 28, 2022 at 5:43 PM Amit Kapila
> > > <amit.kapila16@gmail.com>
> > > wrote:
> > > >
> > > > 5.
> > > > +static ObjTree *
> > > > +deparse_CreateStmt(Oid objectId, Node *parsetree)
> > > > {
> > > > ...
> > > > + tmp = new_objtree_VA("TABLESPACE %{tablespace}I", 0); if
> > > > + (node->tablespacename) append_string_object(tmp, "tablespace",
> > > > + node->tablespacename); else { append_null_object(tmp,
> > > > + node->"tablespace");
> > > > + append_bool_object(tmp, "present", false); }
> > > > + append_object_object(createStmt, "tablespace", tmp);
> > > > ...
> > > > }
> > > >
> > > > Why do we need to append the objects (tablespace, with clause,
> > > > etc.) when they are not present in the actual CREATE TABLE
> > > > statement? The reason to ask this is that this makes the string
> > > > that we want to send downstream much longer than the actual
> > > > statement given by the user on the publisher.
> > > >
> > >
> > > After thinking some more on this, it seems the user may want to
> > > optionally change some of these attributes, for example, on the
> > > subscriber, it may want to associate the table with a different
> > > tablespace. I think to address that we can append these additional
> > > attributes optionally, say via an additional parameter
> > > (append_all_options/append_all_attributes or something like that) in
> exposed APIs like deparse_utility_command().
> >
> > I agree and will research this part.
> >
> 
> Okay, note that similar handling would be required at other places like
> deparse_ColumnDef. Few other comments on
> v9-0001-Functions-to-deparse-DDL-commands.
> 
> 1.
> +static ObjElem *new_bool_object(bool value); static ObjElem
> +*new_string_object(char *value); static ObjElem
> +*new_object_object(ObjTree *value); static ObjElem
> +*new_array_object(List *array); static ObjElem
> +*new_integer_object(int64 value); static ObjElem
> +*new_float_object(float8 value);
> 
> Here, new_object_object() seems to be declared out-of-order (not in sync with
> its order of definition). Similarly, see all other append_* functions.

Changed.

> 2. The function printTypmod in ddl_deparse.c appears to be the same as the
> function with the same name in format_type.c. If so, isn't it better to have a
> single copy of that function?

Changed.

> 3. As I pointed out yesterday, there is some similarity between
> format_type_extended and format_type_detailed. Can we try to extract
> common functionality? If this is feasible, it is better to do this as a separate
> patch. Also, this can obviate the need to expose printTypmod from
> format_type.c. I am not really sure if this will be better than the current one or
> not but it seems worth trying.

It seems the main logic in the two functions is a bit different. For
format_type_detailed, we always try to schema qualify both the user-defined and
built-in type except for some special type(time ..) which we cannot schema
qualify. But in format_type_extended, we don't try to schema qualify the
built-in type. But I will research a bit more about this.

> 4.
> new_objtree_VA()
> {
> ...
> switch (type)
> + {
> + case ObjTypeBool:
> + elem = new_bool_object(va_arg(args, int)); break; case ObjTypeString:
> + elem = new_string_object(va_arg(args, char *)); break; case
> + ObjTypeObject:
> + elem = new_object_object(va_arg(args, ObjTree *)); break; case
> + ObjTypeArray:
> + elem = new_array_object(va_arg(args, List *)); break; case
> + ObjTypeInteger:
> + elem = new_integer_object(va_arg(args, int64)); break; case
> + ObjTypeFloat:
> + elem = new_float_object(va_arg(args, double)); break; case
> + ObjTypeNull:
> + /* Null params don't have a value (obviously) */ elem =
> + new_null_object();
> ...
> 
> I feel here ObjType's should be handled in the same order as they are defined in
> ObjType.

Changed.

> 5. There appears to be common code among node_*_object() functions.
> Can we just have one function instead new_object(ObjType, void *val)?
> In the function based on type, we can typecast the value. Is there a reason why
> that won't be better than current one?

I tried to extract some common code into a new function new_object(ObjType,
char *name). I didn't use 'void *' as user could pass a wrong type of value which
we cannot detect which seems a little unsafe.

> 6.
> deparse_ColumnDef()
> {
> ...
> /* Composite types use a slightly simpler format string */
> + if (composite)
> + column = new_objtree_VA("%{name}I %{coltype}T %{collation}s", 3,
> + "type", ObjTypeString, "column", "name", ObjTypeString,
> + coldef->colname, "coltype", ObjTypeObject, new_objtree_for_type(typid,
> + typmod)); else column = new_objtree_VA("%{name}I %{coltype}T
> + %{default}s
> %{not_null}s %{collation}s",
> + 3,
> + "type", ObjTypeString, "column",
> + "name", ObjTypeString, coldef->colname, "coltype", ObjTypeObject,
> + new_objtree_for_type(typid, typmod));
> ...
> }
> 
> To avoid using the same arguments, we can define fmtstr for composite and
> non-composite types as the patch is doing in deparse_CreateStmt().

Changed.

> 7. It is not clear from comments or otherwise what should be considered for
> default format string in functions like
> deparse_ColumnDef() or deparse_CreateStmt().

I think it was intended to put most of the modifiable part in default format
string so that user can change what they want using json function. But I think
we could improve, maybe by passing a parameter which decide whether to keep
these keywords in default format string. I will research this.

> 8.
> + * FIXME --- actually, what about default values?
> + */
> +static ObjTree *
> +deparse_ColumnDef_typed(Relation relation, List *dpcontext, ColumnDef
> +*coldef)
> 
> I think we need to handle default values for this FIXME.

Changed.

Attach the new version patch which address the above
comments and comments from [1][2].

I haven't addressed the latest comments from Vignesh[3],
I will investigate it.

[1] https://www.postgresql.org/message-id/CAA4eK1K88SMoBq%3DDRA4XU-F3FG6qyzCjGMMKsPpcRBPRcrELrw%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CALDaNm3rEA_zmnDMOCT7NqK4aAffhAgooLf8rXjUN%3DYwA8ASFw%40mail.gmail.com
[3] https://www.postgresql.org/message-id/CALDaNm2nFPMxUo%3D0zRUUA-v3_eRwRY%2Bii5nnG_PU%2B6jT7ta9dA%40mail.gmail.com

Best regards,
Hou zj

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Next
From: Amit Kapila
Date:
Subject: Re: Handle infinite recursion in logical replication setup