RE: Data is copied twice when specifying both child and parent table in publication - Mailing list pgsql-hackers

From wangw.fnst@fujitsu.com
Subject RE: Data is copied twice when specifying both child and parent table in publication
Date
Msg-id OS3PR01MB6275298569C91BB39740C7429E809@OS3PR01MB6275.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: Data is copied twice when specifying both child and parent table in publication  (Amit Kapila <amit.kapila16@gmail.com>)
Responses RE: Data is copied twice when specifying both child and parent table in publication  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
List pgsql-hackers
On Fri, Mar 17, 2023 at 20:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Mar 17, 2023 at 11:58 AM wangw.fnst@fujitsu.com
> <wangw.fnst@fujitsu.com> wrote:
> >
> > On Thu, Mar 16, 2023 at 20:25 PM Amit Kapila <amit.kapila16@gmail.com>
> wrote:
> > >
> >
> > Thanks for your comments.
> >
> > > + if (server_version >= 160000)
> > > + {
> > > + appendStringInfo(&cmd, "SELECT DISTINCT N.nspname, C.relname,\n"
> > > + "              ( SELECT array_agg(a.attname ORDER BY a.attnum)\n"
> > > + "                FROM pg_attribute a\n"
> > > + "                WHERE a.attrelid = GPT.relid AND a.attnum > 0 AND\n"
> > > + "                      NOT a.attisdropped AND\n"
> > > + "                      (a.attnum = ANY(GPT.attrs) OR GPT.attrs IS NULL)\n"
> > > + "              ) AS attnames\n"
> > > + " FROM pg_class C\n"
> > > + "   JOIN pg_namespace N ON N.oid = C.relnamespace\n"
> > > + "   JOIN ( SELECT (pg_get_publication_tables(VARIADIC
> > > array_agg(pubname::text))).*\n"
> > > + "          FROM pg_publication\n"
> > > + "          WHERE pubname IN ( %s )) as GPT\n"
> > > + "       ON GPT.relid = C.oid\n",
> > > + pub_names.data);
> > >
> > > The function pg_get_publication_tables()  has already handled dropped
> > > columns, so we don't need it here in this query. Also, the part to
> > > build attnames should be the same as it is in view
> > > pg_publication_tables.
> >
> > Agree. Changed.
> >
> > > Can we directly try to pass the list of
> > > pubnames to the function pg_get_publication_tables() instead of
> > > joining it with pg_publication?
> >
> > Changed.
> > I think the aim of joining it with pg_publication before is to exclude
> > non-existing publications.
> >
> 
> Okay, A comment for that would have made it clear.

Make sense. Added the comment atop the query.

> > Otherwise, we would get an error because of the call
> > to function GetPublicationByName (with 'missing_ok = false') in function
> > pg_get_publication_tables. So, I changed "missing_ok" to true. If anyone doesn't
> > like this change, I'll reconsider this in the next version.
> >
> 
> I am not sure about changing missing_ok behavior. Did you check it for
> any other similar usage in other functions?

After reviewing the pg_get_* functions in the 'pg_proc.dat' file, I think most
of them ignore incorrect input, such as the function pg_get_indexdef. However,
some functions, such as pg_get_serial_sequence and pg_get_object_address, will
report an error. So, I think it's better to discuss this in a separate thread.
Reverted this modification. And I will start a new separate thread for this
later.

> + foreach(lc, pub_elem_tables)
> + {
> + published_rel *table_info = (published_rel *) malloc(sizeof(published_rel));
> 
> Is there a reason to use malloc instead of palloc?

No. I think we need to use palloc here.
Changed.

Attach the new patch set.

Regards,
Wang Wei

Attachment

pgsql-hackers by date:

Previous
From: Thomas Munro
Date:
Subject: Re: Commitfest 2023-03 starting tomorrow!
Next
From: "席冲(宜穆)"
Date:
Subject: Lock conflict