Thread: PG DOCS - logical replication filtering
Hi. PSA a PG docs patch that is associated with the logical replication Row Filters feature which was recently pushed [1]. This patch introduces a new "Filtering" page to give a common place where all kinds of logical replication filtering can be described. (e.g. It is envisaged that a "Column Filters" section can be added sometime in the future). The main new content for this page is the "Row Filters" section. This gives a full overview of the new row filter feature, plus examples. ------ [1] https://github.com/postgres/postgres/commit/52e4f0cd472d39d07732b99559989ea3b615be78 Kind Regards, Peter Smith. Fujitsu Australia
Attachment
Hi Peter,
> PSA a PG docs patch that is associated with the logical replication
> Row Filters feature which was recently pushed [1].
The patch looks mostly OK, but I have several nitpicks.
```
By default, all data from all published tables will be replicated to the
appropriate subscribers.
[...]
By default, all operation types are replicated.
```
The second sentence seems to be redundant.
```
(This feature is available since PostgreSQL 15)
```
Please correct me if I'm wrong, but I don't think we say that in the docs. When the user opens the documentation for version X he or she sees everything that is available in this version.
```
31.3. Filtering
[...]
There are 3 different ways to filter what data gets replicated.
31.3.1. Operation Filters
[...]
31.3.2. Row Filters
[...]
```
It looks like there are 2 different ways after all.
I see that a large part of the documentation is commented and marked as TBA (Column Filters, Combining Different Kinds of Filters). Could you please clarify if it's a work-in-progress patch? If it's not, I believe the commented part should be removed before committing.
--
Best regards,
Aleksander Alekseev
> PSA a PG docs patch that is associated with the logical replication
> Row Filters feature which was recently pushed [1].
The patch looks mostly OK, but I have several nitpicks.
```
By default, all data from all published tables will be replicated to the
appropriate subscribers.
[...]
By default, all operation types are replicated.
```
The second sentence seems to be redundant.
```
(This feature is available since PostgreSQL 15)
```
Please correct me if I'm wrong, but I don't think we say that in the docs. When the user opens the documentation for version X he or she sees everything that is available in this version.
```
31.3. Filtering
[...]
There are 3 different ways to filter what data gets replicated.
31.3.1. Operation Filters
[...]
31.3.2. Row Filters
[...]
```
It looks like there are 2 different ways after all.
I see that a large part of the documentation is commented and marked as TBA (Column Filters, Combining Different Kinds of Filters). Could you please clarify if it's a work-in-progress patch? If it's not, I believe the commented part should be removed before committing.
--
Best regards,
Aleksander Alekseev
Hi again,
> The second sentence seems to be redundant.
Actually, I'm wrong on this one.
--
Best regards,
Aleksander Alekseev
On Wed, Mar 2, 2022 at 2:37 PM Aleksander Alekseev <aleksander@timescale.com> wrote: > > > I see that a large part of the documentation is commented and marked as TBA (Column Filters, Combining Different Kindsof Filters). Could you please clarify if it's a work-in-progress patch? If it's not, I believe the commented part shouldbe removed before committing. > I think we can remove any Column Filters related information (placeholders), if that patch gets committed, we can always extend the existing docs. -- With Regards, Amit Kapila.
Hi hackers,
> I see that a large part of the documentation is commented and marked as TBA (Column Filters, Combining Different Kinds of Filters). Could you please clarify if it's a work-in-progress patch? If it's not, I believe the commented part should be removed before committing.
>
I think we can remove any Column Filters related information
(placeholders), if that patch gets committed, we can always extend the
existing docs.
Here is an updated version of the patch.
Best regards,
Aleksander Alekseev
Attachment
On 02.03.22 05:47, Peter Smith wrote: > This patch introduces a new "Filtering" page to give a common place > where all kinds of logical replication filtering can be described. > (e.g. It is envisaged that a "Column Filters" section can be added > sometime in the future). The pending feature to select a subset of table columns to replicate is not "column filtering". The thread might still be still called that, but we have changed the patch to not use that terminology. Filtering is a dynamic action based on actual values. The row filtering feature does that. The column list feature is a static DDL-time configuration. It is no more filtering than specifying a list of tables in a publication is table filtering. So please consider organizing the documentation differently to not create this confusion.
On Wed, Mar 2, 2022 at 8:43 PM Aleksander Alekseev <aleksander@timescale.com> wrote: ... > Here is an updated version of the patch. Thanks for your review comments and fixes. The updated v2 patch looks good to me. ------ Kind Regards, Peter Smith. Fujitsu Australia
On Wed, Mar 2, 2022 at 8:26 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Mar 2, 2022 at 2:37 PM Aleksander Alekseev > <aleksander@timescale.com> wrote: > > > > > > I see that a large part of the documentation is commented and marked as TBA (Column Filters, Combining Different Kindsof Filters). Could you please clarify if it's a work-in-progress patch? If it's not, I believe the commented part shouldbe removed before committing. > > > > I think we can remove any Column Filters related information > (placeholders), if that patch gets committed, we can always extend the > existing docs. > +1 ------ Kind Regards, Peter Smith. Fujitsu Australia.
On Wed, Mar 2, 2022 at 8:00 PM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > > On 02.03.22 05:47, Peter Smith wrote: > > This patch introduces a new "Filtering" page to give a common place > > where all kinds of logical replication filtering can be described. > > (e.g. It is envisaged that a "Column Filters" section can be added > > sometime in the future). > > The pending feature to select a subset of table columns to replicate is > not "column filtering". The thread might still be still called that, > but we have changed the patch to not use that terminology. > > Filtering is a dynamic action based on actual values. The row filtering > feature does that. The column list feature is a static DDL-time > configuration. It is no more filtering than specifying a list of tables > in a publication is table filtering. > > So please consider organizing the documentation differently to not > create this confusion. > +1. I think Row Filters can directly be a section just before Conflicts on the logical replication page [1]. Some comments on the patch: 1. I think we can extend/add the example to have filters on more than one table. This has been noticed multiple times during development that people are not very clear on it. 2. I think we can add an example or two for row filters actions (like Insert, Update). 3. Publications can choose to limit the changes they produce to any combination of <command>INSERT</command>, <command>UPDATE</command>, - <command>DELETE</command>, and <command>TRUNCATE</command>, similar to how triggers are fired by - particular event types. By default, all operation types are replicated. + <command>DELETE</command>, and <command>TRUNCATE</command> by using + <quote>operation filters</quote>. By this, one can imply that row filters are used for Truncate as well but that is not true. I know that that patch later specifies that "Row filters have no effect for <command>TRUNCATE</command> commands." but the above modification is not very clear. [1] - https://www.postgresql.org/docs/devel/logical-replication.html -- With Regards, Amit Kapila.
On Thu, Mar 3, 2022 at 2:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Mar 2, 2022 at 8:00 PM Peter Eisentraut > <peter.eisentraut@enterprisedb.com> wrote: > > > > On 02.03.22 05:47, Peter Smith wrote: > > > This patch introduces a new "Filtering" page to give a common place > > > where all kinds of logical replication filtering can be described. > > > (e.g. It is envisaged that a "Column Filters" section can be added > > > sometime in the future). > > > > The pending feature to select a subset of table columns to replicate is > > not "column filtering". The thread might still be still called that, > > but we have changed the patch to not use that terminology. > > > > Filtering is a dynamic action based on actual values. The row filtering > > feature does that. The column list feature is a static DDL-time > > configuration. It is no more filtering than specifying a list of tables > > in a publication is table filtering. > > > > So please consider organizing the documentation differently to not > > create this confusion. > > > > +1. I think Row Filters can directly be a section just before > Conflicts on the logical replication page [1]. > OK. I will reorganize the page as suggested, and also attend to the other comments below. > Some comments on the patch: > 1. I think we can extend/add the example to have filters on more than > one table. This has been noticed multiple times during development > that people are not very clear on it. > 2. I think we can add an example or two for row filters actions (like > Insert, Update). > 3. > Publications can choose to limit the changes they produce to > any combination of <command>INSERT</command>, <command>UPDATE</command>, > - <command>DELETE</command>, and <command>TRUNCATE</command>, > similar to how triggers are fired by > - particular event types. By default, all operation types are replicated. > + <command>DELETE</command>, and <command>TRUNCATE</command> by using > + <quote>operation filters</quote>. > > By this, one can imply that row filters are used for Truncate as well > but that is not true. I know that that patch later specifies that "Row > filters have no effect for <command>TRUNCATE</command> commands." but > the above modification is not very clear. > > [1] - https://www.postgresql.org/docs/devel/logical-replication.html > ------ Kind Regards, Peter Smith. Fujitsu Australia.
PSA patch v3 to address all comments received so far. ------ Kind Regards, Peter Smith. Fujitsu Australia
Attachment
On Thu, Mar 3, 2022 at 2:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Wed, Mar 2, 2022 at 8:00 PM Peter Eisentraut > <peter.eisentraut@enterprisedb.com> wrote: > > > > On 02.03.22 05:47, Peter Smith wrote: > > > This patch introduces a new "Filtering" page to give a common place > > > where all kinds of logical replication filtering can be described. > > > (e.g. It is envisaged that a "Column Filters" section can be added > > > sometime in the future). > > > > The pending feature to select a subset of table columns to replicate is > > not "column filtering". The thread might still be still called that, > > but we have changed the patch to not use that terminology. > > > > Filtering is a dynamic action based on actual values. The row filtering > > feature does that. The column list feature is a static DDL-time > > configuration. It is no more filtering than specifying a list of tables > > in a publication is table filtering. > > > > So please consider organizing the documentation differently to not > > create this confusion. > > > > +1. I think Row Filters can directly be a section just before > Conflicts on the logical replication page [1]. Done as suggested in v3. [1] > > Some comments on the patch: > 1. I think we can extend/add the example to have filters on more than > one table. This has been noticed multiple times during development > that people are not very clear on it. Added example in v3 [1] > 2. I think we can add an example or two for row filters actions (like > Insert, Update). Added examples of INSERT and UPDATE in v3 [1] > 3. > Publications can choose to limit the changes they produce to > any combination of <command>INSERT</command>, <command>UPDATE</command>, > - <command>DELETE</command>, and <command>TRUNCATE</command>, > similar to how triggers are fired by > - particular event types. By default, all operation types are replicated. > + <command>DELETE</command>, and <command>TRUNCATE</command> by using > + <quote>operation filters</quote>. > > By this, one can imply that row filters are used for Truncate as well > but that is not true. I know that that patch later specifies that "Row > filters have no effect for <command>TRUNCATE</command> commands." but > the above modification is not very clear. > Fixed in v3 [1]. Restored original text, and added a note about row filters. ------ [1] https://www.postgresql.org/message-id/CAHut%2BPtwp0FscVpmMjHLV6_%3DSHR5ndwvWdX_gb41_3H2UA9ecA%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia.
On Fri, Mar 4, 2022, at 12:41 AM, Peter Smith wrote:
PSA patch v3 to address all comments received so far.
Peter,
I started reviewing this documentation for row filter and I noticed that I was
changing too much lines to submit a patch on the top of it. Hence, I'm
attaching a new version based on this one.
Here as some of the changes that I did:
* links: I renamed the ids to use "logical-replication-row-filter" instead of
"logical-replication-rf" because it is used in the anchors. A compound word
is better than an abbreviation.
* titles: I changed all titles because of some stylish issue (words are
generally capitalized) or because it reads better.
* sections: I moved the "Restrictions" section to the top but it seems
important than the other sections. I removed the "psql" section. The psql
commands are shown in the "Examples" section so I don't think we should
provide a section for it.
* sentences: I expanded several sentences to provide details about the specific
topic. I also reordered some sentences and removed some duplicated sentences.
* replica identity: I added a link to replica identity. It is a key concept for
row filter.
* transformations: I added a few sentences explaining when/why a transformation
is required. I removed the "Cases" part (same as in the source code) because
it is redundant with the new sentences.
* partitioned table: the title should be _partitioned_ table. I replaced the
bullets with sentences in the same paragraph.
* initial data sync: I removed the "subscriber" from the section title. When
you say "initial data synchronization" it seems clear we're talking about the
subscriber. I also add a sentence saying why pre-15 does not consider row
filters.
* combining row filters: I renamed the section and decided to remove "for the
same table". When reading the first sentences from this section, it is clear
that row filtering is per table. So if you are combining multiple row
filters, it should be for the same table. I also added a sentence saying why
some clauses make the row filter irrelevant.
* examples: I combined the psql commands that shows row filter information
together (\dRp+ and \d). I changed to connection string to avoid "localhost".
Why? It does not seem a separate service (there is no port) and setup pub/sub
in the same cluster requires additional steps. It is better to illustrate
different clusters (at least it seems so since we don't provide details from
publisher). I changed a value in an UPDATE because both UPDATEs use 999.
We could probably reduce the number of rows in the example but I didn't bother
to remove them.
It seems we can remove some sentences from the CREATE PUBLICATION because we
have a new section that explains all of it. I think the link that was added by
this patch is sufficient.
Attachment
On Fri, Mar 11, 2022 at 9:37 AM Euler Taveira <euler@eulerto.com> wrote: > > On Fri, Mar 4, 2022, at 12:41 AM, Peter Smith wrote: > > PSA patch v3 to address all comments received so far. > > Peter, > > I started reviewing this documentation for row filter and I noticed that I was > changing too much lines to submit a patch on the top of it. Hence, I'm > attaching a new version based on this one. Sorry for my long delay in replying. I have been away. Thanks very much for the updated patch with all your suggestions. There were many changes in your v4. I have not merged everything exactly, but I did take the majority of your suggestions on-board in the attached v5. I responded below to each change. > > Here as some of the changes that I did: > > * links: I renamed the ids to use "logical-replication-row-filter" instead of > "logical-replication-rf" because it is used in the anchors. A compound word > is better than an abbreviation. OK, done as suggested. > * titles: I changed all titles because of some stylish issue (words are > generally capitalized) or because it reads better. OK, most titles changed as suggested. > * sections: I moved the "Restrictions" section to the top but it seems > important than the other sections. I removed the "psql" section. The psql > commands are shown in the "Examples" section so I don't think we should > provide a section for it. OK, moved the "Restrictions" section and removed the "psql" section. > * sentences: I expanded several sentences to provide details about the specific > topic. I also reordered some sentences and removed some duplicated sentences. I did not take everything exactly as in your v4, but wherever your suggested changes added some more information I tried to include them using similar wording to yours. > * replica identity: I added a link to replica identity. It is a key concept for > row filter. OK, done as suggested. > * transformations: I added a few sentences explaining when/why a transformation > is required. I removed the "Cases" part (same as in the source code) because > it is redundant with the new sentences. OK, I incorporated most of your new descriptions for the transformations, however I still wanted to keep the summary of "cases" part because IMO it makes the rules so much clearer than just having the text descriptions. > * partitioned table: the title should be _partitioned_ table. I replaced the > bullets with sentences in the same paragraph. OK. The title was changed, but I kept the bullets. > * initial data sync: I removed the "subscriber" from the section title. When > you say "initial data synchronization" it seems clear we're talking about the > subscriber. I also add a sentence saying why pre-15 does not consider row > filters. OK. Title is changed. Also I added your sentence about the pre-15. But I kept all the HTML note rendering that you'd removed in v4. I think this information was important enough to be a "note" instead of just buried in a paragraph. > * combining row filters: I renamed the section and decided to remove "for the > same table". When reading the first sentences from this section, it is clear > that row filtering is per table. So if you are combining multiple row > filters, it should be for the same table. I also added a sentence saying why > some clauses make the row filter irrelevant. OK. Title is changed. Your extra sentence was added. > * examples: I combined the psql commands that shows row filter information > together (\dRp+ and \d). I changed to connection string to avoid "localhost". > Why? It does not seem a separate service (there is no port) and setup pub/sub > in the same cluster requires additional steps. It is better to illustrate > different clusters (at least it seems so since we don't provide details from > publisher). I changed a value in an UPDATE because both UPDATEs use 999. > I did not combine those \dRp+ and \d. I kept them separate so I could write some separate notes about them. I'm unsure about the connection change. This documentation is for "Row Filters" so the examples were only intended to demonstrate row filters and nothing else. I wanted to have a trivial connection such that a user can just cut/paste directly from the example and get something they can immediately test without having to change it. I don't mind changing this later but probably I'd like to get some other opinions about it first. About the UPDATE (555 value) - OK, I changed that value as you suggested. > We could probably reduce the number of rows in the example but I didn't bother > to remove them. > > It seems we can remove some sentences from the CREATE PUBLICATION because we > have a new section that explains all of it. I think the link that was added by > this patch is sufficient. > Yeah, maybe some sentences can be removed. But even though some information is duplicated it might be useful to have a few things still mentioned on the CREATE PUBLICATION page just so the user does not have to chase links too much. I will wait to see if other people have an opinion about it before removing any content from that page. Meanwhile, I have made the create_publication.sgml identical to your v4. ~~~ There should be much fewer v4/v5 differences now although there might be a few things I missed that you want to re-comment on. Hopefully, it will now be easier to post review comments as BEFORE/AFTER text fragments - that would help other people to see the suggestions more easily and give their opinions too. ------ Kind Regards, Peter Smith. Fujitsu Australia.
Attachment
On Thu, Mar 24, 2022 at 11:48 AM Peter Smith <smithpb2250@gmail.com> wrote: > Review comments: =============== 1. + The <literal>WHERE</literal> clause expression is evaluated with the same + role used for the replication connection. i.e. the role specified in the + <literal>CONNECTION</literal> clause of the <xref linkend="sql-createsubscription"/>. Can we use () around i.e. sentence? It looks bit odd to me now. The <literal>WHERE</literal> clause expression is evaluated with the same role used for the replication connection (i.e., the role specified in the <literal>CONNECTION</literal> clause of the <xref linkend="sql-createsubscription"/>). 2. + <para> + Whenever an <command>UPDATE</command> is processed, the row filter + expression is evaluated for both the old and new row (i.e. before + and after the data is updated). Can we write the below part slightly differently? Before: (i.e. before and after the data is updated). After: (i.e the data before and after the update). 3. + <para> + Whenever an <command>UPDATE</command> is processed, the row filter + expression is evaluated for both the old and new row (i.e. before + and after the data is updated). + </para> + + <para> + If both evaluations are <literal>true</literal>, it replicates the + <command>UPDATE</command> change. + </para> + + <para> + If both evaluations are <literal>false</literal>, it doesn't replicate + the change. + </para> I think we can combine these three separate paragraphs. 4. + <para> + If the publication contains a partitioned table, the publication parameter + <literal>publish_via_partition_root</literal> determines which row filter + is used. + <itemizedlist> + + <listitem> + <para> + If <literal>publish_via_partition_root</literal> is <literal>false</literal> + (default), each <emphasis>partition's</emphasis> row filter is used. + </para> + </listitem> + + <listitem> + <para> + If <literal>publish_via_partition_root</literal> is <literal>true</literal>, + the <emphasis>root partitioned table's</emphasis> row filter is used. + </para> + </listitem> + + </itemizedlist> + </para> I think we can combine this into single para as Euler had in his version. 5. + <note> + <para> + Publication <literal>publish</literal> operations are ignored when copying pre-existing table data. + </para> + </note> + + <note> + <para> + If the subscriber is in a release prior to 15, copy pre-existing data + doesn't use row filters even if they are defined in the publication. + This is because old releases can only copy the entire table data. + </para> + </note> I don't see the need for the first <note> here, the second one seems to convey it. 6. + <para> + Create some tables to be used in the following examples. +<programlisting> +testpub=# CREATE TABLE t1(a int, b int, c text, primary key(a,c)); +CREATE TABLE +testpub=# CREATE TABLE t2(d int primary key, e int, f int); +CREATE TABLE +testpub=# CREATE TABLE t3(g int primary key, h int, i int); +CREATE TABLE +</programlisting> + </para> + + <para> + Create some publications. + </para> + <para> + - notice publication <literal>p1</literal> has 1 table with a row filter. + </para> + <para> + - notice publication <literal>p2</literal> has 2 tables, one without a row + filter, and one with a row filter. + </para> + <para> + - notice publication <literal>p3</literal> has 2 tables, both with row filters. +<programlisting> +testpub=# CREATE PUBLICATION p1 FOR TABLE t1 WHERE (a > 5 AND c = 'NSW'); +CREATE PUBLICATION +testpub=# CREATE PUBLICATION p2 FOR TABLE t1, t2 WHERE (e = 99); +CREATE PUBLICATION +testpub=# CREATE PUBLICATION p3 FOR TABLE t2 WHERE (d = 10), t3 WHERE (g = 10); +CREATE PUBLICATION +</programlisting> + </para> I think it is better to use the corresponding content from Euler's version. 7. + <para> + The PSQL command <command>\d</command> shows what publications the table is + a member of, as well as that table's row filter expression (if defined) in + those publications. + </para> + <para> + - notice table <literal>t1</literal> is a member of 2 publications, but + has a row filter only in <literal>p1</literal>. + </para> + <para> + - notice table <literal>t2</literal> is a member of 2 publications, and + has a different row filter in each of them. This looks unnecessary to me. Let's remove this part. 8. + <para> + - notice that only the rows satisfying the <literal>t1 WHERE</literal> + clause of publication <literal>p1</literal> are replicated. Again, it is better to use Euler's version for this and at the place, he had in his version. Similarly, adjust other notices if any like this one. 9. I suggest adding an example for partition tables showing how the row filter is used based on the 'publish_via_partition_root' parameter. -- With Regards, Amit Kapila.
PSA patch v6 to address some of Amit's review comments [1]. ------ [1] https://www.postgresql.org/message-id/CAA4eK1JdwQQsxa%2BzpsBW5rCxEfXopYx381nwcCyeXk6mpF8ChQ%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
Attachment
On Wed, Apr 6, 2022 at 8:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Mar 24, 2022 at 11:48 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > Review comments: > =============== > 1. > + The <literal>WHERE</literal> clause expression is evaluated with the same > + role used for the replication connection. i.e. the role specified in the > + <literal>CONNECTION</literal> clause of the <xref > linkend="sql-createsubscription"/>. > > Can we use () around i.e. sentence? It looks bit odd to me now. > The <literal>WHERE</literal> clause expression is evaluated with the > same role used for the replication connection (i.e., the role > specified in the <literal>CONNECTION</literal> clause of the <xref > linkend="sql-createsubscription"/>). OK. Modified in v6 [1]. > 2. > + <para> > + Whenever an <command>UPDATE</command> is processed, the row filter > + expression is evaluated for both the old and new row (i.e. before > + and after the data is updated). > > Can we write the below part slightly differently? > Before: > (i.e. before and after the data is updated). > After: > (i.e the data before and after the update). OK. Modified in v6 [1]. > 3. > + <para> > + Whenever an <command>UPDATE</command> is processed, the row filter > + expression is evaluated for both the old and new row (i.e. before > + and after the data is updated). > + </para> > + > + <para> > + If both evaluations are <literal>true</literal>, it replicates the > + <command>UPDATE</command> change. > + </para> > + > + <para> > + If both evaluations are <literal>false</literal>, it doesn't replicate > + the change. > + </para> > > I think we can combine these three separate paragraphs. The first sentence is the explanation, then there are the 3 separate “If …” cases mentioned. It doesn’t seem quite right to group just the first 2 “if…” cases into one paragraph, while leaving the 3rd one separated. OTOH combining everything into one big paragraph seems worse. Please confirm if you still want this changed. > 4. > + <para> > + If the publication contains a partitioned table, the publication parameter > + <literal>publish_via_partition_root</literal> determines which row filter > + is used. > + <itemizedlist> > + > + <listitem> > + <para> > + If <literal>publish_via_partition_root</literal> is > <literal>false</literal> > + (default), each <emphasis>partition's</emphasis> row filter is used. > + </para> > + </listitem> > + > + <listitem> > + <para> > + If <literal>publish_via_partition_root</literal> is > <literal>true</literal>, > + the <emphasis>root partitioned table's</emphasis> row filter is used. > + </para> > + </listitem> > + > + </itemizedlist> > + </para> > > I think we can combine this into single para as Euler had in his version. We can do it, but I am not sure if your review was looking at the rendered HTML or just looking at the SGML text? IMO using bullets here ended up being more readable (it is also consistent with other bullet usages on this page). Please confirm if you still want this changed. > 5. > + <note> > + <para> > + Publication <literal>publish</literal> operations are ignored > when copying pre-existing table data. > + </para> > + </note> > + > + <note> > + <para> > + If the subscriber is in a release prior to 15, copy pre-existing data > + doesn't use row filters even if they are defined in the publication. > + This is because old releases can only copy the entire table data. > + </para> > + </note> > > I don't see the need for the first <note> here, the second one seems > to convey it. Well, the 2nd note is only about compatibility with older versions doing the subscribe. But the 1st note is not version-specific at all. It is saying that the COPY does not take the “publish” option into account. If you know of some other docs already mentioning this subtle behaviour of the COPY then I can remove this note and just cross-reference to the other place. But I did not know anywhere this is already mentioned, so that is why I wrote the note about it. > 6. > + <para> > + Create some tables to be used in the following examples. > +<programlisting> > +testpub=# CREATE TABLE t1(a int, b int, c text, primary key(a,c)); > +CREATE TABLE > +testpub=# CREATE TABLE t2(d int primary key, e int, f int); > +CREATE TABLE > +testpub=# CREATE TABLE t3(g int primary key, h int, i int); > +CREATE TABLE > +</programlisting> > + </para> > + > + <para> > + Create some publications. > + </para> > + <para> > + - notice publication <literal>p1</literal> has 1 table with a row filter. > + </para> > + <para> > + - notice publication <literal>p2</literal> has 2 tables, one without a row > + filter, and one with a row filter. > + </para> > + <para> > + - notice publication <literal>p3</literal> has 2 tables, both > with row filters. > +<programlisting> > +testpub=# CREATE PUBLICATION p1 FOR TABLE t1 WHERE (a > 5 AND c = 'NSW'); > +CREATE PUBLICATION > +testpub=# CREATE PUBLICATION p2 FOR TABLE t1, t2 WHERE (e = 99); > +CREATE PUBLICATION > +testpub=# CREATE PUBLICATION p3 FOR TABLE t2 WHERE (d = 10), t3 WHERE (g = 10); > +CREATE PUBLICATION > +</programlisting> > + </para> > > I think it is better to use the corresponding content from Euler's version. OK. Modified in v6 [1]. I changed the primary key syntax to be the same as Euler had. I also moved all the 'notice' parts to below the corresponding example and modified the text. > > 7. > + <para> > + The PSQL command <command>\d</command> shows what publications the table is > + a member of, as well as that table's row filter expression (if defined) in > + those publications. > + </para> > + <para> > + - notice table <literal>t1</literal> is a member of 2 publications, but > + has a row filter only in <literal>p1</literal>. > + </para> > + <para> > + - notice table <literal>t2</literal> is a member of 2 publications, and > + has a different row filter in each of them. > > This looks unnecessary to me. Let's remove this part. I thought something is needed to explain/demonstrate how the user can know the different row filters for all the publications that the same table is a member of. Otherwise, the user has to guess (??) what publications are using their table and then use \dRp+ to dig at all those publications to find the row filters. I can remove all this part from the Examples, but I think at least the \d should still be mentioned somewhere. IMO I should put that "PSQL commands" section back (which existed in an earlier version) and just add a sentence about this. Then this examples part can be removed. What do you think? > 8. > + <para> > + - notice that only the rows satisfying the <literal>t1 WHERE</literal> > + clause of publication <literal>p1</literal> are replicated. > > Again, it is better to use Euler's version for this and at the place, > he had in his version. Similarly, adjust other notices if any like > this one. OK. Modified in v6 [1]. Every “notice” has now been moved to follow the associated example (how Euler had them) > 9. I suggest adding an example for partition tables showing how the > row filter is used based on the 'publish_via_partition_root' > parameter. OK - I am working on this and will add it in a future patch version. ------ [1] https://www.postgresql.org/message-id/CAHut%2BPvyxMedYY-jHaT9YSfEPHv0jU2-CZ8F_nPvhuP0b955og%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
On 2022-Apr-08, Peter Smith wrote: > PSA patch v6 to address some of Amit's review comments [1]. I think the text is good (didn't read it all, just about the first half), but why is there one paragraph per sentence? Seems a bit too sparse. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "You're _really_ hosed if the person doing the hiring doesn't understand relational systems: you end up with a whole raft of programmers, none of whom has had a Date with the clue stick." (Andrew Sullivan)
On Fri, Apr 8, 2022 at 11:42 AM Peter Smith <smithpb2250@gmail.com> wrote: > > On Wed, Apr 6, 2022 at 8:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > 3. > > + <para> > > + Whenever an <command>UPDATE</command> is processed, the row filter > > + expression is evaluated for both the old and new row (i.e. before > > + and after the data is updated). > > + </para> > > + > > + <para> > > + If both evaluations are <literal>true</literal>, it replicates the > > + <command>UPDATE</command> change. > > + </para> > > + > > + <para> > > + If both evaluations are <literal>false</literal>, it doesn't replicate > > + the change. > > + </para> > > > > I think we can combine these three separate paragraphs. > > The first sentence is the explanation, then there are the 3 separate > “If …” cases mentioned. It doesn’t seem quite right to group just the > first 2 “if…” cases into one paragraph, while leaving the 3rd one > separated. OTOH combining everything into one big paragraph seems > worse. Please confirm if you still want this changed. > Yeah, I think we should have something like what Euler's version had and maybe keep a summary section from the current patch. > > 4. > > + <para> > > + If the publication contains a partitioned table, the publication parameter > > + <literal>publish_via_partition_root</literal> determines which row filter > > + is used. > > + <itemizedlist> > > + > > + <listitem> > > + <para> > > + If <literal>publish_via_partition_root</literal> is > > <literal>false</literal> > > + (default), each <emphasis>partition's</emphasis> row filter is used. > > + </para> > > + </listitem> > > + > > + <listitem> > > + <para> > > + If <literal>publish_via_partition_root</literal> is > > <literal>true</literal>, > > + the <emphasis>root partitioned table's</emphasis> row filter is used. > > + </para> > > + </listitem> > > + > > + </itemizedlist> > > + </para> > > > > I think we can combine this into single para as Euler had in his version. > > We can do it, but I am not sure if your review was looking at the > rendered HTML or just looking at the SGML text? IMO using bullets here > ended up being more readable (it is also consistent with other bullet > usages on this page). Please confirm if you still want this changed. > Fair enough. We can keep this part as it is. > > 5. > > + <note> > > + <para> > > + Publication <literal>publish</literal> operations are ignored > > when copying pre-existing table data. > > + </para> > > + </note> > > + > > + <note> > > + <para> > > + If the subscriber is in a release prior to 15, copy pre-existing data > > + doesn't use row filters even if they are defined in the publication. > > + This is because old releases can only copy the entire table data. > > + </para> > > + </note> > > > > I don't see the need for the first <note> here, the second one seems > > to convey it. > > Well, the 2nd note is only about compatibility with older versions > doing the subscribe. But the 1st note is not version-specific at all. > It is saying that the COPY does not take the “publish” option into > account. If you know of some other docs already mentioning this subtle > behaviour of the COPY then I can remove this note and just > cross-reference to the other place. But I did not know anywhere this > is already mentioned, so that is why I wrote the note about it. > I don't see the need to say about general initial sync (COPY) behavior here. It is already defined at [1]. If we want to enhance, we can do that as a separate patch to make changes where the initial sync is explained. I am not sure that is required though. > > > > 7. > > + <para> > > + The PSQL command <command>\d</command> shows what publications the table is > > + a member of, as well as that table's row filter expression (if defined) in > > + those publications. > > + </para> > > + <para> > > + - notice table <literal>t1</literal> is a member of 2 publications, but > > + has a row filter only in <literal>p1</literal>. > > + </para> > > + <para> > > + - notice table <literal>t2</literal> is a member of 2 publications, and > > + has a different row filter in each of them. > > > > This looks unnecessary to me. Let's remove this part. > > I thought something is needed to explain/demonstrate how the user can > know the different row filters for all the publications that the same > table is a member of. Otherwise, the user has to guess (??) what > publications are using their table and then use \dRp+ to dig at all > those publications to find the row filters. > > I can remove all this part from the Examples, but I think at least the > \d should still be mentioned somewhere. IMO I should put that "PSQL > commands" section back (which existed in an earlier version) and just > add a sentence about this. Then this examples part can be removed. > What do you think? > I think the way it is changed in the current patch by moving that explanation down seems okay to me. I feel in the initial "Row Filters" and "Row Filter Rules" sections, we don't need to have separate paragraphs. I think the same is pointed out by Alvaro as well. -- With Regards, Amit Kapila.
PSA patch v7 which addresses all the remaining review comments (from Amit [1a][1b], and from Alvaro [2]). ------ [1a] https://www.postgresql.org/message-id/CAHut%2BPvDFWGUOBugYMtcXhAiViZu%2BQ6P-kxw2%2BU83VOGx0Osdg%40mail.gmail.com [1b] https://www.postgresql.org/message-id/CAA4eK1JPyVoc1dUjeqbPd9D0_uYxWyyx-8fcsrgiZ5Tpr9OAuw%40mail.gmail.com [2] https://www.postgresql.org/message-id/202204091045.v2ei4yupxqso%40alvherre.pgsql Kind Regards, Peter Smith. Fujitsu Australia
Attachment
On Fri, Apr 8, 2022 at 4:12 PM Peter Smith <smithpb2250@gmail.com> wrote: > > On Wed, Apr 6, 2022 at 8:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Mar 24, 2022 at 11:48 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > > > > Review comments: > > =============== [snip] > > > 9. I suggest adding an example for partition tables showing how the > > row filter is used based on the 'publish_via_partition_root' > > parameter. > > OK - I am working on this and will add it in a future patch version. > OK. Added in v7 [1] ------ [1] https://www.postgresql.org/message-id/CAHut%2BPt1X%3D3VaWRbx3yHByEMC-GPh4oeeMeJKJeTmOELDxZJHQ%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
On Mon, Apr 11, 2022 at 1:27 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Apr 8, 2022 at 11:42 AM Peter Smith <smithpb2250@gmail.com> wrote: > > > > On Wed, Apr 6, 2022 at 8:58 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > > > 3. > > > + <para> > > > + Whenever an <command>UPDATE</command> is processed, the row filter > > > + expression is evaluated for both the old and new row (i.e. before > > > + and after the data is updated). > > > + </para> > > > + > > > + <para> > > > + If both evaluations are <literal>true</literal>, it replicates the > > > + <command>UPDATE</command> change. > > > + </para> > > > + > > > + <para> > > > + If both evaluations are <literal>false</literal>, it doesn't replicate > > > + the change. > > > + </para> > > > > > > I think we can combine these three separate paragraphs. > > > > The first sentence is the explanation, then there are the 3 separate > > “If …” cases mentioned. It doesn’t seem quite right to group just the > > first 2 “if…” cases into one paragraph, while leaving the 3rd one > > separated. OTOH combining everything into one big paragraph seems > > worse. Please confirm if you still want this changed. > > > > Yeah, I think we should have something like what Euler's version had > and maybe keep a summary section from the current patch. > Modified in v7 [1]. Removed the bullets. Structured the text paragraphs the same way that Euler had it. The summary is kept as-is. > > > 5. > > > + <note> > > > + <para> > > > + Publication <literal>publish</literal> operations are ignored > > > when copying pre-existing table data. > > > + </para> > > > + </note> > > > + > > > + <note> > > > + <para> > > > + If the subscriber is in a release prior to 15, copy pre-existing data > > > + doesn't use row filters even if they are defined in the publication. > > > + This is because old releases can only copy the entire table data. > > > + </para> > > > + </note> > > > > > > I don't see the need for the first <note> here, the second one seems > > > to convey it. > > > > Well, the 2nd note is only about compatibility with older versions > > doing the subscribe. But the 1st note is not version-specific at all. > > It is saying that the COPY does not take the “publish” option into > > account. If you know of some other docs already mentioning this subtle > > behaviour of the COPY then I can remove this note and just > > cross-reference to the other place. But I did not know anywhere this > > is already mentioned, so that is why I wrote the note about it. > > > > I don't see the need to say about general initial sync (COPY) behavior > here. It is already defined at [1]. If we want to enhance, we can do > that as a separate patch to make changes where the initial sync is > explained. I am not sure that is required though. > Did you miss providing the link URL? Anyway, I removed the note in v7 [1]. This information can be done as a separate patch one day (or not at all). > I feel in the initial "Row Filters" and "Row Filter Rules" sections, > we don't need to have separate paragraphs. I think the same is pointed > out by Alvaro as well. > Modified in v7 [1] those sections as suggested. I also assumed these were the same sections that Alvaro was referring to. ------ [1] https://www.postgresql.org/message-id/CAHut%2BPt1X%3D3VaWRbx3yHByEMC-GPh4oeeMeJKJeTmOELDxZJHQ%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
On Mon, Apr 11, 2022 at 12:39 PM Peter Smith <smithpb2250@gmail.com> wrote: > > On Fri, Apr 8, 2022 at 4:12 PM Peter Smith <smithpb2250@gmail.com> wrote: > > OK. Added in v7 [1] > Thanks, this looks mostly good to me. I didn't like the new section added for partitioned tables examples, so I removed it and added some explanation of the tests. I have slightly changed a few other lines. I am planning to commit the attached tomorrow unless there are more comments. -- With Regards, Amit Kapila.
Attachment
On Mon, Apr 11, 2022, at 7:40 AM, Amit Kapila wrote:
On Mon, Apr 11, 2022 at 12:39 PM Peter Smith <smithpb2250@gmail.com> wrote:>> On Fri, Apr 8, 2022 at 4:12 PM Peter Smith <smithpb2250@gmail.com> wrote:>> OK. Added in v7 [1]>Thanks, this looks mostly good to me. I didn't like the new sectionadded for partitioned tables examples, so I removed it and added someexplanation of the tests. I have slightly changed a few other lines. Iam planning to commit the attached tomorrow unless there are morecomments.
I have a few comments.
> If a publication publishes UPDATE and/or DELETE operations ...
>
If we are talking about operations, use lowercase like I suggested in the
previous version. See the publish parameter [1]. If we are talking about
commands, the word "operations" should be removed or replaced by "commands".
The "and/or" isn't required, "or" implies the same. If you prefer "operations",
my suggestion is to adjust the last sentence that says "only INSERT" to "only
<italic>insert</italic> operation".
> If the old row satisfies the row filter expression (it was sent to the
> subscriber) but the new row doesn't, then from a data consistency perspective
> the old row should be removed from the subscriber.
>
I suggested small additions to this sentence. We should at least add a comma
after "then" and "perspective". The same applies to the next paragraph too.
Regarding the "Summary", it is redundant as I said before. We already described
all 4 cases. I vote to remove it. However, if we would go with a table, I
suggest to improve the formatting: add borders, "old row" and "new row" should
be titles, "no match" and "match" should be represented by symbols ("" and "X",
for example), and "Case X" column should be removed because this extra column
adds nothing.
Regarding the "Partitioned Tables", I suggested to remove the bullets. We
generally have paragraphs with a few sentences. I tend to avoid short
paragraphs. I also didn't like the 2 bullets using almost the same words. In
the previous version, I suggested something like
If the publication contains a partitioned table, the parameter
publish_via_partition_root determines which row filter expression is used. If
the parameter publish_via_partition_root is true, the row filter expression
associated with the partitioned table is used. Otherwise, the row filter
expression associated with the individual partition is used.
> will be copied. (see Section 31.3.6 for details).
There is an extra period after "copied" that should be removed. The other
option is to remove the parentheses and have another sentence for "See ...".
> those expressions get OR'ed together
I prefer plain English here. This part of the sentence is also redundant with
the rest of the sentence so I suggested to remove it in the previous version.
rows that satisfy any of the row filter expressions is replicated.
instead of
those expressions get OR'ed together, so that rows satisfying any of the
expressions will be replicated.
I also didn't use a different paragraph (like I suggested in the previous
version) because we are talking about the same thing.
The bullets in the example sounds strange, that's why I suggested removing it.
We can even combine the 3 sentences into one paragraph.
> The PSQL command \dRp+ shows the row filter expressions (if defined) for each
> table of the publications.
Well, we don't use PSQL (upppercase) in the documentation. I suggested a
different sentence:
The psql shows the row filter expressions (if defined) for each table.
> The PSQL command \d shows what publications the table is a member of, as well
> as that table's row filter expression (if defined) in those publications.
It is not logical replication business to explain about psql behavior. If, for
some reason, someone decided to change it, this section will contain obsolete
information. The psql output is fine, the explanation is not. That's why I
suggested this modification.
> Only the rows satisfying the t1 WHERE clause of publication p1 are
> replicated.
Again, no bullets. This sentence is useful *before* the psql output. We're
presenting the results. Let's follow the pattern, describe the action and show
the results.
> The UPDATE replicates the change as normal.
This sentence should be *before* the psql output (see my previous version).
Regarding the new examples (for partitioned tables), shouldn't we move the
parent / child definitions to the beginning of the Examples section? It seems
confusing use the same code snippet to show repeated table definitions
(publisher and subscriber). I checked fast and after a few seconds I realized
that the example is not wrong but the database name has a small difference (one
letter "s" x "p"). The publication and subscription definitions are fine there.
I think reusing the same tables and publication introduces complexity.
Shouldn't we just use different tables and publication to provide an "easy"
example? It would avoid DROP PUBLICATION, ALTER SUBSCRIPTION and TRUNCATE.
> Do the inserts same as before.
We should indicate the node (publisher) to be clear.
On Mon, Apr 11, 2022 at 11:03 PM Euler Taveira <euler@eulerto.com> wrote: > > On Mon, Apr 11, 2022, at 7:40 AM, Amit Kapila wrote: > > Regarding the new examples (for partitioned tables), shouldn't we move the > parent / child definitions to the beginning of the Examples section? > I think that will make examples less clear. > It seems > confusing use the same code snippet to show repeated table definitions > (publisher and subscriber). I checked fast and after a few seconds I realized > that the example is not wrong but the database name has a small difference (one > letter "s" x "p"). > Can you be more specific? AFAICS, dbname used (testpub) is same. > The publication and subscription definitions are fine there. > > I think reusing the same tables and publication introduces complexity. > Shouldn't we just use different tables and publication to provide an "easy" > example? It would avoid DROP PUBLICATION, ALTER SUBSCRIPTION and TRUNCATE. > I don't know. I find the current way understandable. I feel using different names won't gain much and make the example difficult to follow. -- With Regards, Amit Kapila.
PSA patch v9 which addresses most of Euler's review comments [1] ------ [1] https://www.postgresql.org/message-id/1c78ebd4-b38d-4b5d-a6ea-d583efe87d97%40www.fastmail.com Kind Regards, Peter Smith. Fujitsu Australia
Attachment
On Tue, Apr 12, 2022 at 3:33 AM Euler Taveira <euler@eulerto.com> wrote: > > On Mon, Apr 11, 2022, at 7:40 AM, Amit Kapila wrote: > > On Mon, Apr 11, 2022 at 12:39 PM Peter Smith <smithpb2250@gmail.com> wrote: > > > > On Fri, Apr 8, 2022 at 4:12 PM Peter Smith <smithpb2250@gmail.com> wrote: > > > > OK. Added in v7 [1] > > > > Thanks, this looks mostly good to me. I didn't like the new section > added for partitioned tables examples, so I removed it and added some > explanation of the tests. I have slightly changed a few other lines. I > am planning to commit the attached tomorrow unless there are more > comments. > > I have a few comments. Thanks for your review comments! I addressed most of them as suggested - see the details below. > > > If a publication publishes UPDATE and/or DELETE operations ... > > > > If we are talking about operations, use lowercase like I suggested in the > previous version. See the publish parameter [1]. If we are talking about > commands, the word "operations" should be removed or replaced by "commands". > The "and/or" isn't required, "or" implies the same. If you prefer "operations", > my suggestion is to adjust the last sentence that says "only INSERT" to "only > <italic>insert</italic> operation". Not changed. Because in fact, I copied most of this sentence (including the uppercase, "operations", "and/or") from existing documentation [1] e.g. see "The tables added to a publication that publishes UPDATE and/or DELETE operations must ..." [1] https://www.postgresql.org/docs/current/sql-createpublication.html > > > If the old row satisfies the row filter expression (it was sent to the > > subscriber) but the new row doesn't, then from a data consistency perspective > > the old row should be removed from the subscriber. > > > > I suggested small additions to this sentence. We should at least add a comma > after "then" and "perspective". The same applies to the next paragraph too. Modified the commas in [v9] as suggested. > > Regarding the "Summary", it is redundant as I said before. We already described > all 4 cases. I vote to remove it. However, if we would go with a table, I > suggest to improve the formatting: add borders, "old row" and "new row" should > be titles, "no match" and "match" should be represented by symbols ("" and "X", > for example), and "Case X" column should be removed because this extra column > adds nothing. Yeah, I know the information is the same in the summary and in the text. Personally, I find big slabs of technical text difficult to digest, so I'd have to spend 5 minutes of careful reading and drawing the exact same summary on a piece of paper anyway just to visualize what the text says. The summary makes it easy to understand at a glance. But I have modified the summary in [v9] to remove the "case" and to add other column headings as suggested. > > Regarding the "Partitioned Tables", I suggested to remove the bullets. We > generally have paragraphs with a few sentences. I tend to avoid short > paragraphs. I also didn't like the 2 bullets using almost the same words. In > the previous version, I suggested something like > > If the publication contains a partitioned table, the parameter > publish_via_partition_root determines which row filter expression is used. If > the parameter publish_via_partition_root is true, the row filter expression > associated with the partitioned table is used. Otherwise, the row filter > expression associated with the individual partition is used. > Modified in [v9] to remove the bullets. > > will be copied. (see Section 31.3.6 for details). > > There is an extra period after "copied" that should be removed. The other > option is to remove the parentheses and have another sentence for "See ...". > Modified in [v9] as suggested. > > those expressions get OR'ed together > > I prefer plain English here. This part of the sentence is also redundant with > the rest of the sentence so I suggested to remove it in the previous version. > > rows that satisfy any of the row filter expressions is replicated. > > instead of > > those expressions get OR'ed together, so that rows satisfying any of the > expressions will be replicated. Not changed. The readers of this docs page are all users who will be familiar with the filter expressions, so I felt the "OR'ed together" part will be perfectly clear to the intended audience. > > I also didn't use a different paragraph (like I suggested in the previous > version) because we are talking about the same thing. > Modified in [v9] to use a single paragraph. > The bullets in the example sounds strange, that's why I suggested removing it. > We can even combine the 3 sentences into one paragraph. Modified [v9] so the whole example now has no bullets. Also combined all these 3 sentences as suggested. > > > The PSQL command \dRp+ shows the row filter expressions (if defined) for each > > table of the publications. > > Well, we don't use PSQL (upppercase) in the documentation. I suggested a > different sentence: > > The psql shows the row filter expressions (if defined) for each table. > Modified the sentence in [v9]. Now it uses lowercase psql. > > The PSQL command \d shows what publications the table is a member of, as well > > as that table's row filter expression (if defined) in those publications. > > It is not logical replication business to explain about psql behavior. If, for > some reason, someone decided to change it, this section will contain obsolete > information. The psql output is fine, the explanation is not. That's why I > suggested this modification. Modified [v9] this sentence also to use lowercase psql. But I did not understand your point about “If, for some reason, someone decided to change it, this section will contain obsolete information”, because IIUC that will be equally true for both the explanation and the output, so I did not understand why you say "psql output is fine, the explanation is not". It is the business of this documentation to help the user to know how and where they can find the row filter information they may need to know. > > > Only the rows satisfying the t1 WHERE clause of publication p1 are > > replicated. > > Again, no bullets. This sentence is useful *before* the psql output. We're > presenting the results. Let's follow the pattern, describe the action and show > the results. > OK. Modified all the [v9] example now has all the bullets removed and follows the suggested pattern (e.g. where the notes always come *before* the results) > > The UPDATE replicates the change as normal. > > This sentence should be *before* the psql output (see my previous version). > Modified [v9] as suggested. > Regarding the new examples (for partitioned tables), shouldn't we move the > parent / child definitions to the beginning of the Examples section? Not changed. IMO if we moved those CREATE TABLE as suggested then they will then be too far away from where they are being used. > It seems > confusing use the same code snippet to show repeated table definitions > (publisher and subscriber). I checked fast and after a few seconds I realized > that the example is not wrong but the database name has a small difference (one > letter "s" x "p"). The publication and subscription definitions are fine there. > Not changed. The publisher and subscriber programlistings are always separated. If you are looking at the rendered HTML I think it is quite clear that one is at the publisher and one is at the subscriber. OTOH, if we omitted creating the tables on the subscriber then I think that really would cause some confusion. > I think reusing the same tables and publication introduces complexity. > Shouldn't we just use different tables and publication to provide an "easy" > example? It would avoid DROP PUBLICATION, ALTER SUBSCRIPTION and TRUNCATE. > Not changed. Those same tables were re-used *deliberately* so that the examples could use identical inserts, and to emphasize that the different behaviour was caused only by the "publish_via_partition_root" setting. > > Do the inserts same as before. > > We should indicate the node (publisher) to be clear. OK. Modified [v9] as suggested. ------ [v9] https://www.postgresql.org/message-id/CAHut%2BPvYqo77rwg_vHC%3DOyQ7hCHZGVm%3DNi%2BJQbf8VyBz8hoo2w%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
On Tue, Apr 12, 2022, at 5:30 AM, Peter Smith wrote:
Not changed. Because in fact, I copied most of this sentence(including the uppercase, "operations", "and/or") from existingdocumentation [1]e.g. see "The tables added to a publication that publishes UPDATEand/or DELETE operations must ..."
Hmm. You are checking the Notes. I'm referring the the publish parameter. IMO
this sentence should use operations in lowercase letters too because even if
you create it with uppercase letters, Postgres will provide a lowercase word
when you dump it.
Yeah, I know the information is the same in the summary and in thetext. Personally, I find big slabs of technical text difficult todigest, so I'd have to spend 5 minutes of careful reading and drawingthe exact same summary on a piece of paper anyway just to visualizewhat the text says. The summary makes it easy to understand at aglance. But I have modified the summary in [v9] to remove the "case"and to add other column headings as suggested.
Isn't it better to use a table instead of synopsis?
Not changed. The readers of this docs page are all users who will befamiliar with the filter expressions, so I felt the "OR'ed together"part will be perfectly clear to the intended audience.
If you want to keep it, change it to "ORed". It is used in indices.sgml. Let's
keep the consistence.
But I did not understand your point about “If, for some reason,someone decided to change it, this section will contain obsoleteinformation”, because IIUC that will be equally true for both theexplanation and the output, so I did not understand why you say "psqloutput is fine, the explanation is not". It is the business of thisdocumentation to help the user to know how and where they can find therow filter information they may need to know.
You are describing a psql command here. My point is keep psql explanation in
the psql section. This section is to describe the row filter feature. If we
start describing features in other sections, we will have outdated information
when the referred feature is changed and someone fails to find all references.
I tend to concentrate detailed explanation in the feature section. If I have to
add links in other sections, I use "Seee Section 1.23 for details ...".
Not changed. The publisher and subscriber programlistings are alwaysseparated. If you are looking at the rendered HTML I think it is quiteclear that one is at the publisher and one is at the subscriber. OTOH,if we omitted creating the tables on the subscriber then I think thatreally would cause some confusion.
The difference is extra space. By default, the CSS does not include a border
for programlisting. That's why I complained about it. I noticed that the
website CSS includes it. However, the PDF will not include the border. I would
add a separate description for the subscriber just to be clear.
One last suggestion, you are using identifiers in uppercase letters but
"primary key" is in lowercase.
PSA patch v10 which addresses the remaining review comments from Euler [1] ------ [1] https://www.postgresql.org/message-id/3cd8d622-6a26-4eaf-a5aa-ac78030e5f50%40www.fastmail.com Kind Regards, Peter Smith. Fujitsu Australia
Attachment
On Wed, Apr 13, 2022 at 12:08 AM Euler Taveira <euler@eulerto.com> wrote: > > On Tue, Apr 12, 2022, at 5:30 AM, Peter Smith wrote: > > Not changed. Because in fact, I copied most of this sentence > (including the uppercase, "operations", "and/or") from existing > documentation [1] > e.g. see "The tables added to a publication that publishes UPDATE > and/or DELETE operations must ..." > [1] https://www.postgresql.org/docs/current/sql-createpublication.html > > Hmm. You are checking the Notes. I'm referring the the publish parameter. IMO > this sentence should use operations in lowercase letters too because even if > you create it with uppercase letters, Postgres will provide a lowercase word > when you dump it. IIRC the row filter replication identity checking is at run-time based on the operation (not at DDL time based on the publish_parameter). For this reason, and also because this is the same format/wording used in many places already on the create_publication.sgml, I did not change this formatting. But I did modify [v10] as earlier suggested to replace the “and/or” with “or”, and also added another word “operation”. > > Yeah, I know the information is the same in the summary and in the > text. Personally, I find big slabs of technical text difficult to > digest, so I'd have to spend 5 minutes of careful reading and drawing > the exact same summary on a piece of paper anyway just to visualize > what the text says. The summary makes it easy to understand at a > glance. But I have modified the summary in [v9] to remove the "case" > and to add other column headings as suggested. > > Isn't it better to use a table instead of synopsis? Modified [v10] as suggested. > > Not changed. The readers of this docs page are all users who will be > familiar with the filter expressions, so I felt the "OR'ed together" > part will be perfectly clear to the intended audience. > > If you want to keep it, change it to "ORed". It is used in indices.sgml. Let's > keep the consistence. Modified [v10] as suggested. > > But I did not understand your point about “If, for some reason, > someone decided to change it, this section will contain obsolete > information”, because IIUC that will be equally true for both the > explanation and the output, so I did not understand why you say "psql > output is fine, the explanation is not". It is the business of this > documentation to help the user to know how and where they can find the > row filter information they may need to know. > > You are describing a psql command here. My point is keep psql explanation in > the psql section. This section is to describe the row filter feature. If we > start describing features in other sections, we will have outdated information > when the referred feature is changed and someone fails to find all references. > I tend to concentrate detailed explanation in the feature section. If I have to > add links in other sections, I use "Seee Section 1.23 for details ...". Modified [v10] so the psql descriptions are now very generic. > > Not changed. The publisher and subscriber programlistings are always > separated. If you are looking at the rendered HTML I think it is quite > clear that one is at the publisher and one is at the subscriber. OTOH, > if we omitted creating the tables on the subscriber then I think that > really would cause some confusion. > > The difference is extra space. By default, the CSS does not include a border > for programlisting. That's why I complained about it. I noticed that the > website CSS includes it. However, the PDF will not include the border. I would > add a separate description for the subscriber just to be clear. > Modified [v10] as suggested to add a separate description for the subscriber. > One last suggestion, you are using identifiers in uppercase letters but > "primary key" is in lowercase. > Modified [v10] as suggested to make this uppercase ------ [v10] https://www.postgresql.org/message-id/CAHut%2BPvMEYkCRWDoZSpFnP%2B5SExus7YzWAd%3D6ah9vwkfRhOnSg%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia
On Wed, Apr 13, 2022, at 12:24 AM, Peter Smith wrote:
PSA patch v10 which addresses the remaining review comments from Euler [1]
Looks good to me.
I've changed the CF entry [1] status to "ready for committer". ------ [1] https://commitfest.postgresql.org/38/3605/ Kind Regards, Peter Smith. Fujitsu Australia
On Wednesday, April 13, 2022 11:25 AM Peter Smith <smithpb2250@gmail.com> wrote: > > PSA patch v10 which addresses the remaining review comments from Euler [1] Thanks for the patch, it looks good to me. Best regards, Hou zj
On Thu, Apr 14, 2022 at 1:29 AM Euler Taveira <euler@eulerto.com> wrote: > > On Wed, Apr 13, 2022, at 12:24 AM, Peter Smith wrote: > > PSA patch v10 which addresses the remaining review comments from Euler [1] > > Looks good to me. > Thanks, this looks good to me as well. I'll check this again early next week and push unless I find something or there are more comments. -- With Regards, Amit Kapila.
On Thu, Apr 14, 2022 at 8:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Apr 14, 2022 at 1:29 AM Euler Taveira <euler@eulerto.com> wrote: > > > > On Wed, Apr 13, 2022, at 12:24 AM, Peter Smith wrote: > > > > PSA patch v10 which addresses the remaining review comments from Euler [1] > > > > Looks good to me. > > > > Thanks, this looks good to me as well. I'll check this again early > next week and push unless I find something or there are more comments. > Pushed. -- With Regards, Amit Kapila.
On Mon, Apr 18, 2022 at 3:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Apr 14, 2022 at 8:55 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Thu, Apr 14, 2022 at 1:29 AM Euler Taveira <euler@eulerto.com> wrote: > > > > > > On Wed, Apr 13, 2022, at 12:24 AM, Peter Smith wrote: > > > > > > PSA patch v10 which addresses the remaining review comments from Euler [1] > > > > > > Looks good to me. > > > > > > > Thanks, this looks good to me as well. I'll check this again early > > next week and push unless I find something or there are more comments. > > > > Pushed. > Thanks for pushing. I updated the CF entry [1] to say "committed'. ------ [1] https://commitfest.postgresql.org/38/3605/ Kind Regards, Peter Smith. Fujitsu Australia