Thread: Add publisher and subscriber to glossary documentation.
Hi, There are several places where publisher and subscriber terms are used across the documentation. But the publisher and subscriber were missing in the documentation. I felt this should be added in the glossary. I have created a patch for the same. Thanks and Regards Shlok Kyal
Attachment
Hello On 2024-Feb-12, Shlok Kyal wrote: > There are several places where publisher and subscriber terms are used > across the documentation. But the publisher and subscriber were > missing in the documentation. I felt this should be added in the > glossary. I agree, but let's wordsmith those definitions first. Should these be "publisher node" and "subscriber node" instead? Do we want to define the term "node"? I think in everyday conversations we use "node" quite a lot, so maybe we do need an entry for it. (Maybe just <glossssee otherterm="instance"> suffices, plus add under instance "also called a node".) + <glossterm>Publisher</glossterm> + <glossdef> + <para> + A node where publication is defined. + It replicates the publication changes to the subscriber node. Apart from deciding what to do with "node", what are "changes"? This doesn't seem very specific. + <glossterm>Subscriber</glossterm> + <glossdef> + <para> + A node where subscription is defined. + It subscribe to one or more publications on a publisher node and pull the data + from the publications they subscribe to. Same issues as above, plus there are some grammar issues. I think these definitions should use the term "logical replication", which we don't currently define. We do have "replication" where we provide an overview of "logical replication". Maybe that's enough, but we should consider whether we want a separate definition of logical replication (I'm leaning towards not having one, but it's worth asking.) -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ "But static content is just dynamic content that isn't moving!" http://smylers.hates-software.com/2007/08/15/fe244d0c.html
Hi, I addressed the comments and updated the patch. > Should these be "publisher node" and "subscriber node" instead? Do we > want to define the term "node"? I think in everyday conversations we > use "node" quite a lot, so maybe we do need an entry for it. (Maybe > just <glossssee otherterm="instance"> suffices, plus add under instance > "also called a node".) Modified > + <glossterm>Publisher</glossterm> > + <glossdef> > + <para> > + A node where publication is defined. > + It replicates the publication changes to the subscriber node. > > Apart from deciding what to do with "node", what are "changes"? This > doesn't seem very specific. Modified > + <glossterm>Subscriber</glossterm> > + <glossdef> > + <para> > + A node where subscription is defined. > + It subscribe to one or more publications on a publisher node and pull the data > + from the publications they subscribe to. > > Same issues as above, plus there are some grammar issues. Modified > I think these definitions should use the term "logical replication", > which we don't currently define. We do have "replication" where we > provide an overview of "logical replication". Maybe that's enough, but > we should consider whether we want a separate definition of logical > replication (I'm leaning towards not having one, but it's worth asking.) Modified. Added the term "logical replication" in the definitions. Used reference to "replication". Thanks and Regards, Shlok Kyal
Attachment
Here are some comments for patch v2. ====== 1. There are whitespace problems [postgres@CentOS7-x64 oss_postgres_misc]$ git apply ../patches_misc/v2-0001-Add-publisher-and-subscriber-to-glossary-document.patch ../patches_misc/v2-0001-Add-publisher-and-subscriber-to-glossary-document.patch:43: trailing whitespace. A node where publication is defined for ../patches_misc/v2-0001-Add-publisher-and-subscriber-to-glossary-document.patch:45: trailing whitespace. It replicates a set of changes from a table or a group of tables in ../patches_misc/v2-0001-Add-publisher-and-subscriber-to-glossary-document.patch:66: trailing whitespace. A node where subscription is defined for ../patches_misc/v2-0001-Add-publisher-and-subscriber-to-glossary-document.patch:67: trailing whitespace. <glossterm linkend="glossary-replication">logical replication</glossterm>. ../patches_misc/v2-0001-Add-publisher-and-subscriber-to-glossary-document.patch:68: trailing whitespace. It subscribes to one or more publications on a publisher node and pulls warning: squelched 2 whitespace errors warning: 7 lines add whitespace errors. ~~~ 2. Publisher node + <glossentry id="glossary-publisher"> + <glossterm>Publisher node</glossterm> + <glossdef> + <para> + A node where publication is defined for + <glossterm linkend="glossary-replication">logical replication</glossterm>. + It replicates a set of changes from a table or a group of tables in + publication to the subscriber node. + </para> + <para> + For more information, see + <xref linkend="logical-replication-publication"/>. + </para> + </glossdef> + </glossentry> + 2a. I felt the term here should be "Publication node". Indeed, there should also be additional glossary terms like "Publisher" (i.e. it is the same as "Publication node") and "Subscriber" (i.e. it is the same as "Subscription node"). These definitions will then be consistent with node descriptions already in sections 30.1 and 30.2 ~ 2b. "node" should include a link to the glossary term. Same for any other terms mentioned ~ 2c. /A node where publication is defined for/A node where a publication is defined for/ ~ 2d. "It replicates" is misleading because it is the PUBLICATION doing the replicating, not the node. IMO it will be better to include 2 more glossary terms "Publication" and "Subscription" where you can say this kind of information. Then the link "logical-replication-publication" also belongs under the "Publication" term. ~~~ 3. + <glossentry id="glossary-subscriber"> + <glossterm>Subscriber node</glossterm> + <glossdef> + <para> + A node where subscription is defined for + <glossterm linkend="glossary-replication">logical replication</glossterm>. + It subscribes to one or more publications on a publisher node and pulls + a set of changes from a table or a group of tables in publications it + subscribes to. + </para> + <para> + For more information, see + <xref linkend="logical-replication-subscription"/>. + </para> + </glossdef> + </glossentry> All comments are similar to those above. ====== In summary, IMO it should be a bit more like below: SUGGESTION (include all the necessary links etc) Publisher See "Publication node" Publication A publication replicates the changes of one or more tables to a subscription. For more information, see Section 30.1 Publication node A node where a publication is defined for logical replication. Subscriber See "Subscription node" Subscription A subscription receives the changes of one or more tables from the publications it subscribes to. For more information, see Section 30.2 Subscription node A node where a subscription is defined for logical replication. ====== Kind Regards, Peter Smith. Fujitsu Australia
> Here are some comments for patch v2. > > ====== > > 1. There are whitespace problems > > [postgres@CentOS7-x64 oss_postgres_misc]$ git apply > ../patches_misc/v2-0001-Add-publisher-and-subscriber-to-glossary-document.patch > ../patches_misc/v2-0001-Add-publisher-and-subscriber-to-glossary-document.patch:43: > trailing whitespace. > A node where publication is defined for > ../patches_misc/v2-0001-Add-publisher-and-subscriber-to-glossary-document.patch:45: > trailing whitespace. > It replicates a set of changes from a table or a group of tables in > ../patches_misc/v2-0001-Add-publisher-and-subscriber-to-glossary-document.patch:66: > trailing whitespace. > A node where subscription is defined for > ../patches_misc/v2-0001-Add-publisher-and-subscriber-to-glossary-document.patch:67: > trailing whitespace. > <glossterm linkend="glossary-replication">logical replication</glossterm>. > ../patches_misc/v2-0001-Add-publisher-and-subscriber-to-glossary-document.patch:68: > trailing whitespace. > It subscribes to one or more publications on a publisher node and pulls > warning: squelched 2 whitespace errors > warning: 7 lines add whitespace errors. > > ~~~ > 2. Publisher node > > + <glossentry id="glossary-publisher"> > + <glossterm>Publisher node</glossterm> > + <glossdef> > + <para> > + A node where publication is defined for > + <glossterm linkend="glossary-replication">logical > replication</glossterm>. > + It replicates a set of changes from a table or a group of tables in > + publication to the subscriber node. > + </para> > + <para> > + For more information, see > + <xref linkend="logical-replication-publication"/>. > + </para> > + </glossdef> > + </glossentry> > + > > > 2a. > I felt the term here should be "Publication node". > > Indeed, there should also be additional glossary terms like > "Publisher" (i.e. it is the same as "Publication node") and > "Subscriber" (i.e. it is the same as "Subscription node"). These > definitions will then be consistent with node descriptions already in > sections 30.1 and 30.2 > > > ~ > > 2b. > "node" should include a link to the glossary term. Same for any other > terms mentioned > > ~ > > 2c. > /A node where publication is defined for/A node where a publication is > defined for/ > > ~ > > 2d. > "It replicates" is misleading because it is the PUBLICATION doing the > replicating, not the node. > > IMO it will be better to include 2 more glossary terms "Publication" > and "Subscription" where you can say this kind of information. Then > the link "logical-replication-publication" also belongs under the > "Publication" term. > > > ~~~ > > 3. > + <glossentry id="glossary-subscriber"> > + <glossterm>Subscriber node</glossterm> > + <glossdef> > + <para> > + A node where subscription is defined for > + <glossterm linkend="glossary-replication">logical replication</glossterm>. > + It subscribes to one or more publications on a publisher node and pulls > + a set of changes from a table or a group of tables in publications it > + subscribes to. > + </para> > + <para> > + For more information, see > + <xref linkend="logical-replication-subscription"/>. > + </para> > + </glossdef> > + </glossentry> > > All comments are similar to those above. > > ====== > > In summary, IMO it should be a bit more like below: > > SUGGESTION (include all the necessary links etc) > > Publisher > See "Publication node" > > Publication > A publication replicates the changes of one or more tables to a > subscription. For more information, see Section 30.1 > > Publication node > A node where a publication is defined for logical replication. > > Subscriber > See "Subscription node" > > Subscription > A subscription receives the changes of one or more tables from the > publications it subscribes to. For more information, see Section 30.2 > > Subscription node > A node where a subscription is defined for logical replication. I have addressed the comments and added an updated patch. Thanks and Regards, Shlok Kyal
Attachment
Here are some comments for patch v3: 1. + <glossentry id="glossary-publication-node"> + <glossterm>Publication node</glossterm> + <glossdef> + <para> + A <glossterm linkend="glossary-instance">node</glossterm> where a + <glossterm linkend="glossary-publication">publication</glossterm> is defined + for <glossterm linkend="glossary-replication">logical replication</glossterm>. + </para> + </glossdef> + </glossentry> + I felt the word "node" here should link to the glossary term "Node", instead of directly to the term "Instance". ~~ 2. + <glossentry id="glossary-subscription-node"> + <glossterm>Subscription node</glossterm> + <glossdef> + <para> + A <glossterm linkend="glossary-instance">node</glossterm> where a + <glossterm linkend="glossary-subscription">subscription</glossterm> is defined + for <glossterm linkend="glossary-replication">logical replication</glossterm>. + </para> + </glossdef> + </glossentry> + (same comment as above) I felt the word "node" here should link to the glossary term "Node", instead of directly to the term "Instance". ~~ Apart from those links, it looks good to me. Let's see what others think. ====== Kind Regards, Peter Smith. Fujitsu Australia
> 1. > + <glossentry id="glossary-publication-node"> > + <glossterm>Publication node</glossterm> > + <glossdef> > + <para> > + A <glossterm linkend="glossary-instance">node</glossterm> where a > + <glossterm > linkend="glossary-publication">publication</glossterm> is defined > + for <glossterm linkend="glossary-replication">logical > replication</glossterm>. > + </para> > + </glossdef> > + </glossentry> > + > > I felt the word "node" here should link to the glossary term "Node", > instead of directly to the term "Instance". > > ~~ > > 2. > + <glossentry id="glossary-subscription-node"> > + <glossterm>Subscription node</glossterm> > + <glossdef> > + <para> > + A <glossterm linkend="glossary-instance">node</glossterm> where a > + <glossterm > linkend="glossary-subscription">subscription</glossterm> is defined > + for <glossterm linkend="glossary-replication">logical > replication</glossterm>. > + </para> > + </glossdef> > + </glossentry> > + > > (same comment as above) > > I felt the word "node" here should link to the glossary term "Node", > instead of directly to the term "Instance". I have addressed the comments and have attached the updated version. Thanks and Regards, Shlok Kyal
Attachment
Hi, the patch v4 LGTM. ====== Kind Regards, Peter Smith. Fujitsu Australia
If there's a movement towards "node" to refer to the database which has the Subscription object, then perhaps the documentation for
31.2. Subscription, Chapter 31. Logical Replication should be updated as well, since it uses both the "database" and "node" terms on the same page, and to me referring to the same thing (I could be missing a subtlety).
See:
"The subscriber database..."
"A subscriber node may..."
Also, the word "database" in this sentence: "A subscription defines the connection to another database" to me works, but I think using "node" there could be more consistent if it’s referring to the server instance running the database that holds the PUBLICATION. The connection string information example later on the page shows "host" and "dbname" configured in the CONNECTION value for the SUBSCRIPTION. This sentence seems like the use of "database" in casual style to mean the "server instance" (or "node").
Also, the "The node where a subscription is defined". That one actually feels to me like "The database where a subscription is defined", but then that contradicts what I just said, and "node" is fine here but I think "node" should be on the preceding sentence too.
Anyway, hopefully these examples show “node” and “database” are mixed and perhaps others agree using one consistently might help the goals of the docs.
Thanks!
Hi, the patch v4 LGTM.
======
Kind Regards,
Peter Smith.
Fujitsu Australia
Hi Andrew, > If there's a movement towards "node" to refer to the database which has the Subscription object, then perhaps the documentationfor > > 31.2. Subscription, Chapter 31. Logical Replication should be updated as well, since it uses both the "database" and "node"terms on the same page, and to me referring to the same thing (I could be missing a subtlety). > > > See: > > > "The subscriber database..." > > > "A subscriber node may..." > > > Also, the word "database" in this sentence: "A subscription defines the connection to another database" to me works, butI think using "node" there could be more consistent if it’s referring to the server instance running the database thatholds the PUBLICATION. The connection string information example later on the page shows "host" and "dbname" configuredin the CONNECTION value for the SUBSCRIPTION. This sentence seems like the use of "database" in casual style tomean the "server instance" (or "node"). > > > Also, the "The node where a subscription is defined". That one actually feels to me like "The database where a subscriptionis defined", but then that contradicts what I just said, and "node" is fine here but I think "node" should beon the preceding sentence too. > > > Anyway, hopefully these examples show “node” and “database” are mixed and perhaps others agree using one consistently mighthelp the goals of the docs. For me the existing content looks good, I felt let's keep it as it is unless others feel differently. Thanks and regards, Shlok Kyal
On 2024-Mar-14, Shlok Kyal wrote: > Andrew Atkinson wrote: > > > Anyway, hopefully these examples show “node” and “database” are > > mixed and perhaps others agree using one consistently might help the > > goals of the docs. > > For me the existing content looks good, I felt let's keep it as it is > unless others feel differently. Actually it's these small terminology glitches that give me pause. If we're going to have terms that are interchangeable (in this case "node" and "database"), then they should be always interchangeable, not just in some unspecified cases. Maybe the idea of using "node" (which sounds like something that's instance-wide) is wrong for logical replication, which is necessarily something that happens database-locally. Then again, maybe defining "node" as something that exists at a database-local level when used in the context of logical replication is sufficient. In that case, it would be better to avoid defining it as a synonym of "instance". Then the terms are not always interchangeable, but it's clear when they are and when they aren't. "Node: in <glossterm>replication</>, each of the endpoints to which or from which data is replicated. In the context of physical replication, each node is an instance. In the context of logical replication, each node is a database". Does that make sense? I'd also look at altering "Primary" and "Standby" so that it's clearer that they're about physical replication, and don't mention "database" anymore, since that's the wrong level. Maybe turn them into "Primary (node)" and "Standby (node)" instead. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
On Thu, Mar 14, 2024 at 7:51 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2024-Mar-14, Shlok Kyal wrote: > > > Andrew Atkinson wrote: > > > > > Anyway, hopefully these examples show “node” and “database” are > > > mixed and perhaps others agree using one consistently might help the > > > goals of the docs. > > > > For me the existing content looks good, I felt let's keep it as it is > > unless others feel differently. > > Actually it's these small terminology glitches that give me pause. If > we're going to have terms that are interchangeable (in this case "node" > and "database"), then they should be always interchangeable, not just in > some unspecified cases. Maybe the idea of using "node" (which sounds > like something that's instance-wide) is wrong for logical replication, > which is necessarily something that happens database-locally. > > Then again, maybe defining "node" as something that exists at a > database-local level when used in the context of logical replication is > sufficient. In that case, it would be better to avoid defining it as a > synonym of "instance". Then the terms are not always interchangeable, > but it's clear when they are and when they aren't. > > "Node: in <glossterm>replication</>, each of the endpoints to which or > from which data is replicated. In the context of physical replication, > each node is an instance. In the context of logical replication, each > node is a database". > I think node should mean instance for both physical and logical replication, otherwise, it would be confusing. We need both the usages as a particular publication/subscription is defined at the database level but the server on which we define those is referred to as a node/instance. One of the usages pointed out by Andrew: "The subscriber database..." [1] is unclear but I feel we can use node there as well instead of database. [1] - https://www.postgresql.org/docs/current/logical-replication-subscription.html -- With Regards, Amit Kapila.
I think node should mean instance for both physical and logicalreplication, otherwise, it would be confusing. We need both the usagesas a particular publication/subscription is defined at the databaselevel but the server on which we define those is referred to as anode/instance.