Thread: Fix doc bug in logical replication.
Howdy folks, I noticed that the docs currently state "A different order of columns in the target table is allowed, but the column types have to match." This is untrue, as you can replicate between any two data types as long as the data can be coerced into the right format on the subscriber. Attached is a patch that attempts to clarify this, and provides some additional wordsmithing of that section. Patch is against head but the nature of the patch would apply to the docs for 11 and 10, which both have the incorrect information as well, even if the patch itself does not. Robert Treat https://xzilla.net
Attachment
Em seg, 8 de abr de 2019 às 19:38, Robert Treat <rob@xzilla.net> escreveu: > > I noticed that the docs currently state "A different order of columns > in the target table is allowed, but the column types have to match." > This is untrue, as you can replicate between any two data types as > long as the data can be coerced into the right format on the > subscriber. Attached is a patch that attempts to clarify this, and > provides some additional wordsmithing of that section. Patch is > against head but the nature of the patch would apply to the docs for > 11 and 10, which both have the incorrect information as well, even if > the patch itself does not. > I would say it is inaccurate because the actual instruction works. I agree that your words are an improvement but it lacks a comment on the slot_[store|modify]_cstrings. -- Euler Taveira Timbira - http://www.timbira.com.br/ PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
On Mon, Apr 8, 2019 at 7:19 PM Euler Taveira <euler@timbira.com.br> wrote: > > Em seg, 8 de abr de 2019 às 19:38, Robert Treat <rob@xzilla.net> escreveu: > > > > I noticed that the docs currently state "A different order of columns > > in the target table is allowed, but the column types have to match." > > This is untrue, as you can replicate between any two data types as > > long as the data can be coerced into the right format on the > > subscriber. Attached is a patch that attempts to clarify this, and > > provides some additional wordsmithing of that section. Patch is > > against head but the nature of the patch would apply to the docs for > > 11 and 10, which both have the incorrect information as well, even if > > the patch itself does not. > > > I would say it is inaccurate because the actual instruction works. I > agree that your words are an improvement but it lacks a comment on the > slot_[store|modify]_cstrings. > It is clear to me that the docs are wrong, but I don't see anything inherently incorrect about the code itself. Do you have suggestions for how you would like to see the code comments improved? Robert Treat https://xzilla.net
On 2019-04-12 19:52, Robert Treat wrote: > It is clear to me that the docs are wrong, but I don't see anything > inherently incorrect about the code itself. Do you have suggestions > for how you would like to see the code comments improved? The question is perhaps whether we want to document that non-matching data types do work. It happens to work now, but do we always want to guarantee that? There is talk of a binary mode for example. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Jun 23, 2019 at 1:25 PM Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > > On 2019-04-12 19:52, Robert Treat wrote: > > It is clear to me that the docs are wrong, but I don't see anything > > inherently incorrect about the code itself. Do you have suggestions > > for how you would like to see the code comments improved? > > The question is perhaps whether we want to document that non-matching > data types do work. It happens to work now, but do we always want to > guarantee that? There is talk of a binary mode for example. > Whether we *want* to document that it works, documenting that it doesn't work when it does can't be the right answer. If you want to couch the language to leave the door open that we may not support this the same way in the future I wouldn't be opposed to that, but at this point we will have three releases with the current behavior in production, so if we decide to change the behavior, it is likely going to break certain use cases. That may be ok, but I'd expect a documentation update to accompany a change that would cause such a breaking change. Robert Treat https://xzilla.net
On Sun, Jun 23, 2019 at 10:26:47PM -0400, Robert Treat wrote: >On Sun, Jun 23, 2019 at 1:25 PM Peter Eisentraut ><peter.eisentraut@2ndquadrant.com> wrote: >> >> On 2019-04-12 19:52, Robert Treat wrote: >> > It is clear to me that the docs are wrong, but I don't see anything >> > inherently incorrect about the code itself. Do you have suggestions >> > for how you would like to see the code comments improved? >> >> The question is perhaps whether we want to document that non-matching >> data types do work. It happens to work now, but do we always want to >> guarantee that? There is talk of a binary mode for example. >> > >Whether we *want* to document that it works, documenting that it >doesn't work when it does can't be the right answer. If you want to >couch the language to leave the door open that we may not support this >the same way in the future I wouldn't be opposed to that, but at this >point we will have three releases with the current behavior in >production, so if we decide to change the behavior, it is likely going >to break certain use cases. That may be ok, but I'd expect a >documentation update to accompany a change that would cause such a >breaking change. > I agree with that. We have this behavior for quite a bit of time, and while technically we could change the behavior in the future (using the "not supported" statement), IMO that'd be pretty annoying move. I always despised systems that "fix" bugs by documenting that it does not work, and this is a bit similar. FWIW I don't quite see why supporting binary mode would change this? Surely we can't just enable binary mode blindly, there need to be some sort of checks (alignment, type sizes, ...) with fallback to text mode. And perhaps support only for built-in types. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, 27 Jun 2019 at 12:50, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On Sun, Jun 23, 2019 at 10:26:47PM -0400, Robert Treat wrote:
>On Sun, Jun 23, 2019 at 1:25 PM Peter Eisentraut
><peter.eisentraut@2ndquadrant.com> wrote:
>>
>> On 2019-04-12 19:52, Robert Treat wrote:
>> > It is clear to me that the docs are wrong, but I don't see anything
>> > inherently incorrect about the code itself. Do you have suggestions
>> > for how you would like to see the code comments improved?
>>
>> The question is perhaps whether we want to document that non-matching
>> data types do work. It happens to work now, but do we always want to
>> guarantee that? There is talk of a binary mode for example.
>>
>
>Whether we *want* to document that it works, documenting that it
>doesn't work when it does can't be the right answer. If you want to
>couch the language to leave the door open that we may not support this
>the same way in the future I wouldn't be opposed to that, but at this
>point we will have three releases with the current behavior in
>production, so if we decide to change the behavior, it is likely going
>to break certain use cases. That may be ok, but I'd expect a
>documentation update to accompany a change that would cause such a
>breaking change.
>
I agree with that. We have this behavior for quite a bit of time, and
while technically we could change the behavior in the future (using the
"not supported" statement), IMO that'd be pretty annoying move. I always
despised systems that "fix" bugs by documenting that it does not work, and
this is a bit similar.
FWIW I don't quite see why supporting binary mode would change this?
Surely we can't just enable binary mode blindly, there need to be some
sort of checks (alignment, type sizes, ...) with fallback to text mode.
And perhaps support only for built-in types.
The proposed implementation of binary only supports built-in types.
The subscriber turns it on so presumably it can handle the binary data coming at it.
Dave
On Thu, Jun 27, 2019 at 01:46:47PM -0400, Dave Cramer wrote: >On Thu, 27 Jun 2019 at 12:50, Tomas Vondra <tomas.vondra@2ndquadrant.com> >wrote: > >> On Sun, Jun 23, 2019 at 10:26:47PM -0400, Robert Treat wrote: >> >On Sun, Jun 23, 2019 at 1:25 PM Peter Eisentraut >> ><peter.eisentraut@2ndquadrant.com> wrote: >> >> >> >> On 2019-04-12 19:52, Robert Treat wrote: >> >> > It is clear to me that the docs are wrong, but I don't see anything >> >> > inherently incorrect about the code itself. Do you have suggestions >> >> > for how you would like to see the code comments improved? >> >> >> >> The question is perhaps whether we want to document that non-matching >> >> data types do work. It happens to work now, but do we always want to >> >> guarantee that? There is talk of a binary mode for example. >> >> >> > >> >Whether we *want* to document that it works, documenting that it >> >doesn't work when it does can't be the right answer. If you want to >> >couch the language to leave the door open that we may not support this >> >the same way in the future I wouldn't be opposed to that, but at this >> >point we will have three releases with the current behavior in >> >production, so if we decide to change the behavior, it is likely going >> >to break certain use cases. That may be ok, but I'd expect a >> >documentation update to accompany a change that would cause such a >> >breaking change. >> > >> >> I agree with that. We have this behavior for quite a bit of time, and >> while technically we could change the behavior in the future (using the >> "not supported" statement), IMO that'd be pretty annoying move. I always >> despised systems that "fix" bugs by documenting that it does not work, and >> this is a bit similar. >> >> FWIW I don't quite see why supporting binary mode would change this? >> Surely we can't just enable binary mode blindly, there need to be some >> sort of checks (alignment, type sizes, ...) with fallback to text mode. >> And perhaps support only for built-in types. >> > >The proposed implementation of binary only supports built-in types. >The subscriber turns it on so presumably it can handle the binary data >coming at it. > I don't recall that being discussed in the patch thread, but maybe it should not be enabled merely based on what the subscriber requests. Maybe the subscriber should indicate "interest" and the decision should be made on the upstream, after some additional checks. That's why pglogical does check_binary_compatibility() - see [1]. This is necessary, because the FE/BE protocol docs [2] say: Keep in mind that binary representations for complex data types might change across server versions; the text format is usually the more portable choice. So you can't just assume the subscriber knows what it's doing, because either of the sides might be upgraded. Note: The pglogical code also does check additional stuff (like sizeof(Datum) or endianess), but I'm not sure that's actually necessary - I believe the binary protocol should be independent of that. regards [1] https://github.com/2ndQuadrant/pglogical/blob/REL2_x_STABLE/pglogical_output_plugin.c#L107 [2] https://www.postgresql.org/docs/current/protocol-overview.html -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, 27 Jun 2019 at 14:20, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On Thu, Jun 27, 2019 at 01:46:47PM -0400, Dave Cramer wrote:
>On Thu, 27 Jun 2019 at 12:50, Tomas Vondra <tomas.vondra@2ndquadrant.com>
>wrote:
>
>> On Sun, Jun 23, 2019 at 10:26:47PM -0400, Robert Treat wrote:
>> >On Sun, Jun 23, 2019 at 1:25 PM Peter Eisentraut
>> ><peter.eisentraut@2ndquadrant.com> wrote:
>> >>
>> >> On 2019-04-12 19:52, Robert Treat wrote:
>> >> > It is clear to me that the docs are wrong, but I don't see anything
>> >> > inherently incorrect about the code itself. Do you have suggestions
>> >> > for how you would like to see the code comments improved?
>> >>
>> >> The question is perhaps whether we want to document that non-matching
>> >> data types do work. It happens to work now, but do we always want to
>> >> guarantee that? There is talk of a binary mode for example.
>> >>
>> >
>> >Whether we *want* to document that it works, documenting that it
>> >doesn't work when it does can't be the right answer. If you want to
>> >couch the language to leave the door open that we may not support this
>> >the same way in the future I wouldn't be opposed to that, but at this
>> >point we will have three releases with the current behavior in
>> >production, so if we decide to change the behavior, it is likely going
>> >to break certain use cases. That may be ok, but I'd expect a
>> >documentation update to accompany a change that would cause such a
>> >breaking change.
>> >
>>
>> I agree with that. We have this behavior for quite a bit of time, and
>> while technically we could change the behavior in the future (using the
>> "not supported" statement), IMO that'd be pretty annoying move. I always
>> despised systems that "fix" bugs by documenting that it does not work, and
>> this is a bit similar.
>>
>> FWIW I don't quite see why supporting binary mode would change this?
>> Surely we can't just enable binary mode blindly, there need to be some
>> sort of checks (alignment, type sizes, ...) with fallback to text mode.
>> And perhaps support only for built-in types.
>>
>
>The proposed implementation of binary only supports built-in types.
>The subscriber turns it on so presumably it can handle the binary data
>coming at it.
>
I don't recall that being discussed in the patch thread, but maybe it
should not be enabled merely based on what the subscriber requests. Maybe
the subscriber should indicate "interest" and the decision should be made
on the upstream, after some additional checks.
That's why pglogical does check_binary_compatibility() - see [1].
This is necessary, because the FE/BE protocol docs [2] say:
Keep in mind that binary representations for complex data types might
change across server versions; the text format is usually the more
portable choice.
So you can't just assume the subscriber knows what it's doing, because
either of the sides might be upgraded.
Note: The pglogical code also does check additional stuff (like
sizeof(Datum) or endianess), but I'm not sure that's actually necessary -
I believe the binary protocol should be independent of that.
regards
[1] https://github.com/2ndQuadrant/pglogical/blob/REL2_x_STABLE/pglogical_output_plugin.c#L107
[2] https://www.postgresql.org/docs/current/protocol-overview.html
Thanks for the pointer. I'll add that to the patch.
On 2019-06-27 18:50, Tomas Vondra wrote: >> Whether we *want* to document that it works, documenting that it >> doesn't work when it does can't be the right answer. If you want to >> couch the language to leave the door open that we may not support this >> the same way in the future I wouldn't be opposed to that, but at this >> point we will have three releases with the current behavior in >> production, so if we decide to change the behavior, it is likely going >> to break certain use cases. That may be ok, but I'd expect a >> documentation update to accompany a change that would cause such a >> breaking change. >> > I agree with that. We have this behavior for quite a bit of time, and > while technically we could change the behavior in the future (using the > "not supported" statement), IMO that'd be pretty annoying move. I always > despised systems that "fix" bugs by documenting that it does not work, and > this is a bit similar. committed back to PG10 -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services