Thread: LISTEN/NOTIFY versus encoding conversion

LISTEN/NOTIFY versus encoding conversion

From
Tom Lane
Date:
There's been a lot of thrashing about whether LISTEN/NOTIFY should
restrict payload strings to 7-bit ASCII to avoid possible encoding
conversion failures.  I was on the side of "yes" but I'm having
second thoughts about it.  The point I had failed to think about
is that we already restrict notifies to only be received by backends
in the same database as the sending backend.  (This is an inherent
implementation restriction in the pg_listener-based implementation,
and is kept for compatibility in the new code.)  This means that
sender and receiver must have the same server_encoding, and so no
conversion issue can arise as far as the two backends are concerned.

Now it's true that we could get an encoding conversion failure while
trying to send the payload *to the client*, but it's not apparent
to me why we should restrict the feature because of that.  There are
plenty of other reasons why we might fail to send the notification
to the client.  Most obviously, we could also get an encoding
conversion failure on the notify condition name --- but we've never
enforced a character set restriction on that, and nobody's ever
complained about it AFAIR.

So the currently submitted patch is logically inconsistent.  If we
enforce a character set restriction on the payload for fear of
being unable to convert it to the destination client_encoding, then
we should logically do the same for the condition name.  But then
why not also restrict a lot of other things to pure ASCII?

I'm now thinking that we should just drop that restriction.
        regards, tom lane


Re: LISTEN/NOTIFY versus encoding conversion

From
Jeff Davis
Date:
On Sun, 2010-02-14 at 15:15 -0500, Tom Lane wrote:
> Most obviously, we could also get an encoding
> conversion failure on the notify condition name --- but we've never
> enforced a character set restriction on that, and nobody's ever
> complained about it AFAIR.

If the client successfully executed the LISTEN, then it could convert
all of the characters in one direction. I suppose some incomplete
conversion routine might not be able to convert the same characters in
the other direction -- is that what you're referring to?

The case of a condition name conversion error seems less problematic to
me anyway, because it would happen every time; so there's no danger of
making it through testing and then failing in production.

> I'm now thinking that we should just drop that restriction.

Ok. I'd feel a little better if I understood what would actually happen
in the case of an error with NOTIFY. When does the client receive the
error? Might the client code confuse it with an error for something
synchronous, like a command execution?

Regards,Jeff Davis



Re: LISTEN/NOTIFY versus encoding conversion

From
Tom Lane
Date:
Jeff Davis <pgsql@j-davis.com> writes:
> On Sun, 2010-02-14 at 15:15 -0500, Tom Lane wrote:
>> Most obviously, we could also get an encoding
>> conversion failure on the notify condition name --- but we've never
>> enforced a character set restriction on that, and nobody's ever
>> complained about it AFAIR.

> If the client successfully executed the LISTEN, then it could convert
> all of the characters in one direction.

You're assuming that the LISTEN was transmitted across the connection,
and not for example executed by a pre-existing function.

> The case of a condition name conversion error seems less problematic to
> me anyway, because it would happen every time; so there's no danger of
> making it through testing and then failing in production.

mmm ... that's assuming that condition names are constants, which isn't
necessarily the case either (I seem to recall generating condition names
even back in 1997).

> Ok. I'd feel a little better if I understood what would actually happen
> in the case of an error with NOTIFY. When does the client receive the
> error? Might the client code confuse it with an error for something
> synchronous, like a command execution?

Yeah, that's possible, but avoiding encoding conversion failures doesn't
eliminate that little hole in the protocol :-(.  There are other ways
for the send attempt to fail.  Admittedly, many of them involve a
connection drop, but not all.

In practice, since encoding conversion failures could interfere with the
results of almost any operation, it's not apparent to me why we should
single out NOTIFY as being so fragile it has to have an ASCII-only
restriction.
        regards, tom lane


Re: LISTEN/NOTIFY versus encoding conversion

From
Martijn van Oosterhout
Date:
On Sun, Feb 14, 2010 at 03:15:30PM -0500, Tom Lane wrote:
> So the currently submitted patch is logically inconsistent.  If we
> enforce a character set restriction on the payload for fear of
> being unable to convert it to the destination client_encoding, then
> we should logically do the same for the condition name.  But then
> why not also restrict a lot of other things to pure ASCII?

AFAICS this essentially goes for "payload is a text string" and for
people who want "payload as binary" will have to do hex encoding or
some such. At least, I thought one of the reasons why it got limited
was because we couldn't decide.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Re: LISTEN/NOTIFY versus encoding conversion

From
Jeff Davis
Date:
On Mon, 2010-02-15 at 13:53 -0500, Tom Lane wrote:
> You're assuming that the LISTEN was transmitted across the connection,
> and not for example executed by a pre-existing function.

Ok, good point.

> In practice, since encoding conversion failures could interfere with the
> results of almost any operation, it's not apparent to me why we should
> single out NOTIFY as being so fragile it has to have an ASCII-only
> restriction.

Ok, it sounds reasonable to lift the restriction.

Regards,Jeff Davis