Joe Conway <mail@joeconway.com> writes:
> But the error comes from pg_verifymbstr. Since bytea has no encoding
> (it's just an array of bytes afterall), why does pg_verifymbstr get
> applied at all to bytea data?
Because textin() is used for the initial conversion to an "unknown"
constant --- see make_const() in parse_node.c.
> pg_verifymbstr is called by textin, bpcharin, and varcharin. Would it
> help to rewrite this as:
> INSERT INTO t1(bytea_col) VALUES('characters produced by
> PQescapebytea'::bytea);
Probably that would cause the error to disappear, but it's hardly a
desirable answer.
I wonder whether this says that TEXT is not a good implementation of
type UNKNOWN. That choice was made on the assumption that TEXT would
faithfully preserve the contents of a C string ... but it seems that in
the multibyte world it ain't so. It would not be a huge amount of work
to write a couple more I/O routines and give UNKNOWN its own I/O
behavior.
OTOH, I was surprised to read your message because I had assumed the
damage was being done much further upstream, viz during collection of
the query string by pq_getstr(). Do we need to think twice about that
processing, as well?
regards, tom lane