Re: Binary support for pgoutput plugin - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Binary support for pgoutput plugin |
Date | |
Msg-id | 4098881.1595091212@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Binary support for pgoutput plugin (Dave Cramer <davecramer@gmail.com>) |
Responses |
Re: Binary support for pgoutput plugin
|
List | pgsql-hackers |
I've pushed this patch, with a number of adjustments, some cosmetic and some not so much (no pg_dump support!?). We're not quite done though ... Dave Cramer <davecramer@gmail.com> writes: > So looking at how to confirm that the subscriber has receive functions for > all of the types. > AFAICT we don't have that information since the publication determines what > is sent? Yeah, at the point where we need to send the option, we seem not to have a lot of info. In practice, if the sender has a typsend function, the only way the subscriber doesn't have a matching typreceive function is if it's an older PG version. I think it's sufficient to document that you can't use binary mode in that case, so that's what I did. (Note that getTypeBinaryInputInfo will say "no binary input function available for type %s" in such a case, so that seemed like adequate error handling.) > On Tue, 14 Jul 2020 at 22:47, Andres Freund <andres@anarazel.de> wrote: >> An oid mismatch error without knowing what that's about isn't very >> helpful either. >> How about adding an errcontext that shows the "source type oid", the >> target type oid & type name and, for records, the column name of the >> target table? That'd make this a lot easier to debug. > This code line 482 in proto.c attempts to limit what is sent in binary. We > could certainly be more restrictive here. I think Andres' point is to be *less* restrictive. I left that logic as-is in the committed patch, but we could do something like the attached to improve the situation. regards, tom lane diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c index 2b1356ee24..d9837a5971 100644 --- a/src/backend/replication/logical/proto.c +++ b/src/backend/replication/logical/proto.c @@ -494,22 +494,9 @@ logicalrep_write_tuple(StringInfo out, Relation rel, HeapTuple tuple, bool binar typclass = (Form_pg_type) GETSTRUCT(typtup); /* - * Choose whether to send in binary. Obviously, the option must be - * requested and the type must have a send function. Also, if the - * type is not built-in then it must not be a composite or array type. - * Such types contain type OIDs, which will likely not match at the - * receiver if it's not a built-in type. - * - * XXX this could be relaxed if we changed record_recv and array_recv - * to be less picky. - * - * XXX this fails to apply the restriction to domains over such types. + * Send in binary if requested and type has suitable send function. */ - if (binary && - OidIsValid(typclass->typsend) && - (att->atttypid < FirstGenbkiObjectId || - (typclass->typtype != TYPTYPE_COMPOSITE && - typclass->typelem == InvalidOid))) + if (binary && OidIsValid(typclass->typsend)) { bytea *outputbytes; int len; diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index 800107d4e7..392445ea03 100644 --- a/src/backend/utils/adt/arrayfuncs.c +++ b/src/backend/utils/adt/arrayfuncs.c @@ -1308,13 +1308,34 @@ array_recv(PG_FUNCTION_ARGS) (errcode(ERRCODE_INVALID_BINARY_REPRESENTATION), errmsg("invalid array flags"))); + /* Check element type recorded in the data */ element_type = pq_getmsgint(buf, sizeof(Oid)); + + /* + * From a security standpoint, it doesn't matter whether the input's + * element type matches what we expect: the element type's receive + * function has to be robust enough to cope with invalid data. However, + * from a user-friendliness standpoint, it's nicer to complain about type + * mismatches than to throw "improper binary format" errors. But there's + * a problem: only built-in types have OIDs that are stable enough to + * believe that a mismatch is a real issue. So complain only if both OIDs + * are in the built-in range. Otherwise, carry on with the element type + * we "should" be getting. + */ if (element_type != spec_element_type) { - /* XXX Can we allow taking the input element type in any cases? */ - ereport(ERROR, - (errcode(ERRCODE_DATATYPE_MISMATCH), - errmsg("wrong element type"))); + if (element_type < FirstGenbkiObjectId && + spec_element_type < FirstGenbkiObjectId) + ereport(ERROR, + (errcode(ERRCODE_DATATYPE_MISMATCH), + errmsg("binary data has array element type %u (%s) instead of expected %u (%s)", + element_type, + format_type_extended(element_type, -1, + FORMAT_TYPE_ALLOW_INVALID), + spec_element_type, + format_type_extended(spec_element_type, -1, + FORMAT_TYPE_ALLOW_INVALID)))); + element_type = spec_element_type; } for (i = 0; i < ndim; i++) diff --git a/src/backend/utils/adt/rowtypes.c b/src/backend/utils/adt/rowtypes.c index 80cba2f4c2..674cf0a55d 100644 --- a/src/backend/utils/adt/rowtypes.c +++ b/src/backend/utils/adt/rowtypes.c @@ -551,13 +551,33 @@ record_recv(PG_FUNCTION_ARGS) continue; } - /* Verify column datatype */ + /* Check column type recorded in the data */ coltypoid = pq_getmsgint(buf, sizeof(Oid)); - if (coltypoid != column_type) + + /* + * From a security standpoint, it doesn't matter whether the input's + * column type matches what we expect: the column type's receive + * function has to be robust enough to cope with invalid data. + * However, from a user-friendliness standpoint, it's nicer to + * complain about type mismatches than to throw "improper binary + * format" errors. But there's a problem: only built-in types have + * OIDs that are stable enough to believe that a mismatch is a real + * issue. So complain only if both OIDs are in the built-in range. + * Otherwise, carry on with the column type we "should" be getting. + */ + if (coltypoid != column_type && + coltypoid < FirstGenbkiObjectId && + column_type < FirstGenbkiObjectId) ereport(ERROR, (errcode(ERRCODE_DATATYPE_MISMATCH), - errmsg("wrong data type: %u, expected %u", - coltypoid, column_type))); + errmsg("binary data has type %u (%s) instead of expected %u (%s) in record column %d", + coltypoid, + format_type_extended(coltypoid, -1, + FORMAT_TYPE_ALLOW_INVALID), + column_type, + format_type_extended(column_type, -1, + FORMAT_TYPE_ALLOW_INVALID), + i + 1))); /* Get and check the item length */ itemlen = pq_getmsgint(buf, 4);
pgsql-hackers by date: