Re: Confusion over Python drivers - Mailing list pgsql-hackers
From | Jeff Davis |
---|---|
Subject | Re: Confusion over Python drivers |
Date | |
Msg-id | 1265683127.29919.2552.camel@jdavis Whole thread Raw |
In response to | Re: Confusion over Python drivers (Andrew McNamara <andrewm@object-craft.com.au>) |
Responses |
Re: Confusion over Python drivers
|
List | pgsql-hackers |
On Tue, 2010-02-09 at 12:51 +1100, Andrew McNamara wrote: > Now, with the text format parameters, the parser usually does the right > thing, since text formats have plenty of hints for us humans. The parser doesn't care whether it's text format or binary format when determining the type. > However, with the binary format, unless the caller tells us, there's no way > to tell whether we're correctly parsing the data. If the context implies > one type, but the user passes another, we'll either get an ugly error or, > worse, silently misparse their data. The difference between text and binary format is this: after it has already determined the type of the parameter, (a) if the format is text, it passes it to the type input function to constructthe value; or (b) if the format is binary, it passes it to the type recv function to construct the value. The argument to the input or recv functions may: (a) be valid input; or (b) be invalid input, and be detected as an errorby the input or recv function; or (c) be invalid input, and not be detected as an error by the input or recvfunction. For a given type, the input function may be more likely to catch an input error than the recv function; or the reverse. Either way, it is very type-specific, and the only difference is the whether the input is misinterpreted (type error not caught; bad) or an error is thrown (type error caught; better). > Using my ocpgdb module, and interacting directly with the libpq wrapping > code, you can see how postgres reacts to various inputs: None of the examples show a difference in the inferred type of a text versus binary parameter for the same query. > No parameters: > > >>> r=db.execute('select 1', ()) > >>> r.status > PGRES_TUPLES_OK > >>> list(r) > [(<PyPgCell name '?column?', type 23, modifier -1, value '\x00\x00\x00\x01' at 0xb7514200>,)] Expected, because the literal 1 (without quotes) is an integer literal, not an unknown literal. > Int4 parameter, type specified: > > >>> r=db.execute('select $1', [(pgoid.int4, '\x00\x00\x00\x01')]) > >>> r.status > PGRES_TUPLES_OK > >>> list(r) > [(<PyPgCell name '?column?', type 23, modifier -1, value '\x00\x00\x00\x01' at 0xb75141c0>,)] Expected, because you specified the type, and sent the binary data to the integer recv function, and it was valid input. > Int4 parameter, type unknown, can't be determined from context: > > >>> r=db.execute('select $1', [(pgoid.unknown, '\x00\x00\x00\x01')]) > >>> r.status > PGRES_FATAL_ERROR > >>> r.errorMessage > 'ERROR: could not determine data type of parameter $1\n' Expected -- there is no context to determine the type. Why do you call it an int4 parameter? It's just bytes, and you never told postgres what they are (as you did in the previous example). > Int4 parameter, type unknown, can be determined from context: > > >>> r=db.execute('select $1 + 1', [(pgoid.unknown, '\x00\x00\x00\x01')]) > >>> r.status > PGRES_TUPLES_OK > >>> list(r) > [(<PyPgCell name '?column?', type 23, modifier -1, value '\x00\x00\x00\x02' at 0xb7514200>,)] Expected: the function + provides the context that allows the server to interpret the left argument as an integer. (Again, not an int4 parameter, it's unknown) > Text parameter, type unknown, mismatching context - surprising: > > >>> r=db.execute('select $1 + 1', [(pgoid.unknown, '1111')]) > >>> r.status > PGRES_TUPLES_OK > >>> list(r) > [(<PyPgCell name '?column?', type 23, modifier -1, value '1112' at 0xb7514360>,)] Expected, because this is exactly the same as the previous one except for the data you pass in. Notice that the same type is inferred (23). Why do you call this "mismatching context" when the context is exactly the same as above? The only difference is which 4 bytes you provide. You never told postgres that the bytes were text bytes anywhere. You may think that it's doing 1111 + 1, but it's actually doing addition on the bytes. That is apparent in the next example: > Date parameter, type unknown, int context, the value gets misinterpreted: > > >>> r=db.execute('select $1 + 1', [(pgoid.unknown, '\x00\x00\x01n')]) > >>> r.status > PGRES_TUPLES_OK > >>> list(r) > [(<PyPgCell name '?column?', type 23, modifier -1, value '\x00\x00\x01o' at 0xb75144a0>,)] > Expected, because the only thing that could possibly detect the error is the int4recv function, which happens to accept any 4-byte input (so it will never fail on any 4 bytes of data). Regards,Jeff Davis
pgsql-hackers by date: