pgtcl large object read/write corrupts binary data - Mailing list pgsql-bugs
From | ljb |
---|---|
Subject | pgtcl large object read/write corrupts binary data |
Date | |
Msg-id | bni06l$2jh$2@news.hub.org Whole thread Raw |
Responses |
Re: pgtcl large object read/write corrupts binary data
Re: pgtcl large object read/write corrupts binary data |
List | pgsql-bugs |
[Using PostgreSQL-7.3.4 and -7.4beta5, Tcl-8.4.x.] Binary data written to a Large Object with libpgtcl's pg_lo_write is corrupted. Tcl is mangling the data - something to do with UTF-8 conversion. Example: 0x80 becomes 0xc2 0x80, and 0xff becomes 0xc3 0xbf. The problem with pg_lo_read is more subtle. If you compare the expected and actual data with == or [string equal], they do not match, but if you check byte by byte, or write the two values to files, they do match. I believe this is happening because pg_lo_read is returning an object which is inconsistent between its Tcl "string rep" and internal byte array. Here are 2 test scripts to show the problem. They assume your environment variables are set up to allow a connection to PostgreSQL with an empty 'conninfo' string. Quick test script for pg_lo_write problem: ======================== # Write to large object with pg_lo_write, export with pg_lo_export: set data "\x80\xffzzzz" set datalen 6 set conn [pg_connect -conninfo ""] pg_execute $conn begin set loid [pg_lo_creat $conn INV_READ|INV_WRITE] set lofd [pg_lo_open $conn $loid w] pg_lo_write $conn $lofd $data $datalen pg_lo_close $conn $lofd pg_lo_export $conn $loid lo.out pg_lo_unlink $conn $loid pg_execute $conn commit pg_disconnect $conn ======================== Run this script with pgtclsh, then hexdump the file "lo.out". Expected result: file contains "0x80 0xff 0x7a 0x7a 0x7a 0x7a" Observed result: file contains "0xc2 0x80 0xc3 0xbf 0x7a 0x7a" Quick test script for pg_lo_read problem: ======================== # Import large object with pg_lo_import, read back with pg_lo_read: set data "\x80\xffzzzz" set datalen 6 set f [open lo.in w] fconfigure $f -translation binary puts -nonewline $f $data close $f set conn [pg_connect -conninfo ""] pg_execute $conn begin set loid [pg_lo_import $conn lo.in] set lofd [pg_lo_open $conn $loid r] pg_lo_read $conn $lofd buf $datalen pg_lo_close $conn $lofd pg_lo_unlink $conn $loid pg_execute $conn commit pg_disconnect $conn if {[string equal $buf $data]} { puts Match } else { puts Differ } set f [open lo.in2 w] fconfigure $f -translation binary puts -nonewline $f $buf close $f ======================== Run this script with pgtclsh. Expected result: prints "Match" Observed result: prints "Differ" But hexdump the files "lo.in" and "lo2.in" to see identical contents. Proposed Patch: (I think this requires Tcl >= 8.1) =================== --- src/interfaces/libpgtcl/pgtclCmds.c.orig 2003-08-03 22:40:16.000000000 -0400 +++ src/interfaces/libpgtcl/pgtclCmds.c 2003-10-25 20:36:58.000000000 -0400 @@ -1215,7 +1215,7 @@ buf = ckalloc(len + 1); nbytes = lo_read(conn, fd, buf, len); - bufObj = Tcl_NewStringObj(buf, nbytes); + bufObj = Tcl_NewByteArrayObj(buf, nbytes); if (Tcl_ObjSetVar2(interp, bufVar, NULL, bufObj, TCL_LEAVE_ERR_MSG | TCL_PARSE_PART1) == NULL) @@ -1307,7 +1307,7 @@ if (Tcl_GetIntFromObj(interp, objv[2], &fd) != TCL_OK) return TCL_ERROR; - buf = Tcl_GetStringFromObj(objv[3], &nbytes); + buf = Tcl_GetByteArrayFromObj(objv[3], &nbytes); if (Tcl_GetIntFromObj(interp, objv[4], &len) != TCL_OK) return TCL_ERROR; ===================
pgsql-bugs by date: