DBD::Pg BYTEA Character Escaping - Mailing list pgsql-general
From | David Wheeler |
---|---|
Subject | DBD::Pg BYTEA Character Escaping |
Date | |
Msg-id | 1006059527.1307.14.camel@mercury.atomicode.com Whole thread Raw |
Responses |
Re: DBD::Pg BYTEA Character Escaping
(Bruce Momjian <pgman@candle.pha.pa.us>)
Re: DBD::Pg BYTEA Character Escaping (Tom Lane <tgl@sss.pgh.pa.us>) Re: DBD::Pg BYTEA Character Escaping (Alex Pilosov <alex@pilosoft.com>) |
List | pgsql-general |
Hi All, I recently noticed that the DBD::Pg Perl module appears to be doing a lot of work escaping characters for BYTEA data types. It's importing Perl's POSIX support to check every character in BYTEA data with isprint(), and replacing it with its octal representation if its not printable. However, there are two issues with this approach. The first is efficiency. The way the code is currently written in DBD::Pg does a *lot* of unnecessary work, and I'd like to suggest an optimization (based on discussions on this topic on the Fun with Perl mail list: http://archive.develooper.com/fwp%40perl.org/msg00458.html -- patch supplied upon request). The second issue, however, is that it doesn't appear to me that it's even necessary that non-printable characters be replaced. Although Alex Pilosov says that such an approach is needed: http://www.geocrawler.com/mail/msg.php3?msg_id=6509224&list=10 Joe Conway found that there were only three characters ('\', "'", and "\0") that needed to be escaped, and it was those three characters that Bruce Momjian documented for the forthcoming 7.2 release: http://www.geocrawler.com/mail/msg.php3?msg_id=6547225&list=10 If that's true, then any solution escaping non-printable characters is overkill, and therefore only the three characters need to be escaped. And since it looks like two of them ('\' and "'") are already escaped before the non-printable characters are escaped in DBD::Pg, it then it seems that this code: if ($data_type == DBI::SQL_BINARY || $data_type == DBI::SQL_VARBINARY || $data_type == DBI::SQL_LONGVARBINARY) { $str=join("", map { isprint($_)?$_:'\\'.sprintf("%03o",ord($_)) } split //, $str); } Could be changed to: s/\0/\\000/g if $data_type == DBI::SQL_BINARY || $data_type == DBI::SQL_VARBINARY || $data_type == DBI::SQL_LONGVARBINARY; So, the reason I'm posting this query is because I'd like to get confirmation, if possible, on this conclusion. Based on the feedback I receive, I will submit patches to the DBD::Pg maintainer. Thanks! David PS: If discussion of this issue needs to be moved to the Hackers list, I'll be happy to do so. I just thought I'd try here, first. -- David Wheeler AIM: dwTheory David@Wheeler.net ICQ: 15726394 Yahoo!: dew7e Jabber: Theory@jabber.org
pgsql-general by date: