At 09:56 AM 5/15/02 -0500, Bill Gribble wrote:
>So the only escaping I do in my app currently is to replace ' with '' in
>user-input strings. If I assume that the goal is to prevent any
>user-input strings from being evaluated as SQL statements (only to allow
>user input as constant values), what other escaping do I need to do?
If you already know all the following, I'm sorry for going a bit off topic.
It's just that many don't seem to.
Every app (including DBs) has its own quirks and preferences, so you should
use appropriate filters for each. Do NOT combine them[1]. I strongly
suggest you have filters for each data entry and exit point for your
application e.g:
Application input filter
|
application--Output filter #1 - Output #1
| |___Output filter #2 - Output #2
|
Database filter
|
database
App input filter - filter stuff so your app doesn't choke or break.
DB filter - filter stuff so DB doesn't choke or break and accepts things
correctly.
Output filters - filter stuff so viewer/destination doesn't choke or break.
DB filter - assuming you use plain ASCII (otherwise good luck! Maybe
someone here knows how to filter multibytes/unicode safely for Pg or XYZ, I
don't so I only support ASCII) - filter out all nonprintable characters -
except maybe cr/lf/tab for certain cases. This includes filtering out the
null character - I believe you don't want to try sending \000 into the DB
or unsuspecting apps :).
(Then you may wish to do some post processing e.g. trim
leading/trailing/extra spaces).
AFTER that, do the normal DB quoting. For postgresql backslashes for
backslashes, single quotes for single quotes. Seems backslashes work for
single quotes too.
Many DB connection modules provide quoting. For perl use DBI, DBD and use
the placeholders ? and execute($var1,$var2) to stick in variables - that
way you get automatic quoting. Otherwise you'll have to use DBH->quote
everywhere. For java there's similar placeholder sort of stuff (prepared
queries or something like that).
WARNING!!!
Many quoting functions do not filter out SQL wildcard characters e.g. % and
_. This is probably intentional. But these wildcard characters have special
meaning in LIKE queries. If you are not careful with LIKE queries, people
could force full table scans everywhere and|or see more than they are
supposed to. So you may want to carefully consider these cases - some cases
they are useful, some too useful ;). For instance I've managed to find out
how many customers a certificate authority (not a big one) had - tsk tsk :).
Note[1]: Combining filters (esp input and output) is a BAD idea. You may
notice spurious backslashes (e.g. isn\'t) or other out-of-place characters
in the content of some websites (esp PHP ones) - this is probably due to a
braindead filtering architecture - data is filtered for the DB/XYZ even as
it enters the application (e.g. PHP's addslashes), thus the app doesn't
know whether the backslash is from the user or not, and can't treat it
appropriately - when sending back to itself or to other apps. So keep the
filters separate.
Hope that helps,
Link.