Thread: troublesome inputs

troublesome inputs

From
Michelle Murrain
Date:
Hi Folks,

I've got a question. Users that use the web-based databases that I've
recently written (cgi-perl and pgsql) sometimes come up with errors when they
input information. I have realized that I need to write some very rigorous
error-checking routines in order to make sure that whatever they are typing
in makes it into the database, or flags them for re-input, and doesn't result
in an unrecoverable error. What I need to do is two-fold - escape stuff that
is easy (like single quotes), and do testing of types (make sure they are not
putting non-numerics in int fields, etc.)

I'm not a newbie at either perl or pgsql, but I'm definitely not yet a guru.

I've identified quotes in particualr as characters I need to make sure
get taken care of in some way or another before I pass them to a query. Are
there others that folks know about, especially combinations of characters
that lead to wierdness?

Other things I should be thinking about?

Thanks!

Michelle
--
------------
Michelle Murrain, Ph.D.
President
Norwottuck Technology Resources
mpm@norwottuck.com
http://www.norwottuck.com

Re: troublesome inputs

From
Lincoln Yeoh
Date:
At 02:49 PM 2/17/01 -0500, Michelle Murrain wrote:
>in an unrecoverable error. What I need to do is two-fold - escape stuff that
>is easy (like single quotes), and do testing of types (make sure they are
not
>putting non-numerics in int fields, etc.)
>
>I'm not a newbie at either perl or pgsql, but I'm definitely not yet a guru.
>
>I've identified quotes in particualr as characters I need to make sure
>get taken care of in some way or another before I pass them to a query. Are
>there others that folks know about, especially combinations of characters
>that lead to wierdness?
>
>Other things I should be thinking about?

What you should also think about are cases where people intentionally try
to abuse your program.

Generally you filter what goes into your program so that your program can
handle it (input filtering).

Then you filter what leaves your program for another program (output
filtering).

And you have to assume that your user can send you whatever they want. For
example in a web app, if you store stuff in hidden form fields, the user
can always save the form, alter the values and submit it to you. Same goes
for storing stuff in cookies, or trying to limit options by limiting the
options shown in HTML (users can always add more options, or even do
multiple ones ;) ).

**Input filtering
For example in perl if you are using CGI what you can start with is:
$CGI::POST_MAX=$MAXBUF;

This limits the HTTP posts somewhat. HTTP GETs are usually limited by the
webserver (if it's a decent one).

Then you can do stuff like:

sub myparam {
$cgiparamname=shift;
$defaultvalue=shift;
$usermaxlength=shift||8;
$userminlength=shift||0;
$temp=param($cgiparamname);
unless (defined(temp)) {
    $temp=$defaultvalue;
}
$value=substr($temp,0,$usermaxlength);
$value=~tr/A-Za-z0-9//cd;
if ($value ne $temp) return ($value,INFO_LOSS);
if (length($value)<$userminlength) return ($value,TOO_SHORT);
return $value,OK;
}

where value is assumed to be an alphanumeric.

**Output filtering
For perl you should use the built-in DBI quoting function, or use bind vars
to stuff things into the database (after clamping down data to sane
sizes/values first).

Also, the DBI quoting feature doesn't quote %. So if you are going to use
LIKE queries and you don't want to allow users to do full table scans you
better limit how the % can appear, or just quote all occurences of % out.
Personally I prefer to support a * for wildcard, and translate the * to %
for the user with the usual sanity checks. This provides a useful layer of
abstraction.

When you are printing stuff out, make sure you quote it so that the web
browser sees it correctly - e.g. not HTML if it's not supposed to be html.
Similarly when you are printing a URL out.

**Really troublesome stuff.

I'm not sure how to quote or filter Unicode/custom encoded stuff properly.
For example if a person sends some encoded stuff, and it's fine to your
program, but when you pass it to another program, it decodes the info, and
stuff may go boom (e.g. for instance document paths suddenly look like
../../../bankdirectory/bankkeys).

I don't have any experience in dealing with this - it looks like a real
pain. Basically I've been avoiding the problem by not supporting it (I only
support known safe subsets of ASCII), but of course one day I'll have to.
Any ideas on how to do it securely are very welcome.

Cheerio,
Link.


Re: troublesome inputs

From
"Mark Cowlishaw"
Date:
> At 02:49 PM 2/17/01 -0500, Michelle Murrain wrote:
> >in an unrecoverable error. What I need to do is two-fold - escape stuff
that
> >is easy (like single quotes), and do testing of types (make sure they are
> not
> >putting non-numerics in int fields, etc.)
> >

Input parameter testing in perl is quite straightforward using regex
matching.  I do this on my input parameters (assumes you are using CGI.pm):

my $bar = validate(trim($cgi->param('bar')), '^\d+$')
    or push(@error_list, "Missing or Invalid Bar");

where trim() removes leading/trailing whitespace: (s/^\s*/; s/\s*$/;)
and validate() performs a regex match and returns undef on no match.

Save all the errors in a list for processing later so you dont make the user
re-enter the form details 5 times before they get told about all their
errors.

For doing stuff like escaping strings for input into the database, I've
found that using the prepared statement mechanism very helpful since it does
this for you  -- there is probably some performance hit doing it this way
but I think using Perl to start with obviates most performance concerns ;-)

my $sth = $dbh->prepare("insert into foo (a,b,c) values (?,?,?)");
$sth->execute(1,2,"some random string with funny chars");




troublesome inputs

From
Dan Lyke
Date:
Michelle Murrain writes:
> I've identified quotes in particualr as characters I need to make sure
> get taken care of in some way or another before I pass them to a query. Are
> there others that folks know about, especially combinations of characters
> that lead to wierdness?

If you're using Perl, $dbh->quote() is your friend. Really.

Dan