The Lord of the Sinks

One of the most popular mantras related to information security is: “Do not trust user input”. This is very misleading. A lot of people interpret it like: “If a user of your application sends you some data — validate/escape it before usage because it may contain some malicious symbols”. In fact this is only partially correct and causes enormous amount of issues.
Lets analyze what is wrong with this point of view in case of web applications:
- Do not trust user input. WTF is USER INPUT? If you ask any web developer what is it, I believe more than half of them will answer something like: “Any request from frontend which reaches your backend”. Very few of them realize that any data which application reads from data base or file should be treated as “user input”. Misunderstanding of this fact causes such issues like: Second Order SQL-injection and escaping data against XSS before putting it into database. But what if we have a hard coded string in our application or value of the variable is randomly generated e.g.
<?phpfunction generateRandomString($length = 10) {
$characters = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
$charactersLength = strlen($characters);
$randomString = '';
for ($i = 0; $i < $length; $i++) {
$randomString .= $characters[rand(0, $charactersLength - 1)];
}
return $randomString;
}$foo = 'bar';
$baz = generateRandomString();
are $foo and $baz have to be interpreted as user input? I believe you answer is “NO”.
2. Why should we validate or escape data in a first place? Lets look on a very common vector of attack — SQL injection (assume MySQL as a db):
$user = $_GET['user'];
$password = sha1($_GET['password']);$sql = "SELECT * FROM users WHERE username = '{$user}' and password='{$password}'";
If malicious user as a username sends
' or 1=1 --
we have a classic injection. So to prevent it we have to escape single quote (of course correct answer is to use prepared statements). Lets escape:
$user = mysql_real_escape_string($_GET['user']);
...
Now we are safe. But what if user have not been trying to hack our db? What if he try to use O’Reilly as a username? Then we have to escape it anyway!
So now let me repeat my questions in other fashion: “Why should I treat user input differently of any other input? If I hardcode O’Reilly as variable name in my script can I omit escaping?”.
In fact you should not care about the source of the input (data), you have to do context specific escaping on architectural boundaries. To prevent ANY type of INJECTION attack (sql-injection, command injection, XSS, path traversal …) all you need is properly escape input data as close to the sink as possible.
Taint Analysis attempts to identify variables that have been ‘tainted’ with user controllable input and traces them to possible vulnerable functions also known as a ‘sink’
STOP INJECTIONS — BECOME THE LORD OF THE SINKS.