Adoption of Rule#0 from AppSec Manifesto
Introduction
This is the first article in the series of articles about the adoption of the AppSec Manifesto.
«Every Input Is a Program»
«Information Is Instruction»
(c) Language-theoretic Security
Background
All the described rules and principles are language agnostic, but the examples provided are written in PHP.
AppSec Manifesto
There is an illusion that your program is manipulating its data. But it is an illusion: The data is controlling your program.
Actually, to write an absolutely secure application is possible by adopting only Rule #0 from AppSec Manifesto:
Rule #0 (Absolute Zero)
No code = no issues. No sinks = no vulnerabilities. No user-controlled input = no vector of attack.
It’s quite straightforward, but the application with zero lines of code is useless, so Absolute is unattainable, but we must strive for it.
A real application is always a trade-off between the UX and the security.
To follow Rule #0, we should:
- always delete obsolete, dead, unreachable, unreferenced code
- not ask the user to provide more input than needed
- use the least expressive language to talk to the application
Delete obsolete, dead, unreachable, unreferenced code
Obsolete code: code that may have been useful in the past, but is no longer used, e.g. code to use a deprecated protocol. May be called dead code as well
Dead code: code that is executed but redundant, either the results were never used or adds nothing to the rest of the program. Wastes CPU performance.
Unreferenced: variable (method, function etc.) that is defined but which is never used.
Unreachable code: code that will never be reached regardless of logic flow.
PHP
function(){
return 'foobar';
// following line is unreachable
$a = $b + 1;
}
Unreachable
, unreferenced
and dead
code can be found with static analysis (PHPStan, Phan, Psalm).
Obsolete
(dead) code can be found with dynamic analysis and tombstone
concept.
WHY?
- Unused code adds complexity
- Unused code is misleading
- Dead code can come alive
Trade-off
There is always a temptation to leave the code just in case. Everything just works so just don’t touch (fix) it if it ain’t broken!
When you are dealing with a dead code, you have to choose an approach — a Hospice or a Field hospital. Most of the companies choose a Hospice approach because, in fact, there is nothing to do here, you just keep all the dead code (disk storage is cheap). Usually, this makes sense until the rise of the dead code:
Rise of dead code: During the summer of 2012, Knight Capital Group caused a major stock market disruption and suffered a loss of over $400 million when a botched software deployment caused dead order handling code to be executed. The code had not been tested in many years and resulted in a deluge of orders hitting the market that could not be cancelled.
If you choose a Field Hospital approach, then you remove as much as you can. In such a case, you may remove more code than needed,d especially when dealing with obsolete code. You need good test coverage and regression testing to catch such situations. Don’t combine the implementation of a new feature or a bug fix together with a dead code removal in a single commit (assuming you are using a version control system, e.g., git).
Do not ask the user to provide more input than needed.
Assume you have a form for file uploads:
HTML
<form name="upload" action="upload.php" method="POST" enctype="multipart/form-data">
Select image to upload: <input type="file" name="image">
<input type="submit" name="upload" value="upload">
</form>
and a corresponding PHP script that handles uploads:
PHP
<?php
$uploaddir = 'uploads/';$uploadfile = $uploaddir . basename($_FILES['image']['name']);if (move_uploaded_file($_FILES['image']['tmp_name'], $uploadfile)) {
echo "Image successfully uploaded.";
} else {
echo "Image uploading failed.";
}
This is a common way of file upload. This code is vulnerable to a number of attacks. The source of the problem as usual — a confrontation between the UX and the security. We want to keep a filename provided by the user assuming it has a meaningful name. But let’s try to apply Rule #0 here. Do we really need a user-provided filename? Will the application still be operational if we just generate a filename on a server-side? Of course, we can. And this will significantly reduce an attack surface.
Use the least expressive language to talk to the application:
All data is a stream of tokens of almost arbitrary complexity, and this stream of tokens is a sequence of instructions to the parser of its language
There is a Chomsky hierarchy of formal languages by their power (more expressive languages are more powerful.)
Based on that, we should design our applications to accept regular or maximum context-free input.
Let’s look at a SQL injection attack to clarify this:
Assume we have a SQL query somewhere in our codebase:
SQL
SELECT * from users WHERE username='$username'
// OR
SELECT * from users WHERE id='$id'
and $username ($id)
is controlled by the user's input, e.g.:
PHP
$username = $_GET['username'];
$id = $_GET['id'];
If the $username
is ' OR 1=1 -- then we have an SQL injection and a list of all users as a result. What happened? The developer expected input to be regular, e.g., $username = Bob, Alice
, etc., and $id = 1, 3, 42;
But an attacker understands that in fact the input is not restricted anyhow and uses more power input. Taking it into account, we may protect our $id
variable by forcing it to be an integer:
PHP
$id = intval($_GET['id']);
the case with $username
is more complex. Can we use a similar approach? It depends. From one perspective, we can restrict input with a [a-zA-Z]+ regexp, but this will prevent users with names like O'Reilly from registering. Again, this UX and security trade-off.
Note: There is another aspect that forces the database to treat $username as a SQL expression instead of the value of the username field. We’ll go deep into it in the next article of this series.
Another complex case is an XSS. Let’s look at a few cases:
- We need to show a username in the UI, e.g., a user profile
- We need to allow HTML markup in user comments on the forum
In the case of a profile, we may follow the logic from the SQL-injection example and strip all the characters except [a-zA-Z]+ (more likely just <,”,>’) before the output. (Note: This is only for the sake of an example of how Rule #0 may be applied. The appropriate way to mitigate XSS is a context-specific escaping that will be explained in the next article.)
In the case of the forum, we can’t just strip HTML tags because of the business requirements. So we can’t restrict the user input. Fortunately, the main thing is not how complex the input is, but how much complex input can be understood by the corresponding language parser. So, we can just force the browser to not understand JS in the user’s input:
HTML
<iframe srcdoc="<script>alert('look ma - no xss');</script>" sandbox />
Usage of Strict Content-Security-Policy is also an example of following Rule #0 because it significantly reduces the attack surface.
Content-Security-Policy:
object-src 'none';
script-src 'nonce-{random}' 'unsafe-inline' 'unsafe-eval' 'strict-dynamic' https: http:;
base-uri 'none';
report-uri https://your-report-collector.example.com/
Another example is PHAR Deserialization. This is a type of attack which abuses the fact that using the phar://
stream wrapper to perform read/write operations on PHAR files determines their metadata to be automatically deserialized.
You can see the details at this link. In general, to exploit this vulnerability, it’s enough to upload a specially crafted .phar archive and point any filesystem related function to it, e.g., if in your code you have something like:
PHP
<?phpif (file_exists($filename)) {
echo file_get_contents($filename);
}
and an attacker can control $filename
, then to perform an attack, it's enough to set $filename
to:
phar://./path/to/uploaded.phar
To be safe, we have to prevent .phar uploads and ensure that values of the variables that represent filenames do not start with phar://. But in most cases, it’s enough to apply Rule #0 here:
PHAR archives usually are not used in web applications, in most cases, they are used in CLI apps. So for the web app, we can just reduce the number of stream wrappers our backend can understand:
PHP
stream_wrapper_unregister('phar');
Conclusion: The functionality that is not developed cannot be abused.