Let’s say you conduct a penetration test on a web application, and you find zero critical vulnerabilities. Is this a good or a bad sign? How would you interpret this result? To pass a meaningful judgement, there is no way around measuring code coverage. Of course, you could also just take a leap of faith and conclude that the reason you didn’t find any bugs is that there are none. But do you really want to take that risk?
In this article, I want to clarify why code coverage is so important and how security professionals can measure and increase it effortlessly.
When your tests report 0 vulnerabilities but no code coverage
The Benefits of Measuring Code Coverage
Pentesting certainly plays an essential role in securing web applications. Nonetheless, most enterprise pentests too often leave an important metric out: Code coverage. But how can you decide whether your test is useful or not if you can’t accurately tell which parts of your application it tested? Managers can potentially cover their ass with such a pentest, but as a developer, you always need an accurate picture of how secure your application really is.
Use Case: How Code Coverage Helped Us Find 3 Critical Vulnerabilities
I would like to demonstrate why it is so important to measure code coverage on a use case. For this example, I simply ran a vulnerability scanner on Webgoat. Webgoat is a deliberately insecure web application built by the OWASP organization to examine common vulnerabilities in Java-based web applications.
In our initial test run, we were able to generate 9 bug findings. If we had left it at that, 9 findings would have already been quite a success. However, we measured code coverage and quickly saw that something was wrong. Our tests had only traversed 16% of the project.
After looking at the log, it was pretty clear, that the reason for this was our test inputs getting stuck at the login of the web application. By simply logging in and rerunning our test, we were able to uncover 22 new bugs, three of them being security-critical SQL injections.
How to improve your code coverage, by using the CI Fuzz extension for VS Code
Interpreting the Results:
Why is Feedback on Code Coverage so Important?
For web applications with a login, it is kind of obvious that you cannot achieve a high coverage without logging in. Any experienced tester would be able to recognize this immediately. And even for black box scans, most developers would use a login to improve their code coverage.
However, as you proceed in the testing process, it becomes more difficult to optimize your testing results, without really measuring the code coverage. For example, a low code coverage may also indicate missing permissions. Maybe there are various user groups that have different access levels, which might stay unnoticed and lead to low coverage if the coverage is not measured.
Even experienced pentesters might miss those road blockers, as they are often working under immense time pressure, and have no direct feedback on the code coverage. Continuously measuring code coverage could help you detect those kinds of issues much faster. Thus, by applying minor modifications to your tests, you can improve the reliability and security of your code strongly.
Coverage-Guided vs Black-Box Testing
Implementing code coverage into a testing cycle requires specific tools. Automated web security testing tools such as Burp, OWASP ZAP, or RESTler are brilliant if they are used by experts that know how to fine-tune them and where to look for security vulnerabilities. But for the average developer, generally require too much manual adaptation.
The tools listed above treat applications as a black-box, which means that they do not have access to the source code and thus cannot measure code coverage. Black-box approaches will get the job done, especially if you leave them running for a long time. Nonetheless, you will certainly generate more reliable results if you incorporate code coverage into your tests.
How to Measure and Report Code Coverage
In the use case described above, we used CI Fuzz to automatically generate test inputs and measure code coverage. The testing platform uses modern fuzz testing approaches to automate security testing for web applications. The platform continuously measures code coverage and comes with detailed reporting, that also shows you how much progress you have made since the last test run.
Dashboards in CI Sense allow developers to monitor the performance of fuzz tests in real-time, including such KPIs as the number of findings, code branches tested, auto-generated inputs, code coverage, and unit test equivalents.