CAREER: At-scale Analysis of Issues in Cyber-security and Software Engineering
One of the most significant challenges in cybersecurity is that humans are involved in software engineering and inevitably make security mistakes in their implementation of specifications, leading to software vulnerabilities. A challenge to eliminating these mistakes is the relative lack of empirical evidence regarding what secure coding practices (e.g., secure defaults, validating client data, etc.), threat modeling, and educational solutions are effective in reducing the number of application-level vulnerabilities that software engineers produce. This research aims to perform experiments analyzing programming assignment submissions to Massively Open Online Courses (MOOCs) before and after secure coding and threat modeling techniques are taught to empirically measure their impact on the rate of security vulnerabilities in assignment implementations. A key component of this research will be the use of MOOC assignment specifications and variations that have the potential to be affected by common cybersecurity vulnerabilities, such as problems with input validation to web applications or privilege escalation on mobile platforms. Because these critical security implementation issues will be known ahead of time, the MOOC assignments will allow automated assessment of how successfully each assignment implementation manages these security issues.
Key questions investigated by this research include analyzing the impact of varying secure coding and threat modeling techniques on vulnerability production in software, what level of abstraction these techniques need to be taught at to be effective, the relative return on investment of threat modeling vs. automated vulnerability assessment effort, and the comparative effectiveness of making developers aware of security issues versus requiring active application of secure coding and threat modeling techniques. The broader impact of this research is substantial. Very little empirical data is available for organizations to use to properly value the secure coding and threat modeling techniques that have been developed. By creating a large body of rigorous evidence to illustrate how effective (or possibly not effective) different techniques are, the research will allow organizations to evaluate their return on investment and improve the use of these techniques in the software engineering process.