Publications
publication
Biases and Differences in Code Review Using Medical Imaging and Eye-Tracking: Genders, Humans, and Machines
Code review is a critical step in modern software quality assurance, yet it is vulnerable to human biases. Previous studies have clarified the extent of the problem, particularly regarding biases against the authors of code,but no consensus understanding has emerged. Advances in medical imaging are increasingly applied to software engineering, supporting grounded neurobiological explorations of computing activities, including the review, reading, and writing of source code. In this paper, we present the results of a controlled experiment using both medical imaging and also eye tracking to investigate the neurological correlates of biases and differences between genders of humans and machines (e.g., automated program repair tools) in code review. We find that men and women conduct code reviews differently, in ways that are measurable and supported by behavioral, eye-tracking and medical imaging data. We also find biases in how humans review code as a function of its apparent author, when controlling for code quality. In addition to advancing our fundamental understanding of how cognitive biases relate to the code review process, the results may inform subsequent training and tool design to reduce bias.
publication
KShot: Live Kernel Patching with SMM and SGX
Live kernel patching is an increasingly common trend in operating system distributions, enabling dynamic updates to include new features or to fix vulnerabilities without having to reboot the system. Patching the kernel at runtime lowers downtime and reduces the loss of useful state from running applications. However, existing kernel live patching techniques (1) rely on specific support from the target operating system, and (2) admit patch failures resulting from kernel faults. We present KSHOT, a kernel live patching mechanism based on x86 SMM and Intel SGX that focuses on patching Linux kernel security vulnerabilities. Our patching processes are protected by hardware-assisted Trusted Execution Environments. We demonstrate that our technique can successfully patch vulnerable kernel functions at the binary-level without support from the underlying OS and regardless of whether the kernel patching mechanism is compromised. We demonstrate the applicability of KSHOT by successfully patching 30 critical indicative kernel vulnerabilities.
publication
MARTINI: Memory Access Traces to Detect Attacks
Hardware architectural vulnerabilities, such as Spectre and Meltdown, are difficult or inefficient to mitigate in software. Although revised hardware designs may address some architectural vulnerabilities going forward, most current remedies increase execution time significantly. Techniques are needed to rapidly and efficiently detect these and other emerging threats. We present an anomaly detector, MARTINI, that analyzes traces of memory accesses in real time to detect attacks. Our experimental evaluation shows that anomalies in these traces are strongly correlated with unauthorized program execution, including architectural side-channel attacks of multiple types. MARTINI consists of a finite automaton that models normal program behavior in terms of memory addresses that are read from, and written to, at runtime. The model uses a compact representation of n-grams, i.e., short sequences of memory accesses, which can be stored and processed efficiently. Once the system is trained on authorized behavior, it rapidly detects a variety of low-level anomalous behaviors and attacks not otherwise easily discernible at the software level. MARTINI s implementation leverages recent advances in in-cache and in-memory automata for computation, and we present a hardware unit that repurposes a small portion of a last-level cache slice to monitor memory addresses. Our detector directly inspects the addresses of memory accesses, using the pre-constructed automaton to identify anomalies with high accuracy, negligible runtime overhead, and trivial increase in CPU chip area. We present analyses of expected hardware properties based on indicative cache and memory hierarchy simulations and empirical evaluations.
publication
Graph-Theoretic Approach for Increasing Participation in Networks With Assorted Resources
In many cooperative networks, individuals participate actively as long as they recognize a sufficient value in participation, which depends not only on the number, but also on the attributes of other participating members. In this paper, we present a generalized model of individuals participation in such networks, and a strategy to maximize the number of participating individuals. Unlike most of the existing literature, our model incorporates both the network structure and the heterogeneity of individuals in terms of their attributes and resources. We consider that each individual possesses a subset of available resources (attributes), which it shares with neighbors as long as neighbors reciprocate and provide the missing resources to the individual. However, individual leaves the network if it cannot find all the resources in its neighborhood. To model this phenomenon, we introduce a graph-theoretic notion of the (r,s)-core, which is the sub-network consisting of only those individuals who can access all the resources by collaborating with their neighbors. Since disengagement of an individual could initiate a cascading withdrawal of more individuals from the network, one of our main goals is to prevent this unraveling and maximize the number of participating individuals. For this purpose, we utilize the notion of anchors-individuals that continue to participate (due to incentives) even if they cannot find all of the resources in their neighborhood. By introducing only a few anchors, we can significantly increase the number of participating individuals, which in our model corresponds to increasing the size of the (r,s)-core. We formulate and thoroughly analyze the anchors selection problem by classifying the cases in which the problem is polynomial-time solvable, NP-complete, and inapproximable. Further, we provide greedy and metaheuristic search algorithms to compute a set of anchors and evaluate our results on various networks. Our results are applicable to a large number of cooperative networking applications, including participatory sensing in which users develop an elaborate knowledge of their environment through sharing measurements.
Power grids are undergoing major changes due to the rapid adoption of intermittent renewable energy resources and the increased availability of energy storage devices. These trends drive smart-grid operators to envision a future where peer-to-peer energy trading occurs within microgrids, leading to the development of Transactive Energy Systems. Blockchains have garnered significant interest from both academia and industry for their potential application in decentralized TES, in large part due to their high level of resilience. In this paper, we introduce a novel class of attacks against blockchain based TES, which target the gateways that connect market participants to the system. We introduce a general model of blockchain based TES and study multiple threat models and attack strategies. We also demonstrate the impact of these attacks using a testbed based on GridLAB-D and a private Ethereum network. Finally, we study how to mitigate these attack.
We study the strong structural controllability (SSC) of diffusively coupled networks, where the external control inputs are injected to only some nodes, namely the leaders. For such systems, one measure of controllability is the dimension of strong structurally controllable subspace, which is equal to the smallest possible rank of controllability matrix under admissible (positive) coupling weights. In this paper, we compare two tight lower bounds on the dimension of strong structurally controllable subspace: one based on the distances of followers to leaders, and the other based on the graph coloring process known as zero forcing. We show that the distance-based lower bound is usually better than the zero-forcing-based bound when the leaders do not constitute a zero-forcing set. On the other hand, we also show that any set of leaders that can be shown to achieve complete SSC via the distance-based bound is necessarily a zero-forcing set. Furthermore, we present a novel bound based on the combination of these two approaches, which is always at least as good as, and in some cases strictly greater than, the maximum of the two bounds. Finally, we present some numerical results to compare the bounds on various graphs.
The rowhammer bug belongs to software-induced hardware faults, and has been exploited to form a wide range of powerful rowhammer attacks. Yet, how to effectively detect such attacks remains a challenging problem. In this paper, we propose a novel approach named RADAR (Rowhammer Attack Detection via A Radio) that leverages certain electromagnetic (EM) signals to detect rowhammer attacks. In particular, we have found that there are recognizable hammering-correlated sideband patterns in the spectrum of the DRAM clock signal. As such patterns are inevitable physical side effects of hammering the DRAM, they can "expose" any potential rowhammer attacks including the extremely elusive ones hidden inside encrypted and isolated environments like Intel SGX enclaves. However, the patterns of interest may become unapparent due to the common use of spread-spectrum clocking (SSC) in computer systems. We propose a de-spreading method that can reassemble the hammering-correlated sideband patterns scattered by SSC. Using a common classification technique, we can achieve both effective and robust detection-based defense against rowhammer attacks, as evaluated on a RADAR prototype under various scenarios. In addition, our RADAR does not impose any performance overhead on the protected system. There has been little prior work that uses physical side-channel information to perform rowhammer defenses, and to the best of our knowledge, this is the first investigation on leveraging EM side-channel information for this purpose.
<p><span>An air-gapped computer is physically isolated from unsecured networks to guarantee effective protection against data exfiltration. Due to air gaps, unauthorized data transfer seems impossible over legitimate communication channels, but in reality many so-called physical covert channels can be constructed to allow data exfiltration across the air gaps. Most of such covert channels are very slow and often require certain strict conditions to work (e.g., no physical obstacles between the sender and the receiver). In this paper, we introduce a new physical covert channel named BitJabber that is extremely fast and strong enough to even penetrate concrete walls. We show that this covert channel can be easily created by an unprivileged sender running on a victim’s computer. Specifically, the sender constructs the channel by using only memory accesses to modulate the electromagnetic (EM) signals generated by the DRAM clock. While possessing a very high bandwidth (up to 300,000 bps), this new covert channel is also very reliable (less than 1% error rate). More importantly, this covert channel can enable data exfiltration from an air-gapped computer enclosed in a room with thick concrete walls up to 15 cm.</span></p>
With the proliferation of location-based services enabled by a large number of mobile devices and applications, the quantity of location data, such as trajectories collected by service providers, is gigantic. If these datasets could be published, then they would be valuable assets to various service providers to explore business opportunities, to study commuter behavior for better transport management, which in turn benefits the general public for day-to-day commuting. However, there are two major concerns that considerably limit the availability and the usage of these trajectory datasets. The first is the threat to individual privacy, as users’ trajectories may be misused to discover sensitive information, such as home locations, their children’s school locations, or social information like habits or relationships. The other concern is the ability to analyze the exabytes of location data in a timely manner. Although there have been trajectory anonymization approaches proposed in the past to mitigate privacy concerns. None of these prior works address the scalability issue, since it is a newly occurring problem brought by the significantly increasing adoption of location-based services. In this article, we conquer these two challenges by designing a novel parallel trajectory anonymization algorithm that achieves scalability, strong privacy protection, and high utility rate of the anonymized trajectory datasets. We have conducted extensive experiments using MapReduce and Spark on real maps with different topologies, and our results prove both effectiveness and efficiency when compared with the centralized approaches.
People constantly share their photographs with others through various social media sites. With the aid of the privacy settings provided by social media sites, image owners can designate scope of sharing, e.g., close friends and acquaintances. However, even if the owner of a photograph carefully sets the privacy setting to exclude a given individual who is not supposed to see the photograph, the photograph may still eventually reach a wider audience, including those clearly undesired through unanticipated channels of disclosure, causing a privacy breach. Moreover, it is often the case that a given image involves multiple stakeholders who are also depicted in the photograph. Due to various personalities, it is even more challenging to reach agreement on the privacy settings for these multi-owner photographs. In this paper, we propose a privacy risk reminder system, called REMIND, which estimates the probability that a shared photograph may be seen by unwanted people-through the social graph-who are not included in the original sharing list. We tackle this problem from a novel angle by digging into the big data regarding image sharing history. Specifically, the social media providers possess a huge amount of image sharing information (e.g., what photographs are shared with whom) of their users. By analyzing and modeling such rich information, we build a sophisticated probability model that efficiently aggregates the image disclosure probabilities along different possible image propagation chains and loops. If the computed disclosure probability indicates high risks of privacy breach, a reminder is issued to the image owner to help revise the privacy settings (or, at least, inform the user about this accidental disclosure risk). The proposed REMIND system also has a nice feature of policy harmonization that helps resolve privacy differences in multi-owner photographs. We have carried out a user study to validate the rationale of our proposed solutions and also conducted experimental studies to evaluate the efficiency of the proposed REMIND system.
<p><span>Using the Smart Home as a use case, we examine the vulnerabilities in the system across the technologies used in its implementation. A typical smart home will contain a variety of sensors, actuators (e.g. for opening doors), communication links, storage devices, video cameras, network interfaces, and control units. Each of these physical components and subsystems must be secure in order for the overall system to be secure. Typical security analysis focuses on the defined interfaces of the system: network security via firewalls, communications encryption, and authentication at terminals. Unfortunately, many of these devices in the Internet of Things (IoT) space are susceptible to physical attacks via electromagnetic energy, or other sound/heat energy. Properly designed electromagnetic (EM) waveforms can access a range of vulnerabilities, providing unanticipated entry points into the system. In this chapter, we discuss a multi‐modeling methodology for analyzing cyber‐physical vulnerabilities, assessing the system across geometry, electronic, and behavioral domains. A home automation system is used as an example, showing a methodology for assessing vulnerabilities in hardware. The example exploits the use of EM energy injection. A multi‐modeling of the system captures the geometric structure of the hardware with links to behavioral models. Low‐energy EM pathways are discovered that may impact system behavior. Computation is minimized by applying analysis of EM effects only at behavior‐critical inputs and outputs. The chapter also discusses a methodology for system‐level impact analysis. The final conclusion is that susceptibility to physical layer presents many attack surfaces, due to a large number of heterogeneous IoT devices, mandating consideration of the physical dimensions to vulnerability analysis and risk mitigation.</span></p>
With the prevalence of smartphones, mobile websites have been more and more popular. However, many mobile websites collect the location information which greatly increases users risks of being tracked unexpectedly. The current location access control setting is not sufficient since it cannot prevent the service providers which have been granted location-access permissions from tracking the users. In this paper, we propose a novel location privacy preservation mobile app, called MoveWithMe, which automatically generates decoy queries to hide the real users locations and intentions when they are using location-based mobile services. Unlike the existing works on dummy trajectories which may be easily discovered by attackers through data analysis, the uniqueness of the MoveWithMe app is that our generated decoys closely behave like real humans. Each decoy in our system has its own moving patterns, daily schedules, and social behaviors, which ensures its movements to be semantically different from the real user s trace and satisfying geographic constraints. Thus, our decoys can hardly be distinguished even by advanced data mining techniques. Another advantage of the MoveWithMe app is that it guarantees the same level of user experience without affecting the response time or introducing extra control burdens. Decoys move independently in the back end and automatically submit queries to the same service provider whenever the user does so. Our proposed MoveWithMe app has both iOS and Android versions and has been tested on different brands of smartphones against various location-based services, such as Yelp and TripAdvisor. Experimental results demonstrate its practicality, effectiveness, and efficiency.
Moving-target defenses (MTDs) have been widely studied for common general-purpose and enterprise-computing applications. Indeed, such work has produced highly effective, low-overhead defenses that are now commonly deployed in many systems today. One application space that has seen comparatively little focus is that of safety- and mission-critical systems, which are often real-time systems (RTS) with temporal requirements. Furthermore, such systems are increasingly being targeted by attackers, such as in industrial control systems (ICS), including power grids. The strict timing requirements of these systems presents a different design objective than is common in general-purpose applications — systems should be designed around the worst-case performance, rather than the average case. Perhaps in part due to these alternative design considerations, many real-time systems have not benefited from much of the work on software security that common general-purpose and enterprise applications have, despite the ubiquity of real-time systems that actively control so many applications we as a society have come to rely on, from power generation and distribution, to automotive and avionic applications, and many others.This paper explores the application of moving-target defenses in the context of real-time systems. In particular, the worst-case performance of several address-space randomization defenses are evaluated to study the implications of such designs in real-time applications. These results suggest that current moving-target defenses, while performant in the average case, can exhibit significant tail latencies, which can be problematic in real-time applications, especially if such overheads are not considered in the design and analysis of the system. These results inform future research directions for moving-target defenses in real-time applications.
publication
Enabling Strong Isolation for Distributed Real-time Applications in Edge Computing Scenarios
Distributed, co-existing applications found in the military and space domains, which operate over managed but shared computing resources at the edge require strong isolation from each other. The state of the art for computation sharing at the edge is traditionally based on Docker and similar pseudo-virtualization features. Our team has been working on an end-to-end architecture that provides strong spatial and temporal isolation similar to what has become standard in avionics communities. In this paper we describe an open-source extension to Linux that we have designed and implemented for our distributed real-time embedded managed systems (DREMS) architecture. The key concepts are the partitioning scheduler, strong security design and a health management interface
publication
BARISTA: Efficient and Scalable Serverless Serving System for Deep Learning Prediction Services
<p><span>Pre-trained deep learning models are increasingly being used to offer a variety of compute-intensive predictive analytics services such as fitness tracking, speech and image recognition. The stateless and highly parallelizable nature of deep learning models makes them well-suited for serverless computing paradigm. However, making effective resource management decisions for these services is a hard problem due to the dynamic workloads and diverse set of available resource configurations that have their deployment and management costs. To address these challenges, we present a distributed and scalable deep-learning prediction serving system called Barista and make the following contributions. First, we present a fast and effective methodology for forecasting workloads by identifying various trends. Second, we formulate an optimization problem to minimize the total cost incurred while ensuring bounded prediction latency with reasonable accuracy. Third, we propose an efficient heuristic to identify suitable compute resource configurations. Fourth, we propose an intelligent agent to allocate and manage the compute resources by horizontal and vertical scaling to maintain the required prediction latency. Finally, using representative real-world workloads for urban transportation service, we demonstrate and validate the capabilities of Barista.</span></p>
publication
The Leakage-Resilience Dilemma
Many control-flow-hijacking attacks rely on information leakage to disclose the location of gadgets. To address this, several leakage-resilient defenses, have been proposed that fundamentally limit the power of information leakage. Examples of such defenses include address-space re-randomization, destructive code reads, and execute-only code memory. Underlying all of these defenses is some form of code randomization. In this paper, we illustrate that randomization at the granularity of a page or coarser is not secure, and can be exploited by generalizing the idea of partial pointer overwrites, which we call the Relative ROP (RelROP) attack. We then analyzed more that 1,300 common binaries and found that 94\% of them contained sufficient gadgets for an attacker to spawn a shell. To demonstrate this concretely, we built a proof-of-concept exploit against PHP 7.0.0. Furthermore, randomization at a granularity finer than a memory page faces practicality challenges when applied to shared libraries. Our findings highlight the dilemma that faces randomization techniques: course-grained techniques are efficient but insecure and fine-grained techniques are secure but impractical.
<p><span>Attacks on real-time embedded systems can endanger lives and critical infrastructure. Despite this, techniques for securing embedded systems software have not been widely studied. Many existing security techniques for general-purpose computers rely on assumptions that do not hold in the embedded case. This paper focuses on one such technique, control-flow integrity (CFI), that has been vetted as an effective countermeasure against control-flow hijacking attacks on general-purpose computing systems. Without the process isolation and fine-grained memory protections provided by a general-purpose computer with a rich operating system, CFI cannot provide any security guarantees. This work proposes RECFISH, a system for providing CFI guarantees on ARM Cortex-R devices running minimal real-time operating systems. We provide techniques for protecting runtime structures, isolating processes, and instrumenting compiled ARM binaries with CFI protection. We empirically evaluate RECFISH and its performance implications for real-time systems. Our results suggest RECFISH can be directly applied to binaries without compromising real-time performance; in a test of over six million realistic task systems running FreeRTOS, 85% were still schedulable after adding RECFISH.</span></p>
Learning enabled components (LECs) trained using data-driven algorithms are increasingly being used in autonomous robots commonly found in factories, hospitals, and educational laboratories. However, these LECs do not provide any safety guarantees, and testing them is challenging. In this paper, we introduce a framework that performs weighted simplex strategy based supervised safety control, resource management and confidence estimation of autonomous robots. Specifically, we describe two weighted simplex strategies: (a) simple weighted simplex strategy (SW-Simplex) that computes a weighted controller output by comparing the decisions between a safety supervisor and an LEC, and (b) a context-sensitive weighted simplex strategy (CSW-Simplex) that computes a context-aware weighted controller output. We use reinforcement learning to learn the contextual weights. We also introduce a system monitor that uses the current state information and a Bayesian network model learned from past data to estimate the probability of the robotic system staying in the safe working region. To aid resource constrained robots in performing complex computations of these weighted simplex strategies, we describe a resource manager that offloads tasks to an available fog nodes. The paper also describes a hardware testbed called DeepNNCar, which is a low cost resource-constrained RC car, built to perform autonomous driving. Using the hardware, we show that both SW-Simplex and CSW-Simplex have 40% and 60% fewer safety violations, while demonstrating higher optimized speed during indoor driving (~ 0.40 m/s) than the original system (using only LECs).
Recent advances in machine learning led to the appearance of Learning-Enabled Components (LECs) in Cyber-Physical Systems. LECs are being evaluated and used for various, complex functions including perception and control. However, very little tool support is available for design automation in such systems. This paper introduces an integrated toolchain that supports the architectural modeling of CPS with LECs, but also has extensive support for the engineering and integration of LECs, including support for training data collection, LEC training, LEC evaluation and verification, and system software deployment. Additionally, the toolsuite supports the modeling and analysis of safety cases - a critical part of the engineering process for mission and safety critical systems.
Symbolic execution is a well-known program analysis technique that explores multiple program paths simultaneously. Among other things, it is used to uncover subtle bugs and corner cases in programs, as well as to produce high-coverage test suites. Even though symbolic execution has seen successful use in practice, there remain challenges in applying it to programs like web servers that use features such as multithreading and callbacks. This paper describes our dynamic symbolic execution framework for Java that was designed with these types of features in mind. Our framework uses bytecode instrumentation combined with a run-time agent to perform the symbolic execution. We give a detailed description of the challenges we faced along with our design choices. We also present benchmark results on various examples including programs that use web server frameworks.
Conventional network access control approaches are static (e.g., user roles in Active Directory), coarse-grained (e.g., 802.1x), or both (e.g., VLANs). Such systems are unable to meaningfully stop or hinder motivated attackers seeking to spread throughout an enterprise network. To address this threat, we present Dynamic Flow Isolation (DFI), a novel architecture for supporting dynamic, fine-grained access control policies enforced in a Software-Defined Network (SDN). These policies can emit and revoke specific access control rules automatically in response to network events like users logging off, letting the network adaptively reduce unnecessary reachability that could be potentially leveraged by attackers. DFI is oblivious to the SDN controller implementation and processes new packets prior to the controller, making DFI s access control resilient to a malicious or faulty controller or its applications. We implemented DFI for OpenFlow networks and demonstrated it on an enterprise SDN testbed with around 100 end hosts and servers. Finally, we evaluated the performance of DFI and how it enables a novel policy, which is otherwise difficult to enforce, that protects against a surrogate of the recent NotPetya malware in an infection scenario. We found that the threat was most limited in its ability to spread using our policy, which automatically restricted network flows over the course of the attack, compared to no access control or a static role-based policy.
<p>Program analysis is a popular method to determine properties about program behavior, such as execution times and potential security vulnerabilities. One of the biggest challenges faced by almost every form of program analysis is scalability. One way to address scalability issues is to distribute the analysis across multiple machines. However, this is not an easy task; designing a distribution framework that is capable of supporting multiple types of program analysis requires careful thought and consideration. This paper presents the cloud-based execution framework that we built for performing distributed analysis of Java bytecode programs. We describe the design decisions that allow this framework to be generic enough to support multiple types of analysis but remain efficient at the same time. We also present a simple, static work partitioning algorithm that we have found to work well in practice and provide benchmarks to show its efficiency.</p>
<p><span>Fault Protection Assemblies are used in cyber-physical systems for automated fault-isolation. These devices alter the mode of the system using locally available information in order to stop fault propagation. For example, in electrical networks relays and breakers isolate faults in order to arrest failure propagation and protect the healthy parts of the system. However, these assemblies themselves can have faults, which may inadvertently induce secondary failures. Often these secondary failures lead to cascade effects, which then lead to total system collapse. This behavior is often seen in electrical transmission systems where failures of relays and breakers may cause overloading and the disconnection of parts of an otherwise healthy system. In the past, we had developed a consistency based diagnosis approach for physical systems based on the temporal failure propagation graph. We now describe an extension that uses the concept of timed discrete event observers in combination with the timed failure propagation graphs to extend the hypothesis to include the possibility of failures in the fault protection units. Using a simulated power system case study, we show that the combined approach is able to diagnose faults in both the plant and the protection devices.</span></p>
Reliable operation of power systems is a primary challenge for the system operators. With the advancement in technology and grid automation, power systems are becoming more vulnerable to cyber-attacks. The main goal of adversaries is to take advantage of these vulnerabilities and destabilize the system. This paper describes a game-theoretic approach to attacker / defender modeling in power systems. In our models, the attacker can strategically identify the subset of substations that maximize damage when compromised. However, the defender can identify the critical subset of substations to protect in order to minimize the damage when an attacker launches a cyber-attack. The algorithms for these models are applied to the standard IEEE-14, 39, and 57 bus examples to identify the critical set of substations given an attacker and a defender budget.
The rowhammer bug belongs to software-induced hardware faults, and has posed great security challenges to numerous systems. On x86, many approaches to triggering the rowhammer bug have been found; yet, due to several different reasons, the number of discovered approaches on ARM is limited. In this paper, we revisit the problem of how to trigger the rowhammer bug on ARM-based devices by carefully investigating whether it is possible to translate the original x86-oriented rowhammer approaches to ARM. We provide a thorough study of the unprivileged ARMv8-A cache maintenance instructions and give two previously overlooked reasons to support their use in rowhammer attacks. Moreover, we present a previously undiscovered instruction that can be exploited to trigger the rowhammer bug on many ARM-based devices. A potential approach to quickly evicting ARM CPU caches is also discussed, and experimental evaluations are carried out to show the effectiveness of our findings.