2019
An effective real-time estimation of the travel time for vehicles, using AVL(Automatic Vehicle Locators) has added a new dimension to the smart city planning. In this paper, we used data collected over several months from a transit agency and show how this data can be potentially used to learn patterns of travel time during specially planned events like NFL (National Football League) games and music award ceremonies. The impact of NFL games along with consideration of other factors like weather, traffic condition, distance is discussed with their relative importance to the prediction of travel time. Statistical learning models are used to predict travel time and subsequently assess the cascading effects of delay. The model performance is determined based on its predictive accuracy according to the out-of-sample error. In addition, the models help identify the most significant variables that influence the delay in the transit system. In order to compare the actual and predicted travel time for days having special events, heat maps are generated showing the delay impacts in different time windows between two timepoint-segments in comparison to a non-game day. This work focuses on the prediction and visualization of the delay in the public transit system and the analysis of its cascading effects on the entire transportation network. According to the study results, we are able to explain more than 80\% of the variance in the bus travel time at each segment and can make future travel predictions during planned events with an out-of-sample error of 2.0 minutes using information on the bus schedule, traffic, weather, and scheduled events. According to the variable importance analysis, traffic information is most significant in predicting the delay in the transit system.
The rise in deep learning models in recent years has led to various innovative solutions for intelligent transportation technologies. While some prediction models focus on predicting the state of network efficiently and accurately, such as estimating traffic congestion, transit delay and so on, other models use those predicted states to find a set of sequential decisions that commuters need to make to travel from their origin to destination. The performance of these models is often evaluated using prediction accuracy. There is a growing need to understand the overall impact of such models on the societal scale. In this paper, we leverage MATSim, an agent-based simulation framework, to incorporate various decision-making models and provide a standardized environment to evaluate the efficacy of these models in terms of its system impact. For example, we describe the integration of a model that captures the altruistic behavior of an agent in addition to the disutility of a user proportional to the travel time and cost. This model can then be used to evaluate the sensitivity of an agent to the system disutility and the monetary incentives given by the transportation authority of the city. We show the effectiveness of the approach and provide the analysis using a case study from the Metropolitan Nashville area.
publication
Cyber-Physical Simulation Platform for Security Assessment of the Transactive Energy Systems
Transactive energy systems (TES) are emerging as a transformative solution for the problems that distribution system operators face due to an increase in the use of distributed energy resources and rapid growth in scalability of managing active distribution system (ADS). On the one hand, these changes pose a decentralized power system controls problem, requiring strategic control to maintain reliability and resiliency for the community and for the utility. On the other hand, they require robust financial markets while allowing participation from diverse prosumers. To support computing requirements of TES with required flexibility while preserving privacy and security, a distributed software platforms is required. In this paper, we enable the study and analysis of security concerns by developing Transactive Energy Security Simulation Testbed (TESST), a TES testbed for simulating various cyber attacks. In this work, the testbed is used for TES simulation with centralized clearing market, highlighting weaknesses in a centralized system. Additionally, we present a blockchain enabled decentralized market solution supported by distributed computing for TES, which on one hand can alleviate some of the problems we identify, but on the other hand may introduce newer issues. Future study of these differing paradigms is necessary and will continue as we develop our security simulation testbed.
The Internet of Things (IoT) requires distributed, large scale data collection via geographically distributed devices. While IoT devices typically send data to the cloud for processing, this is problematic for bandwidth constrained applications. Fog and edge computing (processing data near where it is gathered, and sending only results to the cloud) has become more popular, as it lowers network overhead and latency. Edge computing often uses devices with low computational capacity, therefore service frameworks and middleware are needed to efficiently compose services. While many frameworks use a top-down perspective, quality of service is an emergent property of the entire system and often requires a bottom up approach. We define services as multi-modal, allowing resource and performance tradeoffs. Different modes can be composed to meet an application's high level goal, which is modeled as a function. We examine a case study for counting vehicle traffic through intersections in Nashville. We apply object detection and tracking to video of the intersection, which must be performed at the edge due to privacy and bandwidth constraints. We explore the hardware and software architectures, and identify the various modes. This paper lays the foundation to formulate the online optimization problem presented by the system which makes tradeoffs between the quantity of services and their quality constrained by available resources.
Simulation-based analysis is essential in the model-based design process of Cyber-Physical Systems (CPS). Since heterogeneity is inherent to CPS, virtual prototyping of CPS designs and the simulation of their behavior in various environments typically involve a number of physical and computation/communication domains interacting with each other. Affordability of the model-based design process makes the use of existing domain-specific modeling and simulation tools all but mandatory. However, this pressure establishes the requirement for integrating the domain-specific models and simulators into a semantically consistent and efficient system-of-system simulation. The focus of the paper is the interoperability of popular integration platforms supporting heterogeneous multi-model simulations. We examine the relationship among three existing platforms: the High-Level Architecture (HLA)-based CPS Wind Tunnel (CPSWT), mosaik, and the Functional Mockup Unit (FMU). We discuss approaches to establish interoperability and present results of ongoing work in the context of an example.
Learning enabled components (LECs) trained using data-driven algorithms are increasingly being used in autonomous robots commonly found in factories, hospitals, and educational laboratories. However, these LECs do not provide any safety guarantees, and testing them is challenging. In this paper, we introduce a framework that performs weighted simplex strategy based supervised safety control, resource management and confidence estimation of autonomous robots. Specifically, we describe two weighted simplex strategies: (a) simple weighted simplex strategy (SW-Simplex) that computes a weighted controller output by comparing the decisions between a safety supervisor and an LEC, and (b) a context-sensitive weighted simplex strategy (CSW-Simplex) that computes a context-aware weighted controller output. We use reinforcement learning to learn the contextual weights. We also introduce a system monitor that uses the current state information and a Bayesian network model learned from past data to estimate the probability of the robotic system staying in the safe working region. To aid resource constrained robots in performing complex computations of these weighted simplex strategies, we describe a resource manager that offloads tasks to an available fog nodes. The paper also describes a hardware testbed called DeepNNCar, which is a low cost resource-constrained RC car, built to perform autonomous driving. Using the hardware, we show that both SW-Simplex and CSW-Simplex have 40% and 60% fewer safety violations, while demonstrating higher optimized speed during indoor driving (~ 0.40 m/s) than the original system (using only LECs).
Recent advances in machine learning led to the appearance of Learning-Enabled Components (LECs) in Cyber-Physical Systems. LECs are being evaluated and used for various, complex functions including perception and control. However, very little tool support is available for design automation in such systems. This paper introduces an integrated toolchain that supports the architectural modeling of CPS with LECs, but also has extensive support for the engineering and integration of LECs, including support for training data collection, LEC training, LEC evaluation and verification, and system software deployment. Additionally, the toolsuite supports the modeling and analysis of safety cases - a critical part of the engineering process for mission and safety critical systems.
Symbolic execution is a well-known program analysis technique that explores multiple program paths simultaneously. Among other things, it is used to uncover subtle bugs and corner cases in programs, as well as to produce high-coverage test suites. Even though symbolic execution has seen successful use in practice, there remain challenges in applying it to programs like web servers that use features such as multithreading and callbacks. This paper describes our dynamic symbolic execution framework for Java that was designed with these types of features in mind. Our framework uses bytecode instrumentation combined with a run-time agent to perform the symbolic execution. We give a detailed description of the challenges we faced along with our design choices. We also present benchmark results on various examples including programs that use web server frameworks.
Conventional network access control approaches are static (e.g., user roles in Active Directory), coarse-grained (e.g., 802.1x), or both (e.g., VLANs). Such systems are unable to meaningfully stop or hinder motivated attackers seeking to spread throughout an enterprise network. To address this threat, we present Dynamic Flow Isolation (DFI), a novel architecture for supporting dynamic, fine-grained access control policies enforced in a Software-Defined Network (SDN). These policies can emit and revoke specific access control rules automatically in response to network events like users logging off, letting the network adaptively reduce unnecessary reachability that could be potentially leveraged by attackers. DFI is oblivious to the SDN controller implementation and processes new packets prior to the controller, making DFI s access control resilient to a malicious or faulty controller or its applications. We implemented DFI for OpenFlow networks and demonstrated it on an enterprise SDN testbed with around 100 end hosts and servers. Finally, we evaluated the performance of DFI and how it enables a novel policy, which is otherwise difficult to enforce, that protects against a surrogate of the recent NotPetya malware in an infection scenario. We found that the threat was most limited in its ability to spread using our policy, which automatically restricted network flows over the course of the attack, compared to no access control or a static role-based policy.
<p><span>Attacks on real-time embedded systems can endanger lives and critical infrastructure. Despite this, techniques for securing embedded systems software have not been widely studied. Many existing security techniques for general-purpose computers rely on assumptions that do not hold in the embedded case. This paper focuses on one such technique, control-flow integrity (CFI), that has been vetted as an effective countermeasure against control-flow hijacking attacks on general-purpose computing systems. Without the process isolation and fine-grained memory protections provided by a general-purpose computer with a rich operating system, CFI cannot provide any security guarantees. This work proposes RECFISH, a system for providing CFI guarantees on ARM Cortex-R devices running minimal real-time operating systems. We provide techniques for protecting runtime structures, isolating processes, and instrumenting compiled ARM binaries with CFI protection. We empirically evaluate RECFISH and its performance implications for real-time systems. Our results suggest RECFISH can be directly applied to binaries without compromising real-time performance; in a test of over six million realistic task systems running FreeRTOS, 85% were still schedulable after adding RECFISH.</span></p>
publication
The Leakage-Resilience Dilemma
Many control-flow-hijacking attacks rely on information leakage to disclose the location of gadgets. To address this, several leakage-resilient defenses, have been proposed that fundamentally limit the power of information leakage. Examples of such defenses include address-space re-randomization, destructive code reads, and execute-only code memory. Underlying all of these defenses is some form of code randomization. In this paper, we illustrate that randomization at the granularity of a page or coarser is not secure, and can be exploited by generalizing the idea of partial pointer overwrites, which we call the Relative ROP (RelROP) attack. We then analyzed more that 1,300 common binaries and found that 94\% of them contained sufficient gadgets for an attacker to spawn a shell. To demonstrate this concretely, we built a proof-of-concept exploit against PHP 7.0.0. Furthermore, randomization at a granularity finer than a memory page faces practicality challenges when applied to shared libraries. Our findings highlight the dilemma that faces randomization techniques: course-grained techniques are efficient but insecure and fine-grained techniques are secure but impractical.
publication
BARISTA: Efficient and Scalable Serverless Serving System for Deep Learning Prediction Services
<p><span>Pre-trained deep learning models are increasingly being used to offer a variety of compute-intensive predictive analytics services such as fitness tracking, speech and image recognition. The stateless and highly parallelizable nature of deep learning models makes them well-suited for serverless computing paradigm. However, making effective resource management decisions for these services is a hard problem due to the dynamic workloads and diverse set of available resource configurations that have their deployment and management costs. To address these challenges, we present a distributed and scalable deep-learning prediction serving system called Barista and make the following contributions. First, we present a fast and effective methodology for forecasting workloads by identifying various trends. Second, we formulate an optimization problem to minimize the total cost incurred while ensuring bounded prediction latency with reasonable accuracy. Third, we propose an efficient heuristic to identify suitable compute resource configurations. Fourth, we propose an intelligent agent to allocate and manage the compute resources by horizontal and vertical scaling to maintain the required prediction latency. Finally, using representative real-world workloads for urban transportation service, we demonstrate and validate the capabilities of Barista.</span></p>
2018
The International Space Station (ISS) plans to launch 100+ small-sat missions for different science experiments in the next coming years. At present these missions are limited to couple of months but in the future these will last longer and it becomes crucial to monitor and predict future health of these systems as they age to prolong the usage time. This paper describes a hierarchical architecture which combines data-driven anomaly detection methods with a fine-grained model based diagnosis and prognostics architecture. At the core of the architecture is a distributed stack of deep neural network that detects and classifies the data traces from nearby satellites based on prior observations. Any identified anomaly is transmitted to the ground, which then uses model-based diagnosis and prognosis methods. In parallel, periodically the data traces from the satellites are transported to the ground and analyzed using model-based techniques. This data is then used to train the neural networks, which are run from ground systems and periodically updated. This collaborative architecture enables quick data-driven inference on the satellite and more intensive analysis on the ground where often time and power consumption are not key concerns. We demonstrate this architecture through an initial battery data set. In the future we propose to apply this framework to other electric and electronic components onboard the small satellites.
publication
Universal CPS Environment for Federation (UCEF)
NIST, in collaboration with Vanderbilt University, has assembled an open-source tool set for designing and implementing federated, collaborative and interactive experiments with cyber-physical systems (CPS). These capabilities are used in our research on CPS at scale for Smart Grid, Smart Transportation, IoT and Smart Cities. This tool set, "Universal CPS Environment for Federation (UCEF)," includes a virtual machine (VM) to house the development environment, a graphical experiment designer, a model repository, and an initial set of integrated tools including the ability to compose Java, C++, MATLABTM, OMNeT++, GridLAB-D, and LabVIEWTM based federates into consolidated experiments. The experiments themselves are orchestrated using a ‘federation manager federate,’ and progressed using courses of action (COA) experiment descriptions. UCEF utilizes a method of uniformly wrapping federates into a federation. The UCEF VM is an integrated toolset for creating and running these experiments and uses High Level Architecture (HLA) Evolved to facilitate the underlying messaging and experiment orchestration. Our paper introduces the requirements and implementation of the UCEF technology and indicates how we intend to use it in CPS Measurement Science.
Systems-of-Systems (SoS) are composed of several interacting and interdependent systems that necessitate the integration of complex, heterogeneous models that represent the ensemble from different points of view, such as organizational workflows, cyber infrastructure, and various engineering or physical domains. These models are complex and require different dynamic simulators to compute their behavior over time. Thus, evaluation of SoS as-a-whole necessitates integration of these heterogeneous simulators. This is highly challenging because it requires integrating both the heterogeneous system models with different semantics and concepts from different system domains (physical, computational, or human), and the heterogeneous system simulators that use different time-stepping and event handling methods. Further, real-world SoS simulation and experimentation requires a comprehensive framework for integration modeling, efficient model and system composition, parametric experiments, run-time deployment, simulation control, scenario-based experimentation, and system analysis.
This dissertation presents a model-based integration approach for integrating large-scale heterogeneous simulations. The approach is illustrated by developing a generic simulation integration and experimentation framework called the Command and Control Wind Tunnel (C2WT). It allows modeling systems with their interdependencies as well as connecting and relating the corresponding heterogeneous simulators in a logically and temporally coherent manner. Its generalizable methods and tools enable rapid synthesis of industry standards based integrated simulations. For real-world integrated simulation experiments, several novel techniques are presented such as mapping methods for integrating legacy components that cannot directly interface with SoS-level data models, a generic cyber communication network simulation component that can be reused for different SoSs, a reusable cyber-attack library for evaluating SoS’ security and resilience against cyber threats, and modeling and orchestration of alternative what-if scenarios for SoS evaluations. Further, for efficient simulation of complex dynamical models that exhibit different rate dynamics in different parts, a partitioning method is developed to split them into different sampling rate groups. In addition, a novel approach is presented for ontology based model composition. In-depth case studies are also provided to demonstrate the effectiveness of the overall integration approach.
https://etd.library.vanderbilt.edu/available/etd-01172018-232437/unrestricted/Neema.pdf
Reliable operation of power systems is a primary
challenge for the system operators. With the advancement in
technology and grid automation, power systems are becoming
more vulnerable to cyber-attacks. The main goal of adversaries
is to take advantage of these vulnerabilities and destabilize
the system. This paper describes a game-theoretic approach to
attacker / defender modeling in power systems. In our models, the
attacker can strategically identify the subset of substations that
maximize damage when compromised. However, the defender can
identify the critical subset of substations to protect in order to
minimize the damage when an attacker launches a cyber-attack.
The algorithms for these models are applied to the standard
IEEE-14, 39, and 57 bus examples to identify the critical set of
substations given an attacker and a defender budget.
publication
Hierarchical Reasoning about Faults in Cyber-Physical Energy Systems using Temporal Causal Diagrams
The resiliency and reliability of critical cyber physical systems like electrical power grids are of paramount importance. These systems are often equipped with specialized protection devices to detect anomalies and isolate faults in order to arrest failure propagation and protect the healthy parts of the system. However, due to the limited situational awareness and hidden failures the protection devices themselves, through their operation (or mis-operation) may cause overloading and the disconnection of parts of an otherwise healthy system. This can result in cascading failures that lead to a blackout. Diagnosis of failures in such systems is extremely challenging because of the need to account for faults in both the physical systems as well as the protection devices, and the failure-effect propagation across the system.
Our approach for diagnosing such cyber-physical systems is based on the concept of Temporal Causal Diagrams (TCD-s) that capture the timed discrete models of protection devices and their interactions with a system failure propagation graph. In this paper we present a refinement of the TCD language with a layer of independent local observers that aid in diagnosis. We describe a hierarchical two-tier failure diagnosis approach and showcase the results for 4 different scenarios involving both cyber and physical faults in a standard Western System Coordinating Council (WSCC) 9 bus system.
publication
Towards a Design Studio for Collaborative Modeling and Co-Simulations of Mixed Electrical Energy Systems
Despite the known benefits of simulations in the study of mixed energy systems in the context of smart grid, the lack of collaboration facilities between multiple domain experts prevents a holistic analysis of smart grid operations. Current solutions do not provide a unified tool-chain that supports a secure and collaborative platform for not only the modeling and simulation of mixed electrical energy systems, but also the elastic execution of co-simulation experiments. To address above limitations, this paper proposes a design studio that provides an online collaborative platform for modeling and simulation of smart grids with mixed energy resources.
Owing to an immense growth of internet-connected and learning-enabled cyber-physical systems (CPSs) [1], several new types of attack vectors have emerged. Analyzing security and resilience of these complex CPSs is difficult as it requires evaluating many subsystems and factors in an integrated manner. Integrated simulation of physical systems and communication network can provide an underlying framework for creating a reusable and configurable testbed for such analyses. Using a model-based integration approach and the IEEE High-Level Architecture (HLA) [2] based distributed simulation software; we have created a testbed for integrated evaluation of large-scale CPS systems. Our tested supports web-based collaborative metamodeling and modeling of CPS system and experiments and a cloud computing environment for executing integrated networked co-simulations. A modular and extensible cyber-attack library enables validating the CPS under a variety of configurable cyber-attacks, such as DDoS and integrity attacks. Hardware-in-the-loop simulation is also supported along with several hardware attacks. Further, a scenario modeling language allows modeling of alternative paths for what-if scenarios. These capabilities make our testbed well suited for analyzing security and resilience of CPS. In addition, the web-based modeling and cloud-hosted execution infrastructure enables one to exercise the entire testbed using simply a web-browser, with integrated live experimental results display.