2024
2023
<p>Compiled binary executables are often the only available artifact in reverse engineering, malware analysis, or maintenance of software systems. Unfortunately, the lack of semantic information like variable names makes comprehending binaries difficult. In efforts to improve the comprehensibility of binaries, researchers have recently used machine learning techniques to predict semantic information contained in the original source code. Chen et al. implemented DIRTY, a Transformer-based Encoder-Decoder architecture capable of augmenting decompiled code with variable names and types by leveraging decompiler output tokens and variable size information. Chen et al. were able to demonstrate a substantial increase in name and type extraction accuracy on Hex-Rays decomiler outputs compared to existing static analysis and AI-based techniques. We extend the original DIRTY results by re-training the DIRTY model on a dataset produced by the open-source Ghidra decompiler. Although Chen et al. concluded that Ghidra was not a suitable decompiler candidate due to its difficulty in parsing DWARF, we demonstrate that straightforward parsing of variable data generated by Ghidra results in similar retyping performance. We hope this work inspires further interest and adoption of the Ghidra decompiler for use in research projects.</p>
A binary’s behavior is greatly influenced by how the compiler builds its source code. Although most compiler configuration details are abstracted away during compilation, recovering them is useful for reverse engineering and program comprehension tasks on unknown binaries, such as code similarity detection. We observe that previous work has thoroughly explored this on x86-64 binaries. However, there has been limited investigation of ARM binaries, which are increasingly prevalent.In this paper, we extend previous work with a shallow-learning model that efficiently and accurately recovers compiler configuration properties for ARM binaries. We apply opcode and register-derived features, that have previously been effective on x86-64 binaries, to ARM binaries. Furthermore, we compare this work with Pizzolotto et al., a recent architecture-agnostic model that uses deep learning, whose dataset and code are available.We observe that the lightweight features are reproducible on ARM binaries. We achieve over 99% accuracy, on par with state-of-the-art deep learning approaches, while achieving a 583-times speedup during training and 3,826-times speedup during inference. Finally, we also discuss findings of overfitting that was previously undetected in prior work.
<p>Translating natural language into Bash Commands is an emerging research field that has gained attention in recent years. Most efforts have focused on producing more accurate translation models. To the best of our knowledge, only two datasets are available, with one based on the other. Both datasets involve scraping through known data sources (through platforms like stack overflow, crowdsourcing, etc.) and hiring experts to validate and correct either the English text or Bash Commands.</p><p>This paper provides two contributions to research on synthesizing Bash Commands from scratch. First, we describe a state-of-the-art translation model used to generate Bash Commands from the corresponding English text. Second, we introduce a new NL2CMD dataset that is automatically generated, involves minimal human intervention, and is over six times larger than prior datasets. Since the generation pipeline does not rely on existing Bash Commands, the distribution and types of commands can be custom adjusted. Our empirical results show how the scale and diversity of our dataset can offer unique opportunities for semantic parsing researchers.</p>
publication
Highly Efficient Traffic Planning for Autonomous Vehicles to Cross Intersections Without a Stop
Waiting in a long queue at traffic lights not only wastes valuable time but also pollutes the environment. With the advances in autonomous vehicles and 5G networks, the previous jamming scenarios at intersections may be turned into non-stop weaving traffic flows. Toward this vision, we propose a highly efficient traffic planning system, namely DASHX, which enables connected autonomous vehicles to cross multi-way intersections without a stop. Specifically, DASHX has a comprehensive model to represent intersections and vehicle status. It can constantly process large volumes of vehicle information, resolve scheduling conflicts, and generate optimal travel plans for all vehicles coming toward the intersection in real time. Unlike existing works that are limited to certain types of intersections and lack considerations of practicability, DASHX is universal for any type of 3D intersection and yields the near-maximum throughput while still ensuring riding comfort. To better evaluate the effectiveness of traffic scheduling systems in real-world scenarios, we developed a sophisticated open source 3D traffic simulation platform (DASHX-SIM) that can handle complicated 3D road layouts and simulate vehicles’ networking and decision-making processes. We have conducted extensive experiments, and the experimental results demonstrate the practicality, effectiveness, and efficiency of the DASHX system and the simulator.
<p>GPUs have been favored for training deep learning models due to their highly parallelized architecture. As a result, most studies on training optimization focus on GPUs. There is often a trade-off, however, between cost and efficiency when deciding how to choose the proper hardware for training. In particular, CPU servers can be beneficial if training on CPUs was more efficient, as they incur fewer hardware update costs and better utilize existing infrastructure.</p><p>This paper makes three contributions to research on training deep learning models using CPUs. First, it presents a method for optimizing the training of deep learning models on Intel CPUs and a toolkit called ProfileDNN, which we developed to improve performance profiling. Second, we describe a generic training optimization method that guides our workflow and explores several case studies where we identified performance issues and then optimized the Intel® Extension for PyTorch, resulting in an overall 2x training performance increase for the RetinaNet-ResNext50 model. Third, we show how to leverage the visualization capabilities of ProfileDNN, which enabled us to pinpoint bottlenecks and create a custom focal loss kernel that was two times faster than the official reference PyTorch implementation.</p>
Most existing image privacy protection works focus mainly on the privacy of photo owners and their friends, but lack the consideration of other people who are in the background of the photos and the related location privacy issues. In fact, when a person is in the background of someone else’s photos, he/she may be unintentionally exposed to the public when the photo owner shares the photo online. Not only a single visited place could be exposed, attackers may also be able to piece together a person’s travel route from images. In this article, we propose a novel image privacy protection system, called LAMP, which aims to light up the location awareness for people during online image sharing. The LAMP system is based on a newly designed location-aware multi-party image access control model. Unlike previous works on small scales, the LAMP system is highly efficient and scalable as it can enforce privacy protection for billions of users on social networks in real time. The LAMP system automatically detects the user’s occurrences on photos regardless the user is the photo owner or not. Once a user is identified and the location of the photo is deemed sensitive according to the user’s privacy policy, the user’s face will be replaced with a synthetic face. A prototype of the system was implemented and evaluated to demonstrate its applicability in the real world.
<p>This paper presents prompt design techniques for software engineering, in the form of patterns, to solve common problems when using large language models (LLMs), such as ChatGPT to automate common software engineering activities, such as ensuring code is decoupled from third-party libraries and simulating a web application API before it is implemented. This paper provides two contributions to research on using LLMs for software engineering. First, it provides a catalog of patterns for software engineering that classifies patterns according to the types of problems they solve. Second, it explores several prompt patterns that have been applied to improve requirements elicitation, rapid prototyping, code quality, refactoring, and system design.</p>
<p><span>Prompt engineering is an increasingly important skill set needed to converse effectively with large language models (LLMs), such as ChatGPT. Prompts are instructions given to an LLM to enforce rules, automate processes, and ensure specific qualities (and quantities) of generated output. Prompts are also a form of programming that can customize the outputs and interactions with an LLM. This paper describes a catalog of prompt engineering techniques presented in pattern form that have been applied to solve common problems when conversing with LLMs. Prompt patterns are a knowledge transfer method analogous to software patterns since they provide reusable solutions to common problems faced in a particular context, i.e., output generation and interaction when working with LLMs. This paper provides the following contributions to research on prompt engineering that apply LLMs to automate software development tasks. First, it provides a framework for documenting patterns for structuring prompts to solve a range of problems so that they can be adapted to different domains. Second, it presents a catalog of patterns that have been applied successfully to improve the outputs of LLM conversations. Third, it explains how prompts can be built from multiple patterns and illustrates prompt patterns that benefit from combination with other prompt patterns.</span></p>
With the advances in sensing, networking, control-ling, and computing technologies, more and more IoT (Internet of Things) devices are emerging. They are envisioned to become part of public infrastructure in the near future. In face of the potentially large-scale deployment of smart devices in public venues, public IoT services impose new challenges on existing access control mechanisms especially in terms of efficiency. In this work, we design a flexible access control management technique which not only provides automatic and fine-grained access control management, but also incurs low overhead in large scale settings, making it suitable for product deployment. The flexible access control technique is comprised of a highly efficient dual-hierarchy access control structure and associated information retrieval algorithms. Using this technique, we develop a large-scale IoT device access control mechanism named FACT to overcome the efficiency problems in granting and inquiring access control status over millions of devices in distributed environments. Our mechanism also offers a convenient pay-and-consume scheme and plug-and-play device management for easy adoption by service providers. We have conducted extensive experiments, and the results have demonstrated the practicality, effectiveness, and efficiency of our flexible access control technique.
publication
Computation of the Distance-Based Bound on Strong Structural Controllability in Networks
In this article, we study the problem of computing a tight lower bound on the dimension of the strong structurally controllable subspace (SSCS) in networks with Laplacian dynamics. The bound is based on a sequence of vectors containing the distances between leaders (nodes with external inputs) and followers (remaining nodes) in the underlying network graph. Such vectors are referred to as the distance-to-leaders vectors. We give exact and approximate algorithms to compute the longest sequences of distance-to-leaders vectors, which directly provide distance-based bounds on the dimension of SSCS. The distance-based bound is known to outperform the other known bounds (for instance, based on zero-forcing sets), especially when the network is partially strong structurally controllable. Using these results, we discuss an application of the distance-based bound in solving the leader selection problem for strong structural controllability. Further, we characterize strong structural controllability in path and cycle graphs with a given set of leader nodes using sequences of distance-to-leaders vectors. Finally, we numerically evaluate our results on various graphs.
<p><span>Detection of deception attacks is pivotal to ensure the safe and reliable operation of cyber-physical systems (CPS). Detection of such attacks needs to consider time-series sequences and is very challenging especially for autonomous vehicles that rely on high-dimensional observations from camera sensors. The paper presents an approach to detect deception attacks in real-time utilizing sensor observations, with a special focus on high-dimensional observations. The approach is based on inductive conformal anomaly detection (ICAD) and utilizes a novel generative model which consists of a variational autoencoder (VAE) and a recurrent neural network (RNN) that is used to learn both spatial and temporal features of the normal dynamic behavior of the system. The model can be used to predict the observations for multiple time steps, and the predictions are then compared with actual observations to efficiently quantify the nonconformity of a sequence under attack relative to the expected normal behavior, thereby enabling real-time detection of attacks using high-dimensional sequential data. We evaluate the approach empirically using two simulation case studies of an advanced emergency braking system and an autonomous car racing example, as well as a real-world secure water treatment dataset. The experiments show that the proposed method outperforms other detection methods, and in most experiments, both false positive and false negative rates are less than 10%. Furthermore, execution times measured on both powerful cloud machines and embedded devices are relatively short, thereby enabling real-time detection.</span></p>
publication
Dataset Placement and Data Loading Optimizations for Cloud-Native Deep Learning Workloads
The primary challenge facing cloud-based deep learning systems is the need for efficient orchestration of large-scale datasets with diverse data formats and provisioning of high-performance data loading capabilities. To that end, we present DLCache, a cloud-native dataset management and runtime-aware data-loading solution for deep learning training jobs. DLCache supports the low-latency and high-throughput I/O requirements of DL training jobs using cloud buckets as persistent data storage and a dedicated computation cluster for training. DLCache comprises four layers: a control plane, a metadata plane, an operator plane, and a multi-tier storage plane, which are seamlessly integrated with the Kubernetes ecosystem thereby providing ease of deployment, scalability, and self-healing. For efficient memory utilization, DLCache is designed with an on-the-fly and best-effort caching mechanism that can auto-scale the cache according to runtime configurations, resource constraints, and training speeds. DLCache considers both frequency and freshness of data access as well as data preparation costs in making effective cache eviction decisions that result in reduced completion time for deep learning workloads. Results of evaluating DLCache on the Imagenet-ILSVRC and LibriSpeech datasets under various runtime configurations and simulated GPU computation time experiments showed up to a 147.49% and 156.67% improvement in data loading throughput, respectively, compared to the popular PyTorch framework.
publication
Synchrophasor Data Event Detection using Unsupervised Wavelet Convolutional Autoencoders
<p><span>Timely and accurate detection of events affecting the stability and reliability of power transmission systems is crucial for safe grid operation. This paper presents an efficient unsupervised machine-learning algorithm for event detection using a combination of discrete wavelet transform (DWT) and convolutional autoencoders (CAE) with synchrophasor phasor measurements. These measurements are collected from a hardware-in-the-loop testbed setup equipped with a digital real-time simulator. Using DWT, the detail coefficients of measurements are obtained. Next, the decomposed data is then fed into the CAE that captures the underlying structure of the transformed data. Anomalies are identified when significant errors are detected between input samples and their reconstructed outputs. We demonstrate our approach on the IEEE-14 bus system considering different events such as generator faults, line-to-line faults, line-to-ground faults, load shedding, and line outages simulated on a real-time digital simulator (RTDS). The proposed implementation achieves a classification accuracy of 97.7%, precision of 98.0%, recall of 99.5%, F1 Score of 98.7%, and proves to be efficient in both time and space requirements compared to baseline approaches.</span></p>
2022
publication
Syntheto: A Surface Language for APT and ACL2
<p>Syntheto is a surface language for carrying out formally verified program synthesis by transformational refinement in ACL2 using the APT toolkit. Syntheto aims at providing more familiarity and automation, in order to make this technology more widely usable. Syntheto is a strongly statically typed functional language that includes both executable and non-executable constructs, including facilities to state and prove theorems and facilities to apply proof-generating transformations. Syntheto is integrated into an IDE with a notebook-style, interactive interface that translates Syntheto to ACL2 definitions and APT transformation invocations, and back-translates the prover's results to Syntheto; the bidirectional translation happens behind the scenes, with the user interacting solely with Syntheto.</p>
publication
START: A Framework for Trusted and Resilient Autonomous Vehicles (Practical Experience Report)
From delivering groceries and vital medical supplies to driving trucks and passenger vehicles, society is becoming increasingly reliant on autonomous vehicles (AVs), It is therefore vital that these systems be resilient to adversarial actions, perform mission-critical functions despite known and unknown vulnerabilities, and protect and repair themselves during or after operational failures and cyber-attacks. While techniques have been proposed to address individual aspects of software resilience, vulnerability assessment, automated repair, and invariant detection, there is no approach that provides end-to-end trusted and resilient mission operation and repair on AVs. In this paper, we describe our experience of building START,11Software Techniques for Automated Resilience and Trust a framework that provides increased resilience, accurate vul-nerability assessment, and trustworthy post-repair operation in autonomous vehicles. We combine techniques from binary analysis and rewriting, runtime monitoring and verification, auto-mated program repair, and invariant detection that cooperatively detect and eliminate a swath of software security vulnerabilities in cyberphysical systems. We evaluate our framework using an autonomous vehicle simulation platform, demonstrating its holistic applicability to AVs.
publication
StrongBox: A GPU TEE on Arm Endpoints
A wide range of Arm endpoints leverage integrated and discrete GPUs to accelerate computation such as image processing and numerical processing applications. However, in spite of these important use cases, Arm GPU security has yet to be scrutinized by the community. By exploiting vulnerabilities in the kernel, attackers can directly access sensitive data used during GPU computing, such as personally-identifiable image data in computer vision tasks. Existing work has used Trusted Execution Environments (TEEs) to address GPU security concerns on Intel-based platforms, while there are numerous architectural differences that lead to novel technical challenges in deploying TEEs for Arm GPUs. In addition, extant Arm-based GPU defenses are intended for secure machine learning, and lack generality. There is a need for generalizable and efficient Arm-based GPU security mechanisms.To address these problems, we present StrongBox, the first GPU TEE for secured general computation on Arm endpoints. During confidential computation on Arm GPUs, StrongBox provides an isolated execution environment by ensuring exclusive access to the GPU. Our approach is based in part on a dynamic, fine-grained memory protection policy as Arm-based GPUs typically share a unified memory with the CPU, a stark contrast with Intel-based platforms. Furthermore, by characterizing GPU buffers as secure and non-secure, StrongBox reduces redundant security introspection operations to control access to sensitive data used by the GPU, ultimately reducing runtime overhead. Our design leverages the widely-deployed Arm TrustZone and generic Arm features, without hardware modification or architectural changes. We prototype StrongBox using an off-the-shelf Arm Mali GPU and perform an extensive evaluation. Our results show that StrongBox successfully ensures the GPU computing security with a low (4.70\% - 15.26\%) overhead across several indicative benchmarks.
Anonymous communication, that is secure end-to-end and unlinkable, plays a critical role in protecting user privacy by preventing service providers from using message metadata to discover communication links between any two users. Techniques, such as Mix-net, DC-net, time delay, cover traffic, Secure Multiparty Computation (SMC) and Private Information Retrieval, can be used to achieve anonymous communication. SMC-based approach generally offers stronger simulation based security guarantee. In this paper, we propose a simple and novel SMC approach to establishing anonymous communication, easily implementable with two non-colluding servers which have only communication and storage related capabilities. Our approach offers stronger security guarantee against malicious adversaries without incurring a great deal of extra computation. To show its practicality, we implemented our solutions using Chameleon Cloud to simulate the interactions among a million users, and extensive simulations were conducted to show message latency with various group sizes. Our approach is efficient for smaller group sizes and sub-group communication while preserving message integrity. Also, it does not have the message collision problem.
Facial authentication has become more and more popular on personal devices. Due to the ease of use, it has great potential to be widely deployed for web-service authentication in the near future whereby people can easily log on to online accounts from different devices without memorizing lengthy passwords. However, the growing number of attacks on machine learning especially the Deep Neural Networks (DNN) which is commonly used for facial recognition, imposes big challenges on the successful roll-out of such web-service face authentication. Although there have been studies on defending some machine learning attacks, we are not aware of any specific effort devoted to the web-service facial authentication setting. In this article, we first demonstrate a new data poisoning attack that does not require to have any knowledge of the server-side and just needs a handful of malicious photo injections to enable an attacker to easily impersonate the victim in the existing facial authentication systems. We then propose a novel defensive approach called DEFEAT that leverages deep learning techniques to automatically detect such attacks. We have conducted extensive experiments on real datasets and our experimental results show that our defensive approach achieves more than 90 percent detection accuracy.
With the advances in autonomous vehicles and intelligent intersection management systems, traffic lights may be replaced by optimal travel plans calculated for each passing vehicle in the future. While these technological advancements are envisioned to greatly improve travel efficiency, they are still facing various challenging security hurdles since even a single deviation of a vehicle from its assigned travel plan could cause a serious accident if the surrounding vehicles do not take necessary actions in a timely manner. In this paper, we propose a novel security mechanism namely NWADE which can be integrated into existing autonomous intersection management systems to help detect malicious vehicle behavior and generate evacuation plans. In the NWADE mechanism, we introduce the neighborhood watch concept whereby each vehicle around the intersection will serve as a watcher to report or verify the abnormal behavior of any nearby vehicle and the intersection manager. We propose a blockchain-based verification framework to guarantee the integrity and trustworthiness of the individual travel plans optimized for the entire intersection. We have conducted extensive experimental studies on various traffic scenarios, and the experimental results demonstrate the practicality, effectiveness, and efficiency of our mechanism.
Spoof speech can be used to try and fool speaker verification systems that determine the identity of the speaker based on voice characteristics. This paper compares popular learnable front-ends on this task. We categorize the front-ends by defining two generic architectures and then analyze the filtering stages of both types in terms of learning constraints. We pro-pose replacing fixed filterbanks with a learnable layer that can better adapt to anti-spoofing tasks. The proposed FastAudio front-end is then tested with two popular back-ends to measure the performance on the Logical Access track of the ASVspoof 2019 dataset. The FastAudio front-end achieves a relative improvement of 29.7% when compared with fixed front-ends, outperforming all other learnable front-ends on this task.
An emerging trend in audio processing is capturing low-level speech representations from raw waveforms. These representations have shown promising results on a variety of tasks, such as speech recognition and speech separation. Compared to handcrafted features, learning speech features via backpropagation can potentially provide the model greater flexibility in how it represents data for different tasks. However, results from empirical studies show that, in some tasks, such as spoof speech detection, handcrafted features still currently outperform learned features. Instead of evaluating handcrafted features and raw waveforms independently, this paper proposes an Auxiliary Rawnet model to complement handcrafted features with features learned from raw waveforms for spoof speech detection. A key benefit of the approach is that it can improve accuracy at a relatively low computational cost. The proposed Auxiliary Rawnet model is tested using the ASVspoof 2019 dataset and pooled EER and min-tDCF are 1.11% and 0.03645 respectively. Results from this dataset indicate that a lightweight waveform encoder can boost the performance of handcrafted-features-based encoders for 10 types of spoof attacks, including 3 challenging attacks, in exchange for a small amount of additional computational work.
As software development has shifted into a “getting to market quickly"[4] philosophy by embracing fast iteration[2] paradigms offered by such practices as “agile", ensuring strong security and verifiability characteristics has become increasingly difficult. One major contributing factor is the tension between getting to market and satisfying the internal quality requirements of the engineering team (often resulting in software released “too soon” from the perspective of the engineers). This paper describes a software development workflow whereby security and verifiability can be wholly or partially offloaded to a contract to be written by security experts on, or partnering with, the development team and associated enforcement library. This contract can be used to reason about certain properties of the software externally from the running software itself and to enforce a subset of its capabilities at runtime, thus ensuring that at the injection points, the software will behave in a predictable and modelable manner.
In this paper, we study the resilient vector consensus problem in networks with adversarial agents and improve resilience guarantees of existing algorithms. A common approach to achieving resilient vector consensus is that every non-adversarial (or normal) agent in the network updates its state by moving towards a point in the convex hull of its normal neighbors’ states. Since an agent cannot distinguish between its normal and adversarial neighbors, computing such a point, often called safe point, is a challenging task. To compute a safe point, we propose to use the notion of centerpoint, which is an extension of the median in higher dimensions, instead of the Tverberg partition of points, which is often used for this purpose. We discuss that the notion of centerpoint provides a complete characterization of safe points in Rd. In particular, we show that a safe point is essentially an interior centerpoint if the number of adversaries in the neighborhood of a normal agent i is less than Nid+1, where d is the dimension of the state vector and Ni is the total number of agents in the neighborhood of i. Consequently, we obtain necessary and sufficient conditions on the number of adversarial agents to guarantee resilient vector consensus. Further, by considering the complexity of computing centerpoints, we discuss improvements in the resilience guarantees of vector consensus algorithms and compare with the other existing approaches. Finally, we numerically evaluate our approach.
publication
Graphics Peeping Unit: Exploiting EM Side-Channel Information of GPUs to Eavesdrop on Your Neighbors
As the popularity of graphics processing units (GPUs) grows rapidly in recent years, it becomes very critical to study and understand the security implications imposed by them. In this paper, we show that modern GPUs can “broadcast” sensitive information over the air to make a number of attacks practical. Specifically, we present a new electromagnetic (EM) side-channel vulnerability that we have discovered in many GPUs of both NVIDIA and AMD. We show that this vulnerability can be exploited to mount realistic attacks through two case studies, which are website fingerprinting and keystroke timing inference attacks. Our investigation recognizes the commonly used dynamic voltage and frequency scaling (DVFS) feature in GPU as the root cause of this vulnerability. Nevertheless, we also show that simply disabling DVFS may not be an effective countermeasure since it will introduce another highly exploitable EM side-channel vulnerability. To the best of our knowledge, this is the first work that studies realistic physical side-channel attacks on non-shared GPUs at a distance.