BARISTA: Efficient and Scalable Serverless Serving System for Deep Learning Prediction Services
Author
Abstract

Pre-trained deep learning models are increasingly being used to offer a variety of compute-intensive predictive analytics services such as fitness tracking, speech and image recognition. The stateless and highly parallelizable nature of deep learning models makes them well-suited for serverless computing paradigm. However, making effective resource management decisions for these services is a hard problem due to the dynamic workloads and diverse set of available resource configurations that have their deployment and management costs.

Year of Publication
2019
Conference Name
IEEE International Con- ference on Cloud Engineering (IC2E),
Date Published
06/2019
Publisher
IEEE
Conference Location
Prague, Czech Republic
URL
https://doi.org/10.1109%2Fic2e.2019.00-10
DOI
10.1109/ic2e.2019.00-10
Google Scholar | BibTeX | XML | DOI