University of Minnesota
Dispersed Data-driven Computing

Reading List (Tentative)

Note: Some papers are available through the ACM or IEEE Digital Library. These can be accessed for free from within the campus network.

Some new papers are not available online currently, but links will be added when available. More papers may be added to the list later.

Background Papers

  1. (Data-Parallel Computing) MapReduce: Simplified Data Processing on Large Clusters. Jeffrey Dean and Sanjay Ghemawat. OSDI 2004.

  2. (Data-Parallel Computing) Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael Franklin, Scott Shenker and Ion Stoica. NSDI 2012.

  3. (Stream Processing) Storm@Twitter. Toshniwal et al. SIGMOD'14.

  4. (Graph Processing) Pregel: a system for large-scale graph processing. Grzegorz Malewicz, Matthew H. Austern, Aart J.C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski. SIGMOD 2010.

  5. (Distributed Machine Learning) Scaling Distributed Machine Learning with the Parameter Server. Mu Li, David G. Andersen, Jun Woo Park, Alexander J. Smola, Amr Ahmed, Vanja Josifovski, James Long, Eugene J. Shekita, and Bor-Yiing Su. OSDI'14.

  6. (Edge Computing) (Short) The Emergence of Edge Computing. M. Satyanarayanan. IEEE Computer, vol. 50, no. 1, Jan. 2017.

  7. (Fog Computing) (Short) Finding Your Way in the Fog: Towards a Comprehensive Definition of Fog Computing. Vaquero, Luis M. and Rodero-Merino, Luis. SIGCOMM Comput. Commun. Rev. 2014.

Geo-distributed Analytics

  1. CLARINET: WAN-Aware Optimization for Analytics Queries. Raajay Viswanathan et al. OSDI 2016.

  2. Wide-Area Analytics with Multiple Resources. Chien-Chun Hung et al. EuroSys'18.

  3. Dynamic and Decentralized Global Analytics via Machine Learning. Hao Wang et al. SOCC'18. Related: (Short) Lube: Mitigating Bottlenecks in Wide Area Data Analytics. Hao Wang and Baochun Li, HotCloud'17.

  4. Aggregation and Degradation in JetStream: Streaming Analytics in the Wide Area. Ariel Rabkin, Matvey Arye, Siddhartha Sen, Vivek S. Pai, and Michael J. Freedman. NSDI 2014.

  5. AWStream: Adaptive Wide-Area Streaming Analytics. Ben Zhang et al. SIGCOMM'18.

  6. Multi-Query Optimization in Wide-Area Streaming Analytics. Albert Jonathan et al. SOCC’18.

  7. Gaia: Geo-Distributed Machine Learning Approaching LAN Speeds. Kevin Hsieh et al. NSDI'17.

  8. (Short) Bohr: Similarity Aware Geo-distributed Data Analytics. Hangyu Li et al. HotCloud’17.

  9. (Short) Monarch: Gaining Command on Geo-Distributed Graph Analytics. Anand Padmanabha Iyer et al. Hotcloud'18.

Geo-distributed Storage and Networking

  1. Diamond: Automating Data Management and Storage for Wide-Area, Reactive Applications. Irene Zhang et al. OSDI'16.

  2. Giza: Erasure Coding Objects across Global Data Centers. Yu Lin Chen et al. ATC'17.

  3. Fast and Accurate Load Balancing for Geo-Distributed Storage Systems. Kirill L. Bogdanov et al. SOCC'18.

  4. Siphon: Expediting Inter-Datacenter Coflows in Wide-Area Data Analytics. Shuhao Liu et al. ATC'18.

  5. (Short) To Relay or Not to Relay for Inter-Cloud Transfers? Fan Lai et al. HotCloud'18.

Edge Computing and Storage

  1. ParaDrop: Enabling Lightweight Multi-tenancy at the Network's Extreme Edge. P. Liu et al. SEC’16.

  2. Fast, Scalable and Secure Onloading of Edge Functions Using AirBox. K. Bhardwaj et al. SEC’16.

  3. Workload Management for Dynamic Mobile Device Clusters in Edge Femtoclouds. Karim Habak et al. SEC’17.

  4. CloudPath: A Multi-Tier Cloud Computing Framework. Seyed Hossein Mortazavi et al. SEC’17.

  5. Portable Energy-Aware Cluster-Based Edge Computers. Thomas Rausch et al. SEC’18.

  6. From Cell Towers to Smart Street Lamps: Placing Cloudlets on Existing Urban Infrastructures. Julien Gedeon et al. SEC’18.

  7. Bolt: Data Management for Connected Homes. Trinabh Gupta et al. NSDI’14.

  8. (Short) Steel: Simplified Development and Deployment of Edge-Cloud Applications. Shadi A. Noghabi et al. HotCloud’18.

  9. (Short) Towards a Solution to the Red Wedding Problem. Christopher Meiklejohn et al. HotEdge’18.

  10. (Short) Edge Computing Resource Management System: a Critical Building Block! Initiating the debate via OpenStack. Ronan-Alexandre Cherrueau et al. HotEdge’18.

  11. (Short) Mobile Data Repositories at the Edge. Ioannis Psaras et al. HotEdge’18.

  12. (Short) Time-based Coordination in Geo-Distributed Cyber-Physical Systems. Sandeep D'souza et al. HotCloud’17.

Sensing, IoT and the Edge

  1. SOUL: An Edge-Cloud System for Mobile Applications in a Sensor-Rich World. M. Jang et al. SEC’16.

  2. Optimized On-Demand Data Streaming from Sensor Nodes. Jonas Traub et al. SOCC’17.

  3. FarmBeats: An IoT Platform for Data-Driven Agriculture. Deepak Vasisht et al. NSDI’17.

  4. VideoEdge: Processing Camera Streams using Hierarchical Clusters. Chien-Chun Hung et al. SEC’18.

Edge Analytics

  1. DeepEye: Resource Efficient Local Execution of Multiple Deep Vision Models using Wearable Commodity Hardware. Xiao Zeng et al. MobiSys’17.

  2. Precog: Prefetching for Image Recognition Applications at the Edge. Utsav Drolia et al. SEC’17.

  3. IONN: Incremental Offloading of Neural Network Computations from Mobile Devices to Edge Servers. Hyuk-Jin Jeong et al. SOCC’18.

  4. Edge-based Discovery of Training Data for Machine Learning. Ziqiang Feng et al. SEC’18.

  5. (Short) pCAMP: Performance Comparison of Machine Learning Packages on the Edges. Xingzhou Zhang et al. HotEdge’18.

  6. (Short) eSGD: Communication Efficient Distributed Deep Learning on the Edge. Zeyi Tao and Qun Li. HotEdge’18.

  7. (Short) MODI: Mobile Deep Inference Made Efficient by Edge Computing. Samuel S. Ogden and Tian Guo. HotEdge’18.

  8. (Short) ECO: Harmonizing Edge and Cloud with ML/DL Orchestration. Nisha Talagala et al. HotEdge’18.