University of Minnesota
Introduction to Distributed Systems

Additional Reading and Resources

This page contains additional reading material as well as other resources such as online tutorials. The relevant material for each lecture will be indicated on the Schedule page, and you are highly encouraged to read these materials in addition to your textbook reading. More emphasis has been put on providing material for those topics that are not covered in the textbook comprehensively (Also Note: Some old papers are not available online, but references have been provided so that you can look for physical copies, e.g., in the library).

Note: It is strongly recommended that you refer to one of the following OS textbooks and networking references to cover the background material expected for the class.

Operating Systems Background (Textbooks)

  1. [OS-Easy-Pieces] Operating Systems: Three Easy Pieces, Remzi H. Arpaci-Dusseau and Andrea C. Arpaci-Dusseau.

  2. [OS-Concepts] Operating System Concepts Essentials (2nd Edition), Silberschatz, Galvin and Gagne.

  3. [MOS] Modern Operating Systems (4th Edition), Tanenbaum and Bos.

Networking Background

  1. [Guide] TCP/IP Guide.

  2. [Comer-TCP/IP] Douglas E. Comer, Internetworking with TCP/IP Vol.1: Principles, Protocols, and Architecture (6th Edition), Prentice Hall, 2014 - Excellent book on fundamentals of TCP/IP.

  3. [Socket] Socket Programming Tutorial.


  1. [Com1] RPC Tutorial

  2. [Com2] Apache Thrift

Distributed Computing

  1. [DC1] J. K. Ousterhout, "Scheduling techniques for concurrent systems". 3rd Intl. Conf. on Distributed Computing Systems, Oct. 1982. (See Sec. 1-4 for Co-shceduling idea (related to Gang Scheduling) )

  2. [DC2] D. L. Eager, E. D. Lazowska, and J. Zahorjan. "Adaptive load sharing in homogeneous distributed systems". IEEE Transactions on Software Engineering, 12(5), 1986. (Read Sec. I-II.B.)

  3. [DC3] R. Raman, M. Livny and M. Solomon, "Matchmaking: An extensible framework for distributed resource management", Cluster Computing, 1999.

  4. [DC4] Benjamin Hindman et al. "Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center", Proc of USENIX NSDI 2011. (Read Sec. 1-3.)

  5. [DC5] Jeffrey Dean and Sanjay Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters ", Proceedings of OSDI, 2004.


  1. [Nam1] Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan, "Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications", ACM SIGCOMM 2001, San Deigo, CA, August 2001, pp. 149-160. (Read Sec. 1-4.)

  2. [Nam2] Miguel Castro, Peter Druschel, Y C Hu, A Rowstron, "Topology-aware routing in structured peer-to-peer overlay networks", Technical Report MSR-TR-2002-82, 2002.

Clock Synchronization and Event Ordering

  1. [Syn1] Mills, D., "Improved Algorithms for Synchronizing Computer Network Clocks", IEEE/ACM Transactions on NetworkingIEEE Communications Society, 1994.

  2. [Syn2] Leslie Lamport, "Time, Clocks and the Ordering of Events in a Distributed System", Communications of the ACM 21, 7 (July 1978), 558-565.

Data Replication, Consistency Models and Web Caching

  1. [DR1]  David Mosberger, "Memory consistency models",  ACM SIGOPS Operating Systems Review, Volume 27,  Issue 1, January 1993.

  2. [DR2]  V. Duvvuri, P. Shenoy and R. Tewari, "Adaptive Leases: A Strong Consistency Mechanism for the World Wide Web",  IEEE Transactions on Knowledge and Data Engineering (TKDE), 5(5), pages 1266-1276, September 2003. (Can skip Sec. 3 and 6)

  3. [DR3]  GIFFORD,D. K. "Weighted voting for replicated data". In Proceedings of the 7th Symposium on Operating Systems Principles (Asilomar, Calif., Dec. 1979), ACM, New York, 1979,150-159. (Sections 1-3)

  4. [DR4]  H. Yu and A. Vahdat. "Design and Evaluation of a Continuous Consistency Model for Replicated Services". OSDI 2000. (Sections 1-3)

Code Migration

  1. [CM1] Michael R. Hines, Umesh Deshpande, and Kartik Gopalan, "Post-Copy Live Migration of Virtual Machines", ACM SIGOPS Operating Systems Review, Volume 43, Issue 3, July 2009, Pages 14-26. (Read Sec. 1-3.1)

Fault Tolerance

  1. [FT1] Leslie Lamport, Marshall Pease, Robert Shostak, "The Byzantine Generals Problem", ACM Transactions on Programming Languages and Systems 4, 3 (July 1982), 382-401. (Read Sections 1-2).

  2. [FT2] E. A. Akkoyunlu, K. Ekanadham, and R. V. Huber, "Some constraints and tradeoffs in the design of network communications", ACM SIGOPS Operating Systems Review, Volume 9,  Issue 5  (November 1975),  67 - 74.  (Read Appendix for the Two-Army Problem).

  3. [FT3] Leslie Lamport, "Paxos Made Simple", ACM SIGACT News (Distributed Computing Column) 32, 4 (December 2001), 51-58. (Read Sections 1-2).

  4. [FT4] K. Birman and T. Joseph, "Exploiting virtual synchrony in distributed systems", Proc. of SOSP '87.

  5. [FT5] Eric Brewer, "CAP twelve years later: How the "rules" have changed", IEEE Computer, Vol. 45 (2), 2012.

Distributed File Systems

  1. [DFS1] P. Braam, "The Coda Distributed File System", Linux Journal, #50, June 1998.

  2. [DFS2] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung, "The Google File System", Proceedings of SOSP 2003.

  3. [DFS3] A. Rowstrom and P. Druschel, "Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility", In Proc. of ACM SOSP, 2001.


  1. [VM1] G. Popek and R. Goldberg, "Formal requirements for virtualizable third generation architectures", Communications of the ACM, Vol. 17, No. 7, pp 412-421, July 1974.

  2. [VM2] M. Rosenblum and T. Garfinkel, "Virtual Machine Monitors: Current Technology and Future Trends", IEEE Computer, Vol. 38, No. 5, pp 39-47, May 2005.

Cloud and Mobile Computing

  1. [CC1] M. Ambrust et al., "Above the Clouds: A Berkeley View of Cloud Computing", UC Berkeley Tech. Report., 2009.

  2. [CC2] Abhishek Chandra, Jon Weissman and Benjamin Heintz, "Decentralized Edge Clouds", IEEE Internet Computing, Volume 17(5), pp 70-73, September-October 2013.

  3. [CC3], Satyanarayanan, M., Bahl, P., Caceres, R., Davies, N., "The Case for VM-based Cloudlets in Mobile Computing", IEEE Pervasive Computing, Vol. 8, No. 4, October-December 2009.