Winter '18 CMPS 290S: Adv Topics:Comp Sys

General Information

Course Description (Distributed Data Management Systems for Modern Applications)
This is a graduate-level project-oriented seminar course in modern applications of distributed data management. The material of the course covers a number of modern applications of distributed data management systems such as geo-replication, large-scale machine learning, edge computing, and cryptocurrency. We will study how data management systems evolved to cater to these applications. Each topic will be studied by discussing one or more papers. The format of presenting the topics is to discuss the foundation and the used techniques. Following the foundation is a discussion of a recent system that realizes a solution to a modern application.

The course is project oriented. Each group of students select a topic (e.g., large-scale machine learning or geo-replication) and a recent systems paper in that topic. The group project has two main milestones: (1) Reproducing the main experimental evaluations of the system. This entails reimplementing the system. (2) Proposing and implementing an enhancement on the system. The group will write a short report on their evaluations and proposed enhancement. Also, they will do three presentations: (1) a short presentation on the paper they selected in the first half of the course, (2) a longer presentation on another paper in the topic, (3) a short final presentation about the project.

Students are expected to have a basic understanding of operations systems, databases, and distributed systems. For the project, students are expected to have strong systems/implementation skills.

Project: 60%
Presentations: 30%
Participation: 10%

Note: the schedule is subject to change according to the topics and papers selected by students.

Weeks 1 (Tuesday)Welcome and Distributed Systems introduction
Weeks 1 (Thursday)Global-Scale Data Management: Strong consistencyConsensus foundation
  • [reading] Paxos: Lamport, Leslie. "Paxos made simple." ACM Sigact News 32.4 (2001): 18-25.
  • Fast Paxos: Lamport, Leslie. "Fast paxos." Distributed Computing 19.2 (2006): 79-103.
  • Flexible Paxos: Howard, Heidi, Dahlia Malkhi, and Alexander Spiegelman. "Flexible paxos: Quorum intersection revisited." arXiv preprint arXiv:1608.06696 (2016). APA
Weeks 2 (Tuesday)Global-Scale Data Management: Strong consistencyTransactions and Atomic Commitment foundation
  • Serializability and One-Copy Equivalence: Bernstein, Philip A., and Nathan Goodman. "The failure and recovery problem for replicated databases." Proceedings of the second annual ACM symposium on Principles of distributed computing. ACM, 1983.
  • Two Phase Commit
  • [reading] Three Phase Commit: Skeen, Dale, and Michael Stonebraker. "A formal model of crash recovery in a distributed system." IEEE Transactions on Software Engineering 3 (1983): 219-228.
Weeks 2 (Thursday)Globa-Scale Data Management: Strong consistencyModern Systems (one of the following to be presented by students)
  • [reading] Spanner: Corbett, James C., et al. "Spanner: Google’s globally distributed database." ACM Transactions on Computer Systems (TOCS) 31.3 (2013): 8. APA
  • EPaxos: Moraru, Iulian, David G. Andersen, and Michael Kaminsky. "There is more consensus in egalitarian parliaments." SOSP, 2013.
  • MDCC: Kraska, Tim, et al. "MDCC: Multi-data center consistency." EuroSys, 2013.
Weeks 3 (Tuesday)Global-Scale Data Management: Relaxed TransactionsRelaxed Guarantees
  • [reading] Causality: Lamport, Leslie. "Time, clocks, and the ordering of events in a distributed system." Communications of the ACM 21.7 (1978): 558-565.
  • Snapshot Isolation: Berenson, Hal, et al. "A critique of ANSI SQL isolation levels." ACM SIGMOD Record. Vol. 24. No. 2. ACM, 1995.
  • Session consistency: Terry, Douglas B., et al. "Session guarantees for weakly consistent replicated data." Parallel and Distributed Information Systems, 1994., Proceedings of the Third International Conference on. IEEE, 1994.
Weeks 3 (Thursday)Global-Scale Data Management: Relaxed TransactionsModern systems (one of the following to be presented by students)
  • [reading] COPS: Lloyd, Wyatt, et al. "Don't settle for eventual: scalable causal consistency for wide-area storage with COPS." SOSP, 2011.
  • Walter: Sovran, Yair, et al. "Transactional storage for geo-replicated systems." SOSP, 2011.
  • Transaction Chains: Zhang, Yang, et al. "Transaction chains: achieving serializability with low latency in geo-distributed storage systems." SOSP, 2013.
Weeks 4 (Tuesday)Globa-Scale Data Management: Big Data AnalyticsFoundation
  • [reading] C-Store: Stonebraker, Mike, et al. "C-store: a column-oriented DBMS." VLDB, 2005.
  • Multi-versioning for read-only transactions: Satyanarayanan, O. T., and Divyakant Agrawal. "Efficient execution of read-only transactions in replicated multiversion databases." IEEE TKDE 1993.
Weeks 4 (Thursday)Globa-Scale Data Management: Big Data AnalyticsModern systems (one of the following to be presented by students)
  • [reading] Photon: Ananthanarayanan, Rajagopal, et al. "Photon: Fault-tolerant and scalable joining of continuous data streams." SIGMOD, 2013.
  • Iridium: Pu, Qifan, et al. "Low latency geo-distributed data analytics." ACM SIGCOMM 2015
  • WANalytics: Vulimiri, Ashish, et al. "WANalytics: Analytics for a Geo-Distributed Data-Intensive World." CIDR. 2015.
Weeks 5 (Tuesday)Byzantine Agreement
  • [reading] Practical Byzantine Fault Tolerance: Castro, Miguel, and Barbara Liskov. "Practical Byzantine fault tolerance." OSDI 1999.
  • Byzantine generals: Lamport, L.; Shostak, R.; Pease, M. (1982). "The Byzantine Generals Problem"
  • Proof of work: Dwork, Cynthia; Naor, Moni (1993). "Pricing via Processing, Or, Combatting Junk Mail, Advances in Cryptology". CRYPTO’92
Weeks 5 (Thursday)Byzantine Agreement
  • Bitcoin: Nakamoto, Satoshi. "Bitcoin: A peer-to-peer electronic cash system." 2008
  • [reading] Algorand: Gilad, Yossi, et al. "Algorand: Scaling byzantine agreements for cryptocurrencies." SOSP 2017
  • Bitcoin-NG: Eyal, Ittay, et al. "Bitcoin-NG: A Scalable Blockchain Protocol." NSDI, 2016
Weeks 6 (Tuesday)Learning Systems: Multi-core systems
  • Hogwild!: Recht, Benjamin, et al. "Hogwild: A lock-free approach to parallelizing stochastic gradient descent." NIPS 2011.
  • [reading] Cyclades: Pan, Xinghao, et al. "Cyclades: Conflict-free asynchronous machine learning." NIPS 2016.
  • COP: Nawab, Faisal, et al. "COP: Planning Conflicts for Faster Parallel Transactional Machine Learning." EDBT. 2017.
Weeks 6 (Thursday)Learning Systems: Distributed systems
  • [reading] Parameter Server: Li, Mu, et al. "Scaling Distributed Machine Learning with the Parameter Server." OSDI, 2014.
  • Distbelief: Dean, Jeffrey, et al. "Large scale distributed deep networks." Advances in neural information processing systems. 2012.
Weeks 7 (Tuesday)Learning Systems: Image processing systems
  • SVE: Huang, Qi, et al. "SVE: Distributed Video Processing at Facebook Scale." SOSP 2017.
  • NoScope: Kang, Daniel, et al. "NoScope: optimizing neural network queries over video at scale." VLDB, 2017.
  • [reading] NeuroSurgeon: Kang, Yiping, et al. "Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge." ASPLOS, 2017.
Weeks 7 (Thursday)Papers selected by students: Strong consistencyTBD
Weeks 8 (Tuesday)Papers selected by students: Relaxed transactionsTBD
Weeks 8 (Thursday)Papers selected by students: Big Data AnalyticsTBD
Weeks 9 (Tuesday)Papers selected by students: Byzantine agreementTBD
Weeks 9 (Thursday)Papers selected by students:learning systemsTBD
Weeks 10 (Tuesday)Final project presentations
Weeks 10 (Thursday)Final project presentations