Global-scale Data Management (GSDM)

Cloud service providers host web and cloud applications on datacenters across the world. The area of Global-scale Data Management explores the opportunity of deploying cloud applications on multiple datacenters for fault-tolerance, availability, and cost effectiveness. This page summarizes some of our work in this area and provides useful resources for researchers and practitioners interested in global-scale data management.

Talks

The challenges of Global-scale Data Management

Description: A 3-hour tutorial on the area of global-scale data management. The main theme of the tutorial is an exploration of the trade-off between request latency and consistency guarantees of GSDM infrastructure solutions.
Slides: [pptx]
Presented in SIGMOD 2016 [paper]

Geo-replication: A Journey from the Simple to the Optimal

Description: A 1-hour talk about our GSDM work in UC Santa Barbara. A chronological narrative shows the initial motivation of this line of research, and how each piece of work addresses challenges and observations of earlier work.
Slides: [pptx]
Presented in HP Labs (2015), UCLA (2015), and an earlier version was presented in Microsoft Research (2013)

Demonstrations

DB-Risk: The Game of Global Database Placement

Description: The game simulates a data placement problem in GSDM. Participants are invited to place replicas of a database on a subset of the available datacenters with the objective of minimizing latency. The game encourages learning the trade-offs of data placement and the effectiveness of existing and recent performance optimizations.
Access to the game is available through this link
Presented in SIGMOD 2016 [paper]

Bibliography

The Challenges of Global-scale Data Management
(SIGMOD 2016 Tutorial)

DB-Risk: The Game of Global Database Placement
(SIGMOD 2016 Demo)

Minimizing Commit Latency of Transactions in Geo-Replicated Data Stores
(SIGMOD 2015)

Chariots : A Scalable Shared Log for Data Management in Multi-Datacenter Cloud Environments
(EDBT 2015)

Mind your Ps and Vs: A perspective on the challenges of big data management and privacy concerns
(BigComp 2015)

Message Futures: Fast Commitment of Transactions in Multi-datacenter Environments.
(CIDR 2013)

Low-Latency Multi-Datacenter Databases using Replicated Commits.
(VLDB 2013)

Managing Geo-replicated Data in Multi-datacenters.
(Springer Databases in Networked Information Systems 2013)

Serializability, not Serial: Concurrency Control and Availability in Multi-Datacenter Datastores.
(VLDB 2012)