My research lies at the intersection of Big Data management and distributed Cloud Computing systems. Specifically, I work on data management systems that accelerate and support data science and global connectivity especially in the context of autonomous, mobile applications and the Internet of Things.

Teaching: CMPS 290S Winter 2018

Research Statement: Data processing is the driver of the sustained growth and impact of Internet services and Big Data analytics. The global nature of users and data invites a Global-Scale Data Management paradigm. In my work, I study and build global-scale systems with a focus on providing high performance and easy-to-use database abstractions.

Looking forward, two trends in computing will increase the demand on Global-Scale Data Management drastically and will ignite a transformation of data management systems. The first trend is the emergence of Device-driven systems for autonomous applications, Internet of Things (IoT), and mobile agents (self-driving cars and robotics). The second trend is the emergence of Data-driven applications such as data science and machine learning. My ongoing work explores the opportunities and challenges in supporting the increasing demand of device-driven systems by augmenting data management systems with edge computing technology to increase and diversify resources. Also, I explore domain-specific system designs to support the complexity of emerging data-driven applications.
Click here for more about my work on global-scale data management

Projects and publications

Global-Scale Data Management: The global nature of data and users makes distributing databases and systems a natural step towards a better user experience and higher performance. Additionally, it provides higher levels of fault-tolerance. My work in this area investigate the challenges in building global-scale data management systems.

[29] LPaxos: Managing Data Closer to Users for Low Latency and Mobile Applications
(SIGMOD 2018)

[28] Nomadic Datacenters at the Network Edge: Data Management Challenges for the Cloud with Mobile Infrastructure
(EDBT 2018)

[27] Global-Scale Placement of Transactional Data Stores
(EDBT 2018)

[26] A System Infrastructure for Strongly Consistent Transactions on Globally-Replicated Data
(IEEE Data Engineering Bulletin 2017)

[25] Janus: A Hybrid Scalable Multi-Representation Cloud Datastore
(IEEE TKDE 2017)

[24] Typhon: Consistency Semantics for Multi-Representation Data Processing

[23] Multi-Representation Based Data Processing Architecture for IoT Applications
(ICDCS 2017)

[22] COP: Planning Conflicts for Faster Parallel Transactional Machine Learning
(EDBT 2017)

[21] The Challenges of Global-scale Data Management
(SIGMOD 2016 Tutorial) [pptx]

[20] DB-Risk: The Game of Global Database Placement
(SIGMOD 2016 Demo) [demo]

[19] Minimizing Commit Latency of Transactions in Geo-Replicated Data Stores
(SIGMOD 2015)

[18] Chariots : A Scalable Shared Log for Data Management in Multi-Datacenter Cloud Environments
(EDBT 2015)

[17] Mind your Ps and Vs: A perspective on the challenges of big data management and privacy concerns
(BigComp 2015)

[16] Message Futures: Fast Commitment of Transactions in Multi-datacenter Environments.
(CIDR 2013)

[15] Low-Latency Multi-Datacenter Databases using Replicated Commits.
(VLDB 2013)

[14] Managing Geo-replicated Data in Multi-datacenters.
(Springer Databases in Networked Information Systems 2013)

[13] Serializability, not Serial: Concurrency Control and Availability in Multi-Datacenter Datastores.
(VLDB 2012)

Data processing on emerging memory technology: In collaboration with HP Labs, I worked on designing data stores for non-volatile memory architectures. I studied the implications of emerging flush-on-fail CPU technology on the durability cost of transactions. Also, as an intern in MSR Redmond I worked on the Time-Split Bw-tree (TSBw-tree) that integrates the algorithms of the Time-split B-tree within the lock-free implementation of the Bwtree.

[12] Dali: A Periodically Persistent Hash Map
(NVMW 2018)

[11] Dali: A Periodically Persistent Hash Map
(DISC 2017)

[10] High Performance Temporal Indexing on Modern Hardware
(ICDE 2015)

[9] Procrastination Beats Prevention: Timely Sufficient Persistence for Efficient Crash Resilience
(EDBT 2015)

[8] Zero-Overhead NVM Crash Resilience.
(FAST 2015 WiP Session + Poster session)

[7] Zero-Overhead NVM Crash Resilience
(NVMW 2015)

Fair resource allocation for Wireless Mesh Networks: This project tackles the problem of unfairness in Wireless Mesh Networks, where TCP flows experience different performance characteristics depending on their location in the network. A MAC-layer solution is developed to transparently improve TCP fairness. The proposed MAC layer, called TMAC, uses a timestamp-ordering technique to achieve fairness.

[6] Fair Packet Scheduling in Wireless Mesh Networks.
(Elsevier Journal of Ad Hoc Networks 2014)

[5] MAC-Layer Protocol for TCP Fairness in Wireless Mesh Networks.
(ICCC 2012)

[4] TMAC: Timestamp-ordered MAC for CSMA/CA Wireless Mesh Networks.
(ICCCN 2011)

[3] TMAC: Timestamp-Ordered MAC Protocol for Wireless Mesh Networks.
(MS Thesis 2011)

Other work on large-scale data processing

[2] Graph Summarization for Geo-correlated Trends Detection in Social Networks
(SIGMOD 2016 Undergraduate Research Poster Competition)

[1] MaaT: Effective and scalable coordination of distributed transactions in the cloud.
(VLDB 2014)