My research lies at the intersection of Big Data management and distributed Cloud Computing systems.
Specifically, I work on data management systems that accelerate and support data science and global connectivity
especially in the context of autonomous, mobile applications and the Internet of Things.
Profiles:
UCSC,
Google Scholar,
DBLP,
LinkedIn
Dissertation Summary,
Dissertation
Research Statement: Data processing is the driver of the sustained growth and impact of Internet services and Big Data analytics. The global nature of users and data invites a Global-Scale Data Management paradigm. In my work, I study and build global-scale systems with a focus on providing high performance and easy-to-use database abstractions.
Looking forward, two trends in computing will increase the demand on Global-Scale
Data Management drastically and will ignite a transformation of data management
systems. The first trend is the emergence of Device-driven systems for
autonomous applications, Internet of Things (IoT), and mobile
agents (self-driving cars and robotics). The second trend is the emergence of Data-driven applications
such as data science and machine learning.
My ongoing work explores the opportunities and challenges in supporting the increasing demand of device-driven
systems
by augmenting data management systems with edge computing technology to increase and diversify
resources. Also, I explore domain-specific system designs to support the complexity of
emerging data-driven applications.
Click here for more about my work on global-scale data management
Projects and publications
Global-Scale Data Management: The global nature of data and users makes distributing databases and systems a natural step towards a better user experience and higher performance. Additionally, it provides higher levels of fault-tolerance. My work in this area investigate the challenges in building global-scale data management systems.
[31]
Blockplane: A Global-Scale Byzantizing Middleware
(ICDE 2019)
[30]
Unifying Consensus and Atomic Commitment for Effective Cloud Data Management
(VLDB 2019)
[29]
DPaxos: Managing Data Closer to Users for Low Latency and Mobile Applications
(SIGMOD 2018)
[28]
Nomadic Datacenters at the Network Edge: Data Management Challenges for the Cloud with Mobile Infrastructure
(EDBT 2018)
[27]
Global-Scale Placement of Transactional Data Stores
(EDBT 2018)
[26]
A System Infrastructure for Strongly Consistent Transactions on Globally-Replicated Data
(IEEE Data Engineering Bulletin 2017)
[25]
Janus: A Hybrid Scalable Multi-Representation Cloud Datastore
(IEEE TKDE 2017)
[24]
Typhon: Consistency Semantics for Multi-Representation Data Processing
(IEEE CLOUD 2017)
[23]
Multi-Representation Based Data Processing Architecture for IoT Applications
(ICDCS 2017)
[22]
COP: Planning Conflicts for Faster Parallel Transactional Machine Learning
(EDBT 2017)
[21]
The Challenges of Global-scale Data Management
(SIGMOD 2016 Tutorial) [pptx]
[20]
DB-Risk: The Game of Global Database Placement
(SIGMOD 2016 Demo) [demo]
[19]
Minimizing Commit Latency of Transactions in Geo-Replicated Data Stores
(SIGMOD 2015)
[18]
Chariots : A Scalable Shared Log for Data Management in Multi-Datacenter Cloud Environments
(EDBT 2015)
[17]
Mind your Ps and Vs: A perspective on the challenges of big data management and privacy concerns
(BigComp 2015)
[16]
Message Futures: Fast Commitment of Transactions in Multi-datacenter Environments.
(CIDR 2013)
[15]
Low-Latency Multi-Datacenter Databases using Replicated Commits.
(VLDB 2013)
[14]
Managing Geo-replicated Data in Multi-datacenters.
(Springer Databases in Networked Information Systems 2013)
[13]
Serializability, not Serial: Concurrency Control and Availability in Multi-Datacenter Datastores.
(VLDB 2012)
Data processing on emerging memory technology: In collaboration with HP Labs, I worked on designing data stores for non-volatile memory architectures. I studied the implications of emerging flush-on-fail CPU technology on the durability cost of transactions. Also, as an intern in MSR Redmond I worked on the Time-Split Bw-tree (TSBw-tree) that integrates the algorithms of the Time-split B-tree within the lock-free implementation of the Bwtree.
[12]
Dali: A Periodically Persistent Hash Map
(NVMW 2018)
[11]
Dali: A Periodically Persistent Hash Map
(DISC 2017)
[10]
High Performance Temporal Indexing on Modern Hardware
(ICDE 2015)
[9]
Procrastination Beats Prevention: Timely Sufficient Persistence for Efficient Crash Resilience
(EDBT 2015)
[8]
Zero-Overhead NVM Crash Resilience.
(FAST 2015 WiP Session +
Poster session)
[7]
Zero-Overhead NVM Crash Resilience
(NVMW 2015)
Fair resource allocation for Wireless Mesh Networks: This project tackles the problem of unfairness in Wireless Mesh Networks, where TCP flows experience different performance characteristics depending on their location in the network. A MAC-layer solution is developed to transparently improve TCP fairness. The proposed MAC layer, called TMAC, uses a timestamp-ordering technique to achieve fairness.
[6]
Fair Packet Scheduling in Wireless Mesh Networks.
(Elsevier Journal of Ad Hoc Networks 2014)
[5]
MAC-Layer Protocol for TCP Fairness in Wireless Mesh Networks.
(ICCC 2012)
[4]
TMAC: Timestamp-ordered MAC for CSMA/CA Wireless Mesh Networks.
(ICCCN 2011)
[3]
TMAC: Timestamp-Ordered MAC Protocol for Wireless Mesh Networks.
(MS Thesis 2011)
Other work on large-scale data processing
[2]
Graph Summarization for Geo-correlated Trends Detection in Social Networks
(SIGMOD 2016 Undergraduate Research Poster Competition)
[1]
MaaT: Effective and scalable coordination of distributed transactions in the cloud.
(VLDB 2014)