My research lies at the intersection of Big Data management and distributed Cloud Computing systems.
Specifically, I work on data management systems that accelerate and support data science and global connectivity
especially in the context of autonomous, mobile applications and the Internet of Things.
Teaching: CMPS 290S Winter 2018
Research Statement: Data processing is the driver of the sustained growth and impact of Internet services and Big Data analytics. The global nature of users and data invites a Global-Scale Data Management paradigm. In my work, I study and build global-scale systems with a focus on providing high performance and easy-to-use database abstractions.
Looking forward, two trends in computing will increase the demand on Global-Scale
Data Management drastically and will ignite a transformation of data management
systems. The first trend is the emergence of Device-driven systems for
autonomous applications, Internet of Things (IoT), and mobile
agents (self-driving cars and robotics). The second trend is the emergence of Data-driven applications
such as data science and machine learning.
My ongoing work explores the opportunities and challenges in supporting the increasing demand of device-driven
by augmenting data management systems with edge computing technology to increase and diversify
resources. Also, I explore domain-specific system designs to support the complexity of
emerging data-driven applications.
Click here for more about my work on global-scale data management
Projects and publications
Global-Scale Data Management: The global nature of data and users makes distributing databases and systems a natural step towards a better user experience and higher performance. Additionally, it provides higher levels of fault-tolerance. My work in this area investigate the challenges in building global-scale data management systems.
Global-Scale Placement of Transactional Data Stores
A System Infrastructure for Strongly Consistent Transactions on Globally-Replicated Data
(IEEE Data Engineering Bulletin 2017)
Janus: A Hybrid Scalable Multi-Representation Cloud Datastore
(IEEE TKDE 2017)
Typhon: Consistency Semantics for Multi-Representation Data Processing
(IEEE CLOUD 2017)
Multi-Representation Based Data Processing Architecture for IoT Applications
Low-Latency Multi-Datacenter Databases using Replicated Commits.
Managing Geo-replicated Data in Multi-datacenters.
(Springer Databases in Networked Information Systems 2013)
Data processing on emerging memory technology: In collaboration with HP Labs, I worked on designing data stores for non-volatile memory architectures. I studied the implications of emerging flush-on-fail CPU technology on the durability cost of transactions. Also, as an intern in MSR Redmond I worked on the Time-Split Bw-tree (TSBw-tree) that integrates the algorithms of the Time-split B-tree within the lock-free implementation of the Bwtree.
Dali: A Periodically Persistent Hash Map
Fair resource allocation for Wireless Mesh Networks: This project tackles the problem of unfairness in Wireless Mesh Networks, where TCP flows experience different performance characteristics depending on their location in the network. A MAC-layer solution is developed to transparently improve TCP fairness. The proposed MAC layer, called TMAC, uses a timestamp-ordering technique to achieve fairness.
Fair Packet Scheduling in Wireless Mesh Networks.
(Elsevier Journal of Ad Hoc Networks 2014)
MAC-Layer Protocol for TCP Fairness in Wireless Mesh Networks.
TMAC: Timestamp-ordered MAC for CSMA/CA Wireless Mesh Networks.
TMAC: Timestamp-Ordered MAC Protocol for Wireless Mesh Networks.
(MS Thesis 2011)
Other work on large-scale data processing