All posts by Mosharaf

Confidentiality and Security in the Cloud

Raluca Ada Popa, Catherine M. S. Redfield, Nickolai Zeldovich, Hari Balakrishnan, "CryptDB: Protecting Confidentiality with Encrypted Query Processing," SOSP, 2011. [PDF]

Thomas Ristenpart, Eran Tromer, Hovav Shacham, Stefan Savage, "Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds," CCS, 2009. [PDF]

Summary

With the increase in popularity of cloud … Continue Reading ››

Graph-parallel frameworks

Google, "Pregel: A System for Large-Scale Graph Processing," SIGMOD, 2010. [PDF]

Carnegie Mellon, "GraphLab: A New Framework for Parallel Machine Learning," arXiv:1006.4990, 2010. [PDF]

Summary

Data-parallel frameworks such as MapReduce and Dryad are good at performing embarrassingly parallel jobs. These frameworks are not ideal for iterative jobs and for jobs where data-dependencies across stages … Continue Reading ››

Datacenter transport layer protocols

Stanford and Microsoft, "DCTCP: Efficient Packet Transport for the Commoditized Data Center," SIGCOMM, 2010. [PDF]

Raiciu et al, "Improving Datacenter Performance and Robustness with Multipath TCP," SIGCOMM, 2011. [PDF]

MSR Asia, ICTCP: Incast Congestion Control for TCP in Data Center Networks," CoNEXT, 2010. [PDF]

Summary

Datacenters pose a different set of challenges than … Continue Reading ››

Cloudy operating systems

MIT, An Operating System for Multicore and Clouds: Mechanisms and Implementation," SOCC, 2010. [PDF]

Barret Rhoden, Kevin Klues, David (Yu) Zhu, Eric Brewer, "Improving Per-Node Efficiency in the Datacenter with New OS Abstractions," SOCC, 2011. [PDF]

Summary

Factored Operating System

The Factored Operating System (FOS) proposes an OS architecture where each core runs individual microkernels … Continue Reading ››

Multi-framework resource managers for datacenters

AMPLab, "Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center," NSDI, 2011. [PDF]

Apache Software Foundation, "Hadoop NextGen", 2011. [LINK]

Summary

Traditional cluster resource schedulers fall into two broad categories: some do fine-grained management of resources for individual frameworks (e.g., in Hadoop), but this requires multiple frameworks to run on multiple isolated … Continue Reading ››

Distributed in-memory datasets

AMPLab, "Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing," UCB/EECS-2011-82, 2011. [PDF]

Russell Power, Jinyang Li, "Piccolo: Building Fast, Distributed Programs with Partitioned Tables," OSDI, 2010. [PDF]

Summary

MapReduce and similar frameworks, while widely applicable, are limited to directed acyclic data flow models, do not expose global states, and generally slow due … Continue Reading ››

Cloud databases

MIT, "Relational Cloud: A Database-as-a-Service for the Cloud," CIDR, 2011. [PDF]

Divyakant Agrawal, Amr El Abbadi, Sudipto Das, Aaron J. Elmore, "Database Scalability, Elasticity, and Autonomy in the Cloud," DASFAA, 2011. [PDF]

Relational Cloud

The key idea of the Relational Cloud project is to define the concept of transactional Database-as-a-Service (DBaaS), identify the key challenges toward … Continue Reading ››

Declarative and finite state machine approaches to Cloud programming

Perter Alvaro, Tyson Condie, Neil Conway, Khaled Elmeleegy, Joseph M. Hellerstein, Russell Sears, "BOOM Analytics: Exploring Data-Centric, Declarative Programming for the Cloud," EuroSys, 2010. [PDF]

Joe Armstrong, "Erlang: A Survey of the Language and Its Industrial Applications," Ninth Exhibition and Symposium on Industrial Applications of Prolog, 1996. [PDF]

BOOM

BOOM or Berkeley Orders-Of-Magnitude adopts a … Continue Reading ››

Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS

Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, and David G. Andersen, "Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS," SOSP, 2011. [PDF]

Summary

This paper introduces a new consistency model, causal+, that extends the causal consistency model and lies between sequential and causal consistency models. The authors claim that causal+ is the … Continue Reading ››

PNUTS: Yahoo!’s Hosted Data Serving Platform

Yahoo! Research, "PNUTS: Yahoo!’s Hosted Data Serving Platform," PVLDB, 2008. [PDF]

Summary

PNUTS is a scalable, highly available, and geographically distributed (but low latency) data store used by most Yahoo! online properties. To achieve both availability and partition tolerance, it uses a novel notion of consistency called per-record timeline consistency; under this model, all replicas of … Continue Reading ››