Update: Camera-ready version is available here now!
Since introducing the coflow abstraction in 2012, we've been working hard to make it practical one step at a time. Over the years, we've worked on efficient coflow scheduling, removed clairvoyance requirements in coflow scheduling, and performed fair sharing among coexisting coflows. Throughout all these efforts, one requirement remained constant: all … Continue Reading ››Tag Archives: Datacenter Networking
HUG Accepted to Appear at NSDI’2016
Update: Camera-ready version is available here now!
With the advent of cloud computing and datacenter-scale applications, simultaneously dealing with multiple resources is the new norm. When multiple parties have multi-resource demands, fairly dividing these resources (for some notion of fairness) is a core challenge in the resource allocation literature. Dominant Resource Fairness (DRF) in NSDI'2011 was the first work to … Continue Reading ››Aalo Accepted to Appear at SIGCOMM’2015
Update: Camera-ready version is available here now!
Last SIGCOMM we introduced the coflow scheduling problem and presented Varys that addressed its clairvoyant variation, i.e., when all the information of individual coflows are known a priori, and there is no cluster and task scheduling dynamics. In many cases, these assumptions do not hold very well and left us with two primary … Continue Reading ››Orchestra is the Default Broadcast Mechanism in Apache Spark
With its recent release, Apache Spark has promoted Cornet—the BitTorrent-like broadcast mechanism proposed in Orchestra (SIGCOMM'11)—to become its default broadcast mechanism. It's great to see our research see the light of the real-world! Many thanks to Reynold and others for making it happen.
MLlib, the machine learning library of Spark, will enjoy the biggest boost from this change because of the broadcast-heavy nature of … Continue Reading ››
Presented Varys at SIGCOMM’2014
Update: Slides from my talk are online now!
Just got back from a vibrant SIGCOMM! I presented our recent work on application-aware network scheduling using the coflow abstraction (Varys). This is my fifth time attending and third time giving a talk. Great pleasure as always in meeting old friends and making new ones! SIGCOMM was held at Chicago this … Continue Reading ››Varys Developer Alpha Released!
We are glad to announce the first open-source release of Varys, an application-aware network scheduler for data-parallel clusters using the coflow abstraction. It's a stripped-down dev-alpha release for the experts, so please be patient with it!
A quick overview of the system can be found at varys.net. Here is a 30-second summary:
Varys is an open … Continue Reading ››
Varys accepted at SIGCOMM’2014: Coflow is coming!
Update 2: Varys Developer Alpha released!
Update 1: Latest version will be is available here soon now!
Coflow accepted at HotNets’2012
Update: Coflow camera-ready is available online! Tell us what you think!
Our position paper to address the lack of a networking abstraction for cluster applications, "Coflow: A Networking Abstraction for Cluster Applications," has been accepted at the latest workshop on hot topics in networking. We make the observation that thinking in terms of flows is … Continue Reading ››Ahimsa accepted at HotCloud’2012
Update: Camera-ready is available online! Do let us know what you think in the comments section.
Our exploratory paper on the complexity of a transfer, "Redefining Network Fairness to Support Data Parallelism," has been accepted for publication at this year's HotCloud workshop! In Orchestra, we defined the notion of transfers in the context of cluster computing, … Continue Reading ››“Surviving Failures in Bandwidth-Constrained Datacenters” at SIGCOMM’2012
Update: Camera-ready version is in my publications page!
My internship work from last Summer has been accepted for publication at SIGCOMM'2012 as well; yay!! In this piece of work, we try to allocate machines for datacenter applications with bandwidth and fault-tolerance constraints, which are at odds—allocation for bandwidth tries to put machines closer, whereas a fault-tolerant … Continue Reading ››