Microsoft, "DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language," OSDI, 2008. [PDF]
Google, "FlumeJava: Easy, Efficient Data-Parallel Pipelines," PLDI, 2010. [LINK]
Microsoft, "DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language," OSDI, 2008. [PDF]
Google, "FlumeJava: Easy, Efficient Data-Parallel Pipelines," PLDI, 2010. [LINK]
Google, "Dremel: Interactive Analysis of Web-Scale Datasets," VLDB, 2010. [PDF]
Amazon, "Dynamo: Amazon's Highly Available Key-value Store," SOSP, 2007. [PDF]
Google, "Bigtable: A Distributed Storage System for Structured Data," OSDI, 2006. [PDF]
Michael Armbrust, Armando Fox, David A. Patterson, Nick Lanham, Beth Trushkowsky, Jesse Trutna, Haruki Oh, "SCADS: Scale-Independent Storage for Social Computing Applications," CIDR, 2009. [PDF]
Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, Andrew Tomkins, "Pig Latin: A Not-So-Foreign Language for Data Processing," SIGMOD, 2008. [PDF]
Facebook Data Team, "Hive: Data Warehousing and Analytics on Hadoop," . [LINK]
Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, Dennis Fetterly, "Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks," EuroSys, 2007. [PDF]
Jeffrey Dean, Sanjay Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters," OSDI, 2004. [PDF]
Google, "Megastore: Providing Scalable, Highly Available Storage for Interactive Services," CIDR, 2011. [PDF]
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung, "The Google File System, " SOSP, (October, 2003). [PDF]