MapReduce is a framework for processing large data sets largely used in cloud computing. MapReduce implementations like Hadoop can tolerate crashes and file corruptions, but there is evidence that general arbitrary faults do occur and can affect the correctness of job executions. Furthermore, many individual cloud outages have been reported, raising concerns about depending on a single cloud. We present a MapReduce runtime that tolerates arbitrary faults and runs in a set of clouds at a reasonable cost in terms of computation and execution time. The main challenge is to avoid sending through the internet the huge amount of data that would normally be exchanged between map and reduce tasks. © 2012 IEEE.
|Original language||English (US)|
|Title of host publication||Proceedings of the IEEE Symposium on Reliable Distributed Systems|
|Number of pages||6|
|State||Published - Dec 1 2012|