Untitled

1. MapReduce Overview


Programmers just need to define map and reduce functions, and the MR will manage, and hide all aspects of distribution

2. Abstract of View


Input1 -> Map -> a,1 b,1
Input2 -> Map ->     b,1
Input3 -> Map -> a,1     c,1
                  |   |   |
                  |   |   -> Reduce -> c,1
                  |   -----> Reduce -> b,2
                  ---------> Reduce -> a,2
  1. Input is (already) split into M files
  2. MR calls Map() for each input file, produces set of k2,v2
  3. When Maps are done
    1. MR gathers all intermediate v2's for a given k2,
    2. and passes each key + values to a Reduce call
  4. Final output is set of <k2,v3> pairs from Reduce()s

3. Details


3.1 Hidden Details