MapReduce Lession1

Map Reduce:_

    1) Map Reduce is an execution model in hadoop framework

    2) mapreduce is a batch process
       which is subdivided into two seperate Phases.

       i) Mapper Phase
       ii) Reducer Phase

    i) Mapper Phase:-

          From raw input file, It seperates required Output Key and Output value.

    ii) Reducer Phase:-

       mapper output is sent as input to Reducer

       Reducer has two responsibilties:

        a) grouping data based on key

        b) aggregating (summarization).

    In distributed systems, (cluster)   mapper and reducer are executed in seperate systems(slave nodes).

hdfsinput ---- mapper ----- o/p ---- reducer --------- hdfs o/p

    mapper output is called intermediate data or shuffled data.

the process of sending mapper output to reducer is called shuffling.

once reducer output is produced, mapper output will be deleted.

Halitics

MapReduce Lession1

About the author

0 comments:

Search

Follow us

Popular Posts

About Me

Blog Archive

Advertising