{{{ #!html
Hadoop Streaming
}}} [[PageOutline]] Hadoop streaming是Hadoop的一個工具, 它幫助用戶創建和運行一類特殊的map/reduce作業, 這些特殊的map/reduce作業是由一些可執行文件或腳本文件充當mapper或者reducer 用法: {{{ $ bin/hadoop jar contrib/streaming/hadoop-0.18.3-streaming.jar \ -input $INPUT -output $OUTPUT -mapper $MAPPER -reducer $REDUCER }}} 格式分析: || bin/hadoop || 呼叫使用hadoop程式 || || jar contrib/streaming/hadoop-0.18.3-streaming.jar || 使用streaming這個功能 || || -input $INPUT || 設定hdfs上的輸入資料夾 || || -output $OUTPUT || 設定hdfs上的輸出資料夾|| || -mapper $MAPPER || 設定mapper程式 || || -reducer $REDUCER || 設定reducer程式 || = 用 shell實做mapReduce = 此範例以 cat 當mapper , wc 作 reducer * 運算方法如下 {{{ $ bin/hadoop jar contrib/streaming/hadoop-0.18.3-streaming.jar \ -input lab3_input -output stream-out1 -mapper /bin/cat -reducer /usr/bin/wc }}} * 輸出的結果為: {{{ $ bin/hadoop fs -cat stream-out1/part-00000 }}} || 行 || 字數 || 字元數 || || 1528 || 4612 || 48644 || = 用php實做mapReduce = * [http://www.hadoop.tw/2008/09/php-hadoop.html 用 "單機" 跟 "PHP" 開發 Hadoop 程式] from Hadoop Taiwan User Group = Python 實做 = * [http://www.cs.brandeis.edu/~cs147a/lab/hadoop-example/ Hadoop Example Program] from brandeis University