Changes between Initial Version and Version 1 of III120825/Lab8


Ignore:
Timestamp:
Aug 23, 2012, 11:57:56 PM (12 years ago)
Author:
jazz
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • III120825/Lab8

    v1 v1  
     1[[PageOutline]]
     2
     3◢ <[wiki:III120825/Lab7 實作七]> | <[wiki:III120825 回課程大綱]> ▲ | <[wiki:III120825/Lab9 實作九]> ◣
     4
     5= 實作八 Lab 8 =
     6
     7{{{
     8#!html
     9<div style="text-align: center;"><big style="font-weight: bold;"><big>在完全分散模式下編譯 MapReduce 程式<br/>Compiling Hadoop MapReduce Java Program in Hadoop Cluster</big></big></div>
     10}}}
     11
     12{{{
     13#!text
     14以下練習,請連線至 hadoop.nchc.org.tw 操作。底下的 hXXXX 等於您的用戶名稱。
     15}}}
     16
     17= Practice 1 : Word Count (Basic) =
     18
     19 * 上傳內容到 HDFS 內[[BR]]upload data to HDFS
     20{{{
     21$ mkdir lab8_input
     22$ echo "I like NCTU Cloud Course." > lab8_input/input1
     23$ echo "I like nctu Cloud Course, and we enjoy this course." > lab8_input/input2
     24$ hadoop fs -put lab8_input lab8_input
     25$ hadoop fs -ls lab8_input
     26Found 2 items
     27-rw-r--r--   2 hXXXX supergroup         26 2011-04-19 10:07 /user/hXXXX/lab8_input/input1
     28-rw-r--r--   2 hXXXX supergroup         52 2011-04-19 10:07 /user/hXXXX/lab8_input/input2
     29}}}
     30
     31 * 下載 [http://hadoop.nchc.org.tw/WordCount.java WordCount.java] 並存到家目錄;[[BR]]Download [http://hadoop.nchc.org.tw/WordCount.java WordCount.java] and save to your home directory
     32{{{
     33~$ wget http://hadoop.nchc.org.tw/WordCount.java
     34}}}
     35
     36 * 運作程式[[BR]]Compile WordCount.java and run it by '''hadoop jar''' command
     37
     38{{{
     39$ mkdir MyJava
     40$ ln -s /usr/lib/hadoop/hadoop-*-core.jar hadoop-core.jar
     41$ javac -classpath hadoop-core.jar -d MyJava WordCount.java
     42$ jar -cvf wordcount.jar -C MyJava .
     43$ hadoop jar wordcount.jar WordCount lab8_input/ lab8_out1/
     44$ hadoop fs -cat lab8_out1/part-00000
     45}}}
     46
     47 * lab8_out1 執行結果 [[BR]]You should see results like this :
     48{{{
     49#!text
     50Cloud   2
     51Course, 1
     52Course. 1
     53I       2
     54NCTU    1
     55and     1
     56course. 1
     57enjoy   1
     58like    2
     59nctu    1
     60this    1
     61we      1
     62}}}
     63-----
     64
     65= Practice 2 : Word Count (Advanced) =
     66
     67{{{
     68$ echo "\." >pattern.txt && echo "\," >>pattern.txt
     69$ hadoop fs -put pattern.txt .
     70$ mkdir -p MyJava2
     71}}}
     72
     73
     74 * 下載 [http://hadoop.nchc.org.tw/WordCount2.java WordCount2.java] 並存到家目錄;[[BR]]Download [http://hadoop.nchc.org.tw/WordCount2.java WordCount2.java] to home directory
     75{{{
     76~$ wget http://hadoop.nchc.org.tw/WordCount2.java
     77}}}
     78
     79{{{
     80$ javac -classpath hadoop-core.jar -d MyJava2 WordCount2.java
     81$ jar -cvf wordcount2.jar -C MyJava2 .
     82$ hadoop jar wordcount2.jar WordCount2 lab8_input lab8_out2 -skip pattern.txt
     83$ hadoop fs -cat lab8_out2/part-00000
     84}}}
     85
     86 * lab8_out2 執行結果[[BR]]You should see results like this:
     87{{{
     88#!text
     89Cloud   2
     90Course  2
     91I       2
     92NCTU    1
     93and     1
     94course  1
     95enjoy   1
     96like    2
     97nctu    1
     98this    1
     99we      1
     100}}}
     101
     102 * Let's given case insensitive and ignore pattern for this example
     103{{{
     104$ hadoop jar wordcount2.jar WordCount2 -Dwordcount.case.sensitive=false lab8_input lab8_out3 -skip pattern.txt
     105$ hadoop fs -cat lab8_out3/part-00000
     106}}}
     107
     108 * lab8_out3 執行結果[[BR]]You should see results like this:
     109{{{
     110#!text
     111and     1
     112cloud   2
     113course  3
     114enjoy   1
     115i       2
     116like    2
     117nctu    2
     118this    1
     119we      1
     120}}}