[[PageOutline]]
◢ <[wiki:III120825/Lab7 實作七]> | <[wiki:III120825 回課程大綱]> ▲ | <[wiki:III120825/Lab9 實作九]> ◣
= 實作八 Lab 8 =
{{{
#!html
在完全分散模式下編譯 MapReduce 程式
Compiling Hadoop MapReduce Java Program in Hadoop Cluster
}}}
{{{
#!text
以下練習,請連線至 hadoop.classcloud.org 操作。底下的 hXXXX 等於您的用戶名稱。
}}}
 * 請或連線到 https://hadoop.classcloud.org 透過網頁的 Shell 進行操作
= Practice 1 : Word Count (Basic) =
 * 上傳內容到 HDFS 內[[BR]]upload data to HDFS
{{{
$ mkdir lab8_input
$ echo "I like NCTU Cloud Course." > lab8_input/input1
$ echo "I like nctu Cloud Course, and we enjoy this course." > lab8_input/input2
$ hadoop fs -put lab8_input lab8_input
$ hadoop fs -ls lab8_input
Found 2 items
-rw-r--r--   2 hXXXX supergroup         26 2011-04-19 10:07 /user/hXXXX/lab8_input/input1
-rw-r--r--   2 hXXXX supergroup         52 2011-04-19 10:07 /user/hXXXX/lab8_input/input2
}}}
 * 下載 [http://hadoop.nchc.org.tw/WordCount.java WordCount.java] 並存到家目錄;[[BR]]Download [http://hadoop.nchc.org.tw/WordCount.java WordCount.java] and save to your home directory
{{{
~$ wget http://hadoop.nchc.org.tw/WordCount.java
}}}
 * 運作程式[[BR]]Compile WordCount.java and run it by '''hadoop jar''' command
{{{
$ mkdir MyJava
$ ln -s /usr/lib/hadoop/hadoop-*-core.jar hadoop-core.jar
$ javac -classpath hadoop-core.jar -d MyJava WordCount.java
$ jar -cvf wordcount.jar -C MyJava .
$ hadoop jar wordcount.jar WordCount lab8_input/ lab8_out1/
$ hadoop fs -cat lab8_out1/part-00000
}}}
 * lab8_out1 執行結果 [[BR]]You should see results like this :
{{{
#!text
Cloud	2
Course,	1
Course.	1
I	2
NCTU	1
and	1
course.	1
enjoy	1
like	2
nctu	1
this	1
we	1
}}}
-----
= Practice 2 : Word Count (Advanced) =
{{{
$ echo "\." >pattern.txt && echo "\," >>pattern.txt
$ hadoop fs -put pattern.txt .
$ mkdir -p MyJava2
}}}
 * 下載 [http://hadoop.nchc.org.tw/WordCount2.java WordCount2.java] 並存到家目錄;[[BR]]Download [http://hadoop.nchc.org.tw/WordCount2.java WordCount2.java] to home directory
{{{
~$ wget http://hadoop.nchc.org.tw/WordCount2.java
}}}
{{{
$ javac -classpath hadoop-core.jar -d MyJava2 WordCount2.java
$ jar -cvf wordcount2.jar -C MyJava2 .
$ hadoop jar wordcount2.jar WordCount2 lab8_input lab8_out2 -skip pattern.txt
$ hadoop fs -cat lab8_out2/part-00000
}}}
 * lab8_out2 執行結果[[BR]]You should see results like this:
{{{
#!text
Cloud	2
Course	2
I	2
NCTU	1
and	1
course	1
enjoy	1
like	2
nctu	1
this	1
we	1
}}}
 * Let's given case insensitive and ignore pattern for this example
{{{
$ hadoop jar wordcount2.jar WordCount2 -Dwordcount.case.sensitive=false lab8_input lab8_out3 -skip pattern.txt
$ hadoop fs -cat lab8_out3/part-00000
}}}
 * lab8_out3 執行結果[[BR]]You should see results like this:
{{{
#!text
and	1
cloud	2
course	3
enjoy	1
i	2
like	2
nctu	2
this	1
we	1
}}}