= 作業一 = * 題目:請參考 hadoop_labs/lab013 改成逆向索引(Reverse Index)。使 !ReverseIndex 執行之結果為「"關鍵字"\t"檔案名稱(用逗點隔開)"」型態。 * 參考: 以[wiki: 連結之執行方法],忽略句點(\.)與逗點(\,),並且忽略大小寫(case.sensitive=false), * 參考步驟:[[BR]]Here is the reference steps: {{{ ~$ mkdir hw1_input ~$ echo "I like NTU Course." > hw1_input/input1 ~$ echo "I like ntu Course, and we enjoy this course." > hw1_input/input2 ~$ hadoop fs -put hw1_input hw1_input ~$ echo "\." > pattern.txt && echo "\," >> pattern.txt ~$ hadoop fs -put pattern.txt . ~$ hadoop jar WordCount -Dwordcount.case.sensitive=false hw1_input hw1_out -skip pattern.txt ~$ hadoop fs -cat hw1_out/part-00000 }}} * 參考結果應該為:(路徑不限)[[BR]]The reference result should be as following:(no limitation for the format of "path") {{{ and input2 course input1,input2,input2 enjoy input2 i input1,input2 like input1,input2 ntu input1,input2 this input2 we input2 }}} * 繳交期限:2016年5月2日(一) 上午 11:59 * 繳交方式:將原始碼與報告以附件方式寄至 jazzwang@hadoopcon.org (1) 程式原始碼一份:以 ${學號}.zip 方式壓縮與命名 (2) 報告一份:以 ${學號} 命名。 * 提示:[[BR]]Hint: * 請將 Mapper 輸出、Reducer 輸入輸出的 (Key,Value) 由原本的 (Text, !IntWritable) 改成 (Text, Text) * Replace (Key,Value) pair from (Text, !IntWritable) to (Text, Text) * 加分題:(Extra) * 試將出現次數統計加入結果,亦即參考結果如下:[[BR]]Try to add count of each file in the result, i.e. The reference result should be as following: {{{ and input2(1) cloud input1(1),input2(1) course input1(1),input2(2) enjoy input2(1) i input1(1),input2(1) like input1(1),input2(1) nctu input1(1),input2(1) this input2(1) we input2(1) }}} * 配分比例: * 標準題原始碼 Source Code:60% * 報告 Report :20% * 參考內容入下:Reference Items should be shown in your report * 封面 Cover : 姓名、學號 ( Your Name and ID ) * 執行結果 The result of your program * 加分題:20%