close
Warning:
Can't synchronize with repository "(default)" (Unsupported version control system "svn": /usr/lib/python2.7/dist-packages/libsvn/_delta.so: failed to map segment from shared object: Cannot allocate memory). Look in the Trac log for more information.
- Timestamp:
-
Aug 6, 2010, 11:29:59 AM (15 years ago)
- Author:
-
jazz
- Comment:
-
--
Legend:
- Unmodified
- Added
- Removed
- Modified
-
v2
|
v3
|
|
6 | 6 | [[PageOutline]] |
7 | 7 | |
8 | | = 1 Hadoop運算命令 grep = |
| 8 | = Sample 1: grep = |
9 | 9 | |
10 | | * grep 這個命令是擷取文件裡面特定的字元,在Hadoop example中此指令可以擷取文件中有此指定文字的字串,並作計數統計 |
| 10 | * grep 這個命令是擷取文件裡面特定的字元,在Hadoop example中此指令可以擷取文件中有此指定文字的字串,並作計數統計[[BR]]grep is a command to extract specific characters in documents. In hadoop examples, you can use this command to extract strings match the regular expression and count for matched strings. |
11 | 11 | |
12 | 12 | {{{ |
… |
… |
|
18 | 18 | }}} |
19 | 19 | |
20 | | 運作的畫面如下: |
| 20 | 運作的畫面如下:[[BR]]You should see procedure like this: |
21 | 21 | |
22 | 22 | {{{ |
… |
… |
|
52 | 52 | |
53 | 53 | |
54 | | * 接著查看結果 |
| 54 | * 接著查看結果[[BR]]Let's check the computed result of '''grep''' from HDFS : |
55 | 55 | |
56 | 56 | {{{ |
… |
… |
|
59 | 59 | }}} |
60 | 60 | |
61 | | 結果如下 |
| 61 | 結果如下[[BR]]You should see results like this: |
62 | 62 | |
63 | 63 | {{{ |
… |
… |
|
114 | 114 | }}} |
115 | 115 | |
116 | | = 2 Hadoop運算命令 WordCount = |
| 116 | = Sample 2 : WordCount = |
117 | 117 | |
118 | | * 如名稱,WordCount會對所有的字作字數統計,並且從a-z作排列 |
| 118 | * 如名稱,WordCount會對所有的字作字數統計,並且從a-z作排列[[BR]]WordCount example will count each word shown in documents and sorting from a to z. |
119 | 119 | |
120 | 120 | {{{ |
… |
… |
|
122 | 122 | }}} |
123 | 123 | |
124 | | 檢查輸出結果的方法同之前方法 |
| 124 | 檢查輸出結果的方法同之前方法[[BR]]Let's check the computed result of '''wordcount''' from HDFS : |
125 | 125 | |
126 | 126 | {{{ |
… |
… |
|
129 | 129 | }}} |
130 | 130 | |
131 | | = 3. 使用網頁 GUI 瀏覽資訊 = |
| 131 | = Browsing MapReduce and HDFS via Web GUI = |
132 | 132 | |
133 | | * [http://localhost:50030 透過 Map/Reduce Admin 來察看程序運作狀態] |
| 133 | * [http://localhost:50030 JobTracker Web Interface] |
134 | 134 | |
135 | | * [http://localhost:50070 透過 NameNode 察看運算結果] |
| 135 | * [http://localhost:50070 NameNode Web Interface] |
136 | 136 | |
137 | | = 4. 更多運算命令 = |
| 137 | = More Examples = |
138 | 138 | |
139 | | 可執行的指令一覽表: |
| 139 | 可執行的指令一覽表:[[BR]]Here is a list of hadoop examples : |
140 | 140 | |
141 | 141 | || aggregatewordcount || An Aggregate based map/reduce program that counts the words in the input files. || |
… |
… |
|
153 | 153 | || wordcount || A map/reduce program that counts the words in the input files. || |
154 | 154 | |
155 | | 請參考 [http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/examples/package-summary.html org.apache.hadoop.examples] |
| 155 | You could find more detail at [http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/examples/package-summary.html org.apache.hadoop.examples] |