| | 1 | [[PageOutline]] |
| | 2 | |
| | 3 | ◢ <[wiki:III140322/Lab2 實作二]> | <[wiki:III140322 回課程大綱]> ▲ | <[wiki:III140322/Lab4 實作四]> ◣ |
| | 4 | |
| | 5 | = 實作三 Lab3 = |
| | 6 | |
| | 7 | {{{ |
| | 8 | #!html |
| | 9 | <div style="text-align: center;"><big style="font-weight: bold;"><big>HDFS 單機操作練習<br/>HDFS local mode in Practice</big></big></div> |
| | 10 | }}} |
| | 11 | |
| | 12 | == 0. 啟動 Hadoop4Win == |
| | 13 | |
| | 14 | * STEP 1 : 請在「開始功能表」依序點選以下捷徑 |
| | 15 | * [[BR]][[Image(Hadoop4Win:hadoop4win-installer_11.jpg)]] |
| | 16 | * STEP 2 :首先點選 start-hadoop 來啟動 Hadoop 的服務(跑在獨立的 CMD 視窗中) |
| | 17 | * '''注意''':必須看到 Safe Mode is OFF 才算正常啟動完畢。 |
| | 18 | * [[BR]][[Image(Hadoop4Win:hadoop4win_29.jpg,width=800)]] |
| | 19 | * STEP 3 :其次點選 NameNode Web UI 用瀏覽器開啟 http://localhost:50070 的頁面,確認 NameNode 正常開啟,可以正常顯示如下畫面: |
| | 20 | * '''注意''':必須有一個 Live Node 才算是正常。 |
| | 21 | * [[BR]][[Image(Hadoop4Win:hadoop4win_10.jpg,width=800)]] |
| | 22 | * STEP 4 :接著點選 JobTracker Web UI 用瀏覽器開啟 http://localhost:50030 的頁面,確認 JobTracker 正常開啟,可以正常顯示如下畫面: |
| | 23 | * '''注意''':狀態必須是 RUNNING 才算是正常。 |
| | 24 | * [[BR]][[Image(Hadoop4Win:hadoop4win_11.jpg,width=800)]] |
| | 25 | * STEP 5 : 最後點選 hadoop4win 來啟動 hadoop4win 的 Cygwin 視窗,用以輸入後續的指令。 |
| | 26 | * [[BR]][[Image(Hadoop4Win:hadoop4win_20.jpg,width=800)]] |
| | 27 | |
| | 28 | == 1. HDFS 指令練習 == |
| | 29 | |
| | 30 | === 1.1 瀏覽您的 HDFS 目錄 === |
| | 31 | |
| | 32 | * 首先,您可以使用 hadoop fs -ls 指令來瀏覽您的 HDFS 目錄 |
| | 33 | {{{ |
| | 34 | Jazz@human ~ |
| | 35 | $ hadoop fs -ls |
| | 36 | Found 1 items |
| | 37 | drwxr-xr-x - Jazz supergroup 0 2011-09-14 12:50 /user/Jazz/tmp |
| | 38 | }}} |
| | 39 | |
| | 40 | === 1.2 上傳資料到 HDFS 目錄 === |
| | 41 | |
| | 42 | * 接著,讓我們來練習如何上傳資料到 HDFS 目錄。這裡我們使用的是 /opt/hadoop/conf 當作來源目錄,/user/${使用者名稱}/input 當作目標目錄。 |
| | 43 | * '''注意''':由於 Windows 版的 Hadoop 運行於 Cygwin 中,然而 Cygwin 的路徑是虛擬路徑,JRE(Java Runtime Environment)只認識 Windows 目錄路徑,因此倘若您遇到類似底下的錯誤訊息,請加上 cygpath -w 來轉換 Cygwin 路徑到 Windows 路徑。 |
| | 44 | {{{ |
| | 45 | Jazz@human ~ |
| | 46 | $ hadoop fs -put /opt/hadoop/conf input |
| | 47 | put: File /opt/hadoop/conf does not exist. |
| | 48 | Jazz@human ~ |
| | 49 | $ hadoop fs -put $(cygpath -w /opt/hadoop/conf) input |
| | 50 | }}} |
| | 51 | |
| | 52 | * 我們可以使用 hadoop fs -ls 來檢查剛剛上傳的檔案 |
| | 53 | {{{ |
| | 54 | Jazz@human ~ |
| | 55 | $ hadoop fs -ls |
| | 56 | Found 2 items |
| | 57 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 11:45 /user/Jazz/input |
| | 58 | drwxr-xr-x - Jazz supergroup 0 2011-09-14 12:50 /user/Jazz/tmp |
| | 59 | |
| | 60 | Jazz@human ~ |
| | 61 | $ hadoop fs -ls input |
| | 62 | Found 13 items |
| | 63 | -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml |
| | 64 | -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl |
| | 65 | -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml |
| | 66 | -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh |
| | 67 | -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties |
| | 68 | -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml |
| | 69 | -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml |
| | 70 | -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties |
| | 71 | -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml |
| | 72 | -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/masters |
| | 73 | -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves |
| | 74 | -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example |
| | 75 | -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example |
| | 76 | }}} |
| | 77 | |
| | 78 | === 1.3 下載 HDFS 的資料到本地目錄 === |
| | 79 | |
| | 80 | * 接著讓我們來練習如何透過指令從 HDFS 下載資料到本地目錄 |
| | 81 | {{{ |
| | 82 | Jazz@human ~ |
| | 83 | $ hadoop fs -get input fromHDFS |
| | 84 | }}} |
| | 85 | |
| | 86 | * 您可以透過 diff 指令來檢查剛剛上傳的內容與下載下來的內容是否一致 |
| | 87 | {{{ |
| | 88 | Jazz@human ~ |
| | 89 | $ diff -Naur fromHDFS/ /opt/hadoop/conf |
| | 90 | }}} |
| | 91 | |
| | 92 | === 1.4 刪除 HDFS 上的檔案 === |
| | 93 | |
| | 94 | * 您可以透過 hadoop fs -rm 來刪除 HDFS 上的單一檔案 |
| | 95 | {{{ |
| | 96 | Jazz@human ~ |
| | 97 | $ hadoop fs -rm input/masters |
| | 98 | Deleted hdfs://localhost:9000/user/Jazz/input/masters |
| | 99 | }}} |
| | 100 | * 倘若您欲刪除的是目錄,請使用 hadoop fs -rmr 來刪除 HDFS 上的目錄 |
| | 101 | {{{ |
| | 102 | Jazz@human ~ |
| | 103 | $ hadoop fs -rmr tmp |
| | 104 | Deleted hdfs://localhost:9000/user/Jazz/tmp |
| | 105 | }}} |
| | 106 | |
| | 107 | === 1.5 傾印 HDFS 上的檔案內容 === |
| | 108 | |
| | 109 | * 有時,如果只是想要查閱 HDFS 上的檔案內容,可以使用 hdfs fs -cat 來傾印(dump)檔案內容。 |
| | 110 | {{{ |
| | 111 | Jazz@human ~ |
| | 112 | $ hadoop fs -cat input/slaves |
| | 113 | localhost |
| | 114 | }}} |
| | 115 | |
| | 116 | === 1.6 更多 HDFS 指令操作 === |
| | 117 | |
| | 118 | * HDFS 支援的所有指令可以透過以下方式取得列表: |
| | 119 | {{{ |
| | 120 | Jazz@human ~ |
| | 121 | $ hadoop fs |
| | 122 | Usage: java FsShell |
| | 123 | [-ls <path>] |
| | 124 | [-lsr <path>] |
| | 125 | [-du <path>] |
| | 126 | [-dus <path>] |
| | 127 | [-count[-q] <path>] |
| | 128 | [-mv <src> <dst>] |
| | 129 | [-cp <src> <dst>] |
| | 130 | [-rm [-skipTrash] <path>] |
| | 131 | [-rmr [-skipTrash] <path>] |
| | 132 | [-expunge] |
| | 133 | [-put <localsrc> ... <dst>] |
| | 134 | [-copyFromLocal <localsrc> ... <dst>] |
| | 135 | [-moveFromLocal <localsrc> ... <dst>] |
| | 136 | [-get [-ignoreCrc] [-crc] <src> <localdst>] |
| | 137 | [-getmerge <src> <localdst> [addnl]] |
| | 138 | [-cat <src>] |
| | 139 | [-text <src>] |
| | 140 | [-copyToLocal [-ignoreCrc] [-crc] <src> <localdst>] |
| | 141 | [-moveToLocal [-crc] <src> <localdst>] |
| | 142 | [-mkdir <path>] |
| | 143 | [-setrep [-R] [-w] <rep> <path/file>] |
| | 144 | [-touchz <path>] |
| | 145 | [-test -[ezd] <path>] |
| | 146 | [-stat [format] <path>] |
| | 147 | [-tail [-f] <file>] |
| | 148 | [-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...] |
| | 149 | [-chown [-R] [OWNER][:[GROUP]] PATH...] |
| | 150 | [-chgrp [-R] GROUP PATH...] |
| | 151 | [-help [cmd]] |
| | 152 | |
| | 153 | Generic options supported are |
| | 154 | -conf <configuration file> specify an application configuration file |
| | 155 | -D <property=value> use value for given property |
| | 156 | -fs <local|namenode:port> specify a namenode |
| | 157 | -jt <local|jobtracker:port> specify a job tracker |
| | 158 | -files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster |
| | 159 | -libjars <comma separated list of jars> specify comma separated jar files to include in the classpath. |
| | 160 | -archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines. |
| | 161 | |
| | 162 | The general command line syntax is |
| | 163 | bin/hadoop command [genericOptions] [commandOptions] |
| | 164 | }}} |
| | 165 | |
| | 166 | == 2. 使用網頁介面來瀏覽 HDFS 的內容資訊 == |
| | 167 | |
| | 168 | * 您亦可透過調閱 [http://localhost:50070 NameNode] 的頁面來查詢方才上傳的檔案內容與 Block Size、File Size、Block Location、Rack Location 等資訊。 |
| | 169 | * [[BR]][[Image(Hadoop4Win:hadoop4win_30.jpg,width=800)]] |
| | 170 | * [[BR]][[Image(Hadoop4Win:hadoop4win_31.jpg,width=800)]] |
| | 171 | * [[BR]][[Image(Hadoop4Win:hadoop4win_32.jpg,width=800)]] |
| | 172 | |
| | 173 | == 3. 更多 HDFS shell 的用法 == |
| | 174 | |
| | 175 | === -ls === |
| | 176 | |
| | 177 | * -ls 的操作預設目錄在 /user/${username}/ 下,意思就是您使用的是相對於 /user/${username} 的「相對路徑」 |
| | 178 | {{{ |
| | 179 | Jazz@human ~ |
| | 180 | $ hadoop fs -ls input |
| | 181 | Found 13 items |
| | 182 | -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml |
| | 183 | -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl |
| | 184 | -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml |
| | 185 | -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh |
| | 186 | -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties |
| | 187 | -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml |
| | 188 | -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml |
| | 189 | -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties |
| | 190 | -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml |
| | 191 | -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves |
| | 192 | -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example |
| | 193 | -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example |
| | 194 | }}} |
| | 195 | * 當然您也可以指定「完整路徑」,採用 '''hdfs://node:port/path''' 這種格式。 |
| | 196 | {{{ |
| | 197 | Jazz@human ~ |
| | 198 | $ hadoop fs -ls hdfs://localhost:9000/user/${USER}/input |
| | 199 | Found 12 items |
| | 200 | -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml |
| | 201 | -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl |
| | 202 | -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml |
| | 203 | -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh |
| | 204 | -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties |
| | 205 | -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml |
| | 206 | -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml |
| | 207 | -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties |
| | 208 | -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml |
| | 209 | -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves |
| | 210 | -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example |
| | 211 | -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example |
| | 212 | }}} |
| | 213 | |
| | 214 | === -cat === |
| | 215 | |
| | 216 | * 將路徑指定文件的內容輸出到標準輸出(STDOUT) |
| | 217 | {{{ |
| | 218 | Jazz@human ~ |
| | 219 | $ hadoop fs -cat input/slaves |
| | 220 | localhost |
| | 221 | }}} |
| | 222 | |
| | 223 | === -chgrp === |
| | 224 | |
| | 225 | * 改變文件所屬的群組 |
| | 226 | {{{ |
| | 227 | Jazz@human ~ |
| | 228 | $ hadoop fs -ls input/slaves |
| | 229 | Found 1 items |
| | 230 | -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves |
| | 231 | |
| | 232 | Jazz@human ~ |
| | 233 | $ hadoop fs -chgrp ${USERNAME} input/slaves |
| | 234 | |
| | 235 | Jazz@human ~ |
| | 236 | $ hadoop fs -ls input/slaves |
| | 237 | Found 1 items |
| | 238 | -rw-r--r-- 1 Jazz Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves |
| | 239 | }}} |
| | 240 | |
| | 241 | === -chmod === |
| | 242 | |
| | 243 | * 改變文件的權限 |
| | 244 | {{{ |
| | 245 | Jazz@human ~ |
| | 246 | $ hadoop fs -ls input/slaves |
| | 247 | Found 1 items |
| | 248 | -rw-r--r-- 1 Jazz Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves |
| | 249 | |
| | 250 | Jazz@human ~ |
| | 251 | $ hadoop fs -chmod 700 input/slaves |
| | 252 | |
| | 253 | Jazz@human ~ |
| | 254 | $ hadoop fs -ls input/slaves |
| | 255 | Found 1 items |
| | 256 | -rw------- 1 Jazz Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves |
| | 257 | }}} |
| | 258 | |
| | 259 | === -chown === |
| | 260 | |
| | 261 | * 改變文件的擁有者 |
| | 262 | {{{ |
| | 263 | Jazz@human ~ |
| | 264 | $ hadoop fs -chown hadoop input/slaves |
| | 265 | |
| | 266 | Jazz@human ~ |
| | 267 | $ hadoop fs -ls input/slaves |
| | 268 | Found 1 items |
| | 269 | -rw------- 1 hadoop Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves |
| | 270 | }}} |
| | 271 | |
| | 272 | === -copyFromLocal, -put === |
| | 273 | |
| | 274 | * 從本機(local)上傳檔案到 HDFS |
| | 275 | {{{ |
| | 276 | Jazz@human ~ |
| | 277 | $ hadoop fs -put fromHDFS dfs_input |
| | 278 | |
| | 279 | Jazz@human ~ |
| | 280 | $ hadoop fs -ls |
| | 281 | Found 2 items |
| | 282 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input |
| | 283 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input |
| | 284 | }}} |
| | 285 | |
| | 286 | === -copyToLocal, -get === |
| | 287 | |
| | 288 | * 把 HDFS 上的檔案下載到本機(local) |
| | 289 | {{{ |
| | 290 | Jazz@human ~ |
| | 291 | $ hadoop fs -get dfs_input input1 |
| | 292 | }}} |
| | 293 | |
| | 294 | === -cp === |
| | 295 | |
| | 296 | * 將文件從 HDFS 原本路徑複製到 HDFS 目標路徑 |
| | 297 | {{{ |
| | 298 | Jazz@human ~ |
| | 299 | $ hadoop fs -cp dfs_input input1 |
| | 300 | |
| | 301 | Jazz@human ~ |
| | 302 | $ hadoop fs -ls |
| | 303 | Found 3 items |
| | 304 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input |
| | 305 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input |
| | 306 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 |
| | 307 | }}} |
| | 308 | |
| | 309 | === -du === |
| | 310 | |
| | 311 | * 顯示目錄中所有文件的大小 |
| | 312 | {{{ |
| | 313 | Jazz@human ~ |
| | 314 | $ hadoop fs -du input |
| | 315 | Found 12 items |
| | 316 | 3936 hdfs://localhost:9000/user/Jazz/input/capacity-scheduler.xml |
| | 317 | 535 hdfs://localhost:9000/user/Jazz/input/configuration.xsl |
| | 318 | 326 hdfs://localhost:9000/user/Jazz/input/core-site.xml |
| | 319 | 2409 hdfs://localhost:9000/user/Jazz/input/hadoop-env.sh |
| | 320 | 1245 hdfs://localhost:9000/user/Jazz/input/hadoop-metrics.properties |
| | 321 | 4190 hdfs://localhost:9000/user/Jazz/input/hadoop-policy.xml |
| | 322 | 196 hdfs://localhost:9000/user/Jazz/input/hdfs-site.xml |
| | 323 | 2815 hdfs://localhost:9000/user/Jazz/input/log4j.properties |
| | 324 | 212 hdfs://localhost:9000/user/Jazz/input/mapred-site.xml |
| | 325 | 10 hdfs://localhost:9000/user/Jazz/input/slaves |
| | 326 | 1243 hdfs://localhost:9000/user/Jazz/input/ssl-client.xml.example |
| | 327 | 1195 hdfs://localhost:9000/user/Jazz/input/ssl-server.xml.example |
| | 328 | }}} |
| | 329 | |
| | 330 | === -dus === |
| | 331 | |
| | 332 | * 顯示該目錄/文件的總大小 |
| | 333 | {{{ |
| | 334 | Jazz@human ~ |
| | 335 | $ hadoop fs -dus input |
| | 336 | hdfs://localhost:9000/user/Jazz/input 18312 |
| | 337 | }}} |
| | 338 | |
| | 339 | === -expunge === |
| | 340 | |
| | 341 | * 清空垃圾桶 |
| | 342 | {{{ |
| | 343 | Jazz@human ~ |
| | 344 | $ hadoop fs -expunge |
| | 345 | }}} |
| | 346 | |
| | 347 | === -getmerge === |
| | 348 | |
| | 349 | * 將來源目錄 <src> 下所有的文件都集合到本機一個 <localdst> 檔案內 |
| | 350 | * 語法:hadoop fs -getmerge <src> <localdst> |
| | 351 | {{{ |
| | 352 | Jazz@human ~ |
| | 353 | $ mkdir -p in1 |
| | 354 | |
| | 355 | Jazz@human ~ |
| | 356 | $ echo "this is one; " > in1/input |
| | 357 | |
| | 358 | Jazz@human ~ |
| | 359 | $ echo "this is two; " > in1/input2 |
| | 360 | |
| | 361 | Jazz@human ~ |
| | 362 | $ hadoop fs -put in1 in1 |
| | 363 | |
| | 364 | Jazz@human ~ |
| | 365 | $ hadoop fs -getmerge in1 merge.txt |
| | 366 | |
| | 367 | Jazz@human ~ |
| | 368 | $ cat ./merge.txt |
| | 369 | this is one; |
| | 370 | this is two; |
| | 371 | }}} |
| | 372 | |
| | 373 | === -ls === |
| | 374 | |
| | 375 | * 列出文件或目錄的資訊 |
| | 376 | * 文件名 <副本數> 文件大小 修改日期 修改時間 權限 用戶ID 組ID |
| | 377 | * 目錄名 <dir> 修改日期 修改時間 權限 用戶ID 組ID |
| | 378 | {{{ |
| | 379 | Jazz@human ~ |
| | 380 | $ hadoop fs -ls |
| | 381 | Found 3 items |
| | 382 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input |
| | 383 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input |
| | 384 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 |
| | 385 | }}} |
| | 386 | |
| | 387 | === -lsr === |
| | 388 | |
| | 389 | * ls 命令的遞迴版本 |
| | 390 | {{{ |
| | 391 | Jazz@human ~ |
| | 392 | $ hadoop fs -lsr |
| | 393 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input |
| | 394 | -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 12:33 /user/Jazz/dfs_input/capacity-scheduler.xml |
| | 395 | -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 12:33 /user/Jazz/dfs_input/configuration.xsl |
| | 396 | -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 12:33 /user/Jazz/dfs_input/core-site.xml |
| | 397 | -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 12:33 /user/Jazz/dfs_input/hadoop-env.sh |
| | 398 | -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 12:33 /user/Jazz/dfs_input/hadoop-metrics.properties |
| | 399 | -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 12:33 /user/Jazz/dfs_input/hadoop-policy.xml |
| | 400 | -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 12:33 /user/Jazz/dfs_input/hdfs-site.xml |
| | 401 | -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 12:33 /user/Jazz/dfs_input/log4j.properties |
| | 402 | -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 12:33 /user/Jazz/dfs_input/mapred-site.xml |
| | 403 | -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:33 /user/Jazz/dfs_input/masters |
| | 404 | -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:33 /user/Jazz/dfs_input/slaves |
| | 405 | -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 12:33 /user/Jazz/dfs_input/ssl-client.xml.example |
| | 406 | -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 12:33 /user/Jazz/dfs_input/ssl-server.xml.example |
| | 407 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1 |
| | 408 | -rw-r--r-- 1 Jazz supergroup 14 2011-10-21 12:40 /user/Jazz/in1/input |
| | 409 | -rw-r--r-- 1 Jazz supergroup 14 2011-10-21 12:40 /user/Jazz/in1/input2 |
| | 410 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input |
| | 411 | -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml |
| | 412 | -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl |
| | 413 | -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml |
| | 414 | -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh |
| | 415 | -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties |
| | 416 | -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml |
| | 417 | -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml |
| | 418 | -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties |
| | 419 | -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml |
| | 420 | -rw------- 1 hadoop Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves |
| | 421 | -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example |
| | 422 | -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example |
| | 423 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 |
| | 424 | -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 12:34 /user/Jazz/input1/capacity-scheduler.xml |
| | 425 | -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 12:34 /user/Jazz/input1/configuration.xsl |
| | 426 | -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 12:34 /user/Jazz/input1/core-site.xml |
| | 427 | -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 12:34 /user/Jazz/input1/hadoop-env.sh |
| | 428 | -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 12:34 /user/Jazz/input1/hadoop-metrics.properties |
| | 429 | -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 12:34 /user/Jazz/input1/hadoop-policy.xml |
| | 430 | -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 12:34 /user/Jazz/input1/hdfs-site.xml |
| | 431 | -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 12:34 /user/Jazz/input1/log4j.properties |
| | 432 | -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 12:34 /user/Jazz/input1/mapred-site.xml |
| | 433 | -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:34 /user/Jazz/input1/masters |
| | 434 | -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:34 /user/Jazz/input1/slaves |
| | 435 | -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 12:34 /user/Jazz/input1/ssl-client.xml.example |
| | 436 | -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 12:34 /user/Jazz/input1/ssl-server.xml.example |
| | 437 | }}} |
| | 438 | === -mkdir === |
| | 439 | |
| | 440 | * 建立資料夾 |
| | 441 | {{{ |
| | 442 | Jazz@human ~ |
| | 443 | $ hadoop fs -mkdir tmp |
| | 444 | Jazz@human ~ |
| | 445 | $ hadoop fs -ls |
| | 446 | Found 5 items |
| | 447 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input |
| | 448 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1 |
| | 449 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input |
| | 450 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 |
| | 451 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:43 /user/Jazz/tmp |
| | 452 | }}} |
| | 453 | |
| | 454 | === -moveFromLocal === |
| | 455 | |
| | 456 | * 將 local 端的資料夾剪下移動到 HDFS 上 |
| | 457 | {{{ |
| | 458 | Jazz@human ~ |
| | 459 | $ hadoop fs -moveFromLocal in1 in2 |
| | 460 | Jazz@human ~ |
| | 461 | $ hadoop fs -ls |
| | 462 | Found 6 items |
| | 463 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input |
| | 464 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1 |
| | 465 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:44 /user/Jazz/in2 |
| | 466 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input |
| | 467 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 |
| | 468 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:43 /user/Jazz/tmp |
| | 469 | }}} |
| | 470 | |
| | 471 | === -mv === |
| | 472 | |
| | 473 | * 更改資料的名稱 |
| | 474 | {{{ |
| | 475 | Jazz@human ~ |
| | 476 | $ hadoop fs -mv in2 in3 |
| | 477 | |
| | 478 | Jazz@human ~ |
| | 479 | $ hadoop fs -ls |
| | 480 | Found 6 items |
| | 481 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input |
| | 482 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1 |
| | 483 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:44 /user/Jazz/in3 |
| | 484 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input |
| | 485 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 |
| | 486 | drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:43 /user/Jazz/tmp |
| | 487 | }}} |
| | 488 | |
| | 489 | === -rm === |
| | 490 | |
| | 491 | * 刪除指定的檔案(不能是資料夾) |
| | 492 | {{{ |
| | 493 | Jazz@human ~ |
| | 494 | $ hadoop fs -rm in1/input |
| | 495 | Deleted hdfs://localhost:9000/user/Jazz/in1/input |
| | 496 | }}} |
| | 497 | |
| | 498 | === -rmr === |
| | 499 | |
| | 500 | * 遞迴刪除資料夾(包含在內的所有檔案),可以是多個資料夾 |
| | 501 | {{{ |
| | 502 | Jazz@human ~ |
| | 503 | $ hadoop fs -rmr dfs_input in1 in3 input1 |
| | 504 | Deleted hdfs://localhost:9000/user/Jazz/dfs_input |
| | 505 | Deleted hdfs://localhost:9000/user/Jazz/in1 |
| | 506 | Deleted hdfs://localhost:9000/user/Jazz/in3 |
| | 507 | Deleted hdfs://localhost:9000/user/Jazz/input1 |
| | 508 | }}} |
| | 509 | |
| | 510 | === -setrep === |
| | 511 | |
| | 512 | * 設定副本係數 |
| | 513 | * 語法:hadoop fs -setrep [-R] [-w] <rep> <path/file> |
| | 514 | {{{ |
| | 515 | Jazz@human ~ |
| | 516 | $ hadoop fs -setrep -w 1 -R input |
| | 517 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/capacity-scheduler.xml |
| | 518 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/configuration.xsl |
| | 519 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/core-site.xml |
| | 520 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hadoop-env.sh |
| | 521 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hadoop-metrics.properties |
| | 522 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hadoop-policy.xml |
| | 523 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hdfs-site.xml |
| | 524 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/log4j.properties |
| | 525 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/mapred-site.xml |
| | 526 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/slaves |
| | 527 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/ssl-client.xml.example |
| | 528 | Replication 1 set: hdfs://localhost:9000/user/Jazz/input/ssl-server.xml.example |
| | 529 | Waiting for hdfs://localhost:9000/user/Jazz/input/capacity-scheduler.xml ... done |
| | 530 | Waiting for hdfs://localhost:9000/user/Jazz/input/configuration.xsl ... done |
| | 531 | Waiting for hdfs://localhost:9000/user/Jazz/input/core-site.xml ... done |
| | 532 | Waiting for hdfs://localhost:9000/user/Jazz/input/hadoop-env.sh ... done |
| | 533 | Waiting for hdfs://localhost:9000/user/Jazz/input/hadoop-metrics.properties ...done |
| | 534 | Waiting for hdfs://localhost:9000/user/Jazz/input/hadoop-policy.xml ... done |
| | 535 | Waiting for hdfs://localhost:9000/user/Jazz/input/hdfs-site.xml ... done |
| | 536 | Waiting for hdfs://localhost:9000/user/Jazz/input/log4j.properties ... done |
| | 537 | Waiting for hdfs://localhost:9000/user/Jazz/input/mapred-site.xml ... done |
| | 538 | Waiting for hdfs://localhost:9000/user/Jazz/input/slaves ... done |
| | 539 | Waiting for hdfs://localhost:9000/user/Jazz/input/ssl-client.xml.example ... done |
| | 540 | Waiting for hdfs://localhost:9000/user/Jazz/input/ssl-server.xml.example ... done |
| | 541 | $ bin/hadoop fs -setrep -w 2 -R input |
| | 542 | Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt |
| | 543 | Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt |
| | 544 | Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt |
| | 545 | Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt |
| | 546 | Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt ... done |
| | 547 | Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt ... done |
| | 548 | Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt ... done |
| | 549 | Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt ... done |
| | 550 | }}} |
| | 551 | |
| | 552 | === -stat === |
| | 553 | |
| | 554 | * 印出時間資訊 |
| | 555 | {{{ |
| | 556 | Jazz@human ~ |
| | 557 | $ hadoop fs -stat input |
| | 558 | 2011-10-21 04:00:44 |
| | 559 | }}} |
| | 560 | |
| | 561 | === -tail === |
| | 562 | |
| | 563 | * 將文件的最後1k內容輸出 |
| | 564 | * 用法:hadoop fs -tail [-f] 檔案 (-f 參數用來顯示如果檔案增大,則秀出被append上得內容) |
| | 565 | {{{ |
| | 566 | Jazz@human ~ |
| | 567 | $ hadoop fs -tail input/log4j.properties |
| | 568 | g4j.RollingFileAppender |
| | 569 | #log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file} |
| | 570 | |
| | 571 | # Logfile size and and 30-day backups |
| | 572 | #log4j.appender.RFA.MaxFileSize=1MB |
| | 573 | #log4j.appender.RFA.MaxBackupIndex=30 |
| | 574 | |
| | 575 | #log4j.appender.RFA.layout=org.apache.log4j.PatternLayout |
| | 576 | #log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} - %m%n |
| | 577 | #log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) |
| | 578 | - %m%n |
| | 579 | |
| | 580 | # |
| | 581 | # FSNamesystem Audit logging |
| | 582 | # All audit events are logged at INFO level |
| | 583 | # |
| | 584 | log4j.logger.org.apache.hadoop.fs.FSNamesystem.audit=WARN |
| | 585 | |
| | 586 | # Custom Logging levels |
| | 587 | |
| | 588 | #log4j.logger.org.apache.hadoop.mapred.JobTracker=DEBUG |
| | 589 | #log4j.logger.org.apache.hadoop.mapred.TaskTracker=DEBUG |
| | 590 | #log4j.logger.org.apache.hadoop.fs.FSNamesystem=DEBUG |
| | 591 | |
| | 592 | # Jets3t library |
| | 593 | log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR |
| | 594 | |
| | 595 | # |
| | 596 | # Event Counter Appender |
| | 597 | # Sends counts of logging messages at different severity levels to Hadoop Metric |
| | 598 | s. |
| | 599 | # |
| | 600 | log4j.appender.EventCounter=org.apache.hadoop.metrics.jvm.EventCounter |
| | 601 | }}} |
| | 602 | |
| | 603 | === -test === |
| | 604 | |
| | 605 | * 測試檔案, -e 檢查文件是否存在(1=存在, 0=否), -z 檢查文件是否為空(1=空, 0=不為空), -d 檢查是否為目錄(1=存在, 0=否) |
| | 606 | * 要用echo $? 來看回傳值為 0 or 1 |
| | 607 | * 用法: bin/hadoop fs -test -[ezd] URI |
| | 608 | {{{ |
| | 609 | ########## -e 用來判斷檔案是否存在,回傳 0 為真,回傳 1 為偽 ########## |
| | 610 | |
| | 611 | Jazz@human ~ |
| | 612 | $ hadoop fs -test -e input/slaves |
| | 613 | |
| | 614 | Jazz@human ~ |
| | 615 | $ echo $? |
| | 616 | 0 |
| | 617 | |
| | 618 | Jazz@human ~ |
| | 619 | $ hadoop fs -test -e input/masters |
| | 620 | |
| | 621 | Jazz@human ~ |
| | 622 | $ echo $? |
| | 623 | 1 |
| | 624 | |
| | 625 | ########## -z 用來判斷檔案大小是否為零,回傳 0 為真,回傳 1 為偽 ########## |
| | 626 | |
| | 627 | Jazz@human ~ |
| | 628 | $ hadoop fs -test -z input/slaves |
| | 629 | |
| | 630 | Jazz@human ~ |
| | 631 | $ echo $? |
| | 632 | 1 |
| | 633 | |
| | 634 | Jazz@human ~ |
| | 635 | $ hadoop fs -test -z input/masters |
| | 636 | test: File does not exist: input/masters |
| | 637 | |
| | 638 | ########## -d 用來判斷是不是目錄,回傳 0 為真,回傳 1 為偽 ########## |
| | 639 | |
| | 640 | Jazz@human ~ |
| | 641 | $ hadoop fs -test -d input/slaves |
| | 642 | |
| | 643 | Jazz@human ~ |
| | 644 | $ echo $? |
| | 645 | 1 |
| | 646 | |
| | 647 | Jazz@human ~ |
| | 648 | $ hadoop fs -test -d input |
| | 649 | |
| | 650 | Jazz@human ~ |
| | 651 | $ echo $? |
| | 652 | 0 |
| | 653 | |
| | 654 | }}} |
| | 655 | |
| | 656 | === -text === |
| | 657 | |
| | 658 | * 將檔案(如壓縮檔, textrecordinputstream)輸出為純文字格式 |
| | 659 | * hadoop fs -text <src> |
| | 660 | {{{ |
| | 661 | Jazz@human ~ |
| | 662 | $ tar zcvf input.tar.gz input1 |
| | 663 | input1/ |
| | 664 | input1/capacity-scheduler.xml |
| | 665 | input1/configuration.xsl |
| | 666 | input1/core-site.xml |
| | 667 | input1/hadoop-env.sh |
| | 668 | input1/hadoop-metrics.properties |
| | 669 | input1/hadoop-policy.xml |
| | 670 | input1/hdfs-site.xml |
| | 671 | input1/log4j.properties |
| | 672 | input1/mapred-site.xml |
| | 673 | input1/masters |
| | 674 | input1/slaves |
| | 675 | input1/ssl-client.xml.example |
| | 676 | input1/ssl-server.xml.example |
| | 677 | Jazz@human ~ |
| | 678 | $ hadoop fs -put input1.tar.gz . |
| | 679 | Jazz@human ~ |
| | 680 | $ hadoop fs -text input.tar.gz |
| | 681 | <略> |
| | 682 | }}} |
| | 683 | * 註:目前沒支援 zip 的函式庫 |
| | 684 | {{{ |
| | 685 | Jazz@human ~ |
| | 686 | $ zip -r input1.zip input1/ |
| | 687 | updating: input1/ (stored 0%) |
| | 688 | adding: input1/capacity-scheduler.xml (deflated 71%) |
| | 689 | adding: input1/configuration.xsl (deflated 50%) |
| | 690 | adding: input1/core-site.xml (deflated 46%) |
| | 691 | adding: input1/hadoop-env.sh (deflated 58%) |
| | 692 | adding: input1/hadoop-metrics.properties (deflated 78%) |
| | 693 | adding: input1/hadoop-policy.xml (deflated 83%) |
| | 694 | adding: input1/hdfs-site.xml (deflated 35%) |
| | 695 | adding: input1/log4j.properties (deflated 67%) |
| | 696 | adding: input1/mapred-site.xml (deflated 34%) |
| | 697 | adding: input1/masters (stored 0%) |
| | 698 | adding: input1/slaves (stored 0%) |
| | 699 | adding: input1/ssl-client.xml.example (deflated 79%) |
| | 700 | adding: input1/ssl-server.xml.example (deflated 78%) |
| | 701 | Jazz@human ~ |
| | 702 | $ hadoop fs -put input1.zip . |
| | 703 | Jazz@human ~ |
| | 704 | $ hadoop fs -text input1.zip |
| | 705 | PK |
| | 706 | <略> |
| | 707 | }}} |
| | 708 | |
| | 709 | === -touchz === |
| | 710 | |
| | 711 | * 建立一個空文件 |
| | 712 | {{{ |
| | 713 | Jazz@human ~ |
| | 714 | $ hadoop fs -touchz empty |
| | 715 | |
| | 716 | Jazz@human ~ |
| | 717 | $ hadoop fs -test -z empty ; echo $? |
| | 718 | 0 |
| | 719 | }}} |