{{{ #!html

實作二： HDFS Shell操作練習

}}} [[PageOutline]] == 前言 == * 此部份接續實做一 = Content 1. HDFS Shell基本操作 = == 1.1 瀏覽你HDFS目錄 == {{{ /opt/hadoop$ bin/hadoop fs -ls }}} == 1.2 上傳資料到HDFS目錄 == * 上傳 {{{ /opt/hadoop$ bin/hadoop fs -put conf input }}} * 檢查 {{{ /opt/hadoop$ bin/hadoop fs -ls /opt/hadoop$ bin/hadoop fs -ls input }}} == 1.3 下載HDFS的資料到本地目錄 == * 下載 {{{ /opt/hadoop$ bin/hadoop fs -get input fromHDFS }}} * 檢查 {{{ /opt/hadoop$ ls -al | grep fromHDFS /opt/hadoop$ ls -al fromHDFS }}} == 1.4 刪除檔案 == {{{ /opt/hadoop$ bin/hadoop fs -ls input /opt/hadoop$ bin/hadoop fs -rm input/masters }}} == 1.5 直接看檔案 == {{{ /opt/hadoop$ bin/hadoop fs -ls input /opt/hadoop$ bin/hadoop fs -cat input/slaves }}} == 1.6 更多指令操作 == {{{ hadooper@vPro:/opt/hadoop$ bin/hadoop fs Usage: java FsShell [-ls ] [-lsr ] [-du ] [-dus ] [-count[-q] ] [-mv ] [-cp ] [-rm ] [-rmr ] [-expunge] [-put ... ] [-copyFromLocal ... ] [-moveFromLocal ... ] [-get [-ignoreCrc] [-crc] ] [-getmerge [addnl]] [-cat ] [-text ] [-copyToLocal [-ignoreCrc] [-crc] ] [-moveToLocal [-crc] ] [-mkdir ] [-setrep [-R] [-w] ] [-touchz ] [-test -[ezd] ] [-stat [format] ] [-tail [-f] ] [-chmod [-R] PATH...] [-chown [-R] [OWNER][:[GROUP]] PATH...] [-chgrp [-R] GROUP PATH...] [-help [cmd]] Generic options supported are -conf specify an application configuration file -D use value for given property -fs specify a namenode -jt specify a job tracker -files specify comma separated files to be copied to the map reduce cluster -libjars specify comma separated jar files to include in the classpath. -archives specify comma separated archives to be unarchived on the compute machines. The general command line syntax is bin/hadoop command [genericOptions] [commandOptions] }}} = Content 2. 使用網頁Gui瀏覽資訊 = * [http://localhost:50030 Map/Reduce Administration] * [http://localhost:50070 NameNode ] = Content 3. 更多HDFS shell 的用法 = * bin/hadoop fs ，下面則列出的用法 * 以下操作預設的目錄在 /user/<$username>/ 下 {{{ $ bin/hadoop fs -ls input Found 4 items -rw-r--r-- 2 hadooper supergroup 115045564 2009-04-02 11:51 /user/hadooper/input/1.txt -rw-r--r-- 2 hadooper supergroup 987864 2009-04-02 11:51 /user/hadooper/input/2.txt -rw-r--r-- 2 hadooper supergroup 1573048 2009-04-02 11:51 /user/hadooper/input/3.txt -rw-r--r-- 2 hadooper supergroup 25844527 2009-04-02 11:51 /user/hadooper/input/4.txt }}} * 完整的路徑則是 '''hdfs://node:port/path''' 如： {{{ $ bin/hadoop fs -ls hdfs://gm1.nchc.org.tw:9000/user/hadooper/input Found 4 items -rw-r--r-- 2 hadooper supergroup 115045564 2009-04-02 11:51 /user/hadooper/input/1.txt -rw-r--r-- 2 hadooper supergroup 987864 2009-04-02 11:51 /user/hadooper/input/2.txt -rw-r--r-- 2 hadooper supergroup 1573048 2009-04-02 11:51 /user/hadooper/input/3.txt -rw-r--r-- 2 hadooper supergroup 25844527 2009-04-02 11:51 /user/hadooper/input/4.txt }}} == -cat == * 將路徑指定文件的內容輸出到stdout {{{ $ bin/hadoop fs -cat quota/hadoop-env.sh }}} == -chgrp == * 改變文件所屬的組 {{{ $ bin/hadoop fs -chgrp -R hadooper own }}} == -chmod == * 改變文件的權限 {{{ $ bin/hadoop fs -chmod -R 755 own }}} == -chown == * 改變文件的擁有者 {{{ $ bin/hadoop fs -chown -R hadooper own }}} == -copyFromLocal, -put == * 從local放檔案到hdfs {{{ $ bin/hadoop fs -put input dfs_input }}} == -copyToLocal, -get == * 把hdfs上得檔案下載到 local {{{ $ bin/hadoop fs -get dfs_input input1 }}} == -cp == * 將文件從hdfs原本路徑複製到hdfs目標路徑 {{{ $ bin/hadoop fs -cp own hadooper }}} == -du == * 顯示目錄中所有文件的大小 {{{ $ bin/hadoop fs -du input Found 4 items 115045564 hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt 987864 hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt 1573048 hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt 25844527 hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt }}} == -dus == * 顯示該目錄/文件的總大小 {{{ $ bin/hadoop fs -dus input hdfs://gm1.nchc.org.tw:9000/user/hadooper/input 143451003 }}} == -expunge == * 清空垃圾桶 {{{ $ bin/hadoop fs -expunge }}} == -getmerge == * 將來源目錄下所有的文件都集合到本地端一個檔案內 * bin/hadoop fs -getmerge {{{ $ echo "this is one; " >> in1/input $ echo "this is two; " >> in1/input2 $ bin/hadoop fs -put in1 in1 $ bin/hadoop fs -getmerge in1 merge.txt $ cat ./merge.txt }}} == -ls == * 列出文件或目錄的資訊 * 文件名 <副本數> 文件大小修改日期修改時間權限用戶ID 組ID * 目錄名修改日期修改時間權限用戶ID 組ID {{{ $ bin/hadoop fs -ls }}} == -lsr == * ls命令的遞迴版本 {{{ $ bin/hadoop fs -lsr / }}} == -mkdir == * 建立資料夾 {{{ $ bin/hadoop fs -mkdir a b c }}} == -moveFromLocal == * 將local端的資料夾剪下移動到hdfs上 {{{ $ bin/hadoop fs -moveFromLocal in1 in2 }}} == -mv == * 更改資料的名稱 {{{ $ bin/hadoop fs -mv in2 in3 }}} == -rm == * 刪除指定的檔案（不可資料夾） {{{ $ bin/hadoop fs -rm in1/input }}} == -rmr == * 遞迴刪除資料夾（包含在內的所有檔案） {{{ $ bin/hadoop fs -rmr in1 }}} == -setrep == * 設定副本係數 * bin/hadoop fs -setrep [-R] [-w] {{{ $ bin/hadoop fs -setrep -w 2 -R input Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt ... done Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt ... done Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt ... done Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt ... done }}} == -stat == * 印出時間資訊 {{{ $ bin/hadoop fs -stat input 2009-04-02 03:51:29 }}} == -tail == * 將文件的最後1k內容輸出 * 用法： bin/hadoop fs -tail [-f] 檔案 (-f 參數用來顯示如果檔案增大，則秀出被append上得內容) {{{ $ bin/hadoop fs -tail input/1.txt }}} == -test == * 測試檔案， -e 檢查文件是否存在(1=存在, 0＝否)， -z 檢查文件是否為空(1=空, 0＝不為空)， -d 檢查是否為目錄(1=存在, 0＝否) * 要用echo $? 來看回傳值為 0 or 1 * 用法： bin/hadoop fs -test -[ezd] URI {{{ $ bin/hadoop fs -test -e /user/hadooper/input/5.txt $ bin/hadoop fs -test -z /user/hadooper/input/5.txt test: File does not exist: /user/hadooper/input/5.txt $ bin/hadoop fs -test -d /user/hadooper/input/5.txt test: File does not exist: /user/hadooper/input/5.txt }}} == -text == * 將檔案（如壓縮檔, textrecordinputstream）輸出為純文字格式 * hadoop fs -text {{{ $ hadoop fs -text macadr-eth1.txt.gz 00:1b:fc:61:75:b1 00:1b:fc:58:9c:23 }}} * ps : 目前沒支援zip的函式庫 {{{ $ bin/hadoop fs -text b/a.txt.zip PK ��:��H{ a.txtUT b��Ib��IUx��sssss test PK ��:��H{ ��a.txtUTb��IUxPK@C }}} == -touchz == * 建立一個空文件 {{{ $ bin/hadoop fs -touchz b/kk $ bin/hadoop fs -test -z b/kk $ echo $? 1 $ bin/hadoop fs -test -z b/a.txt.zip $ echo $? 0 }}}