[[PageOutline]] ◢ <[wiki:TREND120929/Lab2 實作二]> | <[wiki:TREND120929 回課程大綱]> ▲ | <[wiki:TREND120929/Lab4 實作四]> ◣ = 實作三 Lab3 = {{{ #!html
HDFS 單機操作練習
HDFS local mode in Practice
}}} == 0. 啟動 Hadoop4Win == * STEP 1 : 請在「開始功能表」依序點選以下捷徑 * [[BR]][[Image(Hadoop4Win:hadoop4win-installer_11.jpg)]] * STEP 2 :首先點選 start-hadoop 來啟動 Hadoop 的服務(跑在獨立的 CMD 視窗中) * '''注意''':必須看到 Safe Mode is OFF 才算正常啟動完畢。 * [[BR]][[Image(Hadoop4Win:hadoop4win_29.jpg,width=800)]] * STEP 3 :其次點選 NameNode Web UI 用瀏覽器開啟 http://localhost:50070 的頁面,確認 NameNode 正常開啟,可以正常顯示如下畫面: * '''注意''':必須有一個 Live Node 才算是正常。 * [[BR]][[Image(Hadoop4Win:hadoop4win_10.jpg,width=800)]] * STEP 4 :接著點選 JobTracker Web UI 用瀏覽器開啟 http://localhost:50030 的頁面,確認 JobTracker 正常開啟,可以正常顯示如下畫面: * '''注意''':狀態必須是 RUNNING 才算是正常。 * [[BR]][[Image(Hadoop4Win:hadoop4win_11.jpg,width=800)]] * STEP 5 : 最後點選 hadoop4win 來啟動 hadoop4win 的 Cygwin 視窗,用以輸入後續的指令。 * [[BR]][[Image(Hadoop4Win:hadoop4win_20.jpg,width=800)]] == 1. HDFS 指令練習 == === 1.1 瀏覽您的 HDFS 目錄 === * 首先,您可以使用 hadoop fs -ls 指令來瀏覽您的 HDFS 目錄 {{{ Jazz@human ~ $ hadoop fs -ls Found 1 items drwxr-xr-x - Jazz supergroup 0 2011-09-14 12:50 /user/Jazz/tmp }}} === 1.2 上傳資料到 HDFS 目錄 === * 接著,讓我們來練習如何上傳資料到 HDFS 目錄。這裡我們使用的是 /opt/hadoop/conf 當作來源目錄,/user/${使用者名稱}/input 當作目標目錄。 * '''注意''':由於 Windows 版的 Hadoop 運行於 Cygwin 中,然而 Cygwin 的路徑是虛擬路徑,JRE(Java Runtime Environment)只認識 Windows 目錄路徑,因此倘若您遇到類似底下的錯誤訊息,請加上 cygpath -w 來轉換 Cygwin 路徑到 Windows 路徑。 {{{ Jazz@human ~ $ hadoop fs -put /opt/hadoop/conf input put: File /opt/hadoop/conf does not exist. Jazz@human ~ $ hadoop fs -put $(cygpath -w /opt/hadoop/conf) input }}} * 我們可以使用 hadoop fs -ls 來檢查剛剛上傳的檔案 {{{ Jazz@human ~ $ hadoop fs -ls Found 2 items drwxr-xr-x - Jazz supergroup 0 2011-10-21 11:45 /user/Jazz/input drwxr-xr-x - Jazz supergroup 0 2011-09-14 12:50 /user/Jazz/tmp Jazz@human ~ $ hadoop fs -ls input Found 13 items -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/masters -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example }}} === 1.3 下載 HDFS 的資料到本地目錄 === * 接著讓我們來練習如何透過指令從 HDFS 下載資料到本地目錄 {{{ Jazz@human ~ $ hadoop fs -get input fromHDFS }}} * 您可以透過 diff 指令來檢查剛剛上傳的內容與下載下來的內容是否一致 {{{ Jazz@human ~ $ diff -Naur fromHDFS/ /opt/hadoop/conf }}} === 1.4 刪除 HDFS 上的檔案 === * 您可以透過 hadoop fs -rm 來刪除 HDFS 上的單一檔案 {{{ Jazz@human ~ $ hadoop fs -rm input/masters Deleted hdfs://localhost:9000/user/Jazz/input/masters }}} * 倘若您欲刪除的是目錄,請使用 hadoop fs -rmr 來刪除 HDFS 上的目錄 {{{ Jazz@human ~ $ hadoop fs -rmr tmp Deleted hdfs://localhost:9000/user/Jazz/tmp }}} === 1.5 傾印 HDFS 上的檔案內容 === * 有時,如果只是想要查閱 HDFS 上的檔案內容,可以使用 hdfs fs -cat 來傾印(dump)檔案內容。 {{{ Jazz@human ~ $ hadoop fs -cat input/slaves localhost }}} === 1.6 更多 HDFS 指令操作 === * HDFS 支援的所有指令可以透過以下方式取得列表: {{{ Jazz@human ~ $ hadoop fs Usage: java FsShell [-ls ] [-lsr ] [-du ] [-dus ] [-count[-q] ] [-mv ] [-cp ] [-rm [-skipTrash] ] [-rmr [-skipTrash] ] [-expunge] [-put ... ] [-copyFromLocal ... ] [-moveFromLocal ... ] [-get [-ignoreCrc] [-crc] ] [-getmerge [addnl]] [-cat ] [-text ] [-copyToLocal [-ignoreCrc] [-crc] ] [-moveToLocal [-crc] ] [-mkdir ] [-setrep [-R] [-w] ] [-touchz ] [-test -[ezd] ] [-stat [format] ] [-tail [-f] ] [-chmod [-R] PATH...] [-chown [-R] [OWNER][:[GROUP]] PATH...] [-chgrp [-R] GROUP PATH...] [-help [cmd]] Generic options supported are -conf specify an application configuration file -D use value for given property -fs specify a namenode -jt specify a job tracker -files specify comma separated files to be copied to the map reduce cluster -libjars specify comma separated jar files to include in the classpath. -archives specify comma separated archives to be unarchived on the compute machines. The general command line syntax is bin/hadoop command [genericOptions] [commandOptions] }}} == 2. 使用網頁介面來瀏覽 HDFS 的內容資訊 == * 您亦可透過調閱 [http://localhost:50070 NameNode] 的頁面來查詢方才上傳的檔案內容與 Block Size、File Size、Block Location、Rack Location 等資訊。 * [[BR]][[Image(Hadoop4Win:hadoop4win_30.jpg,width=800)]] * [[BR]][[Image(Hadoop4Win:hadoop4win_31.jpg,width=800)]] * [[BR]][[Image(Hadoop4Win:hadoop4win_32.jpg,width=800)]] == 3. 更多 HDFS shell 的用法 == === -ls === * -ls 的操作預設目錄在 /user/${username}/ 下,意思就是您使用的是相對於 /user/${username} 的「相對路徑」 {{{ Jazz@human ~ $ hadoop fs -ls input Found 13 items -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example }}} * 當然您也可以指定「完整路徑」,採用 '''hdfs://node:port/path''' 這種格式。 {{{ Jazz@human ~ $ hadoop fs -ls hdfs://localhost:9000/user/${USER}/input Found 12 items -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example }}} === -cat === * 將路徑指定文件的內容輸出到標準輸出(STDOUT) {{{ Jazz@human ~ $ hadoop fs -cat input/slaves localhost }}} === -chgrp === * 改變文件所屬的群組 {{{ Jazz@human ~ $ hadoop fs -ls input/slaves Found 1 items -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 11:45 /user/Jazz/input/slaves Jazz@human ~ $ hadoop fs -chgrp ${USERNAME} input/slaves Jazz@human ~ $ hadoop fs -ls input/slaves Found 1 items -rw-r--r-- 1 Jazz Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves }}} === -chmod === * 改變文件的權限 {{{ Jazz@human ~ $ hadoop fs -ls input/slaves Found 1 items -rw-r--r-- 1 Jazz Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves Jazz@human ~ $ hadoop fs -chmod 700 input/slaves Jazz@human ~ $ hadoop fs -ls input/slaves Found 1 items -rw------- 1 Jazz Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves }}} === -chown === * 改變文件的擁有者 {{{ Jazz@human ~ $ hadoop fs -chown hadoop input/slaves Jazz@human ~ $ hadoop fs -ls input/slaves Found 1 items -rw------- 1 hadoop Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves }}} === -copyFromLocal, -put === * 從本機(local)上傳檔案到 HDFS {{{ Jazz@human ~ $ hadoop fs -put fromHDFS dfs_input Jazz@human ~ $ hadoop fs -ls Found 2 items drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input }}} === -copyToLocal, -get === * 把 HDFS 上的檔案下載到本機(local) {{{ Jazz@human ~ $ hadoop fs -get dfs_input input1 }}} === -cp === * 將文件從 HDFS 原本路徑複製到 HDFS 目標路徑 {{{ Jazz@human ~ $ hadoop fs -cp dfs_input input1 Jazz@human ~ $ hadoop fs -ls Found 3 items drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 }}} === -du === * 顯示目錄中所有文件的大小 {{{ Jazz@human ~ $ hadoop fs -du input Found 12 items 3936 hdfs://localhost:9000/user/Jazz/input/capacity-scheduler.xml 535 hdfs://localhost:9000/user/Jazz/input/configuration.xsl 326 hdfs://localhost:9000/user/Jazz/input/core-site.xml 2409 hdfs://localhost:9000/user/Jazz/input/hadoop-env.sh 1245 hdfs://localhost:9000/user/Jazz/input/hadoop-metrics.properties 4190 hdfs://localhost:9000/user/Jazz/input/hadoop-policy.xml 196 hdfs://localhost:9000/user/Jazz/input/hdfs-site.xml 2815 hdfs://localhost:9000/user/Jazz/input/log4j.properties 212 hdfs://localhost:9000/user/Jazz/input/mapred-site.xml 10 hdfs://localhost:9000/user/Jazz/input/slaves 1243 hdfs://localhost:9000/user/Jazz/input/ssl-client.xml.example 1195 hdfs://localhost:9000/user/Jazz/input/ssl-server.xml.example }}} === -dus === * 顯示該目錄/文件的總大小 {{{ Jazz@human ~ $ hadoop fs -dus input hdfs://localhost:9000/user/Jazz/input 18312 }}} === -expunge === * 清空垃圾桶 {{{ Jazz@human ~ $ hadoop fs -expunge }}} === -getmerge === * 將來源目錄 下所有的文件都集合到本機一個 檔案內 * 語法:hadoop fs -getmerge {{{ Jazz@human ~ $ mkdir -p in1 Jazz@human ~ $ echo "this is one; " > in1/input Jazz@human ~ $ echo "this is two; " > in1/input2 Jazz@human ~ $ hadoop fs -put in1 in1 Jazz@human ~ $ hadoop fs -getmerge in1 merge.txt Jazz@human ~ $ cat ./merge.txt this is one; this is two; }}} === -ls === * 列出文件或目錄的資訊 * 文件名 <副本數> 文件大小 修改日期 修改時間 權限 用戶ID 組ID * 目錄名 修改日期 修改時間 權限 用戶ID 組ID {{{ Jazz@human ~ $ hadoop fs -ls Found 3 items drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 }}} === -lsr === * ls 命令的遞迴版本 {{{ Jazz@human ~ $ hadoop fs -lsr drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 12:33 /user/Jazz/dfs_input/capacity-scheduler.xml -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 12:33 /user/Jazz/dfs_input/configuration.xsl -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 12:33 /user/Jazz/dfs_input/core-site.xml -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 12:33 /user/Jazz/dfs_input/hadoop-env.sh -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 12:33 /user/Jazz/dfs_input/hadoop-metrics.properties -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 12:33 /user/Jazz/dfs_input/hadoop-policy.xml -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 12:33 /user/Jazz/dfs_input/hdfs-site.xml -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 12:33 /user/Jazz/dfs_input/log4j.properties -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 12:33 /user/Jazz/dfs_input/mapred-site.xml -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:33 /user/Jazz/dfs_input/masters -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:33 /user/Jazz/dfs_input/slaves -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 12:33 /user/Jazz/dfs_input/ssl-client.xml.example -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 12:33 /user/Jazz/dfs_input/ssl-server.xml.example drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1 -rw-r--r-- 1 Jazz supergroup 14 2011-10-21 12:40 /user/Jazz/in1/input -rw-r--r-- 1 Jazz supergroup 14 2011-10-21 12:40 /user/Jazz/in1/input2 drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 11:45 /user/Jazz/input/capacity-scheduler.xml -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 11:45 /user/Jazz/input/configuration.xsl -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 11:45 /user/Jazz/input/core-site.xml -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 11:45 /user/Jazz/input/hadoop-env.sh -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 11:45 /user/Jazz/input/hadoop-metrics.properties -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 11:45 /user/Jazz/input/hadoop-policy.xml -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 11:45 /user/Jazz/input/hdfs-site.xml -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 11:45 /user/Jazz/input/log4j.properties -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 11:45 /user/Jazz/input/mapred-site.xml -rw------- 1 hadoop Jazz 10 2011-10-21 11:45 /user/Jazz/input/slaves -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 11:45 /user/Jazz/input/ssl-client.xml.example -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 11:45 /user/Jazz/input/ssl-server.xml.example drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 -rw-r--r-- 1 Jazz supergroup 3936 2011-10-21 12:34 /user/Jazz/input1/capacity-scheduler.xml -rw-r--r-- 1 Jazz supergroup 535 2011-10-21 12:34 /user/Jazz/input1/configuration.xsl -rw-r--r-- 1 Jazz supergroup 326 2011-10-21 12:34 /user/Jazz/input1/core-site.xml -rw-r--r-- 1 Jazz supergroup 2409 2011-10-21 12:34 /user/Jazz/input1/hadoop-env.sh -rw-r--r-- 1 Jazz supergroup 1245 2011-10-21 12:34 /user/Jazz/input1/hadoop-metrics.properties -rw-r--r-- 1 Jazz supergroup 4190 2011-10-21 12:34 /user/Jazz/input1/hadoop-policy.xml -rw-r--r-- 1 Jazz supergroup 196 2011-10-21 12:34 /user/Jazz/input1/hdfs-site.xml -rw-r--r-- 1 Jazz supergroup 2815 2011-10-21 12:34 /user/Jazz/input1/log4j.properties -rw-r--r-- 1 Jazz supergroup 212 2011-10-21 12:34 /user/Jazz/input1/mapred-site.xml -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:34 /user/Jazz/input1/masters -rw-r--r-- 1 Jazz supergroup 10 2011-10-21 12:34 /user/Jazz/input1/slaves -rw-r--r-- 1 Jazz supergroup 1243 2011-10-21 12:34 /user/Jazz/input1/ssl-client.xml.example -rw-r--r-- 1 Jazz supergroup 1195 2011-10-21 12:34 /user/Jazz/input1/ssl-server.xml.example }}} === -mkdir === * 建立資料夾 {{{ Jazz@human ~ $ hadoop fs -mkdir tmp Jazz@human ~ $ hadoop fs -ls Found 5 items drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1 drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:43 /user/Jazz/tmp }}} === -moveFromLocal === * 將 local 端的資料夾剪下移動到 HDFS 上 {{{ Jazz@human ~ $ hadoop fs -moveFromLocal in1 in2 Jazz@human ~ $ hadoop fs -ls Found 6 items drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1 drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:44 /user/Jazz/in2 drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:43 /user/Jazz/tmp }}} === -mv === * 更改資料的名稱 {{{ Jazz@human ~ $ hadoop fs -mv in2 in3 Jazz@human ~ $ hadoop fs -ls Found 6 items drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:33 /user/Jazz/dfs_input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:40 /user/Jazz/in1 drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:44 /user/Jazz/in3 drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:00 /user/Jazz/input drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:34 /user/Jazz/input1 drwxr-xr-x - Jazz supergroup 0 2011-10-21 12:43 /user/Jazz/tmp }}} === -rm === * 刪除指定的檔案(不能是資料夾) {{{ Jazz@human ~ $ hadoop fs -rm in1/input Deleted hdfs://localhost:9000/user/Jazz/in1/input }}} === -rmr === * 遞迴刪除資料夾(包含在內的所有檔案),可以是多個資料夾 {{{ Jazz@human ~ $ hadoop fs -rmr dfs_input in1 in3 input1 Deleted hdfs://localhost:9000/user/Jazz/dfs_input Deleted hdfs://localhost:9000/user/Jazz/in1 Deleted hdfs://localhost:9000/user/Jazz/in3 Deleted hdfs://localhost:9000/user/Jazz/input1 }}} === -setrep === * 設定副本係數 * 語法:hadoop fs -setrep [-R] [-w] {{{ Jazz@human ~ $ hadoop fs -setrep -w 1 -R input Replication 1 set: hdfs://localhost:9000/user/Jazz/input/capacity-scheduler.xml Replication 1 set: hdfs://localhost:9000/user/Jazz/input/configuration.xsl Replication 1 set: hdfs://localhost:9000/user/Jazz/input/core-site.xml Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hadoop-env.sh Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hadoop-metrics.properties Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hadoop-policy.xml Replication 1 set: hdfs://localhost:9000/user/Jazz/input/hdfs-site.xml Replication 1 set: hdfs://localhost:9000/user/Jazz/input/log4j.properties Replication 1 set: hdfs://localhost:9000/user/Jazz/input/mapred-site.xml Replication 1 set: hdfs://localhost:9000/user/Jazz/input/slaves Replication 1 set: hdfs://localhost:9000/user/Jazz/input/ssl-client.xml.example Replication 1 set: hdfs://localhost:9000/user/Jazz/input/ssl-server.xml.example Waiting for hdfs://localhost:9000/user/Jazz/input/capacity-scheduler.xml ... done Waiting for hdfs://localhost:9000/user/Jazz/input/configuration.xsl ... done Waiting for hdfs://localhost:9000/user/Jazz/input/core-site.xml ... done Waiting for hdfs://localhost:9000/user/Jazz/input/hadoop-env.sh ... done Waiting for hdfs://localhost:9000/user/Jazz/input/hadoop-metrics.properties ...done Waiting for hdfs://localhost:9000/user/Jazz/input/hadoop-policy.xml ... done Waiting for hdfs://localhost:9000/user/Jazz/input/hdfs-site.xml ... done Waiting for hdfs://localhost:9000/user/Jazz/input/log4j.properties ... done Waiting for hdfs://localhost:9000/user/Jazz/input/mapred-site.xml ... done Waiting for hdfs://localhost:9000/user/Jazz/input/slaves ... done Waiting for hdfs://localhost:9000/user/Jazz/input/ssl-client.xml.example ... done Waiting for hdfs://localhost:9000/user/Jazz/input/ssl-server.xml.example ... done $ bin/hadoop fs -setrep -w 2 -R input Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt Replication 2 set: hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/1.txt ... done Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/2.txt ... done Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/3.txt ... done Waiting for hdfs://gm1.nchc.org.tw:9000/user/hadooper/input/4.txt ... done }}} === -stat === * 印出時間資訊 {{{ Jazz@human ~ $ hadoop fs -stat input 2011-10-21 04:00:44 }}} === -tail === * 將文件的最後1k內容輸出 * 用法:hadoop fs -tail [-f] 檔案 (-f 參數用來顯示如果檔案增大,則秀出被append上得內容) {{{ Jazz@human ~ $ hadoop fs -tail input/log4j.properties g4j.RollingFileAppender #log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file} # Logfile size and and 30-day backups #log4j.appender.RFA.MaxFileSize=1MB #log4j.appender.RFA.MaxBackupIndex=30 #log4j.appender.RFA.layout=org.apache.log4j.PatternLayout #log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} - %m%n #log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n # # FSNamesystem Audit logging # All audit events are logged at INFO level # log4j.logger.org.apache.hadoop.fs.FSNamesystem.audit=WARN # Custom Logging levels #log4j.logger.org.apache.hadoop.mapred.JobTracker=DEBUG #log4j.logger.org.apache.hadoop.mapred.TaskTracker=DEBUG #log4j.logger.org.apache.hadoop.fs.FSNamesystem=DEBUG # Jets3t library log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR # # Event Counter Appender # Sends counts of logging messages at different severity levels to Hadoop Metric s. # log4j.appender.EventCounter=org.apache.hadoop.metrics.jvm.EventCounter }}} === -test === * 測試檔案, -e 檢查文件是否存在(1=存在, 0=否), -z 檢查文件是否為空(1=空, 0=不為空), -d 檢查是否為目錄(1=存在, 0=否) * 要用echo $? 來看回傳值為 0 or 1 * 用法: bin/hadoop fs -test -[ezd] URI {{{ ########## -e 用來判斷檔案是否存在,回傳 0 為真,回傳 1 為偽 ########## Jazz@human ~ $ hadoop fs -test -e input/slaves Jazz@human ~ $ echo $? 0 Jazz@human ~ $ hadoop fs -test -e input/masters Jazz@human ~ $ echo $? 1 ########## -z 用來判斷檔案大小是否為零,回傳 0 為真,回傳 1 為偽 ########## Jazz@human ~ $ hadoop fs -test -z input/slaves Jazz@human ~ $ echo $? 1 Jazz@human ~ $ hadoop fs -test -z input/masters test: File does not exist: input/masters ########## -d 用來判斷是不是目錄,回傳 0 為真,回傳 1 為偽 ########## Jazz@human ~ $ hadoop fs -test -d input/slaves Jazz@human ~ $ echo $? 1 Jazz@human ~ $ hadoop fs -test -d input Jazz@human ~ $ echo $? 0 }}} === -text === * 將檔案(如壓縮檔, textrecordinputstream)輸出為純文字格式 * hadoop fs -text {{{ Jazz@human ~ $ tar zcvf input.tar.gz input1 input1/ input1/capacity-scheduler.xml input1/configuration.xsl input1/core-site.xml input1/hadoop-env.sh input1/hadoop-metrics.properties input1/hadoop-policy.xml input1/hdfs-site.xml input1/log4j.properties input1/mapred-site.xml input1/masters input1/slaves input1/ssl-client.xml.example input1/ssl-server.xml.example Jazz@human ~ $ hadoop fs -put input1.tar.gz . Jazz@human ~ $ hadoop fs -text input.tar.gz <略> }}} * 註:目前沒支援 zip 的函式庫 {{{ Jazz@human ~ $ zip -r input1.zip input1/ updating: input1/ (stored 0%) adding: input1/capacity-scheduler.xml (deflated 71%) adding: input1/configuration.xsl (deflated 50%) adding: input1/core-site.xml (deflated 46%) adding: input1/hadoop-env.sh (deflated 58%) adding: input1/hadoop-metrics.properties (deflated 78%) adding: input1/hadoop-policy.xml (deflated 83%) adding: input1/hdfs-site.xml (deflated 35%) adding: input1/log4j.properties (deflated 67%) adding: input1/mapred-site.xml (deflated 34%) adding: input1/masters (stored 0%) adding: input1/slaves (stored 0%) adding: input1/ssl-client.xml.example (deflated 79%) adding: input1/ssl-server.xml.example (deflated 78%) Jazz@human ~ $ hadoop fs -put input1.zip . Jazz@human ~ $ hadoop fs -text input1.zip PK <略> }}} === -touchz === * 建立一個空文件 {{{ Jazz@human ~ $ hadoop fs -touchz empty Jazz@human ~ $ hadoop fs -test -z empty ; echo $? 0 }}}