Changes between Version 1 and Version 2 of YZU130807/Lab5


Ignore:
Timestamp:
Aug 12, 2013, 12:12:29 PM (11 years ago)
Author:
jazz
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • YZU130807/Lab5

    v1 v2  
    1515}}}
    1616
     17== prepare big data set ==
     18== 準備一個大的資料集 ==
     19
     20 * 首先,讓我們產生一個大小為 200MB 的檔案。
     21{{{
     22h998@hadoop:~$ dd if=/dev/zero of=200mb.img bs=1M count=200
     23200+0 records in
     24200+0 records out
     25209715200 bytes (210 MB) copied, 0.239545 s, 875 MB/s
     26}}}
     27 * 驗證一下檔案大小
     28{{{
     29h998@hadoop:~$ du -sh 200mb.img
     30200M    200mb.img
     31}}}
     32 * 將 200mb.img 上傳到 HDFS
     33{{{
     34h998@hadoop:~$ hadoop fs -put 200mb.img 200mb.img
     35}}}
     36 * 驗證一下,上傳是否成功?
     37{{{
     38h998@hadoop:~$ hadoop fs -ls 200mb.img
     39Found 1 items
     40-rw-r--r--   2 h998 supergroup  209715200 2013-08-12 12:06 /user/h998/200mb.img
     41}}}
     42
    1743== fsck ==
    18 == 檔案系統檢查 ==
     44== 檔案系統檢查 == 
    1945
     46 * 首先,讓我們學習一下 fsck 的基本用法
     47{{{
     48#!sh
     49~$ hadoop fsck
     50Usage: DFSck <path> [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]
     51        <path>  start checking from this path
     52        -move   move corrupted files to /lost+found
     53        -delete delete corrupted files
     54        -files  print out files being checked
     55        -openforwrite   print out files opened for write
     56        -blocks print out block report
     57        -locations      print out locations for every block
     58        -racks  print out network topology for data-node locations
     59                By default fsck ignores files opened for write, use -openforwrite to report such files. They are usually  tagged CORRUPT or HEALTHY depending on their block allocation status
     60}}}
     61 * 我們先不給任何參數,只給絕對路徑看看結果
     62{{{
     63h998@hadoop:~$ hadoop fsck /user/${USER}/200mb.img
     64.Status: HEALTHY
     65 Total size:    209715200 B
     66 Total dirs:    0
     67 Total files:   1
     68 Total blocks (validated):      2 (avg. block size 104857600 B)
     69 Minimally replicated blocks:   2 (100.0 %)
     70 Over-replicated blocks:        0 (0.0 %)
     71 Under-replicated blocks:       0 (0.0 %)
     72 Mis-replicated blocks:         0 (0.0 %)
     73 Default replication factor:    2
     74 Average block replication:     2.0
     75 Corrupt blocks:                0
     76 Missing replicas:              0 (0.0 %)
     77 Number of data-nodes:          12
     78 Number of racks:               1
     79
     80
     81The filesystem under path '/user/h998/200mb.img' is HEALTHY
     82}}}
     83
     84 * 接著,我們要來使用 fsck 的參數,來觀察 200mb.img 到底有幾個區塊?這些區塊又分別存放在哪幾台機器中呢?
     85{{{
     86h998@hadoop:~$ hadoop fsck /user/${USER}/200mb.img -files -blocks -locations -racks
     87/user/h998/200mb.img 209715200 bytes, 2 block(s):  OK
     880. blk_-6674004733773524889_19333928 len=134217728 repl=2 [/default-rack/192.168.1.4:50010, /default-rack/192.168.1.8:50010]
     891. blk_-2951307914939094717_19333928 len=75497472 repl=2 [/default-rack/192.168.1.14:50010, /default-rack/192.168.1.2:50010]
     90
     91Status: HEALTHY
     92 Total size:    209715200 B
     93 Total dirs:    0
     94 Total files:   1
     95 Total blocks (validated):      2 (avg. block size 104857600 B)
     96 Minimally replicated blocks:   2 (100.0 %)
     97 Over-replicated blocks:        0 (0.0 %)
     98 Under-replicated blocks:       0 (0.0 %)
     99 Mis-replicated blocks:         0 (0.0 %)
     100 Default replication factor:    2
     101 Average block replication:     2.0
     102 Corrupt blocks:                0
     103 Missing replicas:              0 (0.0 %)
     104 Number of data-nodes:          12
     105 Number of racks:               1
     106
     107
     108The filesystem under path '/user/h998/200mb.img' is HEALTHY
     109}}}