close
          Warning:
          Can't synchronize with repository "(default)" (Unsupported version control system "svn": /usr/lib/python2.7/dist-packages/libsvn/_delta.so: failed to map segment from shared object: Cannot allocate memory). Look in the Trac log for more information.
      
 
    
        
        
      
        
          
Hadoop Hands-on Labs (1)
Basic DFS command / Hadoop DFS 基本測試環境建立
- download hadoop-0.18.2
$ cd ~
$ wget http://ftp.twaren.net/Unix/Web/apache/hadoop/core/hadoop-0.18.2/hadoop-0.18.2.tar.gz
$ tar zxvf hadoop-0.18.2.tar.gz
 - Hadoop 會用 SSH 進行內部連線,因此需要做 SSH Key exchange
~$ ssh-keygen
~$ cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys
 - 需要 JAVA_HOME 環境變數才能執行 hadoop namenode
$ echo "export JAVA_HOME=/usr/lib/jvm/java-6-sun" >> ~/.bash_profile
$ cd ~/hadoop-0.18.2
 - 編輯 conf/hadoop-evn.sh (HADOOP_HOME要設定到你的hadoop安裝目錄)
export JAVA_HOME=/usr/lib/jvm/java-6-sun
export HADOOP_HOME=/home/jazz/hadoop-0.18.2/
export HADOOP_CONF_DIR=$HADOOP_HOME/conf
 - 編輯 conf/hadoop-site.xml 在 configuration 那一段加入以下設定
<property>
  <name>fs.default.name</name>
  <value>hdfs://localhost:9000/</value>
  <description>
    The name of the default file system. Either the literal string
    "local" or a host:port for NDFS.
  </description>
</property>
<property>
  <name>mapred.job.tracker</name>
  <value>localhost:9001</value>
  <description>
    The host and port that the MapReduce job tracker runs at. If
    "local", then jobs are run in-process as a single map and
    reduce task.
  </description>
</property>
 - 啟動hadoop 的兩道指令
~/hadoop-0.18.2$ bin/hadoop namenode -format
~/hadoop-0.18.2$ bin/start-all.sh
 - 完成後可以看到以下三個網頁
 
- 也可以放的東西上hdfs去看看
~/hadoop-0.18.2$ bin/hadoop dfs -put conf conf
~/hadoop-0.18.2$ bin/hadoop dfs -ls
Found 1 items
drwxr-xr-x   - jazz supergroup          0 2008-11-04 15:56 /user/jazz/conf
 
Hadoop Hands-on Labs (2)
- 執行 Wordcount 範例
~/hadoop-0.18.2$ bin/hadoop fs -put conf conf
~/hadoop-0.18.2$ bin/hadoop fs -ls
Found 1 items
drwxr-xr-x   - jazz supergroup          0 2008-11-05 19:34 /user/jazz/conf
~/hadoop-0.18.2$ bin/hadoop jar /home/jazz/hadoop-0.18.2/hadoop-0.18.2-examples.jar wordcount
ERROR: Wrong number of parameters: 0 instead of 2.
wordcount [-m <maps>] [-r <reduces>] <input> <output>
Generic options supported are
-conf <configuration file>     specify an application configuration file
-D <property=value>            use value for given property
-fs <local|namenode:port>      specify a namenode
-jt <local|jobtracker:port>    specify a job tracker
-files <comma separated list of files>    specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars>    specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives>    specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]
~/hadoop-0.18.2$ bin/hadoop jar /home/jazz/hadoop-0.18.2/hadoop-0.18.2-examples.jar wordcount conf output
 
大量部署 Hadoop 的方法
 
          
          
        
        
       
      
     
    
    
      Download in other formats: