Changes between Version 27 and Version 28 of MR_manual


Ignore:
Timestamp:
Sep 3, 2008, 1:17:30 PM (16 years ago)
Author:
waue
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • MR_manual

    v27 v28  
    4040[http://tech.ccidnet.com/art/5833/20080318/1393525_1.html copied from 詳細講解HBase]
    4141 = 二、環境設定 =
    42 
    43  == 2.1 Prepare ==
    44 System :
    45  * Ubuntu 7.10
    46  * Hadoop 0.16
    47  * Hbase 0.1.3
    48 ps : hbase 0.1.4 <--> hadoop 0.2.0
    49 Requirement :
    50  * Eclipse   (3.2.2)
    51 {{{
    52 $ apt-get install eclipse
    53 }}}
    54 java 6 
    55 {{{
    56 $ apt-get install sun-java6-bin sun-java6-jdk sun-java6-jre sun-java6-plugin
    57 }}}
    58 suggest to remove the default java compiler 「 gcj 」
    59 {{{
    60 $ apt-get purge java-gcj-compat
    61 }}}
    62 Append two codes to /etc/bash.bashrc to setup Java Class path
    63 {{{
    64 export JAVA_HOME=/usr/lib/jvm/java-6-sun
    65 export HADOOP_HOME=/home/waue/workspace/hadoop/
    66 export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
    67 }}}
    68 Building UP Path
     42所對應到的路徑為
    6943 || Name || Path ||
    7044 || Java Home || /usr/lib/jvm/java-6-sun ||
     
    7246 || Hbase Home || /home/waue/workspace/hbase/ ||
    7347
    74 Nodes set
     48節點
    7549 || node name || server ||
    7650 || cloud1 || v ||
    7751 || cloud2 ||  ||
    7852 || cloudn ||  ||
    79 
     53 == 2.1 準備 ==
     54系統 :
     55 * Ubuntu 7.10
     56 * Hadoop 0.16
     57 * Hbase 0.1.3
     58ps : 若要升級則需要兩者都升級 hbase 0.1.4 <--> hadoop 0.2.0 
     59 * Eclipse   (3.2.2)
     60{{{
     61$ apt-get install eclipse
     62}}}
     63java 6 
     64{{{
     65$ apt-get install sun-java6-bin sun-java6-jdk sun-java6-jre sun-java6-plugin
     66}}}
     67建議刪除原本的 「 gcj 」
     68{{{
     69$ apt-get purge java-gcj-compat
     70}}}
     71加入以下內容到 /etc/bash.bashrc
     72{{{
     73export JAVA_HOME=/usr/lib/jvm/java-6-sun
     74export HADOOP_HOME=/home/waue/workspace/hadoop/
     75export HBASE_HOME=/home/waue/workspace/hbase/
     76export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
     77}}}
    8078 == 2.2 Hadoop Setup ==
    8179 === 2.2.1. Generate an SSH key for the user ===
     
    8381$ ssh-keygen -t rsa -P ""
    8482$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    85 $ ssh localhost
     83$ ssh cloud1
    8684$ exit
    8785}}}
     
    8987{{{
    9088 $ cd /home/waue/workspace
    91  $ sudo tar xzf hadoop-0.16.0.tar.gz
    92  $ sudo mv hadoop-0.16.0 hadoop
     89 $ sudo tar xzf hadoop-0.16.3.tar.gz
     90 $ sudo ln -sf hadoop-0.16.3 hadoop
    9391 $ sudo chown -R waue:waue hadoop
    9492 $ cd hadoop
     
    106104export JAVA_HOME=/usr/lib/jvm/java-6-sun
    107105export HADOOP_HOME=/home/waue/workspace/hadoop
     106export HBASE_HOME=/home/waue/workspace/hbase
    108107export HADOOP_LOG_DIR=$HADOOP_HOME/logs
    109108export HADOOP_SLAVES=$HADOOP_HOME/conf/slaves
    110 }}}
     109export HADOOP_CLASSPATH= $HBASE_HOME/hbase-0.1.3.jar:$HBASE_HOME/conf
     110}}}
     111ps. HADOOP_CLASSPATH 要設hbase 的環境,而HBASE_CLASSPATH要設hadoop的環境,
     112有了這行可以解決編譯hbase 程式時出現run time error
     113
    111114 2. hadoop-site.xml  ($HADOOP_HOME/conf/)[[BR]]
    112115modify the contents of conf/hadoop-site.xml as below
     
    127130<property>
    128131  <name>mapred.map.tasks</name>
    129   <value>1</value>
     132  <value>9</value>
    130133  <description>
    131134    define mapred.map tasks to be number of slave hosts
     
    134137<property>
    135138  <name>mapred.reduce.tasks</name>
    136   <value>1</value>
     139  <value>9</value>
    137140  <description>
    138141    define mapred.reduce tasks to be number of slave hosts
     
    175178}}}
    176179    * 用hbase連接hadoop DFS
    177       * 編輯 conf/hbase-site.xml檔案,如下
    178 
     180      * 編輯 conf/hbase-site.xml檔案如下,並複製一份到$HADOOP_HOME/conf 下
    179181{{{
    180182<configuration>
    181183  <property>
    182184      <name>hbase.master</name>
    183       <value>localhost:60000</value>
     185      <value>cloud1:60000</value>
    184186  </property>
    185187  <property>
    186188      <name>hbase.master.info.bindAddress</name>
    187       <value>localhost</value>
     189      <value>cloud1</value>
    188190      <description>The address for the hbase master web UI</description>
    189191  </property>
    190192<property>
    191193   <name>hbase.regionserver.info.bindAddress</name>
    192    <value>localhost</value>
     194   <value>cloud1</value>
    193195   <description>The address for the hbase regionserver web UI
    194196  </description>
     
    197199     <name>hbase.rootdir</name>
    198200     <value>file:///tmp/hbase-${user.home}/hbase</value>
    199      <value>hdfs://localhost:9000/hbase</value>
     201     <value>hdfs://cloud1:9000/hbase</value>
    200202     <description>
    201203         The directory shared by region servers.
     
    206208
    207209}}}
    208       * 編輯 conf/hbase-site.xml檔案,如下
     210      * 多nodes模式,編輯 conf/regionservers檔案,如下
    209211{{{
    210212cloud1
     
    249251{{{
    250252starting namenode, logging to /home/waue/workspace/hadoop/logs/hadoop-waue-namenode-Dx7200.out
    251 localhost: starting datanode, logging to /home/waue/workspace/hadoop/logs/hadoop-waue-datanode-Dx7200.out
    252 localhost: starting secondarynamenode, logging to /home/waue/workspace/hadoop/logs/hadoop-waue-secondarynamenode-Dx7200.out
     253cloud1: starting datanode, logging to /home/waue/workspace/hadoop/logs/hadoop-waue-datanode-Dx7200.out
     254cloud1: starting secondarynamenode, logging to /home/waue/workspace/hadoop/logs/hadoop-waue-secondarynamenode-Dx7200.out
    253255starting jobtracker, logging to /home/waue/workspace/hadoop/logs/hadoop-waue-jobtracker-Dx7200.out
    254 localhost: starting tasktracker, logging to /home/waue/workspace/hadoop/logs/hadoop-waue-tasktracker-Dx7200.out
    255 }}}
    256  * Then make sure http://localhost:50030/ by your explorer is on going.  [[br]]
     256cloud1: starting tasktracker, logging to /home/waue/workspace/hadoop/logs/hadoop-waue-tasktracker-Dx7200.out
     257}}}
     258 * Then make sure http://cloud1:50030/ by your explorer is on going.  [[br]]
    257259
    258260 * Ps : if your system had error after restart, you could do there for resolving and renewing one. And repeat to 「4. start up Hadoop」
     
    381383Click blue elephant to add a new MapReduce server location.
    382384Server name : any_you_want
    383 Hostname : localhost
     385Hostname : cloud1
    384386Installation directory: /home/waue/workspace/nutch/
    385387Username : waue
     
    412414 * A 「console」 tag will show beside 「!MapReduce Server」 tag.
    413415
    414  * While Map Reduce is running, you can visit http://localhost:50030/ to view that Hadoop is dispatching jobs by Map Reduce.
    415 
    416  * After finish, you can go to http://localhost:50060/ to see the result.
     416 * While Map Reduce is running, you can visit http://cloud1:50030/ to view that Hadoop is dispatching jobs by Map Reduce.
     417
     418 * After finish, you can go to http://cloud1:50060/ to see the result.
    417419
    418420
     
    422424
    423425 = 七、Reference =
    424