| 1 | = NutchEZ安裝流程 = |
| 2 | |
| 3 | == 假設條件 == |
| 4 | * JAVA_HOME=/usr/lib/jvm/java-6-sun |
| 5 | * User: nutchuser |
| 6 | * Nutch原始檔路徑:/home/nutchuser/nutch-1.0.tar.gz |
| 7 | * Tomcat原始檔路徑:/home/nutchuser/apache-tomcat-6.0.18.tar.gz |
| 8 | * NutchEZ安裝路徑:/opt/nutchEZ |
| 9 | * Tomcat安裝路徑:/opt/nutchEZ/tomcat |
| 10 | |
| 11 | == 開始安裝 == |
| 12 | === 詢問使用者資訊及其他資訊 === |
| 13 | * Admin e-mail |
| 14 | * DNS name |
| 15 | * Master IP(程式設定) |
| 16 | |
| 17 | === Install Nutch === |
| 18 | ==== 解壓縮.改資料夾名稱.擁有者 ==== |
| 19 | * tar zxvf nutch-1.0.tar.gz |
| 20 | * mv nutch-1.0 nutchEZ |
| 21 | * chown -R nutchuser:nutchuser /opt/nutchEZ |
| 22 | ===== 將設定寫入設定檔 ===== |
| 23 | * hadoop-env.sh |
| 24 | * hadoop-site.xml($MasterDNS) |
| 25 | * nutch-site.xml($Admin) |
| 26 | * slaves(叢集的client_install需更改此檔) |
| 27 | * crawl-urlfilter.txt(爬網規則) |
| 28 | |
| 29 | === 啟動nutch === |
| 30 | * 格式化HDFS |
| 31 | * startup nucth |
| 32 | |
| 33 | === Install Tomcat === |
| 34 | ==== 解壓縮.改資料夾名稱.擁有者 ==== |
| 35 | * tar zxvf apache-tomcat-6.0.18.tar.gz /opt/nutchEZ/ |
| 36 | * mv /opt/nutchEZ/apache-tomcat-6.0.18 /opt/nutchEZ/tomcat |
| 37 | * chown -R nutchuser:nutchuser /opt/nutchEZ/ |
| 38 | |
| 39 | ==== 環境設定 ==== |
| 40 | {{{ |
| 41 | $ cd /opt/nutchEZ |
| 42 | $ mkdir web |
| 43 | $ cd web |
| 44 | $ jar -xvf ../nutch-1.0.war |
| 45 | $ rm ../nutch-1.0.war |
| 46 | $ mv /opt/nuctcEZ/tomcat/webapps/ROOT /opt/tomcat/webapps/ROOT-ori |
| 47 | $ cd /opt/nutchEZ |
| 48 | $ mv /opt/nutchEZ/web /opt/nutchEZ/tomcat/webapps/ROOT |
| 49 | $ mkdir /opt/nutchEZ/search |
| 50 | }}} |
| 51 | |
| 52 | ==== 修改設定檔 ==== |
| 53 | ===== /opt/nutchEZ/tomcat/conf/server.xml ===== |
| 54 | ===== /opt/nutchEZ/tomcat/webapps/ROOT/WEB-INF/classes/nutch-site.xml ===== |
| 55 | |
| 56 | ==== 啟動tomcat ==== |
| 57 | |
| 58 | == 執行階段 == |
| 59 | * 爬網 |
| 60 | * 搬檔案 |
| 61 | * 重新啟動tomcat |