| | 1 | = NutchEZ安裝流程 = |
| | 2 | |
| | 3 | == 假設條件 == |
| | 4 | * JAVA_HOME=/usr/lib/jvm/java-6-sun |
| | 5 | * User: nutchuser |
| | 6 | * Nutch原始檔路徑:/home/nutchuser/nutch-1.0.tar.gz |
| | 7 | * Tomcat原始檔路徑:/home/nutchuser/apache-tomcat-6.0.18.tar.gz |
| | 8 | * NutchEZ安裝路徑:/opt/nutchEZ |
| | 9 | * Tomcat安裝路徑:/opt/nutchEZ/tomcat |
| | 10 | |
| | 11 | == 開始安裝 == |
| | 12 | === 詢問使用者資訊及其他資訊 === |
| | 13 | * Admin e-mail |
| | 14 | * DNS name |
| | 15 | * Master IP(程式設定) |
| | 16 | |
| | 17 | === Install Nutch === |
| | 18 | ==== 解壓縮.改資料夾名稱.擁有者 ==== |
| | 19 | * tar zxvf nutch-1.0.tar.gz |
| | 20 | * mv nutch-1.0 nutchEZ |
| | 21 | * chown -R nutchuser:nutchuser /opt/nutchEZ |
| | 22 | ===== 將設定寫入設定檔 ===== |
| | 23 | * hadoop-env.sh |
| | 24 | * hadoop-site.xml($MasterDNS) |
| | 25 | * nutch-site.xml($Admin) |
| | 26 | * slaves(叢集的client_install需更改此檔) |
| | 27 | * crawl-urlfilter.txt(爬網規則) |
| | 28 | |
| | 29 | === 啟動nutch === |
| | 30 | * 格式化HDFS |
| | 31 | * startup nucth |
| | 32 | |
| | 33 | === Install Tomcat === |
| | 34 | ==== 解壓縮.改資料夾名稱.擁有者 ==== |
| | 35 | * tar zxvf apache-tomcat-6.0.18.tar.gz /opt/nutchEZ/ |
| | 36 | * mv /opt/nutchEZ/apache-tomcat-6.0.18 /opt/nutchEZ/tomcat |
| | 37 | * chown -R nutchuser:nutchuser /opt/nutchEZ/ |
| | 38 | |
| | 39 | ==== 環境設定 ==== |
| | 40 | {{{ |
| | 41 | $ cd /opt/nutchEZ |
| | 42 | $ mkdir web |
| | 43 | $ cd web |
| | 44 | $ jar -xvf ../nutch-1.0.war |
| | 45 | $ rm ../nutch-1.0.war |
| | 46 | $ mv /opt/nuctcEZ/tomcat/webapps/ROOT /opt/tomcat/webapps/ROOT-ori |
| | 47 | $ cd /opt/nutchEZ |
| | 48 | $ mv /opt/nutchEZ/web /opt/nutchEZ/tomcat/webapps/ROOT |
| | 49 | $ mkdir /opt/nutchEZ/search |
| | 50 | }}} |
| | 51 | |
| | 52 | ==== 修改設定檔 ==== |
| | 53 | ===== /opt/nutchEZ/tomcat/conf/server.xml ===== |
| | 54 | ===== /opt/nutchEZ/tomcat/webapps/ROOT/WEB-INF/classes/nutch-site.xml ===== |
| | 55 | |
| | 56 | ==== 啟動tomcat ==== |
| | 57 | |
| | 58 | == 執行階段 == |
| | 59 | * 爬網 |
| | 60 | * 搬檔案 |
| | 61 | * 重新啟動tomcat |