= NutchEZ Install測試 = == 步驟 == * 將apache-tomcat-6.0.18.tar.gz, nutch-1.0.tar.gz放置/opt/目錄下 * 執行install.sh即可 == 安裝測試結果(虛擬電腦與166機器比較) == ||nutch||diff hadoop-env.sh||done|| ||nutch||diff hadoop-site.xml||done|| ||nutch||diff nutch-site.xml||***|| ||nutch||diff slaves||client_install修改此檔|| ||nutch||diff crawl-urlfilter.txt||done|| ||tomcat||diff server.xml||done|| ||tomcat||diff nutch-site.xml||***|| * 若讓使用者自行輸入dns, 可能會因輸入錯誤造成namenode or 其他服務無法正常啟動 == 執行 == === 上傳urls === * bin/hadoop dfs -put urls urls {{{ log4j:ERROR setFile(null,true) call failed. java.io.FileNotFoundException: /tmp/NutchEZ/logs/hadoop.log (Permission denied) ...something message... log4j:ERROR Either File or DatePattern options are not set for appender [DRFA]. put: org.apache.hadoop.security.AccessControlException: Permission denied: user=nutchuser, access=WRITE, inode="":root:supergroup:rwxr-xr-x }}} * 暫時切換至root測試 === 爬網 === * bin/nutch crawl urls -dir search -threads 2 -depth 3 -topN 100000 == 待完成事項 == * 爬網, 搜尋檔案..等執行階段測試