= NutchEZ Install測試 = == 步驟 == * 將安裝shell檔及*.tar.gz放置同一目錄下 * 執行install.sh == 安裝之後檢查項目 == ||路徑||檢查項目|| ||/home/nutchuser/nutchez/source||client安裝檔(檢查ip,hostname), client壓縮檔|| ||/etc/hosts||相同的hostsname需註解掉|| == 測試 == === Ubuntu10.04 === * Java 檢查部份可加入以下訊息提醒user除錯步驟 {{{ add-apt-repository "deb http://archive.canonical.com/ lucid partner" apt-get update apt-get install sun-java6-jdk sun-java6-plugin update-java-alternatives -s java-6-sun }}} === Ubuntu9.10 === * Java 檢查部份可加入以下訊息提醒user除錯步驟 {{{ apt-get install sun-java6-jdk sun-java6-plugin }}} == 執行 == === 上傳urls === * bin/hadoop dfs -put urls urls {{{ log4j:ERROR setFile(null,true) call failed. java.io.FileNotFoundException: /tmp/NutchEZ/logs/hadoop.log (Permission denied) ...something message... log4j:ERROR Either File or DatePattern options are not set for appender [DRFA]. put: org.apache.hadoop.security.AccessControlException: Permission denied: user=nutchuser, access=WRITE, inode="":root:supergroup:rwxr-xr-x }}} * 暫時切換至root測試 === 爬網 === * bin/nutch crawl urls -dir search -threads 2 -depth 3 -topN 100000 == 待完成事項 == * 爬網, 搜尋檔案..等執行階段測試