Changes between Version 9 and Version 10 of waue/2009/nutch_install
- Timestamp:
- Apr 24, 2009, 6:12:05 PM (16 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
waue/2009/nutch_install
v9 v10 47 47 == 2.2 部屬hadoop,nutch目錄結構 == 48 48 {{{ 49 $ cp -rf hadoop/* nutch 49 $ cp -rf /opt/hadoop/* /opt/nutch 50 }}} 51 52 == 2.3 複製函式庫檔 == 53 {{{ 50 54 $ cd nutch 55 $ cp -rf *.jar lib/ 51 56 }}} 52 57 … … 73 78 74 79 75 == 3. 3conf/nutch-site.xml ==80 == 3.2 conf/nutch-site.xml == 76 81 * 重要的設定檔,新增了必要的內容於內,然而想要瞭解更多參數資訊,請見nutch-default.xml 77 82 {{{ … … 150 155 }}} 151 156 152 == 3. 5crawl-urlfilter.txt ==157 == 3.3 crawl-urlfilter.txt == 153 158 * 重新編輯爬檔規則,此檔重要在於若設定不好,則爬出來的結果幾乎是空的,也就是說最後你的搜尋引擎都找不到資料啦! 154 159 {{{