close
Warning:
Can't synchronize with repository "(default)" (Unsupported version control system "svn": /usr/lib/python2.7/dist-packages/libsvn/_fs.so: failed to map segment from shared object: Cannot allocate memory). Look in the Trac log for more information.
- Timestamp:
-
Apr 26, 2009, 12:07:04 AM (17 years ago)
- Author:
-
waue
- Comment:
-
--
Legend:
- Unmodified
- Added
- Removed
- Modified
-
|
v25
|
v26
|
|
| 4 | 4 | #!html |
| 5 | 5 | <div style="text-align: center;"><big |
| 6 | | style="font-weight: bold;"><big><big>Nutch 完整攻略</big></big></big></div> |
| 7 | | }}} |
| 8 | | |
| | 6 | style="font-weight: bold;"><big><big>實做七、Nutch 安裝使用</big></big></big></div> |
| | 7 | }}} |
| 9 | 8 | |
| 10 | 9 | = 前言 = |
| 11 | | * 雖然之前已經測試過了,網路上也有許多人分享過成功的經驗,然而這篇的重點 |
| 12 | | * 完整的安裝nutch,並解決中文亂碼問題 |
| | 10 | * 做完之前的實做,已經對hadoop有一定的體驗,然而各位也許心中有些疑問,就是我學了hadoop到底可以用來..?,因此在此介紹一個hadoop的應用,搜尋引擎nutch |
| | 11 | * 此篇的重點在於 |
| | 12 | * 完整的安裝nutch |
| 13 | 13 | * 用hadoop的角度來架設nutch |
| | 14 | * 解決中文亂碼問題 |
| 14 | 15 | * 搜尋引擎不只是找網頁內的資料,也能爬到網頁內的檔案(如pdf,msword) |
| 15 | 16 | |