| | 17 | == 執行 == |
| | 18 | |
| | 19 | === 上傳urls === |
| | 20 | * bin/hadoop dfs -put urls urls |
| | 21 | {{{ |
| | 22 | log4j:ERROR setFile(null,true) call failed. |
| | 23 | java.io.FileNotFoundException: /tmp/NutchEZ/logs/hadoop.log (Permission denied) |
| | 24 | ...something message... |
| | 25 | log4j:ERROR Either File or DatePattern options are not set for appender [DRFA]. |
| | 26 | put: org.apache.hadoop.security.AccessControlException: Permission denied: user=nutchuser, access=WRITE, inode="":root:supergroup:rwxr-xr-x |
| | 27 | }}} |
| | 28 | * 暫時切換至root測試 |
| | 29 | |
| | 30 | === 爬網 === |
| | 31 | * bin/nutch crawl urls -dir search -threads 2 -depth 3 -topN 100000 |
| | 32 | |