Changes between Version 8 and Version 9 of shunfa/2010/0524_NutchEZ_InstallTest


Ignore:
Timestamp:
Jun 10, 2010, 5:02:18 PM (14 years ago)
Author:
shunfa
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • shunfa/2010/0524_NutchEZ_InstallTest

    v8 v9  
    2828== 執行 ==
    2929
    30 === 上傳urls ===
    31  * bin/hadoop dfs -put urls urls
     30=== 2010/06/10 ===
    3231{{{
    33 log4j:ERROR setFile(null,true) call failed.
    34 java.io.FileNotFoundException: /tmp/NutchEZ/logs/hadoop.log (Permission denied)
    35 ...something message...
    36 log4j:ERROR Either File or DatePattern options are not set for appender [DRFA].
    37 put: org.apache.hadoop.security.AccessControlException: Permission denied: user=nutchuser, access=WRITE, inode="":root:supergroup:rwxr-xr-x
     3210/06/10 16:58:42 INFO mapred.JobClient: Task Id : attempt_201006091555_0003_r_000000_0, Status : FAILED
     33Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
     3410/06/10 16:58:53 INFO mapred.JobClient: Task Id : attempt_201006091555_0003_r_000000_1, Status : FAILED
     35Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
     3610/06/10 16:59:05 INFO mapred.JobClient: Task Id : attempt_201006091555_0003_r_000000_2, Status : FAILED
     37Shuffle Error: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
     38Exception in thread "main" java.io.IOException: Job failed!
     39        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
     40        at org.apache.nutch.crawl.Generator.generate(Generator.java:472)
     41        at org.apache.nutch.crawl.Generator.generate(Generator.java:409)
     42        at org.apache.nutch.crawl.Crawl.main(Crawl.java:116)
     43nutch crawl is error
    3844}}}
    39  * 暫時切換至root測試
    40 
    41 === 爬網 ===
    42  * bin/nutch crawl urls -dir search -threads 2 -depth 3 -topN 100000
    43 
    44 == 待完成事項 ==
    45  * 爬網, 搜尋檔案..等執行階段測試