close
Warning:
Can't synchronize with repository "(default)" (Unsupported version control system "svn": /usr/lib/python2.7/dist-packages/libsvn/_repos.so: failed to map segment from shared object: Cannot allocate memory). Look in the Trac log for more information.
- Timestamp:
-
Oct 29, 2010, 3:26:50 PM (15 years ago)
- Author:
-
waue
- Comment:
-
--
Legend:
- Unmodified
- Added
- Removed
- Modified
-
v3
|
v4
|
|
32 | 32 | {{{ |
33 | 33 | cd /opt/crawlzilla/nutch |
34 | | |
35 | 34 | }}} |
36 | 35 | |
… |
… |
|
66 | 65 | * dedup 2: content by hash 100.00% |
67 | 66 | * dedup 3: delete from index(es) |
| 67 | |
| 68 | {{{ |
| 69 | #java |
| 70 | Usage: DeleteDuplicates <indexes> ... |
| 71 | }}} |
| 72 | |
68 | 73 | {{{ |
69 | 74 | /opt/crawlzilla/nutch/bin/nutch dedup /user/crawler/cw_yahoo_5/index |