Changes between Version 5 and Version 6 of waue/2009/0408


Ignore:
Timestamp:
Apr 8, 2009, 4:12:27 PM (15 years ago)
Author:
waue
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • waue/2009/0408

    v5 v6  
    4949                <name>plugin.includes</name> <value>protocol-file|protocol-http|parse-(text|html)|index-basic|query-(basic|site|url)</value>
    5050        </property>
     51
     52
     53What is happening?
     54By default, the size of the documents downloaded by Nutch is limited (to 65536 bytes). To allow Nutch to download larger files (via HTTP), modify nutch-site.xml and add an entry like this:
     55            <property>
     56                  <name>http.content.limit</name> <value>150000</value>
     57            </property>
     58      If you do not want to limit the size of downloaded documents, set http.content.limit to a negative value:
     59            <property>
     60                  <name>http.content.limit</name> <value>-1</value>
     61            </property>
     62
     63
    5164}}}