| 1 | |
| 2 | * [https://issues.apache.org/jira/browse/NUTCH-427] |
| 3 | |
| 4 | A. Introduction |
| 5 | |
| 6 | The protocol-smb plugins allows you to crawl Microsoft Windows shares. It implements |
| 7 | the CIFS/SMB protocol which is commonly used on Microsoft OS. The plugin replicate the |
| 8 | behaviour of the protocol-file over CIFS/SMB protocol. This plugin uses the JCifs library and also |
| 9 | support all the properties from the JCifs library. |
| 10 | |
| 11 | You can find more information on the following site: http://jcifs.samba.org/ |
| 12 | The smb protocol syntax for crawling is as follow: smb://xxxxx (i.e. smb://server/share). |
| 13 | |
| 14 | B. Installation |
| 15 | |
| 16 | 1) Binaries only: |
| 17 | |
| 18 | The protocol-smb files can be found in the ../plugins directory. |
| 19 | |
| 20 | Copy the "protocol-smb" to NUTCHHOME/build/plugins directory. |
| 21 | |
| 22 | Put the "smb.properties" file in the NUTCHHOME/conf directory. |
| 23 | |
| 24 | Configure the properties in "smb.properties" file |
| 25 | |
| 26 | Enable the plugin by updating "nutch-site.xml" file found in NUTCHHOME/conf directory |
| 27 | |
| 28 | e.g. <property> |
| 29 | <name>plugin.includes</name> |
| 30 | <value>protocol-smb| other plugins...</value> |
| 31 | <description> |
| 32 | </description> |
| 33 | </property> |
| 34 | |