|
|||||||||||
PREV NEXT | FRAMES NO FRAMES |
See:
Description
Plugins | |
org.apache.nutch.analysis.lang | Text document language identifier. |
org.apache.nutch.indexer.basic | A basic indexing plugin. |
org.apache.nutch.indexer.more | A more indexing plugin. |
org.apache.nutch.parse.html | An HTML document parsing plugin. |
org.apache.nutch.parse.js | |
org.apache.nutch.parse.msword | A Word document parsing plugin. |
org.apache.nutch.parse.msword.chp | |
org.apache.nutch.parse.pdf | A pdf parsing plugin. |
org.apache.nutch.parse.text | A plain text parsing plugin. |
org.apache.nutch.protocol.file | Protocol plugin which supports retrieving local file resources. |
org.apache.nutch.protocol.ftp | Protocol plugin which supports retrieving documents via the ftp protocol. |
org.apache.nutch.protocol.http | Protocol plugin which supports retrieving documents via the http protocol. |
org.apache.nutch.protocol.httpclient | Protocol plugin which supports retrieving documents via the HTTP protocol. |
org.creativecommons.nutch | Sample plugins that parse and index Creative Commons medadata. |
Nutch is the open-source search engine.
|
|||||||||||
PREV NEXT | FRAMES NO FRAMES |