Package net.nutch.tools

Interface Summary
PruneIndexTool.PruneChecker This interface can be used to implement additional checking on matching documents.
 

Class Summary
CrawlTool  
DistributedAnalysisTool DistributedAnalysisTool performs link-analysis by reading exclusively from a IWebDBReader, and writing to an IWebDBWriter.
FetchListTool This class takes an IWebDBReader, computes a relevant subset, and then emits the subset.
FetchListTool.SortableScore SortableScore is just a WritableComparable Float!
LinkAnalysisTool LinkAnalysisTool performs link-analysis by using the DistributedAnalysisTool.
ParseSegment Parse contents in one segment.
PruneIndexTool This tool prunes existing Nutch indexes of unwanted content.
PruneIndexTool.PrintFieldsChecker This checker's main function is just to print out selected field values from each document, just before they are deleted.
PruneIndexTool.StoreUrlsChecker This checker's main function is just to store the URLs of each document to be deleted in a text file.
SegmentMergeTool This class cleans up accumulated segments data, and merges them into a single (or optionally multiple) segment(s), with no duplicates in it.
SegmentMergeTool.SegmentMergeStatus  
UpdateDatabaseTool This class takes the output of the fetcher and updates the page and link DBs accordingly.
WebDBAdminTool The WebDBAdminTool is for Nutch administrators who need special access to the webdb.
 



Copyright © 2005 The Nutch Organization.