Package net.nutch.net

Interface Summary
URLFilter Interface used to limit which URLs enter Nutch.
UrlNormalizer Interface used to convert URLs to normal form and optionally do regex substitutions
 

Class Summary
BasicUrlNormalizer Converts URLs to a normal form .
PrefixURLFilter Filters URLs based on a file of URL prefixes.
RegexURLFilter Filters URLs based on a file of regular expressions.
RegexUrlNormalizer Allows users to do regex substitutions on all/any URLs that are encountered, which is useful for stripping session IDs from URLs.
URLFilterFactory Factory to create a URLFilter from "urlfilter.class" config property.
UrlNormalizerFactory Factory to create a UrlNormalizer from "urlnormalizer.class" config property.
 



Copyright © 2005 The Nutch Organization.