|
||||||||||
PREV NEXT | FRAMES NO FRAMES |
Packages that use IndexingException | |
---|---|
org.apache.nutch.analysis.lang | Text document language identifier. |
org.apache.nutch.indexer | Maintain Lucene full-text indexes. |
org.apache.nutch.indexer.anchor | An indexing plugin for inbound anchor text. |
org.apache.nutch.indexer.basic | A basic indexing plugin. |
org.apache.nutch.indexer.feed | |
org.apache.nutch.indexer.more | A more indexing plugin. |
org.apache.nutch.indexer.subcollection | |
org.apache.nutch.indexer.tld | Top Level Domain Indexing plugin. |
org.apache.nutch.microformats.reltag | A microformats Rel-Tag Parser/Indexer/Querier plugin. |
org.creativecommons.nutch | Sample plugins that parse and index Creative Commons medadata. |
Uses of IndexingException in org.apache.nutch.analysis.lang |
---|
Methods in org.apache.nutch.analysis.lang that throw IndexingException | |
---|---|
NutchDocument |
LanguageIndexingFilter.filter(NutchDocument doc,
String url,
WebPage page)
|
Uses of IndexingException in org.apache.nutch.indexer |
---|
Methods in org.apache.nutch.indexer that throw IndexingException | |
---|---|
NutchDocument |
IndexingFilters.filter(NutchDocument doc,
String url,
WebPage page)
Run all defined filters. |
NutchDocument |
IndexingFilter.filter(NutchDocument doc,
String url,
WebPage page)
Adds fields or otherwise modifies the document that will be indexed for a parse. |
Uses of IndexingException in org.apache.nutch.indexer.anchor |
---|
Methods in org.apache.nutch.indexer.anchor that throw IndexingException | |
---|---|
NutchDocument |
AnchorIndexingFilter.filter(NutchDocument doc,
String url,
WebPage page)
The AnchorIndexingFilter filter object which supports boolean
configuration settings for the deduplication of anchors. |
Uses of IndexingException in org.apache.nutch.indexer.basic |
---|
Methods in org.apache.nutch.indexer.basic that throw IndexingException | |
---|---|
NutchDocument |
BasicIndexingFilter.filter(NutchDocument doc,
String url,
WebPage page)
The BasicIndexingFilter filter object which supports boolean
configurable value for length of characters permitted within the
title @see indexer.max.title.length in nutch-default.xml |
Uses of IndexingException in org.apache.nutch.indexer.feed |
---|
Methods in org.apache.nutch.indexer.feed that throw IndexingException | |
---|---|
NutchDocument |
FeedIndexingFilter.filter(NutchDocument doc,
Parse parse,
org.apache.hadoop.io.Text url,
CrawlDatum datum,
Inlinks inlinks)
Extracts out the relevant fields: FEED_AUTHOR FEED_TAGS FEED_PUBLISHED FEED_UPDATED FEED And sends them to the Indexer for indexing within the Nutch
index. |
Uses of IndexingException in org.apache.nutch.indexer.more |
---|
Methods in org.apache.nutch.indexer.more that throw IndexingException | |
---|---|
NutchDocument |
MoreIndexingFilter.filter(NutchDocument doc,
String url,
WebPage page)
|
Uses of IndexingException in org.apache.nutch.indexer.subcollection |
---|
Methods in org.apache.nutch.indexer.subcollection that throw IndexingException | |
---|---|
NutchDocument |
SubcollectionIndexingFilter.filter(NutchDocument doc,
String url,
WebPage page)
|
Uses of IndexingException in org.apache.nutch.indexer.tld |
---|
Methods in org.apache.nutch.indexer.tld that throw IndexingException | |
---|---|
NutchDocument |
TLDIndexingFilter.filter(NutchDocument doc,
String url,
WebPage page)
|
Uses of IndexingException in org.apache.nutch.microformats.reltag |
---|
Methods in org.apache.nutch.microformats.reltag that throw IndexingException | |
---|---|
NutchDocument |
RelTagIndexingFilter.filter(NutchDocument doc,
String url,
WebPage page)
The RelTagIndexingFilter filter object. |
Uses of IndexingException in org.creativecommons.nutch |
---|
Methods in org.creativecommons.nutch that throw IndexingException | |
---|---|
NutchDocument |
CCIndexingFilter.filter(NutchDocument doc,
String url,
WebPage page)
|
|
||||||||||
PREV NEXT | FRAMES NO FRAMES |