|
||||||||||
PREV NEXT | FRAMES NO FRAMES |
Packages that use IndexingException | |
---|---|
org.apache.nutch.analysis.lang | Text document language identifier. |
org.apache.nutch.indexer | Maintain Lucene full-text indexes. |
org.apache.nutch.indexer.anchor | An indexing plugin for inbound anchor text. |
org.apache.nutch.indexer.basic | A basic indexing plugin. |
org.apache.nutch.indexer.feed | |
org.apache.nutch.indexer.metadata | |
org.apache.nutch.indexer.more | A more indexing plugin. |
org.apache.nutch.indexer.staticfield | A simple plugin called at indexing that adds fields with static data. |
org.apache.nutch.indexer.subcollection | |
org.apache.nutch.indexer.tld | Top Level Domain Indexing plugin. |
org.apache.nutch.indexer.urlmeta | URL Meta Tag Indexing Plugin |
org.apache.nutch.microformats.reltag | A microformats Rel-Tag Parser/Indexer/Querier plugin. |
org.creativecommons.nutch | Sample plugins that parse and index Creative Commons medadata. |
Uses of IndexingException in org.apache.nutch.analysis.lang |
---|
Methods in org.apache.nutch.analysis.lang that throw IndexingException | |
---|---|
NutchDocument |
LanguageIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
|
Uses of IndexingException in org.apache.nutch.indexer |
---|
Methods in org.apache.nutch.indexer that throw IndexingException | |
---|---|
NutchDocument |
IndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
Adds fields or otherwise modifies the document that will be indexed for a parse. |
NutchDocument |
IndexingFilters.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
Run all defined filters. |
Uses of IndexingException in org.apache.nutch.indexer.anchor |
---|
Methods in org.apache.nutch.indexer.anchor that throw IndexingException | |
---|---|
NutchDocument |
AnchorIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
The AnchorIndexingFilter filter object which supports boolean
configuration settings for the deduplication of anchors. |
Uses of IndexingException in org.apache.nutch.indexer.basic |
---|
Methods in org.apache.nutch.indexer.basic that throw IndexingException | |
---|---|
NutchDocument |
BasicIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
|
Uses of IndexingException in org.apache.nutch.indexer.feed |
---|
Methods in org.apache.nutch.indexer.feed that throw IndexingException | |
---|---|
NutchDocument |
FeedIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
Extracts out the relevant fields: FEED_AUTHOR FEED_TAGS FEED_PUBLISHED FEED_UPDATED FEED And sends them to the Indexer for indexing within the Nutch
index. |
Uses of IndexingException in org.apache.nutch.indexer.metadata |
---|
Methods in org.apache.nutch.indexer.metadata that throw IndexingException | |
---|---|
NutchDocument |
MetadataIndexer.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
|
Uses of IndexingException in org.apache.nutch.indexer.more |
---|
Methods in org.apache.nutch.indexer.more that throw IndexingException | |
---|---|
NutchDocument |
MoreIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
|
Uses of IndexingException in org.apache.nutch.indexer.staticfield |
---|
Methods in org.apache.nutch.indexer.staticfield that throw IndexingException | |
---|---|
NutchDocument |
StaticFieldIndexer.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
|
Uses of IndexingException in org.apache.nutch.indexer.subcollection |
---|
Methods in org.apache.nutch.indexer.subcollection that throw IndexingException | |
---|---|
NutchDocument |
SubcollectionIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
|
Uses of IndexingException in org.apache.nutch.indexer.tld |
---|
Methods in org.apache.nutch.indexer.tld that throw IndexingException | |
---|---|
NutchDocument |
TLDIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text urlText,
CrawlDatum datum,
Inlinks inlinks)
|
Uses of IndexingException in org.apache.nutch.indexer.urlmeta |
---|
Methods in org.apache.nutch.indexer.urlmeta that throw IndexingException | |
---|---|
NutchDocument |
URLMetaIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
This will take the metatags that you have listed in your "urlmeta.tags" property, and looks for them inside the CrawlDatum object. |
Uses of IndexingException in org.apache.nutch.microformats.reltag |
---|
Methods in org.apache.nutch.microformats.reltag that throw IndexingException | |
---|---|
NutchDocument |
RelTagIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
|
Uses of IndexingException in org.creativecommons.nutch |
---|
Methods in org.creativecommons.nutch that throw IndexingException | |
---|---|
NutchDocument |
CCIndexingFilter.filter(NutchDocument doc,
Parse parse,
Text url,
CrawlDatum datum,
Inlinks inlinks)
|
|
||||||||||
PREV NEXT | FRAMES NO FRAMES |