|
|||||||||||
PREV NEXT | FRAMES NO FRAMES |
InputFormat
.
OutputFormat
.
item
, with the supplied
priority
.
String
can be decoded in reverse and the
first character is represented by a terminal node.
String
can be decoded and the last character is
represented by a terminal node.
Writable
class.
OnlineClusterer
extension
using clustering components of the Carrot2 project
(http://carrot2.sourceforge.net).param
, to the IPC server running at
address
, returning the value.
HitDetails
objects) and
their previously extracted summaries (String
s).
OnlineClusterer
for documentation.
JobConf
.
true
if item
exists in this
FibonacciHeap
, false otherwise.
application/octet-stream
MimeType
Thread.setDaemon(boolean)
with true.DataInput
implementation that reads from an in-memory
buffer.DataOutput
implementation that writes to an in-memory
buffer.priority
value associated with
item
.
WritableComparable
implementation.
Extension
is a kind of listener descriptor that will be
installed on a concrete ExtensionPoint
that acts as kind of
Publisher.ExtensionPoint
provide meta information of a extension
point.o
is a FloatWritable with the same value.
o
is a IntWritable with the same value.
o
is a LongWritable with the same value.
o
is an MD5Hash whose digest contains the
same values.
o
is a UTF8 with the same contents.
HitSummarizer
and HitContent
for a set of
fetched segments.FibonacciHeap
.
n
th value in the file.
key
.
WritableComparable
implementation.
InputFormat
.
OutputFormat
.
name
property, or null if no
such property exists.
name
property.
name
property as an boolean.
name
property as a Class.
name
property as a Class.
null
otherwise.
name
.
name
.
i
th field.
name
property as a float.
i
th hit in this list.
PluginRepository
name
property as an integer.
name
property as a long.
robotsMeta
to appropriate
values, based on any META tags found under the given
node
.
NutchFileSystem.getNamed(String)
.
Outlink
from given plain text.
Outlink
from given plain text and adds anchor
to the extracted Outlink
s
node
, and creates appropriate Outlink
records for each (relative to the supplied base
URL), and adds them to the outlinks
ArrayList
.
Parser
implementation given a content
type and url.
Object.hashCode()
to partition.
Plugin
class.
null
.
MapOutputProtocol
connections.
Protocol
implementation for a url.
Content
for a url.
Content
for a fetchlist entry.
RecordReader
for a FileSplit
.
RecordWriter
.
name
property as an array of
strings.
StringBuffer
and a DOM Node
,
and will append all the content text found beneath the DOM node to
the StringBuffer
.
getText(sb, node, false)
.
StringBuffer
and a DOM Node
,
and will append the content text found beneath the first
title
node to the StringBuffer
.
i
th field.
HtmlParseFilter
that looks for possible
indications of content language.Object.hashCode()
.RawCluster
interface to
HitsCluster
interface.HtmlParseFilter
implementing plugins.true
if this LogFormatter
has
logged something at Level.SEVERE
Searcher
and HitDetailer
for either a single
merged index, or for a set of individual segment indexes.IndexingFilter
implementing plugins.InputFormat
.InputFormat
s.Mapper
that swaps keys and values.false
if the robots.txt
file
prohibits us from accessing the given path
, or
true
otherwise.
false
if the robots.txt
file
prohibits us from accessing the given path
, or
true
otherwise.
true
if this cluster constains documents
that did not fit anywhere else (presentation layer may
discard such clusters).
ArrayFile.Reader.seek(long)
, ArrayFile.Reader.next(Writable)
, or ArrayFile.Reader.get(long,Writable)
.
IndexingFilter
that
add a lang
(language) field to the document.QueryFilter
that handles
"lang:"
query clauses.Reducer
that sums long values.s
padded with leading spaces so
that it's length is length
.
input that is matched,
or null if no match exists.
- longestMatch(String) -
Method in class org.apache.nutch.util.SuffixStringMatcher
- Returns the longest suffix of
input that is matched,
or null if no match exists.
- longestMatch(String) -
Method in class org.apache.nutch.util.TrieStringMatcher
- Returns the longest substring of
input that is
matched by a pattern in the trie, or null if no match
exists.
- lookingAhead -
Variable in class org.apache.nutch.analysis.NutchAnalysis
-
- ls(String) -
Method in class org.apache.nutch.fs.TestClient
- Get a listing of all files in NDFS at the indicated name
MapOutputProtocol
.InterTrackerProtocol
.Properties
which allows multiple values for a single key.TrieStringMatcher.TrieNode
visited, given that you are at
node
, and the the next character in the input is
the idx
'th character of s
.
String
is matched by a
prefix in the trie
String
is matched by a
suffix in the trie
String
is matched by a
pattern in the trie
NFSInputStream
in a DataInputStream
and buffers input through a BufferedInputStream
.NFSOutputStream
in a DataOutputStream
and buffers output through a BufferedOutputStream
.RawDocument
for
Carrot2.summary
and wrapping
a details
hit details.
WritableComparable
instance.
key
and
val
.
key
, skipping its
value.
key
and
val
.
buffer
.
key
.
OnlineClusterer
extensions.Ontology
extensions.Outlink
s
/ URLs from plain text using Regular Expressions.Mapper
and Reducer
implementations to collect
output data.OutputFormat
s.Page.readFields(DataInput)
.
Protocol
implementation.Parser
plugins.PluginClassLoader
contains only classes of the runtime
libraries setuped in the plugin manifest file and exported libraries of
plugins that are required pluguin.PluginDescriptor
provide access to all meta information of
a nutch-plugin, as well to the internationalizable resources and the plugin
own classloader.PluginManifestParser
parser just parse the manifest file
in all plugin directories.PluginRuntimeException
will be thrown until a exception in the
plugin managemnt occurs.String
s against a set
of prefixes.PrefixStringMatcher
which will match
String
s with any prefix in the supplied array.
PrefixStringMatcher
which will match
String
s with any prefix in the supplied
Collection
.
Protocol
plugins.FibonacciHeap.popMin()
would, without
removing it.
QueryFilter
implementing plugins.FileSplit
.Mapper
that extracts text matching a regular expression.robots.txt
files.RobotRulesParser
which will use the
supplied robotNames
when choosing which stanza to
follow in robots.txt
files.
robots.txt
files.RobotRulesParser
which will use the
supplied robotNames
when choosing which stanza to
follow in robots.txt
files.
in
.
false
.
s
padded with trailing spaces so
that it's length is length
.
WritableComparator
.
InputFormat
for plain text files.String
s against a set
of suffixes.PrefixStringMatcher
which will match
String
s with any suffix in the supplied array.
PrefixStringMatcher
which will match
String
s with any suffix in the supplied
Collection
n
th value.
name
property.
baseHref
.
name
property to the name of a class.
name
property to an integer.
noCache
to true
.
noFollow
to true
.
noIndex
to true
.
refresh
to the supplied value.
refreshHref
.
refreshTime
.
Hits.totalIsExact()
.
input that is matched,
or null if no match exists.
- shortestMatch(String) -
Method in class org.apache.nutch.util.SuffixStringMatcher
- Returns the shortest suffix of
input that is matched,
or null if no match exists.
- shortestMatch(String) -
Method in class org.apache.nutch.util.TrieStringMatcher
- Returns the shortest substring of
input that is
matched by a pattern in the trie, or null if no match
exists.
- showTime(boolean) -
Static method in class org.apache.nutch.util.LogFormatter
- When true, time is logged with each entry.
- shutDown() -
Method in class org.apache.nutch.plugin.Plugin
- Shutdown the plugin.
- shutdown() -
Method in class org.apache.nutch.util.ThreadPool
- Turn off the pool.
- size -
Variable in class org.apache.nutch.segment.SegmentReader
-
- size -
Variable in class org.apache.nutch.segment.SegmentWriter
-
- size() -
Method in class org.apache.nutch.util.FibonacciHeap
- Returns the number of objects in the heap.
- skip(DataInput) -
Static method in class org.apache.nutch.io.UTF8
- Skips over one UTF8 in the input.
- skip(DataInput) -
Static method in class org.apache.nutch.parse.Outlink
- Skips over one Outlink in the input.
- skipCompressedByteArray(DataInput) -
Static method in class org.apache.nutch.io.WritableUtils
-
- skippedEntity(String) -
Method in class org.apache.nutch.parse.html.DOMBuilder
- Receive notification of a skipped entity.
- sort(String, String) -
Method in class org.apache.nutch.io.SequenceFile.Sorter
- Perform a file sort.
- sort() -
Method in class org.apache.nutch.tools.ParseSegment
- Sort ParserOutput
- specialConstructor -
Variable in class org.apache.nutch.quality.dynamic.ParseException
- This variable determines which constructor was used to create
this object and thereby affects the semantics of the
"getMessage" method (see below).
- specialToken -
Variable in class org.apache.nutch.quality.dynamic.Token
- This field is used to access special tokens that occur prior to this
token, but after the immediately preceding regular (non-special) token.
- stage -
Variable in class org.apache.nutch.tools.SegmentMergeTool.SegmentMergeStatus
-
- stages -
Static variable in class org.apache.nutch.tools.SegmentMergeTool.SegmentMergeStatus
-
- start() -
Method in class org.apache.nutch.ipc.Server
- Starts the service.
- start() -
Method in class org.apache.nutch.mapReduce.JobTrackerInfoServer
- Launch the HTTP server
- startBlock(Block) -
Method in class org.apache.nutch.ndfs.FSDataset
- A Block b will be coming soon!
- startCDATA() -
Method in class org.apache.nutch.parse.html.DOMBuilder
- Report the start of a CDATA section.
- startDTD(String, String, String) -
Method in class org.apache.nutch.parse.html.DOMBuilder
- Report the start of DTD declarations, if any.
- startDocument() -
Method in class org.apache.nutch.parse.html.DOMBuilder
- Receive notification of the beginning of a document.
- startElement(String, String, String, Attributes) -
Method in class org.apache.nutch.parse.html.DOMBuilder
- Receive notification of the beginning of an element.
- startEntity(String) -
Method in class org.apache.nutch.parse.html.DOMBuilder
- Report the beginning of an entity.
- startFile(UTF8, UTF8, boolean) -
Method in class org.apache.nutch.ndfs.FSNamesystem
- The client would like to create a new block for the indicated
filename.
- startLocalInput(File, File) -
Method in class org.apache.nutch.fs.LocalFileSystem
- We can read directly from the real local fs.
- startLocalInput(File, File) -
Method in class org.apache.nutch.fs.NDFSFileSystem
- Fetch remote NDFS file, place at tmpLocalFile
- startLocalInput(File, File) -
Method in class org.apache.nutch.fs.NutchFileSystem
- Returns a local File that the user can read from.
- startLocalOutput(File, File) -
Method in class org.apache.nutch.fs.LocalFileSystem
- We can write output directly to the final location
- startLocalOutput(File, File) -
Method in class org.apache.nutch.fs.NDFSFileSystem
- Output will go to the tmp working area.
- startLocalOutput(File, File) -
Method in class org.apache.nutch.fs.NutchFileSystem
- Returns a local File that the user can write output to.
- startPrefixMapping(String, String) -
Method in class org.apache.nutch.parse.html.DOMBuilder
- Begin the scope of a prefix-URI Namespace mapping.
- startProcessing(RequestContext) -
Method in class org.apache.nutch.clustering.carrot2.LocalNutchInputComponent
- A callback hook that starts the processing.
- startTime -
Variable in class org.apache.nutch.tools.SegmentMergeTool.SegmentMergeStatus
-
- startUp() -
Method in class org.apache.nutch.plugin.Plugin
- Will be invoked until plugin start up.
- started -
Variable in class org.apache.nutch.segment.SegmentReader
- The time when fetching of this segment started, as recorded
in fetcher output data.
- staticFlag -
Static variable in class org.apache.nutch.quality.dynamic.SimpleCharStream
-
- status() -
Method in class org.apache.nutch.fetcher.Fetcher
- Display the status of the fetcher run.
- status() -
Method in class org.apache.nutch.tools.ParseSegment
- Display the status of the parser run.
- stop() -
Method in class org.apache.nutch.ipc.Client
- Stop all threads related to this client.
- stop() -
Method in class org.apache.nutch.ipc.Server
- Stops the service.
- stop() -
Method in class org.apache.nutch.mapReduce.JobTrackerInfoServer
- Stop the HTTP server
- subclasses(String) -
Method in interface org.apache.nutch.ontology.Ontology
-
- subclasses(String) -
Method in class org.apache.nutch.ontology.OntologyImpl
- retrieve all subclasses of entity(ies) hashed to searchTerm
- submitJob(String) -
Method in class org.apache.nutch.mapReduce.JobClient
- Submit a job to the MR system
- submitJob(JobConf) -
Method in class org.apache.nutch.mapReduce.JobClient
- Submit a job to the MR system
- submitJob(String) -
Method in interface org.apache.nutch.mapReduce.JobSubmissionProtocol
- Submit a Job for execution.
- submitJob(String) -
Method in class org.apache.nutch.mapReduce.JobTracker
-
- success() -
Method in class org.apache.nutch.ndfs.FSResults
- Whether the call worked.
- sync(long) -
Method in class org.apache.nutch.io.SequenceFile.Reader
- Seek to the next sync mark past a given position.
- syncSeen() -
Method in class org.apache.nutch.io.SequenceFile.Reader
- Returns true iff the previous call to next passed a sync mark.
- synonyms(String) -
Method in interface org.apache.nutch.ontology.Ontology
-
- synonyms(String) -
Method in class org.apache.nutch.ontology.OntologyImpl
- retrieves synonyms from wordnet via sweet's web interface
InputFormat
for plain text files.Mapper
that maps text values into Hits.getTotal()
gives the exact number of hits, or false if
it is only an estimate of the total number of hits.
URLFilter
implementing plugins.sizeLimit
bytes, if necessary.
VersionedWritable.readFields(DataInput)
when the
version of an object being read does not match the current implementation
version as returned by VersionedWritable.getVersion()
.DataInput
and
DataOutput
.Writable
and Comparable
.WritableComparable
s.WritableComparable
implementation.
out
.
|
|||||||||||
PREV NEXT | FRAMES NO FRAMES |