Package | Description |
---|---|
org.apache.any23.extractor |
This package contains classes and interfaces modeling the
Extractor API. |
org.apache.any23.extractor.html |
All the various
Extractor needed to distill RDF
from Microformats in HTML pages are contained in this package. |
Class and Description |
---|
MicroformatExtractor
The abstract base class for any
Microformat specification extractor.
|
Class and Description |
---|
AdrExtractor
Extractor for the adr
microformat.
|
DocumentReport
Represents the validationReportBuilder generated by a
the
TagSoupParser when a document
is retrieved and validated. |
EntityBasedMicroformatExtractor
Base class for microformat extractors based on entities.
|
GeoExtractor
Extractor for the Geo
microformat.
|
HCalendarExtractor
Extractor for the hCalendar
microformat.
|
HCardExtractor
Extractor for the hCard
microformat.
|
HeadLinkExtractor
This
Extractor.TagSoupDOMExtractor implementation
retrieves the LINK s declared within the HTML/HEAD page header. |
HListingExtractor
Extractor for the hListing
microformat.
|
HRecipeExtractor
Extractor for the hRecipe
microformat.
|
HResumeExtractor
Extractor for the hResume
microformat.
|
HReviewAggregateExtractor
Extractor for the hReview-aggregate
microformat.
|
HReviewExtractor
Extractor for the hReview
microformat.
|
HTMLDocument
A wrapper around the DOM representation of an HTML document.
|
HTMLDocument.TextField
This class represents a text extracted from the HTML DOM related
to the node from which such test has been retrieved.
|
HTMLMetaExtractor
This extractor represents the HTML META tag values
according the HTML4 specification.
|
ICBMExtractor
Extractor for "ICBM coordinates" provided as META headers in the head
of an HTML page.
|
LicenseExtractor
Extractor for the rel-license
microformat.
|
MicroformatExtractor
The abstract base class for any
Microformat specification extractor.
|
SpeciesExtractor
Extractor able to extract the Species Microformat.
|
TitleExtractor
Extracts the value of the <title> element of an
HTML or XHTML page.
|
TurtleHTMLExtractor
Extractor for Turtle/N3 format embedded within HTML
script tags.
|
XFNExtractor
Extractor for the XFN
microformat.
|
Copyright © 2010-2013 The Apache Software Foundation. All Rights Reserved.