public class MicrodataParser extends Object
Modifier and Type | Field and Description |
---|---|
static Set<String> |
HREF_TAGS
List of tags providing the
href property. |
static String |
ITEMPROP_ATTRIBUTE |
static String |
ITEMSCOPE_ATTRIBUTE |
static Set<String> |
SRC_TAGS
List of tags providing the
src property. |
Constructor and Description |
---|
MicrodataParser(Document document) |
Modifier and Type | Method and Description |
---|---|
ItemProp[] |
deferProperties(String... refs)
Given a document and a list of itemprop names this method will return
such itemprops.
|
org.apache.any23.extractor.microdata.MicrodataParser.ErrorMode |
getErrorMode() |
MicrodataParserException[] |
getErrors() |
static List<Node> |
getItemPropNodes(Node node)
Returns all the itemProps detected within the given root node.
|
List<ItemProp> |
getItemProps(Node scopeNode,
boolean skipRoot)
Returns all the itemprops for the given itemscope node.
|
ItemScope |
getItemScope(Node node)
Returns the
ItemScope instance described within the specified node . |
static List<Node> |
getItemScopeNodes(Node node)
Returns all the itemScopes detected within the given root node.
|
static MicrodataParserReport |
getMicrodata(Document document)
Returns all the Microdata items detected within the given
document ,
works in full report mode. |
static MicrodataParserReport |
getMicrodata(Document document,
org.apache.any23.extractor.microdata.MicrodataParser.ErrorMode errorMode)
Returns all the Microdata items detected within the given
document . |
static void |
getMicrodataAsJSON(Document document,
PrintStream ps)
Returns a JSON containing the list of all extracted Microdata,
as described at Microdata JSON Specification.
|
ItemPropValue |
getPropertyValue(Node node)
Reads the value of a itemprop node.
|
static List<Node> |
getTopLevelItemScopeNodes(Node node)
Returns only the itemScopes that are top level items.
|
static boolean |
isItemProp(Node node)
Check whether a node is an itemProp.
|
static boolean |
isItemScope(Node node)
Check whether a node is an itemScope.
|
void |
setErrorMode(org.apache.any23.extractor.microdata.MicrodataParser.ErrorMode errorMode) |
public static final String ITEMSCOPE_ATTRIBUTE
public static final String ITEMPROP_ATTRIBUTE
public MicrodataParser(Document document)
public static List<Node> getItemScopeNodes(Node node)
node
- root node to search in.public static boolean isItemScope(Node node)
node
- node to check.true
if the node is an itemScope., false
otherwise.public static List<Node> getItemPropNodes(Node node)
node
- root node to search in.public static boolean isItemProp(Node node)
node
- node to check.true
if the node is an itemProp., false
otherwise.public static List<Node> getTopLevelItemScopeNodes(Node node)
node
- root node to search in.public static MicrodataParserReport getMicrodata(Document document, org.apache.any23.extractor.microdata.MicrodataParser.ErrorMode errorMode) throws MicrodataParserException
document
.document
- document to be processed.errorMode
- error management policy.MicrodataParserException
- if
errorMode == MicrodataParser.ErrorMode.StopAtFirstError
and an error occurs.public static MicrodataParserReport getMicrodata(Document document)
document
,
works in full report mode.document
- document to be processed.public static void getMicrodataAsJSON(Document document, PrintStream ps)
document
- document to be processed.ps
- public void setErrorMode(org.apache.any23.extractor.microdata.MicrodataParser.ErrorMode errorMode)
public org.apache.any23.extractor.microdata.MicrodataParser.ErrorMode getErrorMode()
public MicrodataParserException[] getErrors()
public ItemPropValue getPropertyValue(Node node) throws MicrodataParserException
node
- itemprop node.node
.MicrodataParserException
- if an error occurs while extracting a nested item scope.public List<ItemProp> getItemProps(Node scopeNode, boolean skipRoot) throws MicrodataParserException
scopeNode
- node representing the itemscope>skipRoot
- if true
the given root node
will be not read as a property, even if it contains the itemprop attribute.
MicrodataParserException
- if an error occurs while retrieving an property value.public ItemProp[] deferProperties(String... refs) throws MicrodataParserException
refs
- list of references.MicrodataParserException
- if a loop is detected or a property name is missing.public ItemScope getItemScope(Node node) throws MicrodataParserException
ItemScope
instance described within the specified node
.node
- node describing an itemscope.MicrodataParserException
- if an error occurs while dereferencing properties.Copyright © 2010-2013 The Apache Software Foundation. All Rights Reserved.