public class RDFaExtractor extends Object implements Extractor.TagSoupDOMExtractor
Extractor.BlindExtractor, Extractor.ContentExtractor, Extractor.TagSoupDOMExtractor
Modifier and Type | Field and Description |
---|---|
static String |
NAME |
static String |
xsltFilename |
Constructor and Description |
---|
RDFaExtractor()
Default constructor, with no verification of data types and not stop at first error.
|
RDFaExtractor(boolean verifyDataType,
boolean stopAtFirstError)
Constructor, allows to specify the validation and error handling policies.
|
Modifier and Type | Method and Description |
---|---|
ExtractorDescription |
getDescription() |
static XSLTStylesheet |
getXSLT()
Returns a
XSLTStylesheet able to distill RDFa from
HTML pages. |
boolean |
isStopAtFirstError() |
boolean |
isVerifyDataType() |
void |
run(ExtractionParameters extractionParameters,
ExtractionContext extractionContext,
Document in,
ExtractionResult out) |
void |
setStopAtFirstError(boolean stopAtFirstError) |
void |
setVerifyDataType(boolean verifyDataType) |
public static final String NAME
public static final String xsltFilename
public RDFaExtractor(boolean verifyDataType, boolean stopAtFirstError)
verifyDataType
- if true
the data types will be verified,
if false
will be ignored.stopAtFirstError
- if true
the parser will stop at first parsing error,
if false
will ignore non blocking errors.public RDFaExtractor()
public static XSLTStylesheet getXSLT()
XSLTStylesheet
able to distill RDFa from
HTML pages.null
XSLT instance.public boolean isVerifyDataType()
public void setVerifyDataType(boolean verifyDataType)
public boolean isStopAtFirstError()
public void setStopAtFirstError(boolean stopAtFirstError)
public void run(ExtractionParameters extractionParameters, ExtractionContext extractionContext, Document in, ExtractionResult out) throws IOException, ExtractionException
run
in interface Extractor<Document>
IOException
ExtractionException
public ExtractorDescription getDescription()
getDescription
in interface Extractor<Document>
ExtractorDescription
of this extractorCopyright © 2010-2013 The Apache Software Foundation. All Rights Reserved.