Package org.apache.any23.extractor.xpath
Class XPathExtractor
- java.lang.Object
-
- org.apache.any23.extractor.xpath.XPathExtractor
-
- All Implemented Interfaces:
org.apache.any23.extractor.Extractor<Document>
,org.apache.any23.extractor.Extractor.TagSoupDOMExtractor
public class XPathExtractor extends Object implements org.apache.any23.extractor.Extractor.TagSoupDOMExtractor
Implementation of anExtractor.TagSoupDOMExtractor
able to applyXPathExtractionRule
s and generate quads.- Author:
- Michele Mostarda (mostarda@fbk.eu)
- See Also:
XPathExtractionRule
-
-
Constructor Summary
Constructors Constructor Description XPathExtractor()
XPathExtractor(List<XPathExtractionRule> rules)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
add(XPathExtractionRule rule)
boolean
contains(XPathExtractionRule rule)
org.apache.any23.extractor.ExtractorDescription
getDescription()
void
remove(XPathExtractionRule rule)
void
run(org.apache.any23.extractor.ExtractionParameters extractionParameters, org.apache.any23.extractor.ExtractionContext extractionContext, Document in, org.apache.any23.extractor.ExtractionResult out)
-
-
-
Constructor Detail
-
XPathExtractor
public XPathExtractor()
-
XPathExtractor
public XPathExtractor(List<XPathExtractionRule> rules)
-
-
Method Detail
-
add
public void add(XPathExtractionRule rule)
-
remove
public void remove(XPathExtractionRule rule)
-
contains
public boolean contains(XPathExtractionRule rule)
-
run
public void run(org.apache.any23.extractor.ExtractionParameters extractionParameters, org.apache.any23.extractor.ExtractionContext extractionContext, Document in, org.apache.any23.extractor.ExtractionResult out) throws IOException, org.apache.any23.extractor.ExtractionException
- Specified by:
run
in interfaceorg.apache.any23.extractor.Extractor<Document>
- Throws:
IOException
org.apache.any23.extractor.ExtractionException
-
getDescription
public org.apache.any23.extractor.ExtractorDescription getDescription()
- Specified by:
getDescription
in interfaceorg.apache.any23.extractor.Extractor<Document>
-
-