Class BaseRDFExtractor

    • Nested Class Summary

      • Nested classes/interfaces inherited from interface org.apache.any23.extractor.Extractor

        org.apache.any23.extractor.Extractor.BlindExtractor, org.apache.any23.extractor.Extractor.ContentExtractor, org.apache.any23.extractor.Extractor.TagSoupDOMExtractor
    • Constructor Summary

      Constructors 
      Constructor Description
      BaseRDFExtractor()  
      BaseRDFExtractor​(boolean verifyDataType, boolean stopAtFirstError)
      Constructor, allows to specify the validation and error handling policies.
    • Constructor Detail

      • BaseRDFExtractor

        public BaseRDFExtractor()
      • BaseRDFExtractor

        public BaseRDFExtractor​(boolean verifyDataType,
                                boolean stopAtFirstError)
        Constructor, allows to specify the validation and error handling policies.
        Parameters:
        verifyDataType - if true the data types will be verified, if false will be ignored.
        stopAtFirstError - if true the parser will stop at first parsing error, if false will ignore non blocking errors.
    • Method Detail

      • getParser

        protected abstract org.eclipse.rdf4j.rio.RDFParser getParser​(org.apache.any23.extractor.ExtractionContext extractionContext,
                                                                     org.apache.any23.extractor.ExtractionResult extractionResult)
      • isVerifyDataType

        public boolean isVerifyDataType()
      • setVerifyDataType

        public void setVerifyDataType​(boolean verifyDataType)
      • isStopAtFirstError

        public boolean isStopAtFirstError()
      • setStopAtFirstError

        public void setStopAtFirstError​(boolean b)
        Specified by:
        setStopAtFirstError in interface org.apache.any23.extractor.Extractor.ContentExtractor
      • run

        public void run​(org.apache.any23.extractor.ExtractionParameters extractionParameters,
                        org.apache.any23.extractor.ExtractionContext extractionContext,
                        InputStream in,
                        org.apache.any23.extractor.ExtractionResult extractionResult)
                 throws IOException,
                        org.apache.any23.extractor.ExtractionException
        Specified by:
        run in interface org.apache.any23.extractor.Extractor<InputStream>
        Throws:
        IOException
        org.apache.any23.extractor.ExtractionException