Interface TagSoupExtractionResult

  • All Superinterfaces:
    org.apache.any23.extractor.ExtractionResult, org.apache.any23.extractor.IssueReport
    All Known Implementing Classes:
    ExtractionResultImpl

    public interface TagSoupExtractionResult
    extends org.apache.any23.extractor.ExtractionResult
    This interface models a specific ExtractionResult able to collect property roots generated by HTML Microformat extractions.
    Author:
    Michele Mostarda (mostarda@fbk.eu)
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Interface Description
      static class  TagSoupExtractionResult.PropertyPath
      Defines a property path object.
      static class  TagSoupExtractionResult.ResourceRoot
      Defines a property root object.
      • Nested classes/interfaces inherited from interface org.apache.any23.extractor.IssueReport

        org.apache.any23.extractor.IssueReport.Issue, org.apache.any23.extractor.IssueReport.IssueLevel
    • Method Detail

      • addResourceRoot

        void addResourceRoot​(String[] path,
                             org.eclipse.rdf4j.model.Resource root,
                             Class<? extends MicroformatExtractor> extractor)
        Adds a root property to the extraction result, specifying also the path corresponding to the root of data which generated the property and the extractor responsible for such addition.
        Parameters:
        path - the path from the document root to the local root of the data generating the property.
        root - the property root node.
        extractor - the extractor responsible of such extraction.
      • addPropertyPath

        void addPropertyPath​(Class<? extends MicroformatExtractor> extractor,
                             org.eclipse.rdf4j.model.Resource propertySubject,
                             org.eclipse.rdf4j.model.Resource property,
                             org.eclipse.rdf4j.model.BNode object,
                             String[] path)
        Adds a property path to the list of the extracted data.
        Parameters:
        extractor - the identifier of the extractor responsible for retrieving such property.
        propertySubject - the subject of the property.
        property - the property IRI.
        object - the property object if any, null otherwise.
        path - the path of the HTML node from which the property literal has been extracted.