Class HEventExtractor

  • All Implemented Interfaces:
    org.apache.any23.extractor.Extractor<Document>, org.apache.any23.extractor.Extractor.TagSoupDOMExtractor

    public class HEventExtractor
    extends EntityBasedMicroformatExtractor
    Extractor for the h-event microformat.
    Author:
    Nisala Nirmana
    • Constructor Detail

      • HEventExtractor

        public HEventExtractor()
    • Method Detail

      • getDescription

        public org.apache.any23.extractor.ExtractorDescription getDescription()
        Description copied from class: MicroformatExtractor
        Returns the description of this extractor.
        Specified by:
        getDescription in interface org.apache.any23.extractor.Extractor<Document>
        Specified by:
        getDescription in class MicroformatExtractor
        Returns:
        a human readable description.
      • extractEntity

        protected boolean extractEntity​(Node node,
                                        org.apache.any23.extractor.ExtractionResult out)
                                 throws org.apache.any23.extractor.ExtractionException
        Description copied from class: EntityBasedMicroformatExtractor
        Extracts an entity from a DOM node.
        Specified by:
        extractEntity in class EntityBasedMicroformatExtractor
        Parameters:
        node - the DOM node.
        out - the extraction result collector.
        Returns:
        true if the extraction has produces something, false otherwise.
        Throws:
        org.apache.any23.extractor.ExtractionException - if there is an error during extraction
      • extractEntityAsEmbeddedProperty

        public org.eclipse.rdf4j.model.Resource extractEntityAsEmbeddedProperty​(HTMLDocument fragment,
                                                                                org.eclipse.rdf4j.model.BNode event,
                                                                                org.apache.any23.extractor.ExtractionResult out)
                                                                         throws org.apache.any23.extractor.ExtractionException
        Throws:
        org.apache.any23.extractor.ExtractionException