Class TikaMIMETypeDetector

  • All Implemented Interfaces:
    org.apache.any23.mime.MIMETypeDetector

    public class TikaMIMETypeDetector
    extends Object
    implements org.apache.any23.mime.MIMETypeDetector
    Implementation of MIMETypeDetector based on Apache Tika.
    Author:
    Michele Mostarda (michele.mostarda@gmail.com), Davide Palmisano (dpalmisano@gmail.com)
    • Constructor Detail

      • TikaMIMETypeDetector

        public TikaMIMETypeDetector​(org.apache.any23.mime.purifier.Purifier purifier)
      • TikaMIMETypeDetector

        public TikaMIMETypeDetector()
    • Method Detail

      • checkN3Format

        public static boolean checkN3Format​(InputStream is)
                                     throws IOException
        Checks if the stream contains the N3 triple patterns.
        Parameters:
        is - input stream to be verified.
        Returns:
        true if N3 patterns are detected, false otherwise.
        Throws:
        IOException - if there is an error checking the InputStream
      • checkNQuadsFormat

        public static boolean checkNQuadsFormat​(InputStream is)
                                         throws IOException
        Checks if the stream contains the NQuads patterns.
        Parameters:
        is - input stream to be verified.
        Returns:
        true if N3 patterns are detected, false otherwise.
        Throws:
        IOException - if there is an error checking the InputStream
      • checkTurtleFormat

        public static boolean checkTurtleFormat​(InputStream is)
                                         throws IOException
        Checks if the stream contains Turtle triple patterns.
        Parameters:
        is - input stream to be verified.
        Returns:
        true if Turtle patterns are detected, false otherwise.
        Throws:
        IOException - if there is an error checking the InputStream
      • checkCSVFormat

        public static boolean checkCSVFormat​(InputStream is)
                                      throws IOException
        Checks if the stream contains a valid CSV pattern.
        Parameters:
        is - input stream to be verified.
        Returns:
        true if CSV patterns are detected, false otherwise.
        Throws:
        IOException - if there is an error checking the InputStream
      • guessMIMEType

        public org.apache.any23.mime.MIMEType guessMIMEType​(String fileName,
                                                            InputStream input,
                                                            org.apache.any23.mime.MIMEType mimeTypeFromMetadata)
        Estimates the MIME type of the content of input file. The input stream must be resettable.
        Specified by:
        guessMIMEType in interface org.apache.any23.mime.MIMETypeDetector
        Parameters:
        fileName - name of the data source.
        input - null or a resettable input stream containing data.
        mimeTypeFromMetadata - mimetype declared in metadata.
        Returns:
        the supposed mime type or null if nothing appropriate found.
        Throws:
        IllegalArgumentException - if input is not null and is not resettable.