Package | Description |
---|---|
org.apache.nutch.collection |
Subcollection is a subset of an index.
|
org.apache.nutch.net |
Web-related interfaces: URL
filters
and normalizers . |
org.apache.nutch.net.urlnormalizer.basic |
URL normalizer performing basic normalizations: remove default ports
and dot segments in path.
|
org.apache.nutch.net.urlnormalizer.pass |
URL normalizer dummy which does not change URLs.
|
org.apache.nutch.net.urlnormalizer.regex |
URL normalizer with configurable rules based on regular expressions
(
Pattern ). |
org.apache.nutch.urlfilter.api |
Generic
URL filter library,
abstracting away from regular expression implementations. |
org.apache.nutch.urlfilter.automaton |
URL filter plugin based on
dk.brics.automaton Finite-State
Automata for JavaTM.
|
org.apache.nutch.urlfilter.domain |
URL filter plugin to include only URLs which match an element in a given list of
domain suffixes, domain names, and/or host names.
|
org.apache.nutch.urlfilter.prefix |
URL filter plugin to include only URLs which match one of a given list of URL prefixes.
|
org.apache.nutch.urlfilter.regex |
URL filter plugin to include and/or exclude URLs matching Java regular expressions.
|
org.apache.nutch.urlfilter.suffix |
URL filter plugin to either exclude or include only URLs which match
one of the given (path) suffixes.
|
org.apache.nutch.urlfilter.validator |
URL filter plugin that validates given urls.
|
Class and Description |
---|
URLFilter
Interface used to limit which URLs enter Nutch.
|
Class and Description |
---|
URLFilterException |
Class and Description |
---|
URLNormalizer
Interface used to convert URLs to normal form and optionally perform
substitutions
|
Class and Description |
---|
URLNormalizer
Interface used to convert URLs to normal form and optionally perform
substitutions
|
Class and Description |
---|
URLNormalizer
Interface used to convert URLs to normal form and optionally perform
substitutions
|
Class and Description |
---|
URLFilter
Interface used to limit which URLs enter Nutch.
|
Class and Description |
---|
URLFilter
Interface used to limit which URLs enter Nutch.
|
Class and Description |
---|
URLFilter
Interface used to limit which URLs enter Nutch.
|
Class and Description |
---|
URLFilter
Interface used to limit which URLs enter Nutch.
|
Class and Description |
---|
URLFilter
Interface used to limit which URLs enter Nutch.
|
Class and Description |
---|
URLFilter
Interface used to limit which URLs enter Nutch.
|
Class and Description |
---|
URLFilter
Interface used to limit which URLs enter Nutch.
|
Copyright © 2019 The Apache Software Foundation