public abstract class RegexURLFilterBase extends java.lang.Object implements URLFilter
URL filter
based on regular
expressions.
The format of this file is made of many rules (one per line):
[+-]<regex>
where plus (+
)means go ahead and index it and minus (
-
)means no.
X_POINT_ID
Modifier | Constructor and Description |
---|---|
|
RegexURLFilterBase()
Constructs a new empty RegexURLFilterBase
|
|
RegexURLFilterBase(java.io.File filename)
Constructs a new RegexURLFilter and init it with a file of rules.
|
protected |
RegexURLFilterBase(java.io.Reader reader)
Constructs a new RegexURLFilter and init it with a Reader of rules.
|
|
RegexURLFilterBase(java.lang.String rules)
Constructs a new RegexURLFilter and inits it with a list of rules.
|
Modifier and Type | Method and Description |
---|---|
protected abstract RegexRule |
createRule(boolean sign,
java.lang.String regex)
Creates a new
RegexRule . |
java.lang.String |
filter(java.lang.String url) |
Configuration |
getConf() |
protected abstract java.io.Reader |
getRulesReader(Configuration conf)
Returns the name of the file of rules to use for a particular
implementation.
|
static void |
main(RegexURLFilterBase filter,
java.lang.String[] args)
Filter the standard input using a RegexURLFilterBase.
|
void |
setConf(Configuration conf) |
public RegexURLFilterBase()
public RegexURLFilterBase(java.io.File filename) throws java.io.IOException, java.lang.IllegalArgumentException
filename
- is the name of rules file.java.io.IOException
java.lang.IllegalArgumentException
public RegexURLFilterBase(java.lang.String rules) throws java.io.IOException, java.lang.IllegalArgumentException
rules
- string with a list of rules, one rule per linejava.io.IOException
java.lang.IllegalArgumentException
protected RegexURLFilterBase(java.io.Reader reader) throws java.io.IOException, java.lang.IllegalArgumentException
reader
- is a reader of rules.java.io.IOException
java.lang.IllegalArgumentException
protected abstract RegexRule createRule(boolean sign, java.lang.String regex)
RegexRule
.sign
- of the regular expression. A true
value means that
any URL matching this rule must be included, whereas a
false
value means that any URL matching this rule
must be excluded.regex
- is the regular expression associated to this rule.protected abstract java.io.Reader getRulesReader(Configuration conf) throws java.io.IOException
conf
- is the current configuration.java.io.IOException
public java.lang.String filter(java.lang.String url)
public void setConf(Configuration conf)
setConf
in interface Configurable
public Configuration getConf()
getConf
in interface Configurable
public static void main(RegexURLFilterBase filter, java.lang.String[] args) throws java.io.IOException, java.lang.IllegalArgumentException
filter
- is the RegexURLFilterBase to use for filtering the standard input.args
- some optional parameters (not used).java.io.IOException
java.lang.IllegalArgumentException
Copyright © 2019 The Apache Software Foundation