public class SubcollectionIndexingFilter extends Configured implements IndexingFilter
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
FIELD_NAME
Doc field name
|
X_POINT_ID
Constructor and Description |
---|
SubcollectionIndexingFilter() |
SubcollectionIndexingFilter(Configuration conf) |
Modifier and Type | Method and Description |
---|---|
NutchDocument |
filter(NutchDocument doc,
java.lang.String url,
WebPage page)
Adds fields or otherwise modifies the document that will be indexed for a
parse.
|
java.util.Collection<WebPage.Field> |
getFields() |
getConf, setConf
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getConf, setConf
public static final java.lang.String FIELD_NAME
public SubcollectionIndexingFilter()
public SubcollectionIndexingFilter(Configuration conf)
public java.util.Collection<WebPage.Field> getFields()
getFields
in interface FieldPluggable
public NutchDocument filter(NutchDocument doc, java.lang.String url, WebPage page) throws IndexingException
IndexingFilter
filter
in interface IndexingFilter
doc
- document instance for collecting fieldsurl
- page urlIndexingException
Copyright © 2019 The Apache Software Foundation