-
Class Summary
Class |
Description |
HtmlIndexingFilter |
Add raw HTML content of a document to the index.
|
Package org.apache.nutch.indexer.html Description
Index raw HTML content.
The plugin index-html adds the field "rawcontent" to the index.
This field contains the raw (HTML) content of a document converted to a String.