Package org.apache.nutch.indexer.html

Index raw HTML content.

See: Description

Package org.apache.nutch.indexer.html Description

Index raw HTML content. The plugin index-html adds the field "rawcontent" to the index. This field contains the raw (HTML) content of a document converted to a String.

Copyright © 2015 The Apache Software Foundation