Package org.apache.nutch.analysis.lang

Text document language identifier.


Class Summary
HTMLLanguageParser Adds metadata identifying language of document if found We could also run statistical analysis here but we'd miss all other formats
LanguageIndexingFilter An IndexingFilter that adds a lang (language) field to the document.

Package org.apache.nutch.analysis.lang Description

Text document language identifier.

Language profiles are based on material from

Copyright © 2013 The Apache Software Foundation