Package org.apache.nutch.analysis.lang

Text document language identifier.


Class Summary
HTMLLanguageParser Adds metadata identifying language of document if found We could also run statistical analysis here but we'd miss all other formats
LanguageIndexingFilter An IndexingFilter that adds a lang (language) field to the document.

Language profiles are based on material from

