Package org.apache.nutch.parse

Interface Summary
HtmlParseFilter Extension point for DOM-based HTML parsers.
Parse The result of parsing a page's raw content.
Parser A parser for content generated by a Protocol implementation.

Class Summary
HTMLMetaTags This class holds the information about HTML "meta" tags extracted from a page.
HtmlParseFilters Creates and caches HtmlParseFilter implementing plugins.
OutlinkExtractor Extractor to extract Outlinks / URLs from plain text using Regular Expressions.
ParseData Data extracted from a page's content.
ParseImpl The result of parsing a page's raw content.
ParserChecker Parser checker, useful for testing parser.
ParseResult A utility class that stores result of a parse.
ParserFactory Creates and caches Parser plugins.
ParseUtil A Utility class containing methods to simply perform parsing utilities such as iterating through a preferred list of Parsers to obtain Parse objects.

Exception Summary

Copyright © 2006 The Apache Software Foundation