public interface Parser extends Pluggable, Configurable
Protocolimplementation. This interface is implemented by extensions. Nutch's core contains no page parsing code.
|Modifier and Type||Field and Description|
The name of the extension point.
static final String X_POINT_ID
ParseResult getParse(Content c)
This method parses the given content and returns a map of <key,
Parse instances will be persisted under the given
Note: Meta-redirects should be followed only when they are coming from the
original URL. That is:
Assume fetcher is in parsing mode and is currently processing foo.bar.com/redirect.html. If this url contains a meta redirect to another url, fetcher should only follow the redirect if the map contains an entry of the form <"foo.bar.com/redirect.html",
Parse with a
ParseStatus indicating the redirect>.
c- Content to be parsed
Copyright © 2015 The Apache Software Foundation