Package org.apache.nutch.parse
Interface Parse
-
- All Known Implementing Classes:
ParseImpl
public interface Parse
The result of parsing a page's raw content.- See Also:
Parser.getParse(Content)
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description ParseData
getData()
Other data extracted from the page.String
getText()
The textual content of the page.boolean
isCanonical()
Indicates if the parse is coming from a url or a sub-url
-
-
-
Method Detail
-
getText
String getText()
The textual content of the page. This is indexed, searched, and used when generating snippets.- Returns:
- the entire text String
-
getData
ParseData getData()
Other data extracted from the page.- Returns:
- a populated
ParseData
object
-
isCanonical
boolean isCanonical()
Indicates if the parse is coming from a url or a sub-url- Returns:
- true if canonical, false otherwise
-
-