Uses of Class
org.apache.nutch.protocol.Content

Packages that use Content
org.apache.nutch.analysis.lang Text document language identifier. 
org.apache.nutch.crawl Crawl control code. 
org.apache.nutch.fetcher The Nutch robot. 
org.apache.nutch.microformats.reltag A microformats Rel-Tag Parser/Indexer/Querier plugin. 
org.apache.nutch.parse   
org.apache.nutch.parse.ext   
org.apache.nutch.parse.html An HTML document parsing plugin. 
org.apache.nutch.parse.js   
org.apache.nutch.parse.ms Common API for Microsoft © documents parsing. 
org.apache.nutch.parse.msexcel A Microsoft © Excel document parsing plugin. 
org.apache.nutch.parse.mspowerpoint A Microsoft © PowerPoint document parsing plugin. 
org.apache.nutch.parse.msword A Microsoft © Word document parsing plugin. 
org.apache.nutch.parse.oo   
org.apache.nutch.parse.pdf A pdf parsing plugin. 
org.apache.nutch.parse.rss   
org.apache.nutch.parse.swf   
org.apache.nutch.parse.text A plain text parsing plugin. 
org.apache.nutch.parse.zip   
org.apache.nutch.protocol   
org.apache.nutch.protocol.file Protocol plugin which supports retrieving local file resources. 
org.apache.nutch.protocol.ftp Protocol plugin which supports retrieving documents via the ftp protocol. 
org.apache.nutch.scoring   
org.apache.nutch.scoring.opic   
org.apache.nutch.util   
org.creativecommons.nutch Sample plugins that parse and index Creative Commons medadata. 
 

Uses of Content in org.apache.nutch.analysis.lang
 

Methods in org.apache.nutch.analysis.lang with parameters of type Content
 ParseResult HTMLLanguageParser.filter(Content content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment doc)
          Scan the HTML document looking at possible indications of content language
1.
 

Uses of Content in org.apache.nutch.crawl
 

Methods in org.apache.nutch.crawl with parameters of type Content
 byte[] MD5Signature.calculate(Content content, Parse parse)
           
 byte[] TextProfileSignature.calculate(Content content, Parse parse)
           
abstract  byte[] Signature.calculate(Content content, Parse parse)
           
 

Uses of Content in org.apache.nutch.fetcher
 

Methods in org.apache.nutch.fetcher that return Content
 Content FetcherOutput.getContent()
           
 

Constructors in org.apache.nutch.fetcher with parameters of type Content
FetcherOutput(CrawlDatum crawlDatum, Content content, ParseImpl parse)
           
 

Uses of Content in org.apache.nutch.microformats.reltag
 

Methods in org.apache.nutch.microformats.reltag with parameters of type Content
 ParseResult RelTagParser.filter(Content content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment doc)
          Scan the HTML document looking at possible rel-tags
 

Uses of Content in org.apache.nutch.parse
 

Methods in org.apache.nutch.parse with parameters of type Content
 ParseResult HtmlParseFilter.filter(Content content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment doc)
          Adds metadata or otherwise modifies a parse of HTML content, given the DOM tree of a page.
 ParseResult HtmlParseFilters.filter(Content content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment doc)
          Run all defined filters.
 ParseResult Parser.getParse(Content c)
           This method parses the given content and returns a map of <key, parse> pairs.
 void ParseSegment.map(WritableComparable key, Content content, OutputCollector<Text,ParseImpl> output, Reporter reporter)
           
 ParseResult ParseUtil.parse(Content content)
          Performs a parse by iterating through a List of preferred Parsers until a successful parse is performed and a Parse object is returned.
 ParseResult ParseUtil.parseByExtensionId(String extId, Content content)
          Method parses a Content object using the Parser specified by the parameter extId, i.e., the Parser's extension ID.
 

Uses of Content in org.apache.nutch.parse.ext
 

Methods in org.apache.nutch.parse.ext with parameters of type Content
 ParseResult ExtParser.getParse(Content content)
           
 

Uses of Content in org.apache.nutch.parse.html
 

Methods in org.apache.nutch.parse.html with parameters of type Content
 ParseResult HtmlParser.getParse(Content content)
           
 

Uses of Content in org.apache.nutch.parse.js
 

Methods in org.apache.nutch.parse.js with parameters of type Content
 ParseResult JSParseFilter.filter(Content content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment doc)
           
 ParseResult JSParseFilter.getParse(Content c)
           
 

Uses of Content in org.apache.nutch.parse.ms
 

Methods in org.apache.nutch.parse.ms with parameters of type Content
protected  ParseResult MSBaseParser.getParse(MSExtractor extractor, Content content)
          Parses a Content with a specific Microsoft document extractor.
 

Uses of Content in org.apache.nutch.parse.msexcel
 

Methods in org.apache.nutch.parse.msexcel with parameters of type Content
 ParseResult MSExcelParser.getParse(Content content)
           
 

Uses of Content in org.apache.nutch.parse.mspowerpoint
 

Methods in org.apache.nutch.parse.mspowerpoint with parameters of type Content
 ParseResult MSPowerPointParser.getParse(Content content)
           
 

Uses of Content in org.apache.nutch.parse.msword
 

Methods in org.apache.nutch.parse.msword with parameters of type Content
 ParseResult MSWordParser.getParse(Content content)
           
 

Uses of Content in org.apache.nutch.parse.oo
 

Methods in org.apache.nutch.parse.oo with parameters of type Content
 ParseResult OOParser.getParse(Content content)
           
 

Uses of Content in org.apache.nutch.parse.pdf
 

Methods in org.apache.nutch.parse.pdf with parameters of type Content
 ParseResult PdfParser.getParse(Content content)
           
 

Uses of Content in org.apache.nutch.parse.rss
 

Methods in org.apache.nutch.parse.rss with parameters of type Content
 ParseResult RSSParser.getParse(Content content)
           Implementation method, parses the RSS content, and then returns a ParseImpl.
 

Uses of Content in org.apache.nutch.parse.swf
 

Methods in org.apache.nutch.parse.swf with parameters of type Content
 ParseResult SWFParser.getParse(Content content)
           
 

Uses of Content in org.apache.nutch.parse.text
 

Methods in org.apache.nutch.parse.text with parameters of type Content
 ParseResult TextParser.getParse(Content content)
          Parses plain text document.
 

Uses of Content in org.apache.nutch.parse.zip
 

Methods in org.apache.nutch.parse.zip with parameters of type Content
 ParseResult ZipParser.getParse(Content content)
           
 

Uses of Content in org.apache.nutch.protocol
 

Methods in org.apache.nutch.protocol that return Content
 Content ProtocolOutput.getContent()
           
static Content Content.read(DataInput in)
           
 

Methods in org.apache.nutch.protocol with parameters of type Content
 void ProtocolOutput.setContent(Content content)
           
 

Constructors in org.apache.nutch.protocol with parameters of type Content
ProtocolOutput(Content content)
           
ProtocolOutput(Content content, ProtocolStatus status)
           
 

Uses of Content in org.apache.nutch.protocol.file
 

Methods in org.apache.nutch.protocol.file that return Content
 Content FileResponse.toContent()
           
 

Uses of Content in org.apache.nutch.protocol.ftp
 

Methods in org.apache.nutch.protocol.ftp that return Content
 Content FtpResponse.toContent()
           
 

Uses of Content in org.apache.nutch.scoring
 

Methods in org.apache.nutch.scoring with parameters of type Content
 void ScoringFilter.passScoreAfterParsing(Text url, Content content, Parse parse)
          Currently a part of score distribution is performed using only data coming from the parsing process.
 void ScoringFilters.passScoreAfterParsing(Text url, Content content, Parse parse)
           
 void ScoringFilter.passScoreBeforeParsing(Text url, CrawlDatum datum, Content content)
          This method takes all relevant score information from the current datum (coming from a generated fetchlist) and stores it into Content metadata.
 void ScoringFilters.passScoreBeforeParsing(Text url, CrawlDatum datum, Content content)
           
 

Uses of Content in org.apache.nutch.scoring.opic
 

Methods in org.apache.nutch.scoring.opic with parameters of type Content
 void OPICScoringFilter.passScoreAfterParsing(Text url, Content content, Parse parse)
          Copy the value from Content metadata under Fetcher.SCORE_KEY to parseData.
 void OPICScoringFilter.passScoreBeforeParsing(Text url, CrawlDatum datum, Content content)
          Store a float value of CrawlDatum.getScore() under Fetcher.SCORE_KEY.
 

Uses of Content in org.apache.nutch.util
 

Methods in org.apache.nutch.util with parameters of type Content
 void EncodingDetector.autoDetectClues(Content content, boolean filter)
           
 String EncodingDetector.guessEncoding(Content content, String defaultValue)
          Guess the encoding with the previously specified list of clues.
 

Uses of Content in org.creativecommons.nutch
 

Methods in org.creativecommons.nutch with parameters of type Content
 ParseResult CCParseFilter.filter(Content content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment doc)
          Adds metadata or otherwise modifies a parse of an HTML document, given the DOM tree of a page.
 



Copyright © 2006 The Apache Software Foundation