Package org.apache.nutch.hostdb
-
Interface Summary Interface Description CrawlDatumProcessor These are instantiated once for each host. -
Class Summary Class Description FetchOverdueCrawlDatumProcessor Simple custom crawl datum processor that counts the number of records that are overdue for fetching, e.g.HostDatum ReadHostDb ResolverThread Simple runnable that performs DNS lookup for a single host.UpdateHostDb Tool to create a HostDB from the CrawlDB.UpdateHostDbMapper Mapper ingesting HostDB and CrawlDB entries.UpdateHostDbReducer