|
Class Summary |
| LinkDatum |
A class for holding link information including the url, anchor text, a score,
the timestamp of the link and a link type. |
| LinkDumper |
The LinkDumper tool creates a database of node to inlink information that can
be read using the nested Reader class. |
| LinkDumper.Inverter |
Inverts outlinks from the WebGraph to inlinks and attaches node
information. |
| LinkDumper.LinkNode |
Bean class which holds url to node information. |
| LinkDumper.LinkNodes |
Writable class which holds an array of LinkNode objects. |
| LinkDumper.Merger |
Merges LinkNode objects into a single array value per url. |
| LinkDumper.Reader |
Reader class which will print out the url and all of its inlinks to system
out. |
| LinkRank |
|
| LoopReader |
The LoopReader tool prints the loopset information for a single url. |
| Loops |
The Loops job identifies cycles of loops inside of the web graph. |
| Loops.Finalizer |
Finishes the Loops job by aggregating and collecting and found routes. |
| Loops.Initializer |
Initializes the Loop routes. |
| Loops.Looper |
Follows a route path looking for the start url of the route. |
| Loops.LoopSet |
A set of loops. |
| Loops.Route |
A link path or route looking to identify a link cycle. |
| Node |
A class which holds the number of inlinks and outlinks for a given url along
with an inlink score from a link analysis program and any metadata. |
| NodeDumper |
A tools that dumps out the top urls by number of inlinks, number of outlinks,
or by score, to a text file. |
| NodeDumper.Dumper |
Outputs the hosts or domains with an associated value. |
| NodeDumper.Sorter |
Outputs the top urls sorted in descending order. |
| NodeReader |
Reads and prints to system out information for a single node from the NodeDb
in the WebGraph. |
| ScoreUpdater |
Updates the score from the WebGraph node database into the crawl database. |
| WebGraph |
Creates three databases, one for inlinks, one for outlinks, and a node
database that holds the number of in and outlinks to a url and the current
score for the url. |
| WebGraph.OutlinkDb |
The OutlinkDb creates a database of all outlinks. |