Class LinkDumper

  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.nutch.scoring.webgraph.LinkDumper
All Implemented Interfaces:
Configurable, Tool

public class LinkDumper
extends Configured
implements Tool

The LinkDumper tool creates a database of node to inlink information that can be read using the nested Reader class. This allows the inlink and scoring state of a single url to be reviewed quickly to determine why a given url is ranking a certain way. This tool is to be used with the LinkRank analysis.

Nested Class Summary
static class LinkDumper.Inverter
          Inverts outlinks from the WebGraph to inlinks and attaches node information.
static class LinkDumper.LinkNode
          Bean class which holds url to node information.
static class LinkDumper.LinkNodes
          Writable class which holds an array of LinkNode objects.
static class LinkDumper.Merger
          Merges LinkNode objects into a single array value per url.
static class LinkDumper.Reader
          Reader class which will print out the url and all of its inlinks to system out.
Field Summary
static String DUMP_DIR
static org.apache.commons.logging.Log LOG
Constructor Summary
Method Summary
 void dumpLinks(Path webGraphDb)
          Runs the inverter and merger jobs of the LinkDumper tool to create the url to inlink node database.
static void main(String[] args)
 int run(String[] args)
          Runs the LinkDumper tool.
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf

Field Detail


public static final org.apache.commons.logging.Log LOG


public static final String DUMP_DIR
See Also:
Constant Field Values
Constructor Detail


public LinkDumper()
Method Detail


public void dumpLinks(Path webGraphDb)
               throws IOException
Runs the inverter and merger jobs of the LinkDumper tool to create the url to inlink node database.



public static void main(String[] args)
                 throws Exception


public int run(String[] args)
        throws Exception
Runs the LinkDumper tool. This simply creates the database, to read the values the nested Reader tool must be used.

Specified by:
run in interface Tool

Copyright © 2006 The Apache Software Foundation