|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.nutch.scoring.link.LinkAnalysisScoringFilter
public class LinkAnalysisScoringFilter
| Field Summary |
|---|
| Fields inherited from interface org.apache.nutch.scoring.ScoringFilter |
|---|
X_POINT_ID |
| Constructor Summary | |
|---|---|
LinkAnalysisScoringFilter()
|
|
| Method Summary | |
|---|---|
void |
distributeScoreToOutlinks(String fromUrl,
WebPage page,
Collection<ScoreDatum> scoreData,
int allCount)
Distribute score value from the current page to all its outlinked pages. |
float |
generatorSortValue(String url,
WebPage page,
float initSort)
This method prepares a sort value for the purpose of sorting and selecting top N scoring pages during fetchlist generation. |
Configuration |
getConf()
|
Collection<WebPage.Field> |
getFields()
|
float |
indexerScore(String url,
NutchDocument doc,
WebPage page,
float initScore)
This method calculates a Lucene document boost. |
void |
initialScore(String url,
WebPage page)
Set an initial score for newly discovered pages. |
void |
injectedScore(String url,
WebPage page)
Set an initial score for newly injected pages. |
void |
setConf(Configuration conf)
|
void |
updateScore(String url,
WebPage page,
List<ScoreDatum> inlinkedScoreData)
This method calculates a new score during table update, based on the values contributed by inlinked pages. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
|---|
public LinkAnalysisScoringFilter()
| Method Detail |
|---|
public Configuration getConf()
getConf in interface Configurablepublic void setConf(Configuration conf)
setConf in interface Configurablepublic Collection<WebPage.Field> getFields()
getFields in interface FieldPluggable
public void injectedScore(String url,
WebPage page)
throws ScoringFilterException
ScoringFilter
injectedScore in interface ScoringFilterurl - url of the pagepage - new page. Filters will modify it in-place.
ScoringFilterException
public void initialScore(String url,
WebPage page)
throws ScoringFilterException
ScoringFilter
initialScore in interface ScoringFilterurl - url of the page
ScoringFilterException
public float generatorSortValue(String url,
WebPage page,
float initSort)
throws ScoringFilterException
ScoringFilter
generatorSortValue in interface ScoringFilterurl - url of the pageinitSort - initial sort value, or a value from previous filters in chain
ScoringFilterException
public void distributeScoreToOutlinks(String fromUrl,
WebPage page,
Collection<ScoreDatum> scoreData,
int allCount)
throws ScoringFilterException
ScoringFilter
distributeScoreToOutlinks in interface ScoringFilterfromUrl - url of the source pagescoreData - A list of OutlinkedScoreDatums for every outlink.
These OutlinkedScoreDatums will be passed to
#updateScore(String, OldWebTableRow, List)
for every outlinked URL.allCount - number of all collected outlinks from the source page
ScoringFilterException
public void updateScore(String url,
WebPage page,
List<ScoreDatum> inlinkedScoreData)
throws ScoringFilterException
ScoringFilter
updateScore in interface ScoringFilterurl - url of the page
ScoringFilterException
public float indexerScore(String url,
NutchDocument doc,
WebPage page,
float initScore)
throws ScoringFilterException
ScoringFilter
indexerScore in interface ScoringFilterurl - url of the pagedoc - document. NOTE: this already contains all information collected
by indexing filters. Implementations may modify this instance, in order to store/remove
some information.initScore - initial boost value for the Lucene document.
ScoringFilterException
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||