FAQ
Hi all,

I have a question concerning updating a site's score in Nutch 1.2.

In org.apache.nutch.crawlCrawlDbReducer's reduce-method I found a call to
scfilters.updateDbScore((Text)key, oldSet ? old : null, result, linkList);

During debugging, I discovered that this method is executed in the org.apache.nutch.scoring.opic.OPICScoringFilter class. The code for this method is the following:
/** Increase the score by a sum of inlinked scores. */
public void updateDbScore(Text url, CrawlDatum old, CrawlDatum datum, List inlinked) throws ScoringFilterException {
float adjust = 0.0f;
for (int i = 0; i < inlinked.size(); i++) {
CrawlDatum linked = (CrawlDatum)inlinked.get(i);
adjust += linked.getScore();
}
if (old == null) old = datum;
datum.setScore(old.getScore() + adjust);
}

To my understanding, this code would increase a sites score based on it's inlinks, every time a site is crawled. So even if neither the site has been modified, nor any new inlink was discovered, the sites score will increase.

Is my understanding of this mechanism correct?
If so, could anyone explain to me <why a sites score is increased in any case? I would expect it to only change if either its content has changed, or a new inlink has been discovered.

Cheers
David

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 1 of 1 | next ›
Discussion Overview
groupdev @
categoriesnutch, lucene
postedFeb 1, '11 at 3:16p
activeFeb 1, '11 at 3:16p
posts1
users1
websitenutch.apache.org

1 user in discussion

David Saile: 1 post

People

Translate

site design / logo © 2022 Grokbase