|
||||||||||
| PREV NEXT | FRAMES NO FRAMES | |||||||||
See:
Description
| Plugins API | |
|---|---|
| org.apache.nutch.protocol.http.api | Common API used by HTTP plugins (http,
httpclient) |
| org.apache.nutch.urlfilter.api | |
| Protocol Plugins | |
|---|---|
| org.apache.nutch.protocol.file | Protocol plugin which supports retrieving local file resources. |
| org.apache.nutch.protocol.ftp | Protocol plugin which supports retrieving documents via the ftp protocol. |
| org.apache.nutch.protocol.http | Protocol plugin which supports retrieving documents via the http protocol. |
| org.apache.nutch.protocol.httpclient | Protocol plugin which supports retrieving documents via the HTTP and HTTPS protocols, optionally with Basic, Digest and NTLM authentication schemes for web server as well as proxy server. |
| URL Filter Plugins | |
|---|---|
| org.apache.nutch.net.urlnormalizer.basic | |
| org.apache.nutch.net.urlnormalizer.pass | |
| org.apache.nutch.net.urlnormalizer.regex | |
| Scoring Plugins | |
|---|---|
| org.apache.nutch.scoring.link | |
| org.apache.nutch.scoring.opic | |
| org.apache.nutch.scoring.tld | Top Level Domain Scoring plugin. |
| org.apache.nutch.scoring.urlmeta | URL Meta Tag Scoring Plugin |
| Parse Plugins | |
|---|---|
| org.apache.nutch.parse.headings | |
| Indexing Filter Plugins | |
|---|---|
| org.apache.nutch.indexer.anchor | An indexing plugin for inbound anchor text. |
| org.apache.nutch.indexer.basic | A basic indexing plugin. |
| org.apache.nutch.indexer.feed | |
| org.apache.nutch.indexer.metadata | |
| org.apache.nutch.indexer.staticfield | A simple plugin called at indexing that adds fields with static data. |
| org.apache.nutch.indexer.subcollection | |
| org.apache.nutch.indexer.tld | Top Level Domain Indexing plugin. |
| org.apache.nutch.indexer.urlmeta | URL Meta Tag Indexing Plugin |
| Misc. Plugins | |
|---|---|
| org.apache.nutch.analysis.lang | Text document language identifier. |
| org.apache.nutch.collection | Subcollection is a subset of an index. |
| org.creativecommons.nutch | Sample plugins that parse and index Creative Commons medadata. |
Apache Nutch is an open source web-search software project.
Nutch is a project of the Apache Software Foundation and is part of the larger Apache community of developers and users.
|
||||||||||
| PREV NEXT | FRAMES NO FRAMES | |||||||||