Overview

Nutch uses the official subversion repository of the Apache Software Foundation. However, Apache also provides read only mirrors for Git users. Below you can find how to use Subversion or Git to access the Nutch source code.

Subversion Repository

Subversion Clients

The Nutch source code resides in the Apache Subversion (SVN) repository. The command-line SVN client can be obtained here. The TortoiseSVN GUI client for Windows can be obtained here. There are also SVN plugins available for both Eclipse and IntelliJ IDEA.

Web Access (read-only)

The source code can be browsed via the Web at http://svn.apache.org/viewvc/nutch/. No SVN client software is required.

Anonymous Access (read-only)

Instructions for anonymous SVN access are here.

Committer Access (read-write)

Instructions for committer SVN access are here.

Git Repository

Anonymous Access (read-only)

The Apache Git repository can be used for accessing the repository. The URL for anonymous read-only access is http://git.apache.org/nutch.git/. Alternatively the Github mirror at http://github.com/apache/nutch can also be used. The repository can be cloned by:

$ git clone http://git.apache.org/nutch.git/

More instructions for setting up git access can be found here.

Committer Access (read-write)

Currently the subversion repository is the only official repository that has Nutch write access. However, committers can use the git-svn package to bypass svn and use a git-only workflow. Setting up the initial repository and pushing git commits to subversion is detailed here.