A system for scalable, fault-tolerant, distributed computation over large data collections.
Applications implement {@link org.apache.nutch.mapReduce.Mapper} and {@link org.apache.nutch.mapReduce.Reducer} interfaces. These are submitted as a MapReduceJob and are applied to data stored in a {@link org.apache.nutch.fs.NutchFileSystem}.
See Google's original Map/Reduce paper for background information.