org.apache.nutch.crawl
Class CrawlDb

java.lang.Object
  extended byorg.apache.hadoop.conf.Configured
      extended byorg.apache.nutch.crawl.CrawlDb
All Implemented Interfaces:
Configurable

public class CrawlDb
extends Configured

This class takes the output of the fetcher and updates the crawldb accordingly.


Field Summary
static org.apache.commons.logging.Log LOG
           
 
Constructor Summary
CrawlDb(Configuration conf)
          Construct an CrawlDb.
 
Method Summary
static JobConf createJob(Configuration config, Path crawlDb)
           
static void install(JobConf job, Path crawlDb)
           
static void main(String[] args)
           
 void update(Path crawlDb, Path segment)
           
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final org.apache.commons.logging.Log LOG
Constructor Detail

CrawlDb

public CrawlDb(Configuration conf)
Construct an CrawlDb.

Method Detail

update

public void update(Path crawlDb,
                   Path segment)
            throws IOException
Throws:
IOException

createJob

public static JobConf createJob(Configuration config,
                                Path crawlDb)
                         throws IOException
Throws:
IOException

install

public static void install(JobConf job,
                           Path crawlDb)
                    throws IOException
Throws:
IOException

main

public static void main(String[] args)
                 throws Exception
Throws:
Exception


Copyright © 2006 The Apache Software Foundation