public class Injector extends NutchTool implements Tool
Note, that some metadata keys are reserved:
Example:
http://www.nutch.org/ \t nutch.score=10 \t nutch.fetchInterval=2592000 \t userType=open_source
Modifier and Type | Class and Description |
---|---|
static class |
Injector.InjectMapper |
static class |
Injector.InjectReducer
Combine multiple new entries for a url.
|
Modifier and Type | Field and Description |
---|---|
static String |
nutchFetchIntervalMDName
metadata key reserved for setting a custom fetchInterval for a specific URL
|
static String |
nutchFixedFetchIntervalMDName
metadata key reserved for setting a fixed custom fetchInterval for a
specific URL
|
static String |
nutchScoreMDName
metadata key reserved for setting a custom score for a specific URL
|
currentJob, currentJobNum, numJobs, results, status
Constructor and Description |
---|
Injector() |
Injector(Configuration conf) |
Modifier and Type | Method and Description |
---|---|
void |
inject(Path crawlDb,
Path urlDir) |
void |
inject(Path crawlDb,
Path urlDir,
boolean overwrite,
boolean update) |
static void |
main(String[] args) |
Map<String,Object> |
run(Map<String,Object> args,
String crawlId)
Used by the Nutch REST service
|
int |
run(String[] args) |
void |
usage() |
getProgress, getStatus, killJob, stopJob
getConf, setConf
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getConf, setConf
public static String nutchScoreMDName
public static String nutchFetchIntervalMDName
public static String nutchFixedFetchIntervalMDName
public Injector()
public Injector(Configuration conf)
public void inject(Path crawlDb, Path urlDir) throws IOException, ClassNotFoundException, InterruptedException
public void inject(Path crawlDb, Path urlDir, boolean overwrite, boolean update) throws IOException, ClassNotFoundException, InterruptedException
public void usage()
Copyright © 2017 The Apache Software Foundation