org.apache.hadoop.util
Class CopyFiles

java.lang.Object
  extended by org.apache.hadoop.mapred.MapReduceBase
      extended by org.apache.hadoop.util.CopyFiles
All Implemented Interfaces:
Closeable, JobConfigurable, Reducer

public class CopyFiles
extends MapReduceBase
implements Reducer

A Map-reduce program to recursively copy directories between diffferent file-systems.

Author:
Milind Bhandarkar

Nested Class Summary
static class CopyFiles.CopyFilesMapper
          Mappper class for Copying files.
 
Constructor Summary
CopyFiles()
           
 
Method Summary
static void main(String[] args)
          This is the main driver for recursively copying directories across file systems.
 void reduce(WritableComparable key, Iterator values, OutputCollector output, Reporter reporter)
          Combines values for a given key.
 
Methods inherited from class org.apache.hadoop.mapred.MapReduceBase
close, configure
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.hadoop.mapred.JobConfigurable
configure
 
Methods inherited from interface org.apache.hadoop.io.Closeable
close
 

Constructor Detail

CopyFiles

public CopyFiles()
Method Detail

reduce

public void reduce(WritableComparable key,
                   Iterator values,
                   OutputCollector output,
                   Reporter reporter)
            throws IOException
Description copied from interface: Reducer
Combines values for a given key. Output values must be of the same type as input values. Input keys must not be altered. Typically all values are combined into zero or one value. Output pairs are collected with calls to OutputCollector.collect(WritableComparable,Writable).

Specified by:
reduce in interface Reducer
Parameters:
key - the key
values - the values to combine
output - to collect combined values
Throws:
IOException

main

public static void main(String[] args)
                 throws IOException
This is the main driver for recursively copying directories across file systems. It takes at least two cmdline parameters. A source URL and a destination URL. It then essentially does an "ls -lR" on the source URL, and writes the output in aa round-robin manner to all the map input files. The mapper actually copies the files allotted to it. And the reduce is empty.

Throws:
IOException


Copyright © 2006 The Apache Software Foundation