org.apache.hadoop.fs
Class FileSystem

java.lang.Object
  extended by org.apache.hadoop.conf.Configured
      extended by org.apache.hadoop.fs.FileSystem
All Implemented Interfaces:
Configurable
Direct Known Subclasses:
DistributedFileSystem, LocalFileSystem

public abstract class FileSystem
extends Configured

An abstract base class for a fairly generic filesystem. It may be implemented as a distributed filesystem, or as a "local" one that reflects the locally-connected disk. The local version exists for small Hadopp instances and for testing.

All user code that may potentially use the Hadoop Distributed File System should be written to use a FileSystem object. The Hadoop DFS is a multi-machine system that appears as a single disk. It's useful because of its fault tolerance and potentially very large capacity.

The local implementation is LocalFileSystem and distributed implementation is DistributedFileSystem.

Author:
Mike Cafarella

Field Summary
static Logger LOG
           
 
Constructor Summary
protected FileSystem(Configuration conf)
           
 
Method Summary
abstract  void close()
          No more filesystem operations are needed.
 void completeLocalOutput(File src, File dst)
          Deprecated. Call completeLocalOutput(Path, Path) instead.
abstract  void completeLocalOutput(Path fsOutputFile, Path tmpLocalFile)
          Called when we're all done writing to the target.
abstract  void copyFromLocalFile(Path src, Path dst)
          The src file is on the local disk.
abstract  void copyToLocalFile(Path src, Path dst)
          The src file is under FS, and the dst is on the local disk.
 FSDataOutputStream create(File f)
          Deprecated. Call create(Path) instead.
 FSDataOutputStream create(Path f)
          Opens an FSDataOutputStream at the indicated Path.
 FSDataOutputStream create(Path f, boolean overwrite, int bufferSize)
          Opens an FSDataOutputStream at the indicated Path.
 FSDataOutputStream create(Path f, boolean overwrite, int bufferSize, short replication)
          Opens an FSDataOutputStream at the indicated Path.
 FSDataOutputStream create(Path f, short replication)
          Opens an FSDataOutputStream at the indicated Path.
 boolean createNewFile(File f)
          Deprecated. Call createNewFile(Path) instead.
 boolean createNewFile(Path f)
          Creates the given Path as a brand-new zero-length file.
abstract  FSOutputStream createRaw(Path f, boolean overwrite, short replication)
          Opens an OutputStream at the indicated Path.
 boolean delete(File f)
          Deprecated. Call delete(Path) instead.
 boolean delete(Path f)
          Delete a file.
abstract  boolean deleteRaw(Path f)
          Deletes Path
 boolean exists(File f)
          Deprecated. call exists(Path) instead
abstract  boolean exists(Path f)
          Check if exists.
static FileSystem get(Configuration conf)
          Returns the configured filesystem implementation.
abstract  long getBlockSize()
          Return the number of bytes that large input files should be optimally be split into to minimize i/o time.
static Path getChecksumFile(Path file)
          Return the name of the checksum file associated with a file.
abstract  String[][] getFileCacheHints(Path f, long start, long len)
          Return a 2D array of size 1x1 or greater, containing hostnames where portions of the given file can be found.
 long getLength(File f)
          Deprecated. Call getLength(Path) instead.
abstract  long getLength(Path f)
          The number of bytes in a file.
abstract  String getName()
          Returns a name for this filesystem, suitable to pass to getNamed(String,Configuration).
static FileSystem getNamed(String name, Configuration conf)
          Returns a named filesystem.
abstract  short getReplication(Path src)
          Get replication.
abstract  Path getWorkingDirectory()
          Get the current working directory for the given file system
static boolean isChecksumFile(Path file)
          Return true iff file is a checksum file name.
 boolean isDirectory(File f)
          Deprecated. Call isDirectory(Path) instead.
abstract  boolean isDirectory(Path f)
          True iff the named path is a directory.
 boolean isFile(File f)
          Deprecated. Call isFile(Path) instead.
 boolean isFile(Path f)
          True iff the named path is a regular file.
 File[] listFiles(File f)
          Deprecated. Call listPaths(Path) instead.
 File[] listFiles(File f, FileFilter filt)
          Deprecated. Call listPaths(Path) instead.
 Path[] listPaths(Path f)
          List files in a directory.
 Path[] listPaths(Path f, PathFilter filter)
          Filter files in a directory.
abstract  Path[] listPathsRaw(Path f)
          List files in a directory.
 void lock(File f, boolean shared)
          Deprecated. Call lock(Path,boolean) instead.
abstract  void lock(Path f, boolean shared)
          Obtain a lock on the given Path
 boolean mkdirs(File f)
          Deprecated. Call mkdirs(Path) instead.
abstract  boolean mkdirs(Path f)
          Make the given file and all non-existent parents into directories.
abstract  void moveFromLocalFile(Path src, Path dst)
          The src file is on the local disk.
 FSDataInputStream open(File f)
          Deprecated. Call open(Path) instead.
 FSDataInputStream open(Path f)
          Opens an FSDataInputStream at the indicated Path.
 FSDataInputStream open(Path f, int bufferSize)
          Opens an FSDataInputStream at the indicated Path.
abstract  FSInputStream openRaw(Path f)
          Opens an InputStream for the indicated Path, whether local or via DFS.
static FileSystem parseArgs(String[] argv, int i, Configuration conf)
          Parse the cmd-line args, starting at i.
 void release(File f)
          Deprecated. Call release(Path) instead.
abstract  void release(Path f)
          Release the lock
 boolean rename(File src, File dst)
          Deprecated. Call rename(Path, Path) instead.
 boolean rename(Path src, Path dst)
          Renames Path src to Path dst.
abstract  boolean renameRaw(Path src, Path dst)
          Renames Path src to Path dst.
abstract  void reportChecksumFailure(Path f, FSInputStream in, long start, long length, int crc)
          Report a checksum error to the file system.
 boolean setReplication(Path src, short replication)
          Set replication for an existing file.
abstract  boolean setReplicationRaw(Path src, short replication)
          Set replication for an existing file.
abstract  void setWorkingDirectory(Path new_dir)
          Set the current working directory for the given file system.
 File startLocalOutput(File src, File dst)
          Deprecated. Call startLocalOutput(Path, Path) instead.
abstract  Path startLocalOutput(Path fsOutputFile, Path tmpLocalFile)
          Returns a local File that the user can write output to.
 
Methods inherited from class org.apache.hadoop.conf.Configured
getConf, setConf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LOG

public static final Logger LOG
Constructor Detail

FileSystem

protected FileSystem(Configuration conf)
Method Detail

parseArgs

public static FileSystem parseArgs(String[] argv,
                                   int i,
                                   Configuration conf)
                            throws IOException
Parse the cmd-line args, starting at i. Remove consumed args from array. We expect param in the form: '-local | -dfs '

Throws:
IOException

get

public static FileSystem get(Configuration conf)
                      throws IOException
Returns the configured filesystem implementation.

Throws:
IOException

getName

public abstract String getName()
Returns a name for this filesystem, suitable to pass to getNamed(String,Configuration).


getNamed

public static FileSystem getNamed(String name,
                                  Configuration conf)
                           throws IOException
Returns a named filesystem. Names are either the string "local" or a host:port pair, naming an DFS name server.

Throws:
IOException

getChecksumFile

public static Path getChecksumFile(Path file)
Return the name of the checksum file associated with a file.


isChecksumFile

public static boolean isChecksumFile(Path file)
Return true iff file is a checksum file name.


getFileCacheHints

public abstract String[][] getFileCacheHints(Path f,
                                             long start,
                                             long len)
                                      throws IOException
Return a 2D array of size 1x1 or greater, containing hostnames where portions of the given file can be found. For a nonexistent file or regions, null will be returned. This call is most helpful with DFS, where it returns hostnames of machines that contain the given file. The FileSystem will simply return an elt containing 'localhost'.

Throws:
IOException

open

public FSDataInputStream open(File f)
                       throws IOException
Deprecated. Call open(Path) instead.

Throws:
IOException

open

public FSDataInputStream open(Path f,
                              int bufferSize)
                       throws IOException
Opens an FSDataInputStream at the indicated Path.

Parameters:
f - the file name to open
bufferSize - the size of the buffer to be used.
Throws:
IOException

open

public FSDataInputStream open(Path f)
                       throws IOException
Opens an FSDataInputStream at the indicated Path.

Parameters:
f - the file to open
Throws:
IOException

openRaw

public abstract FSInputStream openRaw(Path f)
                               throws IOException
Opens an InputStream for the indicated Path, whether local or via DFS.

Throws:
IOException

create

public FSDataOutputStream create(File f)
                          throws IOException
Deprecated. Call create(Path) instead.

Throws:
IOException

create

public FSDataOutputStream create(Path f)
                          throws IOException
Opens an FSDataOutputStream at the indicated Path. Files are overwritten by default.

Throws:
IOException

create

public FSDataOutputStream create(Path f,
                                 short replication)
                          throws IOException
Opens an FSDataOutputStream at the indicated Path. Files are overwritten by default.

Throws:
IOException

create

public FSDataOutputStream create(Path f,
                                 boolean overwrite,
                                 int bufferSize)
                          throws IOException
Opens an FSDataOutputStream at the indicated Path.

Parameters:
f - the file name to open
overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an error will be thrown.
bufferSize - the size of the buffer to be used.
Throws:
IOException

create

public FSDataOutputStream create(Path f,
                                 boolean overwrite,
                                 int bufferSize,
                                 short replication)
                          throws IOException
Opens an FSDataOutputStream at the indicated Path.

Parameters:
f - the file name to open
overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an error will be thrown.
bufferSize - the size of the buffer to be used.
replication - required block replication for the file.
Throws:
IOException

createRaw

public abstract FSOutputStream createRaw(Path f,
                                         boolean overwrite,
                                         short replication)
                                  throws IOException
Opens an OutputStream at the indicated Path.

Parameters:
f - the file name to open
overwrite - if a file with this name already exists, then if true, the file will be overwritten, and if false an error will be thrown.
replication - required block replication for the file.
Throws:
IOException

createNewFile

public boolean createNewFile(File f)
                      throws IOException
Deprecated. Call createNewFile(Path) instead.

Throws:
IOException

createNewFile

public boolean createNewFile(Path f)
                      throws IOException
Creates the given Path as a brand-new zero-length file. If create fails, or if it already existed, return false.

Throws:
IOException

setReplication

public boolean setReplication(Path src,
                              short replication)
                       throws IOException
Set replication for an existing file.

Parameters:
src - file name
replication - new replication
Returns:
true if successful; false if file does not exist or is a directory
Throws:
IOException

getReplication

public abstract short getReplication(Path src)
                              throws IOException
Get replication.

Parameters:
src - file name
Returns:
file replication
Throws:
IOException

setReplicationRaw

public abstract boolean setReplicationRaw(Path src,
                                          short replication)
                                   throws IOException
Set replication for an existing file.

Parameters:
src - file name
replication - new replication
Returns:
true if successful; false if file does not exist or is a directory
Throws:
IOException

rename

public boolean rename(File src,
                      File dst)
               throws IOException
Deprecated. Call rename(Path, Path) instead.

Throws:
IOException

rename

public boolean rename(Path src,
                      Path dst)
               throws IOException
Renames Path src to Path dst. Can take place on local fs or remote DFS.

Throws:
IOException

renameRaw

public abstract boolean renameRaw(Path src,
                                  Path dst)
                           throws IOException
Renames Path src to Path dst. Can take place on local fs or remote DFS.

Throws:
IOException

delete

public boolean delete(File f)
               throws IOException
Deprecated. Call delete(Path) instead.

Throws:
IOException

delete

public boolean delete(Path f)
               throws IOException
Delete a file.

Throws:
IOException

deleteRaw

public abstract boolean deleteRaw(Path f)
                           throws IOException
Deletes Path

Throws:
IOException

exists

public boolean exists(File f)
               throws IOException
Deprecated. call exists(Path) instead

Throws:
IOException

exists

public abstract boolean exists(Path f)
                        throws IOException
Check if exists.

Throws:
IOException

isDirectory

public boolean isDirectory(File f)
                    throws IOException
Deprecated. Call isDirectory(Path) instead.

Throws:
IOException

isDirectory

public abstract boolean isDirectory(Path f)
                             throws IOException
True iff the named path is a directory.

Throws:
IOException

isFile

public boolean isFile(File f)
               throws IOException
Deprecated. Call isFile(Path) instead.

Throws:
IOException

isFile

public boolean isFile(Path f)
               throws IOException
True iff the named path is a regular file.

Throws:
IOException

getLength

public long getLength(File f)
               throws IOException
Deprecated. Call getLength(Path) instead.

Throws:
IOException

getLength

public abstract long getLength(Path f)
                        throws IOException
The number of bytes in a file.

Throws:
IOException

listFiles

public File[] listFiles(File f)
                 throws IOException
Deprecated. Call listPaths(Path) instead.

Throws:
IOException

listPaths

public Path[] listPaths(Path f)
                 throws IOException
List files in a directory.

Throws:
IOException

listPathsRaw

public abstract Path[] listPathsRaw(Path f)
                             throws IOException
List files in a directory.

Throws:
IOException

listFiles

public File[] listFiles(File f,
                        FileFilter filt)
                 throws IOException
Deprecated. Call listPaths(Path) instead.

Throws:
IOException

listPaths

public Path[] listPaths(Path f,
                        PathFilter filter)
                 throws IOException
Filter files in a directory.

Throws:
IOException

setWorkingDirectory

public abstract void setWorkingDirectory(Path new_dir)
Set the current working directory for the given file system. All relative paths will be resolved relative to it.

Parameters:
new_dir -

getWorkingDirectory

public abstract Path getWorkingDirectory()
Get the current working directory for the given file system

Returns:
the directory pathname

mkdirs

public boolean mkdirs(File f)
               throws IOException
Deprecated. Call mkdirs(Path) instead.

Throws:
IOException

mkdirs

public abstract boolean mkdirs(Path f)
                        throws IOException
Make the given file and all non-existent parents into directories.

Throws:
IOException

lock

public void lock(File f,
                 boolean shared)
          throws IOException
Deprecated. Call lock(Path,boolean) instead.

Throws:
IOException

lock

public abstract void lock(Path f,
                          boolean shared)
                   throws IOException
Obtain a lock on the given Path

Throws:
IOException

release

public void release(File f)
             throws IOException
Deprecated. Call release(Path) instead.

Throws:
IOException

release

public abstract void release(Path f)
                      throws IOException
Release the lock

Throws:
IOException

copyFromLocalFile

public abstract void copyFromLocalFile(Path src,
                                       Path dst)
                                throws IOException
The src file is on the local disk. Add it to FS at the given dst name and the source is kept intact afterwards

Throws:
IOException

moveFromLocalFile

public abstract void moveFromLocalFile(Path src,
                                       Path dst)
                                throws IOException
The src file is on the local disk. Add it to FS at the given dst name, removing the source afterwards.

Throws:
IOException

copyToLocalFile

public abstract void copyToLocalFile(Path src,
                                     Path dst)
                              throws IOException
The src file is under FS, and the dst is on the local disk. Copy it from FS control to the local dst name.

Throws:
IOException

startLocalOutput

public File startLocalOutput(File src,
                             File dst)
                      throws IOException
Deprecated. Call startLocalOutput(Path, Path) instead.

Throws:
IOException

startLocalOutput

public abstract Path startLocalOutput(Path fsOutputFile,
                                      Path tmpLocalFile)
                               throws IOException
Returns a local File that the user can write output to. The caller provides both the eventual FS target name and the local working file. If the FS is local, we write directly into the target. If the FS is remote, we write into the tmp local area.

Throws:
IOException

completeLocalOutput

public void completeLocalOutput(File src,
                                File dst)
                         throws IOException
Deprecated. Call completeLocalOutput(Path, Path) instead.

Throws:
IOException

completeLocalOutput

public abstract void completeLocalOutput(Path fsOutputFile,
                                         Path tmpLocalFile)
                                  throws IOException
Called when we're all done writing to the target. A local FS will do nothing, because we've written to exactly the right place. A remote FS will copy the contents of tmpLocalFile to the correct target at fsOutputFile.

Throws:
IOException

close

public abstract void close()
                    throws IOException
No more filesystem operations are needed. Will release any held locks.

Throws:
IOException

reportChecksumFailure

public abstract void reportChecksumFailure(Path f,
                                           FSInputStream in,
                                           long start,
                                           long length,
                                           int crc)
Report a checksum error to the file system.

Parameters:
f - the file name containing the error
in - the stream open on the file
start - the position of the beginning of the bad data in the file
length - the length of the bad data in the file
crc - the expected CRC32 of the data

getBlockSize

public abstract long getBlockSize()
Return the number of bytes that large input files should be optimally be split into to minimize i/o time.



Copyright © 2006 The Apache Software Foundation