org.apache.hadoop.hbase.util
Class FSUtils

java.lang.Object
  extended by org.apache.hadoop.hbase.util.FSUtils
Direct Known Subclasses:
FSHDFSUtils, FSMapRUtils

@InterfaceAudience.Private
public abstract class FSUtils
extends Object

Utility methods for interacting with the underlying file system.


Nested Class Summary
static class FSUtils.BlackListDirFilter
          Directory filter that doesn't include any of the directories in the specified blacklist
static class FSUtils.DirFilter
          A PathFilter that only allows directories.
static class FSUtils.FamilyDirFilter
          Filter for all dirs that are legal column family names.
static class FSUtils.HFileFilter
          Filter for HFiles that excludes reference files.
static class FSUtils.ReferenceFileFilter
           
static class FSUtils.RegionDirFilter
          Filter for all dirs that don't start with '.'
static class FSUtils.UserTableDirFilter
          A PathFilter that returns usertable directories.
 
Field Summary
static String FULL_RWX_PERMISSIONS
          Full access permissions (starting point for a umask)
static boolean WINDOWS
          Set to true on Windows platforms
 
Constructor Summary
protected FSUtils()
           
 
Method Summary
static void checkAccess(org.apache.hadoop.security.UserGroupInformation ugi, org.apache.hadoop.fs.FileStatus file, org.apache.hadoop.fs.permission.FsAction action)
          Throw an exception if an action is not permitted by a user on a file.
static boolean checkClusterIdExists(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, int wait)
          Checks that a cluster ID file exists in the HBase root directory
static void checkDfsSafeMode(org.apache.hadoop.conf.Configuration conf)
          Check whether dfs is in safemode.
static void checkFileSystemAvailable(org.apache.hadoop.fs.FileSystem fs)
          Checks to see if the specified file system is available
static void checkShortCircuitReadBufferSize(org.apache.hadoop.conf.Configuration conf)
          Check if short circuit read buffer size is set and if not, set it to hbase value.
static void checkVersion(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, boolean message)
          Verifies current version of file system
static void checkVersion(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, boolean message, int wait, int retries)
          Verifies current version of file system
static HDFSBlocksDistribution computeHDFSBlocksDistribution(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.FileStatus status, long start, long length)
          Compute HDFS blocks distribution of a given file, or a portion of the file
static org.apache.hadoop.fs.FSDataOutputStream create(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, org.apache.hadoop.fs.permission.FsPermission perm, InetSocketAddress[] favoredNodes)
          Create the specified file on the filesystem.
static org.apache.hadoop.fs.FSDataOutputStream create(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, org.apache.hadoop.fs.permission.FsPermission perm, boolean overwrite)
          Create the specified file on the filesystem.
static boolean delete(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path, boolean recursive)
          Calls fs.delete() and returns the value returned by the fs.delete()
static boolean deleteDirectory(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir)
          Delete if exists.
static boolean deleteRegionDir(org.apache.hadoop.conf.Configuration conf, HRegionInfo hri)
          Delete the region directory if exists.
static ClusterId getClusterId(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir)
          Returns the value of the unique cluster ID stored for this HBase instance.
static org.apache.hadoop.fs.FileSystem getCurrentFileSystem(org.apache.hadoop.conf.Configuration conf)
           
static long getDefaultBlockSize(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path)
          Return the number of bytes that large input files should be optimally be split into to minimize i/o time.
static int getDefaultBufferSize(org.apache.hadoop.fs.FileSystem fs)
          Returns the default buffer size to use during writes.
static short getDefaultReplication(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path)
           
static List<org.apache.hadoop.fs.Path> getFamilyDirs(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path regionDir)
          Given a particular region dir, return all the familydirs inside it
static org.apache.hadoop.fs.permission.FsPermission getFileDefault()
          Get the default permission for file.
static org.apache.hadoop.fs.permission.FsPermission getFilePermissions(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, String permssionConfKey)
          Get the file permissions specified in the configuration, if they are enabled.
static FSUtils getInstance(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf)
           
static List<org.apache.hadoop.fs.Path> getLocalTableDirs(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir)
           
static org.apache.hadoop.fs.Path getNamespaceDir(org.apache.hadoop.fs.Path rootdir, String namespace)
          Returns the Path object representing the namespace directory under path rootdir
static String getPath(org.apache.hadoop.fs.Path p)
          Return the 'path' component of a Path.
static List<org.apache.hadoop.fs.Path> getReferenceFilePaths(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path familyDir)
           
static Map<String,Map<String,Float>> getRegionDegreeLocalityMappingFromFS(org.apache.hadoop.conf.Configuration conf)
          This function is to scan the root path of the file system to get the degree of locality for each region on each of the servers having at least one block of that region.
static Map<String,Map<String,Float>> getRegionDegreeLocalityMappingFromFS(org.apache.hadoop.conf.Configuration conf, String desiredTable, int threadPoolSize)
          This function is to scan the root path of the file system to get the degree of locality for each region on each of the servers having at least one block of that region.
static List<org.apache.hadoop.fs.Path> getRegionDirs(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path tableDir)
          Given a particular table dir, return all the regiondirs inside it, excluding files such as .tableinfo
static int getRegionReferenceFileCount(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path p)
           
static org.apache.hadoop.fs.Path getRootDir(org.apache.hadoop.conf.Configuration c)
           
static org.apache.hadoop.fs.Path getTableDir(org.apache.hadoop.fs.Path rootdir, TableName tableName)
          Returns the Path object representing the table directory under path rootdir
static List<org.apache.hadoop.fs.Path> getTableDirs(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir)
           
static Map<String,Integer> getTableFragmentation(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir)
          Runs through the HBase rootdir and checks how many stores for each table have more than one file in them.
static Map<String,Integer> getTableFragmentation(HMaster master)
          Runs through the HBase rootdir and checks how many stores for each table have more than one file in them.
static TableName getTableName(org.apache.hadoop.fs.Path tablePath)
          Returns the TableName object representing the table directory under path rootdir
static Map<String,org.apache.hadoop.fs.Path> getTableStoreFilePathMap(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir)
          Runs through the HBase rootdir and creates a reverse lookup map for table StoreFile names to the full Path.
static Map<String,org.apache.hadoop.fs.Path> getTableStoreFilePathMap(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir, HBaseFsck.ErrorReporter errors)
          Runs through the HBase rootdir and creates a reverse lookup map for table StoreFile names to the full Path.
static Map<String,org.apache.hadoop.fs.Path> getTableStoreFilePathMap(Map<String,org.apache.hadoop.fs.Path> map, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir, TableName tableName)
          Runs through the HBase rootdir/tablename and creates a reverse lookup map for table StoreFile names to the full Path.
static Map<String,org.apache.hadoop.fs.Path> getTableStoreFilePathMap(Map<String,org.apache.hadoop.fs.Path> map, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir, TableName tableName, HBaseFsck.ErrorReporter errors)
          Runs through the HBase rootdir/tablename and creates a reverse lookup map for table StoreFile names to the full Path.
static int getTotalTableFragmentation(HMaster master)
          Returns the total overall fragmentation percentage.
static String getVersion(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir)
          Verifies current version of file system
static boolean isAppendSupported(org.apache.hadoop.conf.Configuration conf)
          Heuristic to determine whether is safe or not to open a file for append Looks both for dfs.support.append and use reflection to search for SequenceFile.Writer.syncFs() or FSDataOutputStream.hflush()
static boolean isExists(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path path)
          Calls fs.exists().
static boolean isHDFS(org.apache.hadoop.conf.Configuration conf)
           
static boolean isMajorCompacted(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path hbaseRootDir)
          Runs through the hbase rootdir and checks all stores have only one file in them -- that is, they've been major compacted.
static boolean isMatchingTail(org.apache.hadoop.fs.Path pathToSearch, org.apache.hadoop.fs.Path pathTail)
          Compare path component of the Path URI; e.g.
static boolean isMatchingTail(org.apache.hadoop.fs.Path pathToSearch, String pathTail)
          Compare path component of the Path URI; e.g.
static boolean isRecoveredEdits(org.apache.hadoop.fs.Path path)
          Checks if the given path is the one with 'recovered.edits' dir.
static boolean isStartingWithPath(org.apache.hadoop.fs.Path rootPath, String path)
          Compare of path component.
static org.apache.hadoop.fs.FileStatus[] listStatus(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir)
          Calls fs.listStatus() and treats FileNotFoundException as non-fatal This would accommodates differences between hadoop versions
static org.apache.hadoop.fs.FileStatus[] listStatus(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path dir, org.apache.hadoop.fs.PathFilter filter)
          Calls fs.listStatus() and treats FileNotFoundException as non-fatal This accommodates differences between hadoop versions, where hadoop 1 does not throw a FileNotFoundException, and return an empty FileStatus[] while Hadoop 2 will throw FileNotFoundException.
static void logFileSystemState(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path root, org.apache.commons.logging.Log LOG)
          Log the current state of the filesystem from a certain root directory
static boolean metaRegionExists(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir)
          Checks if meta region exists
abstract  void recoverFileLease(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path p, org.apache.hadoop.conf.Configuration conf, CancelableProgressable reporter)
          Recover file lease.
static String removeRootPath(org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration conf)
          Checks for the presence of the root path (using the provided conf object) in the given path.
static boolean renameAndSetModifyTime(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path src, org.apache.hadoop.fs.Path dest)
           
static void setClusterId(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, ClusterId clusterId, int wait)
          Writes a new unique identifier for this cluster to the "hbase.id" file in the HBase root directory
static void setFsDefault(org.apache.hadoop.conf.Configuration c, org.apache.hadoop.fs.Path root)
           
static void setRootDir(org.apache.hadoop.conf.Configuration c, org.apache.hadoop.fs.Path root)
           
static void setupShortCircuitRead(org.apache.hadoop.conf.Configuration conf)
          Do our short circuit read setup.
static void setVersion(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir)
          Sets version of file system
static void setVersion(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, int wait, int retries)
          Sets version of file system
static void setVersion(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path rootdir, String version, int wait, int retries)
          Sets version of file system
static org.apache.hadoop.fs.Path validateRootPath(org.apache.hadoop.fs.Path root)
          Verifies root directory path is a valid URI with a scheme
static void waitOnSafeMode(org.apache.hadoop.conf.Configuration conf, long wait)
          If DFS, check safe mode and if so, wait until we clear it.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

FULL_RWX_PERMISSIONS

public static final String FULL_RWX_PERMISSIONS
Full access permissions (starting point for a umask)

See Also:
Constant Field Values

WINDOWS

public static final boolean WINDOWS
Set to true on Windows platforms

Constructor Detail

FSUtils

protected FSUtils()
Method Detail

isStartingWithPath

public static boolean isStartingWithPath(org.apache.hadoop.fs.Path rootPath,
                                         String path)
Compare of path component. Does not consider schema; i.e. if schemas different but path starts with rootPath, then the function returns true

Parameters:
rootPath -
path -
Returns:
True if path starts with rootPath

isMatchingTail

public static boolean isMatchingTail(org.apache.hadoop.fs.Path pathToSearch,
                                     String pathTail)
Compare path component of the Path URI; e.g. if hdfs://a/b/c and /a/b/c, it will compare the '/a/b/c' part. Does not consider schema; i.e. if schemas different but path or subpath matches, the two will equate.

Parameters:
pathToSearch - Path we will be trying to match.
pathTail -
Returns:
True if pathTail is tail on the path of pathToSearch

isMatchingTail

public static boolean isMatchingTail(org.apache.hadoop.fs.Path pathToSearch,
                                     org.apache.hadoop.fs.Path pathTail)
Compare path component of the Path URI; e.g. if hdfs://a/b/c and /a/b/c, it will compare the '/a/b/c' part. If you passed in 'hdfs://a/b/c and b/c, it would return true. Does not consider schema; i.e. if schemas different but path or subpath matches, the two will equate.

Parameters:
pathToSearch - Path we will be trying to match.
pathTail -
Returns:
True if pathTail is tail on the path of pathToSearch

getInstance

public static FSUtils getInstance(org.apache.hadoop.fs.FileSystem fs,
                                  org.apache.hadoop.conf.Configuration conf)

deleteDirectory

public static boolean deleteDirectory(org.apache.hadoop.fs.FileSystem fs,
                                      org.apache.hadoop.fs.Path dir)
                               throws IOException
Delete if exists.

Parameters:
fs - filesystem object
dir - directory to delete
Returns:
True if deleted dir
Throws:
IOException - e

deleteRegionDir

public static boolean deleteRegionDir(org.apache.hadoop.conf.Configuration conf,
                                      HRegionInfo hri)
                               throws IOException
Delete the region directory if exists.

Parameters:
conf -
hri -
Returns:
True if deleted the region directory.
Throws:
IOException

getDefaultBlockSize

public static long getDefaultBlockSize(org.apache.hadoop.fs.FileSystem fs,
                                       org.apache.hadoop.fs.Path path)
                                throws IOException
Return the number of bytes that large input files should be optimally be split into to minimize i/o time. use reflection to search for getDefaultBlockSize(Path f) if the method doesn't exist, fall back to using getDefaultBlockSize()

Parameters:
fs - filesystem object
Returns:
the default block size for the path's filesystem
Throws:
IOException - e

getDefaultReplication

public static short getDefaultReplication(org.apache.hadoop.fs.FileSystem fs,
                                          org.apache.hadoop.fs.Path path)
                                   throws IOException
Throws:
IOException

getDefaultBufferSize

public static int getDefaultBufferSize(org.apache.hadoop.fs.FileSystem fs)
Returns the default buffer size to use during writes. The size of the buffer should probably be a multiple of hardware page size (4096 on Intel x86), and it determines how much data is buffered during read and write operations.

Parameters:
fs - filesystem object
Returns:
default buffer size to use during writes

create

public static org.apache.hadoop.fs.FSDataOutputStream create(org.apache.hadoop.conf.Configuration conf,
                                                             org.apache.hadoop.fs.FileSystem fs,
                                                             org.apache.hadoop.fs.Path path,
                                                             org.apache.hadoop.fs.permission.FsPermission perm,
                                                             InetSocketAddress[] favoredNodes)
                                                      throws IOException
Create the specified file on the filesystem. By default, this will:
  1. overwrite the file if it exists
  2. apply the umask in the configuration (if it is enabled)
  3. use the fs configured buffer size (or 4096 if not set)
  4. use the configured column family replication or default replication if HColumnDescriptor.DEFAULT_DFS_REPLICATION
  5. use the default block size
  6. not track progress

Parameters:
conf - configurations
fs - FileSystem on which to write the file
path - Path to the file to write
perm - permissions
favoredNodes -
Returns:
output stream to the created file
Throws:
IOException - if the file cannot be created

create

public static org.apache.hadoop.fs.FSDataOutputStream create(org.apache.hadoop.fs.FileSystem fs,
                                                             org.apache.hadoop.fs.Path path,
                                                             org.apache.hadoop.fs.permission.FsPermission perm,
                                                             boolean overwrite)
                                                      throws IOException
Create the specified file on the filesystem. By default, this will:
  1. apply the umask in the configuration (if it is enabled)
  2. use the fs configured buffer size (or 4096 if not set)
  3. use the default replication
  4. use the default block size
  5. not track progress

Parameters:
fs - FileSystem on which to write the file
path - Path to the file to write
perm -
overwrite - Whether or not the created file should be overwritten.
Returns:
output stream to the created file
Throws:
IOException - if the file cannot be created

getFilePermissions

public static org.apache.hadoop.fs.permission.FsPermission getFilePermissions(org.apache.hadoop.fs.FileSystem fs,
                                                                              org.apache.hadoop.conf.Configuration conf,
                                                                              String permssionConfKey)
Get the file permissions specified in the configuration, if they are enabled.

Parameters:
fs - filesystem that the file will be created on.
conf - configuration to read for determining if permissions are enabled and which to use
permssionConfKey - property key in the configuration to use when finding the permission
Returns:
the permission to use when creating a new file on the fs. If special permissions are not specified in the configuration, then the default permissions on the the fs will be returned.

getFileDefault

public static org.apache.hadoop.fs.permission.FsPermission getFileDefault()
Get the default permission for file. This is the same method as FsPermission.getFileDefault() in Hadoop 2. We provide the method here to support compatibility with Hadoop 1. See HBASE-11061. Would be better to do this as Interface in hadoop-compat w/ hadoop1 and hadoop2 implementations but punting on this since small risk this will change in 0.96/0.98 timeframe (only committed to these branches).


checkFileSystemAvailable

public static void checkFileSystemAvailable(org.apache.hadoop.fs.FileSystem fs)
                                     throws IOException
Checks to see if the specified file system is available

Parameters:
fs - filesystem
Throws:
IOException - e

checkDfsSafeMode

public static void checkDfsSafeMode(org.apache.hadoop.conf.Configuration conf)
                             throws IOException
Check whether dfs is in safemode.

Parameters:
conf -
Throws:
IOException

getVersion

public static String getVersion(org.apache.hadoop.fs.FileSystem fs,
                                org.apache.hadoop.fs.Path rootdir)
                         throws IOException,
                                DeserializationException
Verifies current version of file system

Parameters:
fs - filesystem object
rootdir - root hbase directory
Returns:
null if no version file exists, version string otherwise.
Throws:
IOException - e
DeserializationException

checkVersion

public static void checkVersion(org.apache.hadoop.fs.FileSystem fs,
                                org.apache.hadoop.fs.Path rootdir,
                                boolean message)
                         throws IOException,
                                DeserializationException
Verifies current version of file system

Parameters:
fs - file system
rootdir - root directory of HBase installation
message - if true, issues a message on System.out
Throws:
IOException - e
DeserializationException

checkVersion

public static void checkVersion(org.apache.hadoop.fs.FileSystem fs,
                                org.apache.hadoop.fs.Path rootdir,
                                boolean message,
                                int wait,
                                int retries)
                         throws IOException,
                                DeserializationException
Verifies current version of file system

Parameters:
fs - file system
rootdir - root directory of HBase installation
message - if true, issues a message on System.out
wait - wait interval
retries - number of times to retry
Throws:
IOException - e
DeserializationException

setVersion

public static void setVersion(org.apache.hadoop.fs.FileSystem fs,
                              org.apache.hadoop.fs.Path rootdir)
                       throws IOException
Sets version of file system

Parameters:
fs - filesystem object
rootdir - hbase root
Throws:
IOException - e

setVersion

public static void setVersion(org.apache.hadoop.fs.FileSystem fs,
                              org.apache.hadoop.fs.Path rootdir,
                              int wait,
                              int retries)
                       throws IOException
Sets version of file system

Parameters:
fs - filesystem object
rootdir - hbase root
wait - time to wait for retry
retries - number of times to retry before failing
Throws:
IOException - e

setVersion

public static void setVersion(org.apache.hadoop.fs.FileSystem fs,
                              org.apache.hadoop.fs.Path rootdir,
                              String version,
                              int wait,
                              int retries)
                       throws IOException
Sets version of file system

Parameters:
fs - filesystem object
rootdir - hbase root directory
version - version to set
wait - time to wait for retry
retries - number of times to retry before throwing an IOException
Throws:
IOException - e

checkClusterIdExists

public static boolean checkClusterIdExists(org.apache.hadoop.fs.FileSystem fs,
                                           org.apache.hadoop.fs.Path rootdir,
                                           int wait)
                                    throws IOException
Checks that a cluster ID file exists in the HBase root directory

Parameters:
fs - the root directory FileSystem
rootdir - the HBase root directory in HDFS
wait - how long to wait between retries
Returns:
true if the file exists, otherwise false
Throws:
IOException - if checking the FileSystem fails

getClusterId

public static ClusterId getClusterId(org.apache.hadoop.fs.FileSystem fs,
                                     org.apache.hadoop.fs.Path rootdir)
                              throws IOException
Returns the value of the unique cluster ID stored for this HBase instance.

Parameters:
fs - the root directory FileSystem
rootdir - the path to the HBase root directory
Returns:
the unique cluster identifier
Throws:
IOException - if reading the cluster ID file fails

setClusterId

public static void setClusterId(org.apache.hadoop.fs.FileSystem fs,
                                org.apache.hadoop.fs.Path rootdir,
                                ClusterId clusterId,
                                int wait)
                         throws IOException
Writes a new unique identifier for this cluster to the "hbase.id" file in the HBase root directory

Parameters:
fs - the root directory FileSystem
rootdir - the path to the HBase root directory
clusterId - the unique identifier to store
wait - how long (in milliseconds) to wait between retries
Throws:
IOException - if writing to the FileSystem fails and no wait value

validateRootPath

public static org.apache.hadoop.fs.Path validateRootPath(org.apache.hadoop.fs.Path root)
                                                  throws IOException
Verifies root directory path is a valid URI with a scheme

Parameters:
root - root directory path
Returns:
Passed root argument.
Throws:
IOException - if not a valid URI with a scheme

removeRootPath

public static String removeRootPath(org.apache.hadoop.fs.Path path,
                                    org.apache.hadoop.conf.Configuration conf)
                             throws IOException
Checks for the presence of the root path (using the provided conf object) in the given path. If it exists, this method removes it and returns the String representation of remaining relative path.

Parameters:
path -
conf -
Returns:
String representation of the remaining relative path
Throws:
IOException

waitOnSafeMode

public static void waitOnSafeMode(org.apache.hadoop.conf.Configuration conf,
                                  long wait)
                           throws IOException
If DFS, check safe mode and if so, wait until we clear it.

Parameters:
conf - configuration
wait - Sleep between retries
Throws:
IOException - e

getPath

public static String getPath(org.apache.hadoop.fs.Path p)
Return the 'path' component of a Path. In Hadoop, Path is an URI. This method returns the 'path' component of a Path's URI: e.g. If a Path is hdfs://example.org:9000/hbase_trunk/TestTable/compaction.dir, this method returns /hbase_trunk/TestTable/compaction.dir. This method is useful if you want to print out a Path without qualifying Filesystem instance.

Parameters:
p - Filesystem Path whose 'path' component we are to return.
Returns:
Path portion of the Filesystem

getRootDir

public static org.apache.hadoop.fs.Path getRootDir(org.apache.hadoop.conf.Configuration c)
                                            throws IOException
Parameters:
c - configuration
Returns:
Path to hbase root directory: i.e. hbase.rootdir from configuration as a qualified Path.
Throws:
IOException - e

setRootDir

public static void setRootDir(org.apache.hadoop.conf.Configuration c,
                              org.apache.hadoop.fs.Path root)
                       throws IOException
Throws:
IOException

setFsDefault

public static void setFsDefault(org.apache.hadoop.conf.Configuration c,
                                org.apache.hadoop.fs.Path root)
                         throws IOException
Throws:
IOException

metaRegionExists

public static boolean metaRegionExists(org.apache.hadoop.fs.FileSystem fs,
                                       org.apache.hadoop.fs.Path rootdir)
                                throws IOException
Checks if meta region exists

Parameters:
fs - file system
rootdir - root directory of HBase installation
Returns:
true if exists
Throws:
IOException - e

computeHDFSBlocksDistribution

public static HDFSBlocksDistribution computeHDFSBlocksDistribution(org.apache.hadoop.fs.FileSystem fs,
                                                                   org.apache.hadoop.fs.FileStatus status,
                                                                   long start,
                                                                   long length)
                                                            throws IOException
Compute HDFS blocks distribution of a given file, or a portion of the file

Parameters:
fs - file system
status - file status of the file
start - start position of the portion
length - length of the portion
Returns:
The HDFS blocks distribution
Throws:
IOException

isMajorCompacted

public static boolean isMajorCompacted(org.apache.hadoop.fs.FileSystem fs,
                                       org.apache.hadoop.fs.Path hbaseRootDir)
                                throws IOException
Runs through the hbase rootdir and checks all stores have only one file in them -- that is, they've been major compacted. Looks at root and meta tables too.

Parameters:
fs - filesystem
hbaseRootDir - hbase root directory
Returns:
True if this hbase install is major compacted.
Throws:
IOException - e

getTotalTableFragmentation

public static int getTotalTableFragmentation(HMaster master)
                                      throws IOException
Returns the total overall fragmentation percentage. Includes hbase:meta and -ROOT- as well.

Parameters:
master - The master defining the HBase root and file system.
Returns:
A map for each table and its percentage.
Throws:
IOException - When scanning the directory fails.

getTableFragmentation

public static Map<String,Integer> getTableFragmentation(HMaster master)
                                                 throws IOException
Runs through the HBase rootdir and checks how many stores for each table have more than one file in them. Checks -ROOT- and hbase:meta too. The total percentage across all tables is stored under the special key "-TOTAL-".

Parameters:
master - The master defining the HBase root and file system.
Returns:
A map for each table and its percentage.
Throws:
IOException - When scanning the directory fails.

getTableFragmentation

public static Map<String,Integer> getTableFragmentation(org.apache.hadoop.fs.FileSystem fs,
                                                        org.apache.hadoop.fs.Path hbaseRootDir)
                                                 throws IOException
Runs through the HBase rootdir and checks how many stores for each table have more than one file in them. Checks -ROOT- and hbase:meta too. The total percentage across all tables is stored under the special key "-TOTAL-".

Parameters:
fs - The file system to use.
hbaseRootDir - The root directory to scan.
Returns:
A map for each table and its percentage.
Throws:
IOException - When scanning the directory fails.

getTableDir

public static org.apache.hadoop.fs.Path getTableDir(org.apache.hadoop.fs.Path rootdir,
                                                    TableName tableName)
Returns the Path object representing the table directory under path rootdir

Parameters:
rootdir - qualified path of HBase root directory
tableName - name of table
Returns:
Path for table

getTableName

public static TableName getTableName(org.apache.hadoop.fs.Path tablePath)
Returns the TableName object representing the table directory under path rootdir

Parameters:
tablePath - path of table
Returns:
Path for table

getNamespaceDir

public static org.apache.hadoop.fs.Path getNamespaceDir(org.apache.hadoop.fs.Path rootdir,
                                                        String namespace)
Returns the Path object representing the namespace directory under path rootdir

Parameters:
rootdir - qualified path of HBase root directory
namespace - namespace name
Returns:
Path for table

isAppendSupported

public static boolean isAppendSupported(org.apache.hadoop.conf.Configuration conf)
Heuristic to determine whether is safe or not to open a file for append Looks both for dfs.support.append and use reflection to search for SequenceFile.Writer.syncFs() or FSDataOutputStream.hflush()

Parameters:
conf -
Returns:
True if append support

isHDFS

public static boolean isHDFS(org.apache.hadoop.conf.Configuration conf)
                      throws IOException
Parameters:
conf -
Returns:
True if this filesystem whose scheme is 'hdfs'.
Throws:
IOException

recoverFileLease

public abstract void recoverFileLease(org.apache.hadoop.fs.FileSystem fs,
                                      org.apache.hadoop.fs.Path p,
                                      org.apache.hadoop.conf.Configuration conf,
                                      CancelableProgressable reporter)
                               throws IOException
Recover file lease. Used when a file might be suspect to be had been left open by another process.

Parameters:
fs - FileSystem handle
p - Path of file to recover lease
conf - Configuration handle
Throws:
IOException

getTableDirs

public static List<org.apache.hadoop.fs.Path> getTableDirs(org.apache.hadoop.fs.FileSystem fs,
                                                           org.apache.hadoop.fs.Path rootdir)
                                                    throws IOException
Throws:
IOException

getLocalTableDirs

public static List<org.apache.hadoop.fs.Path> getLocalTableDirs(org.apache.hadoop.fs.FileSystem fs,
                                                                org.apache.hadoop.fs.Path rootdir)
                                                         throws IOException
Parameters:
fs -
rootdir -
Returns:
All the table directories under rootdir. Ignore non table hbase folders such as .logs, .oldlogs, .corrupt folders.
Throws:
IOException

isRecoveredEdits

public static boolean isRecoveredEdits(org.apache.hadoop.fs.Path path)
Checks if the given path is the one with 'recovered.edits' dir.

Parameters:
path -
Returns:
True if we recovered edits

getRegionDirs

public static List<org.apache.hadoop.fs.Path> getRegionDirs(org.apache.hadoop.fs.FileSystem fs,
                                                            org.apache.hadoop.fs.Path tableDir)
                                                     throws IOException
Given a particular table dir, return all the regiondirs inside it, excluding files such as .tableinfo

Parameters:
fs - A file system for the Path
tableDir - Path to a specific table directory /
Returns:
List of paths to valid region directories in table dir.
Throws:
IOException

getFamilyDirs

public static List<org.apache.hadoop.fs.Path> getFamilyDirs(org.apache.hadoop.fs.FileSystem fs,
                                                            org.apache.hadoop.fs.Path regionDir)
                                                     throws IOException
Given a particular region dir, return all the familydirs inside it

Parameters:
fs - A file system for the Path
regionDir - Path to a specific region directory
Returns:
List of paths to valid family directories in region dir.
Throws:
IOException

getReferenceFilePaths

public static List<org.apache.hadoop.fs.Path> getReferenceFilePaths(org.apache.hadoop.fs.FileSystem fs,
                                                                    org.apache.hadoop.fs.Path familyDir)
                                                             throws IOException
Throws:
IOException

getCurrentFileSystem

public static org.apache.hadoop.fs.FileSystem getCurrentFileSystem(org.apache.hadoop.conf.Configuration conf)
                                                            throws IOException
Parameters:
conf -
Returns:
Returns the filesystem of the hbase rootdir.
Throws:
IOException

getTableStoreFilePathMap

public static Map<String,org.apache.hadoop.fs.Path> getTableStoreFilePathMap(Map<String,org.apache.hadoop.fs.Path> map,
                                                                             org.apache.hadoop.fs.FileSystem fs,
                                                                             org.apache.hadoop.fs.Path hbaseRootDir,
                                                                             TableName tableName)
                                                                      throws IOException
Runs through the HBase rootdir/tablename and creates a reverse lookup map for table StoreFile names to the full Path.
Example...
Key = 3944417774205889744
Value = hdfs://localhost:51169/user/userid/-ROOT-/70236052/info/3944417774205889744

Parameters:
map - map to add values. If null, this method will create and populate one to return
fs - The file system to use.
hbaseRootDir - The root directory to scan.
tableName - name of the table to scan.
Returns:
Map keyed by StoreFile name with a value of the full Path.
Throws:
IOException - When scanning the directory fails.

getTableStoreFilePathMap

public static Map<String,org.apache.hadoop.fs.Path> getTableStoreFilePathMap(Map<String,org.apache.hadoop.fs.Path> map,
                                                                             org.apache.hadoop.fs.FileSystem fs,
                                                                             org.apache.hadoop.fs.Path hbaseRootDir,
                                                                             TableName tableName,
                                                                             HBaseFsck.ErrorReporter errors)
                                                                      throws IOException
Runs through the HBase rootdir/tablename and creates a reverse lookup map for table StoreFile names to the full Path.
Example...
Key = 3944417774205889744
Value = hdfs://localhost:51169/user/userid/-ROOT-/70236052/info/3944417774205889744

Parameters:
map - map to add values. If null, this method will create and populate one to return
fs - The file system to use.
hbaseRootDir - The root directory to scan.
tableName - name of the table to scan.
errors - ErrorReporter instance or null
Returns:
Map keyed by StoreFile name with a value of the full Path.
Throws:
IOException - When scanning the directory fails.

getRegionReferenceFileCount

public static int getRegionReferenceFileCount(org.apache.hadoop.fs.FileSystem fs,
                                              org.apache.hadoop.fs.Path p)

getTableStoreFilePathMap

public static Map<String,org.apache.hadoop.fs.Path> getTableStoreFilePathMap(org.apache.hadoop.fs.FileSystem fs,
                                                                             org.apache.hadoop.fs.Path hbaseRootDir)
                                                                      throws IOException
Runs through the HBase rootdir and creates a reverse lookup map for table StoreFile names to the full Path.
Example...
Key = 3944417774205889744
Value = hdfs://localhost:51169/user/userid/-ROOT-/70236052/info/3944417774205889744

Parameters:
fs - The file system to use.
hbaseRootDir - The root directory to scan.
Returns:
Map keyed by StoreFile name with a value of the full Path.
Throws:
IOException - When scanning the directory fails.

getTableStoreFilePathMap

public static Map<String,org.apache.hadoop.fs.Path> getTableStoreFilePathMap(org.apache.hadoop.fs.FileSystem fs,
                                                                             org.apache.hadoop.fs.Path hbaseRootDir,
                                                                             HBaseFsck.ErrorReporter errors)
                                                                      throws IOException
Runs through the HBase rootdir and creates a reverse lookup map for table StoreFile names to the full Path.
Example...
Key = 3944417774205889744
Value = hdfs://localhost:51169/user/userid/-ROOT-/70236052/info/3944417774205889744

Parameters:
fs - The file system to use.
hbaseRootDir - The root directory to scan.
errors - ErrorReporter instance or null
Returns:
Map keyed by StoreFile name with a value of the full Path.
Throws:
IOException - When scanning the directory fails.

listStatus

public static org.apache.hadoop.fs.FileStatus[] listStatus(org.apache.hadoop.fs.FileSystem fs,
                                                           org.apache.hadoop.fs.Path dir,
                                                           org.apache.hadoop.fs.PathFilter filter)
                                                    throws IOException
Calls fs.listStatus() and treats FileNotFoundException as non-fatal This accommodates differences between hadoop versions, where hadoop 1 does not throw a FileNotFoundException, and return an empty FileStatus[] while Hadoop 2 will throw FileNotFoundException.

Parameters:
fs - file system
dir - directory
filter - path filter
Returns:
null if dir is empty or doesn't exist, otherwise FileStatus array
Throws:
IOException

listStatus

public static org.apache.hadoop.fs.FileStatus[] listStatus(org.apache.hadoop.fs.FileSystem fs,
                                                           org.apache.hadoop.fs.Path dir)
                                                    throws IOException
Calls fs.listStatus() and treats FileNotFoundException as non-fatal This would accommodates differences between hadoop versions

Parameters:
fs - file system
dir - directory
Returns:
null if dir is empty or doesn't exist, otherwise FileStatus array
Throws:
IOException

delete

public static boolean delete(org.apache.hadoop.fs.FileSystem fs,
                             org.apache.hadoop.fs.Path path,
                             boolean recursive)
                      throws IOException
Calls fs.delete() and returns the value returned by the fs.delete()

Parameters:
fs -
path -
recursive -
Returns:
the value returned by the fs.delete()
Throws:
IOException

isExists

public static boolean isExists(org.apache.hadoop.fs.FileSystem fs,
                               org.apache.hadoop.fs.Path path)
                        throws IOException
Calls fs.exists(). Checks if the specified path exists

Parameters:
fs -
path -
Returns:
the value returned by fs.exists()
Throws:
IOException

checkAccess

public static void checkAccess(org.apache.hadoop.security.UserGroupInformation ugi,
                               org.apache.hadoop.fs.FileStatus file,
                               org.apache.hadoop.fs.permission.FsAction action)
                        throws AccessDeniedException
Throw an exception if an action is not permitted by a user on a file.

Parameters:
ugi - the user
file - the file
action - the action
Throws:
AccessDeniedException

logFileSystemState

public static void logFileSystemState(org.apache.hadoop.fs.FileSystem fs,
                                      org.apache.hadoop.fs.Path root,
                                      org.apache.commons.logging.Log LOG)
                               throws IOException
Log the current state of the filesystem from a certain root directory

Parameters:
fs - filesystem to investigate
root - root file/directory to start logging from
LOG - log to output information
Throws:
IOException - if an unexpected exception occurs

renameAndSetModifyTime

public static boolean renameAndSetModifyTime(org.apache.hadoop.fs.FileSystem fs,
                                             org.apache.hadoop.fs.Path src,
                                             org.apache.hadoop.fs.Path dest)
                                      throws IOException
Throws:
IOException

getRegionDegreeLocalityMappingFromFS

public static Map<String,Map<String,Float>> getRegionDegreeLocalityMappingFromFS(org.apache.hadoop.conf.Configuration conf)
                                                                          throws IOException
This function is to scan the root path of the file system to get the degree of locality for each region on each of the servers having at least one block of that region. This is used by the tool RegionPlacementMaintainer

Parameters:
conf - the configuration to use
Returns:
the mapping from region encoded name to a map of server names to locality fraction
Throws:
IOException - in case of file system errors or interrupts

getRegionDegreeLocalityMappingFromFS

public static Map<String,Map<String,Float>> getRegionDegreeLocalityMappingFromFS(org.apache.hadoop.conf.Configuration conf,
                                                                                 String desiredTable,
                                                                                 int threadPoolSize)
                                                                          throws IOException
This function is to scan the root path of the file system to get the degree of locality for each region on each of the servers having at least one block of that region.

Parameters:
conf - the configuration to use
desiredTable - the table you wish to scan locality for
threadPoolSize - the thread pool size to use
Returns:
the mapping from region encoded name to a map of server names to locality fraction
Throws:
IOException - in case of file system errors or interrupts

setupShortCircuitRead

public static void setupShortCircuitRead(org.apache.hadoop.conf.Configuration conf)
Do our short circuit read setup. Checks buffer size to use and whether to do checksumming in hbase or hdfs.

Parameters:
conf -

checkShortCircuitReadBufferSize

public static void checkShortCircuitReadBufferSize(org.apache.hadoop.conf.Configuration conf)
Check if short circuit read buffer size is set and if not, set it to hbase value.

Parameters:
conf -


Copyright © 2007–2016 The Apache Software Foundation. All rights reserved.