org.apache.hadoop.hbase.util.hbck
Class HFileCorruptionChecker

java.lang.Object
  extended by org.apache.hadoop.hbase.util.hbck.HFileCorruptionChecker

public class HFileCorruptionChecker
extends Object

This class marches through all of the region's hfiles and verifies that they are all valid files. One just needs to instantiate the class, use checkTables(List) and then retrieve the corrupted hfiles (and quarantined files if in quarantining mode) The implementation currently parallelizes at the regionDir level.


Constructor Summary
HFileCorruptionChecker(org.apache.hadoop.conf.Configuration conf, ExecutorService executor, boolean quarantine)
           
 
Method Summary
protected  void checkColFamDir(org.apache.hadoop.fs.Path cfDir)
          Check all files in a column family dir.
protected  void checkHFile(org.apache.hadoop.fs.Path p)
          Checks a path to see if it is a valid hfile.
protected  void checkRegionDir(org.apache.hadoop.fs.Path regionDir)
          Check all column families in a region dir.
 void checkTables(Collection<org.apache.hadoop.fs.Path> tables)
          Check the specified table dirs for bad hfiles.
 Collection<org.apache.hadoop.fs.Path> getCorrupted()
           
 Collection<org.apache.hadoop.fs.Path> getFailures()
           
 int getHFilesChecked()
           
 Collection<org.apache.hadoop.fs.Path> getMissing()
           
 Collection<org.apache.hadoop.fs.Path> getQuarantined()
           
 void report(HBaseFsck.ErrorReporter out)
          Print a human readable summary of hfile quarantining operations.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

HFileCorruptionChecker

public HFileCorruptionChecker(org.apache.hadoop.conf.Configuration conf,
                              ExecutorService executor,
                              boolean quarantine)
                       throws IOException
Throws:
IOException
Method Detail

checkHFile

protected void checkHFile(org.apache.hadoop.fs.Path p)
                   throws IOException
Checks a path to see if it is a valid hfile.

Parameters:
p - full Path to an HFile
Throws:
IOException - This is a connectivity related exception

checkColFamDir

protected void checkColFamDir(org.apache.hadoop.fs.Path cfDir)
                       throws IOException
Check all files in a column family dir.

Parameters:
cfDir - column family directory
Throws:
IOException

checkRegionDir

protected void checkRegionDir(org.apache.hadoop.fs.Path regionDir)
                       throws IOException
Check all column families in a region dir.

Parameters:
regionDir - region directory
Throws:
IOException

checkTables

public void checkTables(Collection<org.apache.hadoop.fs.Path> tables)
                 throws IOException
Check the specified table dirs for bad hfiles.

Throws:
IOException

getFailures

public Collection<org.apache.hadoop.fs.Path> getFailures()
Returns:
the set of check failure file paths after checkTables is called.

getCorrupted

public Collection<org.apache.hadoop.fs.Path> getCorrupted()
Returns:
the set of corrupted file paths after checkTables is called.

getHFilesChecked

public int getHFilesChecked()
Returns:
number of hfiles checked in the last HfileCorruptionChecker run

getQuarantined

public Collection<org.apache.hadoop.fs.Path> getQuarantined()
Returns:
the set of successfully quarantined paths after checkTables is called.

getMissing

public Collection<org.apache.hadoop.fs.Path> getMissing()
Returns:
the set of paths that were missing. Likely due to deletion/moves from compaction or flushes.

report

public void report(HBaseFsck.ErrorReporter out)
Print a human readable summary of hfile quarantining operations.

Parameters:
out -


Copyright © 2013 The Apache Software Foundation. All Rights Reserved.