org.apache.hadoop.hbase.regionserver
Interface KeyValueScanner

All Known Implementing Classes:
CollectionBackedScanner, KeyValueHeap, MemStore.MemStoreScanner, NonLazyKeyValueScanner, NonReversedNonLazyKeyValueScanner, ReversedKeyValueHeap, StoreFileScanner, StoreScanner

@InterfaceAudience.Private
public interface KeyValueScanner

Scanner that returns the next KeyValue.


Method Summary
 boolean backwardSeek(KeyValue key)
          Seek the scanner at or before the row of specified KeyValue, it firstly tries to seek the scanner at or after the specified KeyValue, return if peek KeyValue of scanner has the same row with specified KeyValue, otherwise seek the scanner at the first KeyValue of the row which is the previous row of specified KeyValue
 void close()
          Close the KeyValue scanner.
 void enforceSeek()
          Does the real seek operation in case it was skipped by seekToRowCol(KeyValue, boolean) (TODO: Whats this?).
 byte[] getNextIndexedKey()
           
 long getSequenceID()
          Get the sequence id associated with this KeyValueScanner.
 boolean isFileScanner()
           
 KeyValue next()
          Return the next KeyValue in this scanner, iterating the scanner
 KeyValue peek()
          Look at the next KeyValue in this scanner, but do not iterate scanner.
 boolean realSeekDone()
          We optimize our store scanners by checking the most recent store file first, so we sometimes pretend we have done a seek but delay it until the store scanner bubbles up to the top of the key-value heap.
 boolean requestSeek(KeyValue kv, boolean forward, boolean useBloom)
          Similar to seek(org.apache.hadoop.hbase.KeyValue) (or reseek(org.apache.hadoop.hbase.KeyValue) if forward is true) but only does a seek operation after checking that it is really necessary for the row/column combination specified by the kv parameter.
 boolean reseek(KeyValue key)
          Reseek the scanner at or after the specified KeyValue.
 boolean seek(KeyValue key)
          Seek the scanner at or after the specified KeyValue.
 boolean seekToLastRow()
          Seek the scanner at the first KeyValue of last row
 boolean seekToPreviousRow(KeyValue key)
          Seek the scanner at the first KeyValue of the row which is the previous row of specified key
 boolean shouldUseScanner(Scan scan, SortedSet<byte[]> columns, long oldestUnexpiredTS)
          Allows to filter out scanners (both StoreFile and memstore) that we don't want to use based on criteria such as Bloom filters and timestamp ranges.
 

Method Detail

peek

KeyValue peek()
Look at the next KeyValue in this scanner, but do not iterate scanner.

Returns:
the next KeyValue

next

KeyValue next()
              throws IOException
Return the next KeyValue in this scanner, iterating the scanner

Returns:
the next KeyValue
Throws:
IOException

seek

boolean seek(KeyValue key)
             throws IOException
Seek the scanner at or after the specified KeyValue.

Parameters:
key - seek value
Returns:
true if scanner has values left, false if end of scanner
Throws:
IOException

reseek

boolean reseek(KeyValue key)
               throws IOException
Reseek the scanner at or after the specified KeyValue. This method is guaranteed to seek at or after the required key only if the key comes after the current position of the scanner. Should not be used to seek to a key which may come before the current position.

Parameters:
key - seek value (should be non-null)
Returns:
true if scanner has values left, false if end of scanner
Throws:
IOException

getSequenceID

long getSequenceID()
Get the sequence id associated with this KeyValueScanner. This is required for comparing multiple files to find out which one has the latest data. The default implementation for this would be to return 0. A file having lower sequence id will be considered to be the older one.


close

void close()
Close the KeyValue scanner.


shouldUseScanner

boolean shouldUseScanner(Scan scan,
                         SortedSet<byte[]> columns,
                         long oldestUnexpiredTS)
Allows to filter out scanners (both StoreFile and memstore) that we don't want to use based on criteria such as Bloom filters and timestamp ranges.

Parameters:
scan - the scan that we are selecting scanners for
columns - the set of columns in the current column family, or null if not specified by the scan
oldestUnexpiredTS - the oldest timestamp we are interested in for this query, based on TTL
Returns:
true if the scanner should be included in the query

requestSeek

boolean requestSeek(KeyValue kv,
                    boolean forward,
                    boolean useBloom)
                    throws IOException
Similar to seek(org.apache.hadoop.hbase.KeyValue) (or reseek(org.apache.hadoop.hbase.KeyValue) if forward is true) but only does a seek operation after checking that it is really necessary for the row/column combination specified by the kv parameter. This function was added to avoid unnecessary disk seeks by checking row-column Bloom filters before a seek on multi-column get/scan queries, and to optimize by looking up more recent files first.

Parameters:
forward - do a forward-only "reseek" instead of a random-access seek
useBloom - whether to enable multi-column Bloom filter optimization
Throws:
IOException

realSeekDone

boolean realSeekDone()
We optimize our store scanners by checking the most recent store file first, so we sometimes pretend we have done a seek but delay it until the store scanner bubbles up to the top of the key-value heap. This method is then used to ensure the top store file scanner has done a seek operation.


enforceSeek

void enforceSeek()
                 throws IOException
Does the real seek operation in case it was skipped by seekToRowCol(KeyValue, boolean) (TODO: Whats this?). Note that this function should be never called on scanners that always do real seek operations (i.e. most of the scanners). The easiest way to achieve this is to call realSeekDone() first.

Throws:
IOException

isFileScanner

boolean isFileScanner()
Returns:
true if this is a file scanner. Otherwise a memory scanner is assumed.

backwardSeek

boolean backwardSeek(KeyValue key)
                     throws IOException
Seek the scanner at or before the row of specified KeyValue, it firstly tries to seek the scanner at or after the specified KeyValue, return if peek KeyValue of scanner has the same row with specified KeyValue, otherwise seek the scanner at the first KeyValue of the row which is the previous row of specified KeyValue

Parameters:
key - seek KeyValue
Returns:
true if the scanner is at the valid KeyValue, false if such KeyValue does not exist
Throws:
IOException

seekToPreviousRow

boolean seekToPreviousRow(KeyValue key)
                          throws IOException
Seek the scanner at the first KeyValue of the row which is the previous row of specified key

Parameters:
key - seek value
Returns:
true if the scanner at the first valid KeyValue of previous row, false if not existing such KeyValue
Throws:
IOException

seekToLastRow

boolean seekToLastRow()
                      throws IOException
Seek the scanner at the first KeyValue of last row

Returns:
true if scanner has values left, false if the underlying data is empty
Throws:
IOException

getNextIndexedKey

byte[] getNextIndexedKey()
Returns:
the next key in the index (the key to seek to the next block) if known, or null otherwise


Copyright © 2007–2015 The Apache Software Foundation. All rights reserved.