For more information on the HBase client, see Section 8.2, “Client”.
This is thrown if the time between RPC calls from the client to RegionServer exceeds the scan timeout.
For example, if Scan.setCaching
is set to 500, then there will be an RPC call to fetch the next batch of rows every 500 .next()
calls on the ResultScanner
because data is being transferred in blocks of 500 rows to the client. Reducing the setCaching value may be an option, but setting this value too low makes for inefficient
processing on numbers of rows.
Since 0.20.0 the default log level for org.apache.hadoop.hbase.*
is DEBUG.
On your clients, edit $HBASE_HOME/conf/log4j.properties
and change this: log4j.logger.org.apache.hadoop.hbase=DEBUG
to this: log4j.logger.org.apache.hadoop.hbase=INFO
, or even log4j.logger.org.apache.hadoop.hbase=WARN
.
This is a fairly frequent question on the HBase dist-list. The scenario is that a client is typically inserting a lot of data into a relatively un-optimized HBase cluster. Compression can exacerbate the pauses, although it is not the source of the problem.
See Section 10.6.2, “ Table Creation: Pre-Creating Regions ” on the pattern for pre-creating regions and confirm that the table isn't starting with a single region.
See Section 10.4, “HBase Configurations” for cluster configuration, particularly hbase.hstore.blockingStoreFiles
, hbase.hregion.memstore.block.multiplier
,
MAX_FILESIZE
(region size), and MEMSTORE_FLUSHSIZE.
A slightly longer explanation of why pauses can happen is as follows: Puts are sometimes blocked on the MemStores which are blocked by the flusher thread which is blocked because there are too many files to compact because the compactor is given too many small files to compact and has to compact the same data repeatedly. This situation can occur even with minor compactions. Compounding this situation, HBase doesn't compress data in memory. Thus, the 64MB that lives in the MemStore could become a 6MB file after compression - which results in a smaller StoreFile. The upside is that more data is packed into the same region, but performance is achieved by being able to write larger files - which is why HBase waits until the flushize before writing a new StoreFile. And smaller StoreFiles become targets for compaction. Without compression the files are much bigger and don't need as much compaction, however this is at the expense of I/O.
For additional information, see this thread on Long client pauses with compression.
Errors like this...
11/07/05 11:26:41 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused: no further information at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078) 11/07/05 11:26:43 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181 11/07/05 11:26:44 WARN zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused: no further information at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078) 11/07/05 11:26:45 INFO zookeeper.ClientCnxn: Opening socket connection to server localhost/127.0.0.1:2181
... are either due to ZooKeeper being down, or unreachable due to network issues.