HDFS

HDFS

Scheme: hdfs
Name Kind Type Required Deprecated Default Value Enum Values Description
hostName path java.lang.String true false HDFS host to use
port path int false 8020 HDFS port to use
path path java.lang.String true false The directory path to use
overwrite parameter boolean false true Whether to overwrite existing files with the same name
append parameter boolean false Append to existing file. Notice that not all HDFS file systems support the append option.
splitStrategy parameter java.lang.String false In the current version of Hadoop opening a file in append mode is disabled since it's not very reliable. So, for the moment, it's only possible to create new files. The Camel HDFS endpoint tries to solve this problem in this way:
  • If the split strategy option has been defined, the hdfs path will be used as a directory and files will be created using the configured UuidGenerator.
  • Every time a splitting condition is met, a new file is created.
The splitStrategy option is defined as a string with the following syntax:
splitStrategy=ST:value,ST:value,...
where ST can be:
  • BYTES a new file is created, and the old is closed when the number of written bytes is more than value
  • MESSAGES a new file is created, and the old is closed when the number of written messages is more than value
  • IDLE a new file is created, and the old is closed when no writing happened in the last value milliseconds
bufferSize parameter int false 4096 The buffer size used by HDFS
replication parameter short false 3 The HDFS replication factor
blockSize parameter long false 67108864 The size of the HDFS blocks
compressionType parameter org.apache.hadoop.io.SequenceFile.CompressionType false NONE The compression type to use (is default not in use)
compressionCodec parameter org.apache.camel.component.hdfs.HdfsCompressionCodec false DEFAULT DEFAULT
GZIP
BZIP2
The compression codec to use
fileType parameter org.apache.camel.component.hdfs.HdfsFileType false NORMAL_FILE NORMAL_FILE
SEQUENCE_FILE
MAP_FILE
BLOOMMAP_FILE
ARRAY_FILE
The file type to use. For more details see Hadoop HDFS documentation about the various files types.
fileSystemType parameter org.apache.camel.component.hdfs.HdfsFileSystemType false HDFS LOCAL
HDFS
Set to LOCAL to not use HDFS but local java.io.File instead.
keyType parameter org.apache.camel.component.hdfs.HdfsWritableFactories.WritableType false NULL The type for the key in case of sequence or map files.
valueType parameter org.apache.camel.component.hdfs.HdfsWritableFactories.WritableType false BYTES The type for the key in case of sequence or map files
openedSuffix parameter java.lang.String false opened When a file is opened for reading/writing the file is renamed with this suffix to avoid to read it during the writing phase.
readSuffix parameter java.lang.String false read Once the file has been read is renamed with this suffix to avoid to read it again.
initialDelay parameter long false For the consumer, how much to wait (milliseconds) before to start scanning the directory.
delay parameter long false 1000 The interval (milliseconds) between the directory scans.
pattern parameter java.lang.String false * The pattern used for scanning the directory
chunkSize parameter int false 4096 When reading a normal file, this is split into chunks producing a message per chunk.
checkIdleInterval parameter int false 500 How often (time in millis) in to run the idle checker background task. This option is only in use if the splitter strategy is IDLE.
connectOnStartup parameter boolean false true Whether to connect to the HDFS file system on starting the producer/consumer. If false then the connection is created on-demand. Notice that HDFS may take up till 15 minutes to establish a connection, as it has hardcoded 45 x 20 sec redelivery. By setting this option to false allows your application to startup, and not block for up till 15 minutes.
owner parameter java.lang.String false The file owner must match this owner for the consumer to pickup the file. Otherwise the file is skipped.
exchangePattern parameter org.apache.camel.ExchangePattern false InOnly InOnly
RobustInOnly
InOut
InOptionalOut
OutOnly
RobustOutOnly
OutIn
OutOptionalIn
Sets the default exchange pattern when creating an exchange
synchronous parameter boolean false false Sets whether synchronous processing should be strictly used, or Camel is allowed to use asynchronous processing (if supported).

hdfs consumer