Description:
This Processors polls Apache Kafka
for data. When a message is received from Kafka, this Processor emits a FlowFile
where the content of the FlowFile is the value of the Kafka message. If the
message has a key associated with it, an attribute named kafka.key
will be added to the FlowFile, with the value being the UTF-8 Encoded value
of the Message's Key.
Kafka supports the notion of a Consumer Group when pulling messages in order to
provide scalability while still offering a publish-subscribe interface. Each
Consumer Group must have a unique identifier. The Consumer Group identifier that
is used by NiFi is the UUID of the Processor. This means that all of the nodes
within a cluster will use the same Consumer Group Identifier so that they do
not receive duplicate data but multiple GetKafka Processors can be used to pull
from multiple Topics, as each Processor will receive a different Processor UUID
and therefore a different Consumer Group Identifier.
Modifies Attributes:
Attribute Name |
Description |
kafka.topic |
The name of the Kafka Topic from which the message was received |
kafka.key |
The key of the Kafka message, if it exists and batch size is 1. If the message does not have a key,
or if the batch size is greater than 1, this attribute will not be added. |
kafka.partition |
The partition of the Kafka Topic from which the message was received. This attribute is added only
if the batch size is 1. |
kafka.offset |
The offset of the message within the Kafka partition. This attribute is added only
if the batch size is 1. |
Properties:
In the list below, the names of required properties appear
in bold. Any other properties (not in bold) are considered optional.
If a property has a default value, it is indicated. If a property
supports the use of the NiFi Expression Language (or simply,
"expression language"), that is also indicated.
- ZooKeeper Connection String
- The Connection String to use in order to connect to ZooKeeper. This is often a
comma-separated list of <host>:<port> combinations. For example,
host1:2181,host2:2181,host3:2188
- Default value: no default
- Supports expression language: false
- Topic Name
- The Kafka Topic to pull messages from
- Default value: no default
- Supports expression language: false
- Zookeeper Commit Frequency
- Specifies how often to communicate with ZooKeeper to indicate which messages have been pulled.
A longer time period will result in better overall performance but can result in more data
duplication if a NiFi node is lost
- Default value: 60 secs
- Supports expression language: false
- ZooKeeper Communications Timeout
- The amount of time to wait for a response from ZooKeeper before determining that there is a communications error
- Default value: 30 secs
- Supports expression language: false
- Kafka Communications Timeout
- The amount of time to wait for a response from Kafka before determining that there is a communications error
- Default value: 30 secs
- Supports expression language: false
- Batch Size
- Specifies the maximum number of messages to combine into a single FlowFile.
These messages will be concatenated together with the <Message Demarcator>
string placed between the content of each message. If the messages from Kafka
should not be concatenated together, leave this value at 1.
- Default value: 1
- Supports expression language: false
- Message Demarcator
- Specifies the characters to use in order to demarcate multiple messages from Kafka.
If the <Batch Size> property is set to 1, this value is ignored. Otherwise, for each two
subsequent messages in the batch, this value will be placed in between them. This property will
treat "\n" as a new-line, "\r" as a carriage return and "\t" as a tab character. All other
characters are treated as literal characters.
- Default value: \n
- Supports expression language: false
- Client Name
- Client Name to use when communicating with Kafka
- Default value: "NiFi-" followed by the UUID of the Processor
- Supports expression language: false
Relationships:
- success
- All messages that are received from Kafka are routed to the 'success' relationship