New leader = replica in isr that's not being shutdown; New isr = current isr - shutdown replica; Replicas to receive LeaderAndIsr request = live assigned replicas
Called when leader intimates of isr change
Essentially does nothing.
Select the new leader, new isr and receiving replicas (for the LeaderAndIsrRequest): 1.
Select the new leader, new isr and receiving replicas (for the LeaderAndIsrRequest): 1. If at least one broker from the isr is alive, it picks a broker from the live isr as the new leader and the live isr as the new isr. 2. Else, if unclean leader election for the topic is disabled, it throws a NoReplicaOnlineException. 3. Else, it picks some alive broker from the assigned replica list as the new leader and the new isr. 4. If no broker in the assigned replica list is alive, it throws a NoReplicaOnlineException Replicas to receive LeaderAndIsr request = live assigned replicas Once the leader is successfully registered in zookeeper, it updates the allLeaders cache
This class represents the state machine for partitions.
This class represents the state machine for partitions. It defines the states that a partition can be in, and transitions to move the partition to another legal state. The different states that a partition can be in are - 1. NonExistentPartition: This state indicates that the partition was either never created or was created and then deleted. Valid previous state, if one exists, is OfflinePartition 2. NewPartition : After creation, the partition is in the NewPartition state. In this state, the partition should have replicas assigned to it, but no leader/isr yet. Valid previous states are NonExistentPartition 3. OnlinePartition : Once a leader is elected for a partition, it is in the OnlinePartition state. Valid previous states are NewPartition/OfflinePartition 4. OfflinePartition : If, after successful leader election, the leader for partition dies, then the partition moves to the OfflinePartition state. Valid previous states are NewPartition/OnlinePartition
Starts the partition reassignment process unless - 1.
Starts the partition reassignment process unless - 1. Partition previously existed 2. New replicas are the same as existing replicas 3. Any replica in the new set of replicas are dead If any of the above conditions are satisfied, it logs an error and removes the partition from list of reassigned partitions.
Starts the preferred replica leader election for the list of partitions specified under /admin/preferred_replica_election -
New leader = preferred (first assigned) replica (if in isr and alive); New isr = current isr; Replicas to receive LeaderAndIsr request = assigned replicas
New leader = a live in-sync reassigned replica New isr = current isr Replicas to receive LeaderAndIsr request = reassigned replicas
This class represents the state machine for replicas.
This class represents the state machine for replicas. It defines the states that a replica can be in, and transitions to move the replica to another legal state. The different states that a replica can be in are - 1. NewReplica : The controller can create new replicas during partition reassignment. In this state, a replica can only get become follower state change request. Valid previous state is NonExistentReplica 2. OnlineReplica : Once a replica is started and part of the assigned replicas for its partition, it is in this state. In this state, it can get either become leader or become follower state change requests. Valid previous state are NewReplica, OnlineReplica or OfflineReplica 3. OfflineReplica : If a replica dies, it moves to this state. This happens when the broker hosting the replica is down. Valid previous state are NewReplica, OnlineReplica 4. ReplicaDeletionStarted: If replica deletion starts, it is moved to this state. Valid previous state is OfflineReplica 5. ReplicaDeletionSuccessful: If replica responds with no error code in response to a delete replica request, it is moved to this state. Valid previous state is ReplicaDeletionStarted 6. ReplicaDeletionIneligible: If replica deletion fails, it is moved to this state. Valid previous state is ReplicaDeletionStarted 7. NonExistentReplica: If a replica is deleted successfully, it is moved to this state. Valid previous state is ReplicaDeletionSuccessful
This manages the state machine for topic deletion.
This manages the state machine for topic deletion. 1. TopicCommand issues topic deletion by creating a new admin path /admin/delete_topics/<topic> 2. The controller listens for child changes on /admin/delete_topic and starts topic deletion for the respective topics 3. The controller has a background thread that handles topic deletion. The purpose of having this background thread is to accommodate the TTL feature, when we have it. This thread is signaled whenever deletion for a topic needs to be started or resumed. Currently, a topic's deletion can be started only by the onPartitionDeletion callback on the controller. In the future, it can be triggered based on the configured TTL for the topic. A topic will be ineligible for deletion in the following scenarios - 3.1 broker hosting one of the replicas for that topic goes down 3.2 partition reassignment for partitions of that topic is in progress 3.3 preferred replica election for partitions of that topic is in progress (though this is not strictly required since it holds the controller lock for the entire duration from start to end) 4. Topic deletion is resumed when - 4.1 broker hosting one of the replicas for that topic is started 4.2 preferred replica election for partitions of that topic completes 4.3 partition reassignment for partitions of that topic completes 5. Every replica for a topic being deleted is in either of the 3 states - 5.1 TopicDeletionStarted (Replica enters TopicDeletionStarted phase when the onPartitionDeletion callback is invoked. This happens when the child change watch for /admin/delete_topics fires on the controller. As part of this state change, the controller sends StopReplicaRequests to all replicas. It registers a callback for the StopReplicaResponse when deletePartition=true thereby invoking a callback when a response for delete replica is received from every replica) 5.2 TopicDeletionSuccessful (deleteTopicStopReplicaCallback() moves replicas from TopicDeletionStarted->TopicDeletionSuccessful depending on the error codes in StopReplicaResponse) 5.3 TopicDeletionFailed. (deleteTopicStopReplicaCallback() moves replicas from TopicDeletionStarted->TopicDeletionFailed depending on the error codes in StopReplicaResponse. In general, if a broker dies and if it hosted replicas for topics being deleted, the controller marks the respective replicas in TopicDeletionFailed state in the onBrokerFailure callback. The reason is that if a broker fails before the request is sent and after the replica is in TopicDeletionStarted state, it is possible that the replica will mistakenly remain in TopicDeletionStarted state and topic deletion will not be retried when the broker comes back up.) 6. The delete topic thread marks a topic successfully deleted only if all replicas are in TopicDeletionSuccessful state and it starts the topic deletion teardown mode where it deletes all topic state from the controllerContext as well as from zookeeper. This is the only time the /brokers/topics/<topic> path gets deleted. On the other hand, if no replica is in TopicDeletionStarted state and at least one replica is in TopicDeletionFailed state, then it marks the topic for deletion retry.
Essentially does nothing. Returns the current leader and ISR, and the current set of replicas assigned to a given topic/partition.