Skip to content

KAFKA-14552: Assume a baseline of 3.0 for server protocol versions #18497

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ijuma
Copy link
Member

@ijuma ijuma commented Jan 12, 2025

Kafka 4.0 will remove support for zk mode and will require conversion to kraft before upgrading to 4.0. The minimum kraft version is 3.0 (aka 3.0-IV1).

This provides an opportunity to remove exclusively server side protocols versions that only exist to allow direct upgrades from versions older than 3.0 or that are used only by zk mode.

Since KRaft became production ready in 3.3, we should consider setting the baseline to 3.3. But that requires more discussion and it can be done via a separate change (KAFKA-18601).

Protocol changes:

  • Remove RequestHeader v0 (only used by ControlledShutdown v0)
  • Remove WriteTxnMarkers v0
  • Remove all versions of ControlledShutdown, LeaderAndIsr, StopReplica, UpdateMetadata

In order to remove all versions safely, extend generator to support setting "versions" to "none".
In this case, we no longer generate the *Data classes, but we still reserve the id for
the relevant protocol api (so it doesn't get accidentally used for something else). The protocol
documentation is correct after these changes.

We kept a simplified version of LeaderAndIsr{Request|Response} because it's used by many
tests that are still relevant in kraft mode. Once KAFKA-18486 is done, it may be possible to remove
it (I left a comment on the ticket). Similarly, KAFKA-18487 may make it possible to remove
the introduced StopReplicaPartitionState (left a comment on that ticket too).

There are a number of places that were adjusted to include an ApiKeys.hasValidVersion
check.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@github-actions github-actions bot added core Kafka Broker generator RPC and Record code generator clients labels Jan 12, 2025
@ijuma ijuma force-pushed the kafka-14552-remove-server-protocol-versions-3.0 branch from a7b8a51 to 1cb6fce Compare January 19, 2025 16:06
@ijuma ijuma force-pushed the kafka-14552-remove-server-protocol-versions-3.0 branch 3 times, most recently from 0e36dba to 48ace0f Compare January 19, 2025 21:45
@ijuma ijuma force-pushed the kafka-14552-remove-server-protocol-versions-3.0 branch from 48ace0f to 2320dd9 Compare January 19, 2025 23:07
}

@Test
def testUpdateStrayLogs(): Unit = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReplicaManager.updateStrayLogs was deleted and hence this method is no longer needed. @cmccabe @jsancio Can you please confirm that we have tests for the kraft flow of stray log management?

@@ -4732,67 +4549,6 @@ class ReplicaManagerTest {
}
}

@Test
def testPartitionMetadataFileCreatedAfterPreviousRequestWithoutIds(): Unit = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jolshan I believe this andtestPartitionFetchStateUpdatesWithTopicIdChanges are no longer necessary since KRaft always requires topic ids to be set and 4.0 only supports upgrades from clusters already running in kraft mode. Do you agree or are there reasons to try to make these tests work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your conclusion makes sense.

@ijuma ijuma force-pushed the kafka-14552-remove-server-protocol-versions-3.0 branch from a30378d to 4b200e5 Compare January 19, 2025 23:28
@@ -355,28 +362,9 @@ private void generateHeaderVersion(String type) {
buffer.decrementIndent();
continue;
}
if (type.equals("request") && apiKey == 7) {
buffer.printf("// Version 0 of ControlledShutdownRequest has a non-standard request header%n");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need the special case for ControlledShutdown v0 anymore (since it's been removed).

@@ -894,7 +894,7 @@ class ShareGroupHeartbeatRequestTest(cluster: ClusterInstance) {
val allPartitionsMetadata = waitForAllPartitionsMetadata(brokersToValidate, topic, totalPartitionCount)
(0 until totalPartitionCount - 1).foreach(i => {
allPartitionsMetadata.get(new TopicPartition(topic, i)).foreach { partitionMetadata =>
assertEquals(totalPartitionCount, partitionMetadata.replicas.size)
assertEquals(totalPartitionCount, partitionMetadata.isr.size)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not exactly the same behavior, but the updated behavior is fine for the test (since isr and replicas is the same). The reason for the change is that the replicas are no longer returned by the method being used here.

public class ProtocolTest {

@Test
public void testToHtml() {
Copy link
Member Author

@ijuma ijuma Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surprisingly, toHtml had zero tests even though it breaks the build if it throws an exception (as it's called from one of the docs build targets).

@@ -36,7 +38,7 @@
// ApiVersionsRequest, even if it is from a newer version.
// Since the client is sending the ApiVersionsRequest in order to discover what
// versions are supported, the client does not know the best version to use.
{ "name": "ClientId", "type": "string", "versions": "1+", "nullableVersions": "1+", "ignorable": true,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed ignorable since the ClientId is always set now.

@ijuma ijuma changed the title KAFKA-14552: Assume a baseline of 3.0 for server protocol versions (WIP) KAFKA-14552: Assume a baseline of 3.0 for server protocol versions Jan 20, 2025
@@ -419,42 +418,6 @@ class ReplicaManager(val config: KafkaConfig,
brokerTopicStats.removeMetrics(topic)
}

private[server] def updateStrayLogs(strayPartitions: Iterable[TopicPartition]): Unit = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only used by tests.

@ijuma ijuma force-pushed the kafka-14552-remove-server-protocol-versions-3.0 branch from 4b200e5 to 904554c Compare January 20, 2025 06:29
assertTrue(replicaManager.localLog(topicPartition).isDefined)
val log2 = replicaManager.localLog(topicPartition).get
assertFalse(log2.partitionMetadataFile.get.exists())
assertEquals(Errors.NONE, response2.partitionErrors(topicNames).get(topicPartition))

// There is no file if the request an older version
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These removed lines are no longer relevant.

@ijuma
Copy link
Member Author

ijuma commented Jan 20, 2025

@chia7712 @mumrah Can one of you please review this? It would be good to get this into 4.0 so that the protocol apis are correctly specified and documented.

@ijuma
Copy link
Member Author

ijuma commented Jan 20, 2025

Test failures are unrelated - kicked off the tests again to try for a green build.

Copy link
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ijuma thanks for this patch!

ijuma added 2 commits January 20, 2025 07:31
…emove-server-protocol-versions-3.0

* apache-github/trunk:
  HOTFIX: ListShareGroupOffsetResult javadoc (apache#18642)
  KAFKA-18594: Cleanup BrokerLifecycleManager (apache#18626)
  KAFKA-18232: Add share group state topic prune metrics. (apache#18174)
  KAFKA-18568: Fix flaky test ClientIdQuotaTest (apache#18612)
  KAFKA-18553: Update javadoc and comments of ConfigType (apache#18567)
  [KAFKA-16720] AdminClient Support for ListShareGroupOffsets (1/n) (apache#18571)
  KAFKA-18588 Remove TopicKey.scala (apache#18624)
@ijuma
Copy link
Member Author

ijuma commented Jan 20, 2025

@chia7712 Thanks for the review. Pushed a commit that addresses the comments. Also merged master to this branch for the flaky test fix.

Copy link
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after following comments are addressed

val version = if (usesTopicIds) LeaderAndIsrRequestData.HIGHEST_SUPPORTED_VERSION else 4.toShort
val topicId = if (usesTopicIds) this.topicId else Uuid.ZERO_UUID
val topicIdOpt = if (usesTopicIds) Some(topicId) else None
val topicId = this.topicId
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary

@@ -35,7 +35,6 @@ import scala.jdk.CollectionConverters._
import scala.collection._
import scala.collection.mutable.ArrayBuffer
import scala.util.{Failure, Success, Try}
import org.apache.kafka.common.requests.{AbstractControlRequest, LeaderAndIsrRequest}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we can remove AbstractControlRequest in this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - this can go if we move Type to LeaderAndIsrRequest.

@ijuma
Copy link
Member Author

ijuma commented Jan 20, 2025

Assuming the tests look good, I'll merge this. If @jolshan, @cmccabe and/or @jsancio have any comments (for the questions I asked or otherwise), I'll address them in a follow-up.

@ijuma ijuma merged commit 87b37a4 into apache:trunk Jan 20, 2025
9 checks passed
@ijuma ijuma deleted the kafka-14552-remove-server-protocol-versions-3.0 branch January 21, 2025 01:30
ijuma added a commit that referenced this pull request Jan 21, 2025
…18497)

Kafka 4.0 will remove support for zk mode and will require conversion to kraft
before upgrading to 4.0. The minimum kraft version is 3.0 (aka 3.0-IV1).

This provides an opportunity to remove exclusively server side protocols versions
that only exist to allow direct upgrades from versions older than 3.0 or that are
used only by zk mode.

Since KRaft became production ready in 3.3, we should consider setting the
baseline to 3.3. But that requires more discussion and it can be done via a
separate change (KAFKA-18601).

Protocol changes:
* Remove RequestHeader v0 (only used by ControlledShutdown v0)
* Remove WriteTxnMarkers v0
* Remove all versions of ControlledShutdown, LeaderAndIsr, StopReplica, UpdateMetadata

In order to remove all versions safely, extend generator to support setting
"versions" to "none". In this case, we no longer generate the `*Data` classes,
but we still reserve the id for the relevant protocol api (so it doesn't get
accidentally used for something else). The protocol documentation is correct
after these changes.

We kept a simplified version of `LeaderAndIsr{Request|Response}` because
it's used by many tests that are still relevant in kraft mode. Once
KAFKA-18486 is done, it may be possible to remove it (I left a comment on
the ticket). Similarly, KAFKA-18487 may make it possible to remove
the introduced `StopReplicaPartitionState` (left a comment on that ticket too).

There are a number of places that were adjusted to include an
`ApiKeys.hasValidVersion` check.

Reviewers: Chia-Ping Tsai <[email protected]>
pranavt84 pushed a commit to pranavt84/kafka that referenced this pull request Jan 27, 2025
…pache#18497)

Kafka 4.0 will remove support for zk mode and will require conversion to kraft
before upgrading to 4.0. The minimum kraft version is 3.0 (aka 3.0-IV1).

This provides an opportunity to remove exclusively server side protocols versions
that only exist to allow direct upgrades from versions older than 3.0 or that are
used only by zk mode.

Since KRaft became production ready in 3.3, we should consider setting the
baseline to 3.3. But that requires more discussion and it can be done via a
separate change (KAFKA-18601).

Protocol changes:
* Remove RequestHeader v0 (only used by ControlledShutdown v0)
* Remove WriteTxnMarkers v0
* Remove all versions of ControlledShutdown, LeaderAndIsr, StopReplica, UpdateMetadata

In order to remove all versions safely, extend generator to support setting
"versions" to "none". In this case, we no longer generate the `*Data` classes,
but we still reserve the id for the relevant protocol api (so it doesn't get
accidentally used for something else). The protocol documentation is correct
after these changes.

We kept a simplified version of `LeaderAndIsr{Request|Response}` because
it's used by many tests that are still relevant in kraft mode. Once
KAFKA-18486 is done, it may be possible to remove it (I left a comment on
the ticket). Similarly, KAFKA-18487 may make it possible to remove
the introduced `StopReplicaPartitionState` (left a comment on that ticket too).

There are a number of places that were adjusted to include an
`ApiKeys.hasValidVersion` check.

Reviewers: Chia-Ping Tsai <[email protected]>
airlock-confluentinc bot pushed a commit to confluentinc/kafka that referenced this pull request Jan 27, 2025
…pache#18497)

Kafka 4.0 will remove support for zk mode and will require conversion to kraft
before upgrading to 4.0. The minimum kraft version is 3.0 (aka 3.0-IV1).

This provides an opportunity to remove exclusively server side protocols versions
that only exist to allow direct upgrades from versions older than 3.0 or that are
used only by zk mode.

Since KRaft became production ready in 3.3, we should consider setting the
baseline to 3.3. But that requires more discussion and it can be done via a
separate change (KAFKA-18601).

Protocol changes:
* Remove RequestHeader v0 (only used by ControlledShutdown v0)
* Remove WriteTxnMarkers v0
* Remove all versions of ControlledShutdown, LeaderAndIsr, StopReplica, UpdateMetadata

In order to remove all versions safely, extend generator to support setting
"versions" to "none". In this case, we no longer generate the `*Data` classes,
but we still reserve the id for the relevant protocol api (so it doesn't get
accidentally used for something else). The protocol documentation is correct
after these changes.

We kept a simplified version of `LeaderAndIsr{Request|Response}` because
it's used by many tests that are still relevant in kraft mode. Once
KAFKA-18486 is done, it may be possible to remove it (I left a comment on
the ticket). Similarly, KAFKA-18487 may make it possible to remove
the introduced `StopReplicaPartitionState` (left a comment on that ticket too).

There are a number of places that were adjusted to include an
`ApiKeys.hasValidVersion` check.

Reviewers: Chia-Ping Tsai <[email protected]>
manoj-mathivanan pushed a commit to manoj-mathivanan/kafka that referenced this pull request Feb 19, 2025
…pache#18497)

Kafka 4.0 will remove support for zk mode and will require conversion to kraft
before upgrading to 4.0. The minimum kraft version is 3.0 (aka 3.0-IV1).

This provides an opportunity to remove exclusively server side protocols versions
that only exist to allow direct upgrades from versions older than 3.0 or that are
used only by zk mode.

Since KRaft became production ready in 3.3, we should consider setting the
baseline to 3.3. But that requires more discussion and it can be done via a
separate change (KAFKA-18601).

Protocol changes:
* Remove RequestHeader v0 (only used by ControlledShutdown v0)
* Remove WriteTxnMarkers v0
* Remove all versions of ControlledShutdown, LeaderAndIsr, StopReplica, UpdateMetadata

In order to remove all versions safely, extend generator to support setting
"versions" to "none". In this case, we no longer generate the `*Data` classes,
but we still reserve the id for the relevant protocol api (so it doesn't get
accidentally used for something else). The protocol documentation is correct
after these changes.

We kept a simplified version of `LeaderAndIsr{Request|Response}` because
it's used by many tests that are still relevant in kraft mode. Once
KAFKA-18486 is done, it may be possible to remove it (I left a comment on
the ticket). Similarly, KAFKA-18487 may make it possible to remove
the introduced `StopReplicaPartitionState` (left a comment on that ticket too).

There are a number of places that were adjusted to include an
`ApiKeys.hasValidVersion` check.

Reviewers: Chia-Ping Tsai <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clients core Kafka Broker generator RPC and Record code generator kraft performance tools
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants