Skip to content

Conversation

@ArangoGutierrez
Copy link
Collaborator

No description provided.

Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>

This comment was marked as outdated.

Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
@ArangoGutierrez ArangoGutierrez requested a review from Copilot July 30, 2025 14:01
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes containerd and CNI installation by unifying the containerd configuration to use version 2 for all versions, updating the default version to 1.7.28, and improving error handling during provisioning failures.

  • Unified containerd configuration to use version 2 for both 1.x and 2.x versions
  • Updated default containerd version from 1.7.26 to 1.7.28
  • Added interactive cleanup options for failed provisioning attempts

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pkg/provisioner/templates/kubernetes.go Added Ubuntu-specific resolv.conf configuration and cgroup driver settings
pkg/provisioner/templates/containerd.go Updated default version and unified configuration approach
pkg/provisioner/templates/containerd_test.go Updated tests to reflect new unified configuration
pkg/provisioner/templates/container-toolkit.go Added CNI configuration verification after nvidia-ctk
cmd/cli/create/create.go Added interactive cleanup handling for provisioning failures
cmd/cli/main.go Fixed delete command example syntax
Multiple test files Added copyright headers

PodSubnet: "192.168.0.0/16", // Default subnet, modify if needed
FeatureGates: featureGates, // Convert slice to string for kubeadm
RuntimeConfig: "resource.k8s.io/v1beta1=true", // Example runtime config
IsUbuntu: true, // Default to true for Ubuntu-based deployments
Copy link

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoding IsUbuntu to true by default may cause issues on non-Ubuntu systems. Consider determining the OS type dynamically or making this configurable through the environment specification.

Suggested change
IsUbuntu: true, // Default to true for Ubuntu-based deployments
IsUbuntu: isUbuntu(), // Dynamically determine if the system is Ubuntu-based

Copilot uses AI. Check for mistakes.
Comment on lines 227 to +234
address = "/run/containerd/containerd.sock"
uid = 0
gid = 0
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
sandbox_image = "registry.k8s.io/pause:3.9"
systemd_cgroup = true
[plugins."io.containerd.grpc.v1.cri".containerd]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runtime.v1.linux"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins."io.containerd.grpc.v1.cri".registry]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://registry-1.docker.io"]
EOF
fi
# Ensure CNI directories exist
sudo mkdir -p /etc/cni/net.d
sudo mkdir -p /opt/cni/bin
# Ensure containerd directories exist
Copy link

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unified configuration removes version-specific handling but still uses hardcoded paths. Consider adding validation that these paths are appropriate for the target system, especially since the comment mentions 'distro variance'.

Copilot uses AI. Check for mistakes.
if ! sudo grep -q 'bin_dir = "/opt/cni/bin:/usr/libexec/cni"' /etc/containerd/config.toml; then
echo "WARNING: CNI bin_dir configuration may have been modified by nvidia-ctk"
echo "Restoring CNI paths..."
# This is a safeguard, but nvidia-ctk should preserve existing CNI config
Copy link

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sed command on line 51 is complex and error-prone. Consider using a more robust configuration management approach or validating that the sed pattern correctly targets only the intended configuration section.

Copilot uses AI. Check for mistakes.
@coveralls
Copy link

Pull Request Test Coverage Report for Build 16624739979

Details

  • 1 of 2 (50.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.008%) to 14.278%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/provisioner/templates/kubernetes.go 0 1 0.0%
Totals Coverage Status
Change from base Build 16622991246: -0.008%
Covered Lines: 261
Relevant Lines: 1828

💛 - Coveralls

@ArangoGutierrez ArangoGutierrez merged commit 14a7880 into NVIDIA:main Jul 30, 2025
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants