Skip to content

Conversation

@yavinash007
Copy link

@yavinash007 yavinash007 commented Dec 23, 2025

Summary

Type of Change

  • 🐛 Bug fix
  • ✨ New feature
  • 💥 Breaking change
  • 📚 Documentation
  • 🔧 Refactoring
  • 🔨 Build/CI

Component(s) Affected

  • Core Services
  • Documentation/CI
  • Fault Management
  • Health Monitors
  • Janitor
  • Other: ____________

Testing

  • Tests pass locally
  • Manual testing completed
  • No breaking changes (or documented)

Checklist

  • Self-review completed
  • Documentation updated (if needed)
  • Ready for review

@copy-pr-bot
Copy link

copy-pr-bot bot commented Dec 23, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 23, 2025

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

include ../make/go.mk

# Version of controller-gen to use for generating CRD deepcopy, client, etc.
CONTROLLER_GEN_VERSION := v0.17.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v0.20.0 is latest, if we're going to pin we should pickup the newest one

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, janitor has a controller-gen as well. It's using latest rather than a pinned version though. Checkout how the main makfile in the root dir is handling versions:

ADDLICENSE_VERSION := $(shell $(YQ) '.linting.addlicense' .versions.yaml)

We should probably put controller-gen there with that same pattern since we're using it in multiple modules now


var (
// GroupVersion is group version used to register these objects
GroupVersion = schema.GroupVersion{Group: "data-models.dgxc.nvidia.com", Version: "v1alpha1"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking maybe we do healthevents.dgxc.nvidia.com but open to ideas here too. Just because health-events is a little more descriptive than data-models

)

// HealthEventSpec defines the desired state of HealthStatus
type HealthEventSpec struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to manage two copies of the same struct just to make it a kubernetes type.

Can we point directly to the objects that already exist? So it would look something like this

// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
type HealthEvent struct {
  Spec HealthEvent // this is the existing health event object
  Status HealthEventStatus // this is the existing status object
}

@yavinash007 yavinash007 force-pushed the healtheventwithstatus branch from aa400bd to 14aa429 Compare December 23, 2025 21:43
EventID string `json:"eventID"`

// Node associated with this health event
// +kubebuilder:validation:Required
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still looks like we're redefining the types, we ideally don't want to maintain two copies of the same types and keep them in sync

Message string `bson:"message,omitempty" json:"message,omitempty"`
}

type HealthEventStatus struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we could just stuff this exact object to be the status for example


// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
type HealthStatus struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so my thought was this struct would look like this:

// +kubebuilder:object:root=true
// +kubebuilder:subresource:status
type HealthStatus struct {
  metav1.TypeMeta   `json:",inline"`
  metav1.ObjectMeta `json:"metadata,omitempty"`

  Spec   model.HealthEvent   `json:"spec,omitempty"`
  Status model.HealthEventStatus `json:"status,omitempty"`
}

So you would not define your own spec and status objects we would just use the existing ones.

There's some implications for api-versioning going forward if we want to adhere to best practices that we should discuss with the NVIDIA folks if we do it this way but it seems like a clean way to share the object that the other datasources use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants