Skip to content

Fixes #1509: Notebook CRD allows misconfigured notebooks that make the notebook-controller crash#1510

Merged
k8s-ci-robot merged 3 commits into
kubeflow:masterfrom
agoblet:1509-notebook-crd-resources-pattern
Sep 9, 2020
Merged

Fixes #1509: Notebook CRD allows misconfigured notebooks that make the notebook-controller crash#1510
k8s-ci-robot merged 3 commits into
kubeflow:masterfrom
agoblet:1509-notebook-crd-resources-pattern

Conversation

@agoblet

@agoblet agoblet commented Aug 25, 2020

Copy link
Copy Markdown
Contributor

Which issue is resolved by this Pull Request:
Resolves #1509

Description of your changes:
Added Regex patterns to cpu and memory requests and limits in notebook CRD.

Checklist:

  • Unit tests have been rebuilt:
    1. cd manifests/tests
    2. make generate-changed-only
    3. make test

@kubeflow-bot

Copy link
Copy Markdown
Contributor

This change is Reviewable

@k8s-ci-robot

Copy link
Copy Markdown
Contributor

Hi @axelgobletbdr. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@agoblet

agoblet commented Aug 25, 2020

Copy link
Copy Markdown
Contributor Author

/assign @kimwnasptd

@agoblet

agoblet commented Aug 26, 2020

Copy link
Copy Markdown
Contributor Author

When running the unit tests as described in checklist above, many of the tests fail. This happens for many components that I did not touch in my PR. Is the test suite outdated?

@PatrickXYS

Copy link
Copy Markdown
Member

Hi @alexlatchford

Have you ever tried to run below commands before sending PR?

cd manifests/tests
make generate-changed-only
make test

@agoblet

agoblet commented Aug 26, 2020

Copy link
Copy Markdown
Contributor Author

Hi @Patrickxyz, I did run the exact commands you decribe. make generate-changed-only runs succesfully and changes many of the test files. Then, make test fails, with failures in all sorts of tests unrelated to the notebook CRD (the only file I touched). This happens both on my own branch as on master.

Does running these commands yield a positive outcome for you on master?

Do you want me to check in all the newly generated test files?

@PatrickXYS

Copy link
Copy Markdown
Member

If make test failed, which means some of the test cases failed, you might need to dig into the error messages, and try to solve them.

I'm not sure whether failed tests related to Notebook CRD. But I would encourage you taking a deep look and fix these failures.

@agoblet

agoblet commented Aug 26, 2020

Copy link
Copy Markdown
Contributor Author

I just checked in the files generated by make generate-changed-only. Over 300 files changed. I doubt that this should be part of the PR should it? Shouldn't this be built into the build pipeline? It clutters the PR a lot.

Here is an example of one of the dozens if not hundreds of failed test cases:

=== RUN   TestKustomize
===== ACTUAL BEGIN ========================================
apiVersion: app.k8s.io/v1beta1
kind: Application
metadata:
  labels:
    app.kubernetes.io/component: tfjob
    app.kubernetes.io/instance: tf-job-operator-v1.0.0
    app.kubernetes.io/managed-by: kfctl
    app.kubernetes.io/name: tf-job-operator
    app.kubernetes.io/part-of: kubeflow
    app.kubernetes.io/version: v1.0.0
  name: tf-job-operator
  namespace: kubeflow
spec:
  addOwnerRef: true
  componentKinds:
  - group: core
    kind: Service
  - group: apps
    kind: Deployment
  - group: core
    kind: ServiceAccount
  - group: kubeflow.org
    kind: TFJob
  descriptor:
    description: Tf-operator allows users to create and manage the "TFJob" custom
      resource.
    keywords:
    - tfjob
    - tf-operator
    - tf-training
    links:
    - description: About
      url: https://github.com/kubeflow/tf-operator
    - description: Docs
      url: https://www.kubeflow.org/docs/reference/tfjob/v1/tensorflow/
    maintainers:
    - email: ricliu@google.com
      name: Richard Liu
    owners:
    - email: ricliu@google.com
      name: Richard Liu
    type: tf-job-operator
    version: v1
  selector:
    matchLabels:
      app.kubernetes.io/component: tfjob
      app.kubernetes.io/instance: tf-job-operator-v0.7.0
      app.kubernetes.io/managed-by: kfctl
      app.kubernetes.io/name: tf-job-operator
      app.kubernetes.io/part-of: kubeflow
      app.kubernetes.io/version: v0.7.0
===== ACTUAL END ==========================================
   EXPECTED                                                                                        ACTUAL
   --------                                                                                        ------
   apiVersion: app.k8s.io/v1beta1                                                                  apiVersion: app.k8s.io/v1beta1
   kind: Application                                                                               kind: Application
   metadata:                                                                                       metadata:
     labels:                                                                                         labels:
       app.kubernetes.io/component: tfjob                                                              app.kubernetes.io/component: tfjob
       app.kubernetes.io/instance: tf-job-operator-v1.0.0                                              app.kubernetes.io/instance: tf-job-operator-v1.0.0
       app.kubernetes.io/managed-by: kfctl                                                             app.kubernetes.io/managed-by: kfctl
       app.kubernetes.io/name: tf-job-operator                                                         app.kubernetes.io/name: tf-job-operator
       app.kubernetes.io/part-of: kubeflow                                                             app.kubernetes.io/part-of: kubeflow
       app.kubernetes.io/version: v1.0.0                                                               app.kubernetes.io/version: v1.0.0
     name: tf-job-operator                                                                           name: tf-job-operator
     namespace: kubeflow                                                                             namespace: kubeflow
   spec:                                                                                           spec:
     addOwnerRef: true                                                                               addOwnerRef: true
     componentKinds:                                                                                 componentKinds:
     - group: core                                                                                   - group: core
       kind: Service                                                                                   kind: Service
     - group: apps                                                                                   - group: apps
       kind: Deployment                                                                                kind: Deployment
     - group: core                                                                                   - group: core
       kind: ServiceAccount                                                                            kind: ServiceAccount
     - group: kubeflow.org                                                                           - group: kubeflow.org
       kind: TFJob                                                                                     kind: TFJob
     descriptor:                                                                                     descriptor:
X      description: Tf-operator allows users to create and manage the "TFJob" custom resource.         description: Tf-operator allows users to create and manage the "TFJob" custom
X      keywords:                                                                                         resource.
X      - tfjob                                                                                         keywords:
X      - tf-operator                                                                                   - tfjob
X      - tf-training                                                                                   - tf-operator
X      links:                                                                                          - tf-training
X      - description: About                                                                            links:
X        url: https://github.com/kubeflow/tf-operator                                                  - description: About
X      - description: Docs                                                                               url: https://github.com/kubeflow/tf-operator
X        url: https://www.kubeflow.org/docs/reference/tfjob/v1/tensorflow/                             - description: Docs
X      maintainers:                                                                                      url: https://www.kubeflow.org/docs/reference/tfjob/v1/tensorflow/
X      - email: ricliu@google.com                                                                      maintainers:
X        name: Richard Liu                                                                             - email: ricliu@google.com
X      owners:                                                                                           name: Richard Liu
X      - email: ricliu@google.com                                                                      owners:
X        name: Richard Liu                                                                             - email: ricliu@google.com
X      type: tf-job-operator                                                                             name: Richard Liu
X      version: v1                                                                                     type: tf-job-operator
X    selector:                                                                                         version: v1
X      matchLabels:                                                                                  selector:
X        app.kubernetes.io/component: tfjob                                                            matchLabels:
X        app.kubernetes.io/instance: tf-job-operator-v0.7.0                                              app.kubernetes.io/component: tfjob
X        app.kubernetes.io/managed-by: kfctl                                                             app.kubernetes.io/instance: tf-job-operator-v0.7.0
X        app.kubernetes.io/name: tf-job-operator                                                         app.kubernetes.io/managed-by: kfctl
X        app.kubernetes.io/part-of: kubeflow                                                             app.kubernetes.io/name: tf-job-operator
X        app.kubernetes.io/version: v0.7.0                                                               app.kubernetes.io/part-of: kubeflow
X                                                                                                        app.kubernetes.io/version: v0.7.0
    TestKustomize: test_util.go:176: Expected not equal to actual
--- FAIL: TestKustomize (0.02s)
FAIL
FAIL    github.com/kubeflow/manifests/tests/tests/legacy_kustomizations/tf-job-operator 0.044s

This can't be right. Either there is something going wrong in the test suite, or the tests are outdated and haven't been looked into for a long time.

I am exactly following the provided steps and not getting any errors while generating the tests. I do not know what is going wrong.

And as I mentioned before, this does happen on the master branch as well. Do you know what is going on?

@PatrickXYS

Copy link
Copy Markdown
Member

Okay, can you confirm you're using kustomize v3.2.1?

And retry above commands?

We mentioned it here https://github.com/kubeflow/manifests#manifests

@agoblet

agoblet commented Aug 27, 2020

Copy link
Copy Markdown
Contributor Author

Hi @PatrickXYS, I was using a kustomize version >3.2.1. With version 3.2.1, all newly generated tests passed. I assume you want the generated tests to be checked into the repository as well? I added them to this PR.

@PatrickXYS

Copy link
Copy Markdown
Member

/ok-to-test

@agoblet

agoblet commented Aug 31, 2020

Copy link
Copy Markdown
Contributor Author

@PatrickXYS can this PR be merged?

@PatrickXYS

Copy link
Copy Markdown
Member

/cc @kimwnasptd Can you take a look at Notebook Controller change? I notice we're about to bring up Notebook WG, and you'll be part of the WG.

@agoblet

agoblet commented Sep 7, 2020

Copy link
Copy Markdown
Contributor Author

@kimwnasptd can you have a look at this PR? It is a minor change of the notebook CRD to prevent crashes in the notebook controller.

@kimwnasptd

Copy link
Copy Markdown
Member

/lgtm
/approve

@agoblet

agoblet commented Sep 9, 2020

Copy link
Copy Markdown
Contributor Author

@andreyvelich @Jeffwan @Bobgy @gaocegege @johnugeorge @krishnadurai you are the test owners. Could one of you approve this PR please?

@krishnadurai krishnadurai left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kimwnasptd, krishnadurai

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 3fd1c71 into kubeflow:master Sep 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Notebook CRD allows misconfigured notebooks that make the notebook-controller crash

6 participants