Skip to content

Adding Openshift stack#1567

Merged
k8s-ci-robot merged 24 commits into
kubeflow:masterfrom
nakfour:openshift_stack
Nov 6, 2020
Merged

Adding Openshift stack#1567
k8s-ci-robot merged 24 commits into
kubeflow:masterfrom
nakfour:openshift_stack

Conversation

@nakfour

@nakfour nakfour commented Sep 22, 2020

Copy link
Copy Markdown
Member

Which issue is resolved by this Pull Request:
This is to upgrade all changes needed to run Kubeflow on OCP 4.x to kustomize v3.
Resolves #

Description of your changes:

Checklist:

  • Unit tests have been rebuilt:
    1. cd manifests/tests
    2. make generate-changed-only
    3. make test

@kubeflow-bot

Copy link
Copy Markdown
Contributor

This change is Reviewable

@crobby crobby left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like its on the right path. Great work!

@vpavlin

vpavlin commented Oct 23, 2020

Copy link
Copy Markdown
Member

Metadata fail to start because metadata-db livenessprobe is failing. The liveness probe needs to use MYSQL_USER to check mysql availability. You can use my commit here:

vpavlin@d9bce16

@nakfour

nakfour commented Oct 23, 2020

Copy link
Copy Markdown
Member Author

@vpavlin now I get this error below in on of the metadata-db pods that is showing up "not ready" then keeps crashing

[ERROR] InnoDB: Unable to lock ./ibdata1 error: 11

nevermind, I did a clean re-install and this issue is gone.

@nakfour nakfour changed the title WIP Adding Openshift stack Adding Openshift stack Oct 29, 2020
@nakfour

nakfour commented Oct 29, 2020

Copy link
Copy Markdown
Member Author

/retest

@nakfour

nakfour commented Oct 29, 2020

Copy link
Copy Markdown
Member Author

@jlewi @animeshsingh is there a way to get detailed logs from the kubeflow-manifests-presubmit job? Its failing and we have no indication why?

@Jeffwan

Jeffwan commented Oct 29, 2020

Copy link
Copy Markdown
Member

If you plan to have this in v1.2 release, please let us know kubeflow/kubeflow#5224

@nakfour

nakfour commented Oct 29, 2020

Copy link
Copy Markdown
Member Author

@Jeffwan yes ran test manually and ran into issues documented here : #1596
I do not see the argo issue you are referring to, please point it out?

@Jeffwan Jeffwan mentioned this pull request Oct 29, 2020
@nakfour

nakfour commented Nov 5, 2020

Copy link
Copy Markdown
Member Author

@PatrickXYS rebase done
/test all

@k8s-ci-robot

Copy link
Copy Markdown
Contributor

@nakfour: No presubmit jobs available for kubeflow/manifests@master

Details

In response to this:

@PatrickXYS rebase done
/test all

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@crobby

crobby commented Nov 5, 2020

Copy link
Copy Markdown
Member

I'm in the process of trying this out on OpenShift 4.6 and it looks like Seldon will not work due to a known issue: #1543

@nakfour

nakfour commented Nov 5, 2020

Copy link
Copy Markdown
Member Author

@crobby the seldon issue should be fixed in the upgrade: #1600

@crobby

crobby commented Nov 5, 2020

Copy link
Copy Markdown
Member

I think notebooks spin-up is also failing due to the fact it's trying to use user/group ids that are not permitted in OpenShift.

@nakfour

nakfour commented Nov 6, 2020

Copy link
Copy Markdown
Member Author

@crobby istio side car injection requires NET_ADMIN capability as described in this blog: https://www.openshift.com/blog/increasing-security-of-istio-deployments-by-removing-the-need-for-privileged-containers
in order to change the ip tables. So in OCP we have to give each pod privileged access in that namespace which is not recommended. I will try and see if I can disable side car injection. I added our custom profile image and now notebooks work. We need to rebuild profile controller to get the latest code.

Comment thread stacks/openshift/application/pipeline/kustomization.yaml Outdated
@nakfour

nakfour commented Nov 6, 2020

Copy link
Copy Markdown
Member Author

@PatrickXYS @Tomcli how can we get logs on what is failing in the e2e test? Running tests locally pass, and this stack is specific to Openshift cluster.

@Tomcli

Tomcli commented Nov 6, 2020

Copy link
Copy Markdown
Member

thanks @nakfour
/lgtm

@animeshsingh

Copy link
Copy Markdown
Contributor

Thanks @nakfour
/lgtm
/approve

@k8s-ci-robot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: animeshsingh, crobby, nakfour

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@animeshsingh

Copy link
Copy Markdown
Contributor

/unhold

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.