Skip to content

Conversation

dharmit
Copy link
Member

@dharmit dharmit commented Jan 5, 2022

What type of PR is this?
/kind bug

What does this PR do / why we need it:
Refer issue being fixed

Which issue(s) this PR fixes:

Fixes #5041

PR acceptance criteria:

How to test changes / Special notes to the reviewer:
Refer issue description of original issue for steps to test changes.

@dharmit dharmit requested review from feloy and valaparthvi January 5, 2022 06:22
@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Jan 5, 2022
@netlify
Copy link

netlify bot commented Jan 5, 2022

✔️ Deploy Preview for odo-docusaurus-preview ready!

🔨 Explore the source changes: 0ebe3b5

🔍 Inspect the deploy log: https://app.netlify.com/sites/odo-docusaurus-preview/deploys/61e27e0ce9a14200073be266

😎 Browse the preview: https://deploy-preview-5315--odo-docusaurus-preview.netlify.app

@openshift-ci openshift-ci bot requested a review from mohammedzee1000 January 5, 2022 06:22
@odo-robot
Copy link

odo-robot bot commented Jan 5, 2022

OpenShift Tests on commit 5f432de finished successfully.
View logs: TXT HTML

@odo-robot
Copy link

odo-robot bot commented Jan 5, 2022

Kubernetes Tests on commit 5f432de finished successfully.
View logs: TXT HTML

@odo-robot
Copy link

odo-robot bot commented Jan 5, 2022

Unit Tests on commit 5f432de finished successfully.
View logs: TXT HTML

@@ -529,7 +529,7 @@ func (a *Adapter) createOrUpdateComponent(componentExists bool, ei envinfo.EnvSp
if err != nil {
return err
}

deployment.Spec.Replicas = pointer.Int32Ptr(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have this field set when the Deployment is first created, by the devfile library. Could we propose a path to the library to add the Replicas to the generator.DeploymentParams structure?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By setting it at library level, it becomes applicable to Che folks as well. I'm not sure if that's what they would want. Otherwise, I think it could be patched at library level.

IMO, we could track this separately (issue on devfile repo or a discussion in the Devfile cabal). WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Through the generator.DeploymentParams structure, it could be an optional parameter, to not impact other library's users.


When("the Deployment's Replica count (pods) is scaled to 0", func() {
BeforeEach(func() {
commonVar.CliRunner.ScalePodToZero(cmpName, "app")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you will need to wait for the pod to be terminated before to continue, or the scale command won't have the time to have any effect

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -127,7 +126,7 @@ func (a Adapter) Push(parameters common.PushParameters) (err error) {
var podName string

// If the component already exists, retrieve the pod's name before it's potentially updated
if componentExists {
if componentExists && *a.deployment.Spec.Replicas == 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very specific to this issue, and is not generic enough to cover all the cases when a pod is not present.

I made some more reading of this Push() function, and understand now that this call to getPod() is necessary to get the name of the previous pod (the one we synced files to previously), and to compare it later (https://github.com/redhat-developer/odo/pull/5315/files#diff-674dc676ed2f8aa2588ede2db4112fe412bfa0365f7b0f7303d5ae407582d2aaL271) with the pod we will sync the files to.

If the pod is restarted, all the files need to be synced; in the other case, only changed files need to be synced.

Instead of making the specific test on Deployment Replicas, you could react differently if getPod in line 131 fails. Instead of returning an error, you could continue and make sure that podChangedwill be set to true (as it is sure we will need to sync all the files again), for example by keeping podName to the empty string.

Also, could you please change the variable name componentExists to deploymentExists, to make the flow easier to understand?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some more reading of this Push() function, and understand now that this call to getPod() is necessary to get the name of the previous pod (the one we synced files to previously), and to compare it later (https://github.com/redhat-developer/odo/pull/5315/files#diff-674dc676ed2f8aa2588ede2db4112fe412bfa0365f7b0f7303d5ae407582d2aaL271) with the pod we will sync the files to.

Yes, that's what I was saying while we were discussing this yesterday. But I was struggling to explain. :)

Instead of making the specific test on Deployment Replicas, you could react differently if getPod in line 131 fails. Instead of returning an error, you could continue and make sure that podChangedwill be set to true (as it is sure we will need to sync all the files again), for example by keeping podName to the empty string.

But for that to happen, the user will have to wait for whatever the value of PushTimeout is in odo preferences. By default, it is 4 minutes. Also, getPod() function makes a call to WaitAndGetPodWithEvents in pkg/kclient/pods.go which mandates the desired phase of the pod to be Running:

odo/pkg/kclient/pods.go

Lines 25 to 27 in 724f16e

// WaitAndGetPod block and waits until pod matching selector is in in Running state
// desiredPhase cannot be PodFailed or PodUnknown
func (c *Client) WaitAndGetPodWithEvents(selector string, desiredPhase corev1.PodPhase, waitMessage string, pushTimeout time.Duration) (*corev1.Pod, error) {

Now, I know that we can make change to the code to not mandate it. But I'm honestly having a hard time and would appreciate some help. Using the debugger, I can't seem to be able to get inside below go routine:

odo/pkg/kclient/pods.go

Lines 46 to 95 in 724f16e

// Here we are going to start a loop watching for the pod status
podChannel := make(chan *corev1.Pod)
watchErrorChannel := make(chan error)
failedEvents := make(map[string]corev1.Event)
go func() {
loop:
for {
val, ok := <-w.ResultChan()
if !ok {
watchErrorChannel <- errors.New("watch channel was closed")
break loop
}
if e, ok := val.Object.(*corev1.Pod); ok {
klog.V(3).Infof("Status of %s pod is %s", e.Name, e.Status.Phase)
for _, cond := range e.Status.Conditions {
// using this just for debugging message, so ignoring error on purpose
jsonCond, _ := json.Marshal(cond)
klog.V(3).Infof("Pod Conditions: %s", string(jsonCond))
}
for _, status := range e.Status.ContainerStatuses {
// using this just for debugging message, so ignoring error on purpose
jsonStatus, _ := json.Marshal(status)
klog.V(3).Infof("Container Status: %s", string(jsonStatus))
}
switch e.Status.Phase {
case desiredPhase:
klog.V(3).Infof("Pod %s is %v", e.Name, desiredPhase)
podChannel <- e
break loop
case corev1.PodFailed, corev1.PodUnknown:
watchErrorChannel <- errors.Errorf("pod %s status %s", e.Name, e.Status.Phase)
break loop
default:
// we start in a phase different from the desired one, let's wait
if spinner == nil {
spinner = log.Spinner(waitMessage)
// Collect all the events in a separate go routine
quit := make(chan int)
go c.CollectEvents(selector, failedEvents, spinner, quit)
defer close(quit)
}
}
} else {
watchErrorChannel <- errors.New("unable to convert event object to Pod")
break loop
}
}
close(podChannel)
close(watchErrorChannel)
}()

Even though the default value for "Replicas" is 1, we explicitly set it
so that the field for replicas is tracked by SSA and odo is made its
manager.
`podName` was being fetched long before its use. This commit fetches it
just before it is needed.
@openshift-ci
Copy link

openshift-ci bot commented Jan 15, 2022

@dharmit: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/psi-unit-test-windows 0ebe3b5 link false /test psi-unit-test-windows
ci/prow/psi-unit-test-mac 0ebe3b5 link false /test psi-unit-test-mac
ci/prow/psi-kubernetes-integration-e2e 0ebe3b5 link false /test psi-kubernetes-integration-e2e

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@feloy
Copy link
Contributor

feloy commented Jan 17, 2022

/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. Required by Prow. label Jan 17, 2022
@openshift-ci
Copy link

openshift-ci bot commented Jan 17, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: feloy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. Required by Prow. label Jan 17, 2022
@openshift-merge-robot openshift-merge-robot merged commit 43444fd into redhat-developer:main Jan 17, 2022
@dharmit
Copy link
Member Author

dharmit commented Jan 18, 2022

/cherrypick v2

1 similar comment
@dharmit
Copy link
Member Author

dharmit commented Jan 19, 2022

/cherrypick v2

@openshift-cherrypick-robot

@dharmit: new pull request created: #5360

In response to this:

/cherrypick v2

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dharmit dharmit mentioned this pull request Jan 20, 2022
3 tasks
rnapoles-rh pushed a commit to rnapoles-rh/odo that referenced this pull request Jan 24, 2022
* Use SSA for replicas

Even though the default value for "Replicas" is 1, we explicitly set it
so that the field for replicas is tracked by SSA and odo is made its
manager.

* Fetches pod name just before it is needed

`podName` was being fetched long before its use. This commit fetches it
just before it is needed.

* Fix shadow issue

* Tests should wait for pod termination

* Get pod name only if it exists

* Check if only one pod exists for the component

* Fix shadow error

* Expect error in oc get pod output for oc
@dharmit dharmit deleted the fix-5041 branch February 4, 2022 07:21
cdrage pushed a commit to cdrage/odo that referenced this pull request Aug 31, 2022
* Use SSA for replicas

Even though the default value for "Replicas" is 1, we explicitly set it
so that the field for replicas is tracked by SSA and odo is made its
manager.

* Fetches pod name just before it is needed

`podName` was being fetched long before its use. This commit fetches it
just before it is needed.

* Fix shadow issue

* Tests should wait for pod termination

* Get pod name only if it exists

* Check if only one pod exists for the component

* Fix shadow error

* Expect error in oc get pod output for oc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. Required by Prow. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. Required by Prow.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

odo push fails with idled components
4 participants