Fix update container resources #319

Random-Liu · 2017-10-03T06:06:57Z

Fixes #316.

@abhi @mikebrow @kelseyhightower

Signed-off-by: Lantao Liu <[email protected]>

mikebrow

See comments

mikebrow · 2017-10-03T13:39:25Z

pkg/store/container/client.go

+// Client holds the containerd container client.
+// containerd.Container is a pointer underlying. New assignment won't affect
+// the previous pointer, so simply lock around is enough.
+type Client struct {


mikebrow · 2017-10-03T13:50:09Z

pkg/server/container_update_resources.go

+	id := cntr.ID
+	// Do not update the container when there is a removal in progress.
+	if status.Removing {
+		return fmt.Errorf("container %q is in removing state", id)


should this be an ignore or error?

We could either ignore it or return error here. The reason I let it return error is that:

The container is being removed, it only adds more race condition to update the resources at the same time;

Docker has similar behavior https://github.com/moby/moby/blob/master/daemon/update.go#L52

mikebrow · 2017-10-03T13:53:34Z

pkg/server/container_update_resources.go

+	// Update container spec. If the container is not started yet, updating
+	// spec makes sure that the resource limits are correct when start;
+	// if the container is already started, updating spec is still required,
+	// the spec will become our source of truth for resource limits.


mikebrow · 2017-10-03T13:54:54Z

pkg/server/container_update_resources.go

+		return fmt.Errorf("failed to marshal spec %+v: %v", newSpec, err)
+	}
+	info.Spec = any
+	// TODO(random-liu): Add helper function in containerd to do the update.


i'll add this today

containerd/containerd#1590

mikebrow · 2017-10-03T13:59:19Z

pkg/server/container_update_resources.go

+	}
+	defer func() {
+		if retErr != nil {
+			// Reset spec on error.


Curious, what is the reason/rule here for resetting to the spec on error? Eg. failed to load container.. reset to spec?

The main reason is that: If the container is running, we updated the spec, but failed to update the real resource limit of the running task, there will be inconsistency between the spec and the running task if we don't recover the spec.

This is best effort anyway. cri-containerd dies before we recover the spec or before we update the task, there will still be inconsistency. We could reconcile it during restart, but I don't think it's super necessary.

mikebrow · 2017-10-03T14:04:24Z

pkg/server/container_update_resources.go

+	if err != nil {
+		if errdefs.IsNotFound(err) {
+			// Task exited already.
+			return nil


mikebrow · 2017-10-03T14:07:27Z

pkg/server/container_update_resources.go

+			// Task exited already.
+			return nil
+		}
+		return fmt.Errorf("failed to get task: %v", err)


random thought here.. the term "task" may be very confusing to kubernetes users... Let's consider using "container" or "container state" instead of task in error messages we generate.

We have many error messages include task now. :p We may want to address them all when we have better name for it. :)

Random-Liu · 2017-10-03T18:27:36Z

@mikebrow Replied comments. :)

mikebrow

/LGTM your call if you want to push this through now and do the TODO bringing in Michael's helper tomorrow or just wait for it.

Cheers.

crosbymichael · 2017-10-03T19:54:00Z

No need to wait on it

Random-Liu added 2 commits October 3, 2017 06:03

Fix update container resources

a81a47b

Signed-off-by: Lantao Liu <[email protected]>

Add integration test

ecd1638

Signed-off-by: Lantao Liu <[email protected]>

k8s-ci-robot added cncf-cla: yes size/XL labels Oct 3, 2017

Random-Liu added this to the v1.0.0-alpha.1 milestone Oct 3, 2017

Random-Liu assigned abhi and mikebrow Oct 3, 2017

Random-Liu mentioned this pull request Oct 3, 2017

failed to find task: no running task found #316

Closed

mikebrow reviewed Oct 3, 2017

View reviewed changes

mikebrow approved these changes Oct 3, 2017

View reviewed changes

mikebrow added the lgtm label Oct 3, 2017

Random-Liu merged commit aec175c into containerd:master Oct 3, 2017

Random-Liu deleted the fix-update-container-resources branch October 3, 2017 20:17

Fix update container resources #319

Fix update container resources #319

Uh oh!

Conversation

Random-Liu commented Oct 3, 2017

Uh oh!

mikebrow left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Random-Liu commented Oct 3, 2017

Uh oh!

mikebrow left a comment

Choose a reason for hiding this comment

Uh oh!

crosbymichael commented Oct 3, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants