-
Notifications
You must be signed in to change notification settings - Fork 347
Conversation
mikebrow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question on if returning early is the right thing when deleting?
| // Delete containerd container. | ||
| if err := container.Container.Delete(ctx, containerd.WithSnapshotCleanup); err != nil { | ||
| if !errdefs.IsNotFound(err) { | ||
| return nil, fmt.Errorf("failed to delete containerd container %q: %v", id, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if we just pop out warnings and continue instead of returning early then return the error after we attempt to delete all the parts, best can do model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we fail to delete container, we want to keep its status and root directory so that we could still retrieve information about this container.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok per discussion, let's add a todo / issue regarding forcibly removing / killing on the remove request. Or getting guidance from the CRI sig-node team on remove expectations.
|
|
||
| // Delete container checkpoint. | ||
| if err := container.Delete(); err != nil { | ||
| return nil, fmt.Errorf("failed to delete container checkpoint for %q: %v", id, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
containerd.Delete is atomic, we should only use it to delete container status.
If we fail to do that, we may not want to remove container root directory, because that removal is not atomic.
| glog.V(5).Infof("Remove called for containerd container %q that does not exist", id, err) | ||
| containerRootDir := getContainerRootDir(c.config.RootDir, id) | ||
| if err := system.EnsureRemoveAll(containerRootDir); err != nil { | ||
| return nil, fmt.Errorf("failed to remove container root directory %q: %v", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
|
In short, we do need to enforce order of those deletion. :) |
Signed-off-by: Lantao Liu <[email protected]>
6795875 to
4eaaee3
Compare
|
@mikebrow Added TODO for forcibly stop. |
mikebrow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/LGTM
In a test, I saw that we could never remove a container any more once containerd container removal returns an error.
The reason is that
removingstate could not be reset, because the on-disk container status has been removed, thus the reset function fails.This PR:
UpdateSyncandUpdate. Actually in most cases, we just need the code to run in aUpdatetransaction to avoid race, and don't actually update the on-disk status. We could useUpdatefor this case in the future. This also fixed the issue I mentioned above. With this PR, we'll only useUpdateto set/reset removing state, which doesn't include any disk operation, and won't be affected by container status/root directory removal.Signed-off-by: Lantao Liu [email protected]