Skip to content
This repository was archived by the owner on Mar 9, 2022. It is now read-only.

Conversation

@Random-Liu
Copy link
Member

@Random-Liu Random-Liu commented Sep 4, 2017

Fixes #120.

Add recovery logic during restart. I've manually tested this, it's working even better than I thought. :)

Based on #206 and #179.

@kubernetes-incubator/maintainers-cri-containerd Please review this carefully, especially pkg/server/restart.go. This is really important to the production quality.

I'll send another PR to add our own integration test framework, and will add restart test.

Copy link
Member

@abhi abhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial comments.

}

// Load network namespace.
netNS, err := sandboxstore.LoadNetNS(meta.NetNSPath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NetNSPath is not set if its in host network namespace. Should we be checking for the network namespace before loading it ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. We should.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
id := d.Name()
found := false
for _, c := range cntrs {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use a map to track the loaded containers/sandboxes from the caller. Just thinking, looping over containers for every directory may not be a faster method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or may be keep a track of containerDirs in a map once loadContainer succeeds. So we can just loop around the dirs here and check if its in the containerDirs map , if not then clean it up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
id := d.Name()
found := false
for _, c := range cntrs {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as mentioned for sandboxes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@Random-Liu
Copy link
Member Author

@abhinandanpb Addressed comments.

Copy link
Member

@abhi abhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a first pass now. Looks good overall. Minor comments.

}

// LoadNetNS loads existing network namespace. It returns ErrClosedNetNS
// is the network namespace has already been closed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if the network namespace

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// LoadNetNS loads existing network namespace. It returns ErrClosedNetNS
// is the network namespace has already been closed.
func LoadNetNS(path string) (*NetNS, error) {
if err := cnins.IsNSorErr(path); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is already done by GetNS(path) and would return an error if it is closed.

Copy link
Member Author

@Random-Liu Random-Liu Sep 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to distinguish whether it's ErrClosedNetNS or not in restart.go.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetNS would return cnins.NSPathNotExistErr so you can always check for that error if the nspath doesnt exist. Unless I am missing something. This is fine. If we can avoid an if condition even better.

return fmt.Errorf("failed to stat netns: %v", err)
}
// Follow possible /var/run -> /run symlink.
path, err := symlink.FollowSymlinkInScope(path, "/")
Copy link
Member

@abhi abhi Sep 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we dont create symlink currently right ? Should we explicitly follow the symlink ? kernel takes care of delinking the implicit ones ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied from cri-o. Yeah, it seems that we don't need this even /var/run -> /run is a symlink.

return container, fmt.Errorf("failed to load task: %v", err)
}
var s containerd.Status
notFound := false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var notFound bool

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if err != nil {
return fmt.Errorf("failed to list sandbox containers: %v", err)
}
for _, sandbox := range sandboxes {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can create a sandboxmap[string]bool here and pass it to cleanuporphansandboxdir instead of creating one there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't make too much difference to me, because we won't use the map here. Given so I prefer move the code into cleanuporphansandboxdir which is the place where the map will be used. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant instead of passing sandbox list and container list , pass a map for comparison. Not a blocker though

if err != nil {
return fmt.Errorf("failed to list containers: %v", err)
}
for _, container := range containers {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as sandbox

Copy link
Member

@abhi abhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise

// LoadNetNS loads existing network namespace. It returns ErrClosedNetNS
// is the network namespace has already been closed.
func LoadNetNS(path string) (*NetNS, error) {
if err := cnins.IsNSorErr(path); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetNS would return cnins.NSPathNotExistErr so you can always check for that error if the nspath doesnt exist. Unless I am missing something. This is fine. If we can avoid an if condition even better.

@Random-Liu
Copy link
Member Author

Random-Liu commented Sep 21, 2017

GetNS would return cnins.NSPathNotExistErr so you can always check for that error if the nspath doesnt exist. Unless I am missing something. This is fine. If we can avoid an if condition even better.

@abhinandanpb I read their code. You are right. We should use cnins.NSPathNotExistErr instead. We may not want to ignore the error if it's some other transient stat error.

Good catch!

@Random-Liu
Copy link
Member Author

@abhinandanpb Addressed comments. I still keeps the ErrClosedNetNS, so that caller of LoadNetNS doesn't need to make assumption on the underlying implementation. However, I does change LoadNetNS to look at cnins.NSPathNotExistErr.

@abhi abhi added the lgtm label Sep 21, 2017
@Random-Liu
Copy link
Member Author

@abhinandanpb Thanks for reviewing! Will merge after test passes.

@Random-Liu Random-Liu merged commit 9015b6e into containerd:master Sep 21, 2017
@Random-Liu Random-Liu deleted the checkpoint-recovery branch September 21, 2017 18:32
@mikebrow
Copy link
Member

/LGTM

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants