Skip to content

Fix lock file regression by appending UID to the lock driectory#22623

Merged
medyagh merged 10 commits into
kubernetes:masterfrom
medyagh:lock_regression
Feb 17, 2026
Merged

Fix lock file regression by appending UID to the lock driectory#22623
medyagh merged 10 commits into
kubernetes:masterfrom
medyagh:lock_regression

Conversation

@medyagh

@medyagh medyagh commented Feb 3, 2026

Copy link
Copy Markdown
Member

fixes a regression introduced in v1.38.0 where the shared lock directory /tmp/minikube-locks was created with 0755 permissions. This prevented different users (or CI jobs running as different users) from acquiring locks on the same machine, resulting in HOST_HOME_PERMISSION errors.
Changes:

  • Append the current user's UID to the lock directory name (e.g., /tmp/minikube-locks-1000).
  • This ensures each user has a dedicated, writable directory for their lock files.
  • Added TestLockDirectoryStructure to verify the directory naming convention.
    Fixes HOST_HOME_PERMISSION "unable to acquire lock" in 1.38.0 #22619

fix #22619

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 3, 2026
@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Feb 3, 2026
@k8s-ci-robot

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: medyagh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 3, 2026
@medyagh

medyagh commented Feb 3, 2026

Copy link
Copy Markdown
Member Author

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Feb 3, 2026
@medyagh

medyagh commented Feb 3, 2026

Copy link
Copy Markdown
Member Author

here is the link to the binary from a PR that I think might fix this issue, do you mind trying it out ?

https://storage.googleapis.com/minikube-builds/22623/minikube-linux-amd64

@medyagh medyagh changed the title Lock regression fix Lock regression Feb 3, 2026
@medyagh medyagh requested a review from Copilot February 3, 2026 22:32

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes a regression introduced in v1.38.0 where multiple users on the same machine couldn't acquire locks due to permission issues with the shared /tmp/minikube-locks directory.

Changes:

  • Modified the lock directory naming to include the user's UID (/tmp/minikube-locks-<UID>) for per-user isolation
  • Added a test to verify the lock directory structure follows the new naming convention

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
pkg/util/lock/lock.go Updated Acquire() function to append user UID to lock directory name, ensuring each user has their own lock directory
pkg/util/lock/lock_test.go Added TestLockDirectoryStructure() to verify the directory naming convention with UID suffix

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/util/lock/lock.go Outdated
Comment thread pkg/util/lock/lock_test.go Outdated
@k8s-ci-robot k8s-ci-robot removed the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Feb 3, 2026
@medyagh medyagh requested a review from Copilot February 3, 2026 23:04
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Feb 3, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/util/lock/lock_test.go
Comment thread pkg/util/lock/lock.go Outdated
Comment thread pkg/util/lock/lock.go
@minikube-pr-bot

This comment has been minimized.

@minikube-pr-bot

This comment has been minimized.

@nirs nirs left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works, but it will be better keep locks the recommended location for the platform:

Operating System Primary Location (User-level) Specification / Documentation
Linux $XDG_RUNTIME_DIR (e.g., /run/user/1000/) XDG Base Directory Spec
macOS ~/Library/Application Support/<app>/ Apple File System Guide
Windows %LOCALAPPDATA%\<app>\ Windows App Data Storage

@medyagh

medyagh commented Feb 4, 2026

Copy link
Copy Markdown
Member Author

This works, but it will be better keep locks the recommended location for the platform:

Operating System Primary Location (User-level) Specification / Documentation
Linux $XDG_RUNTIME_DIR (e.g., /run/user/1000/) XDG Base Directory Spec
macOS ~/Library/Application Support/<app>/ Apple File System Guide
Windows %LOCALAPPDATA%\<app>\ Windows App Data Storage

that sounds like a good idea ! however I think for a patch for 1.38.1 we should stick to something low risk and more fimilar with what we have

Summary: While the suggestion follows OS standards (which is generally good), switching to these paths for lock files introduces significant risks for Minikube's specific use cases (CI, containers, enterprise setups).

Analysis:

Reliability (The Main Risk):
CI & Containers: XDG_RUNTIME_DIR (Linux) is often not set in Docker containers or minimal CI environments (root/headless). We would need a fallback to /tmp anyway.
Networked Home Dirs (NFS): On some enterprise Linux/macOS setups, the home directory (~/...) is mounted via NFS. File locking (flock) on NFS is historically unreliable and can lead to deadlocks or ignored locks. /tmp is almost always a local disk or RAM disk (tmpfs), making it much safer for locking coordination.
Complexity:
We would need to implement robust path detection for all 3 OSes, plus the fallback logic for when those paths (like XDG_RUNTIME_DIR) are missing.
The current solution (UID/Hash in /tmp) effectively implements a "custom runtime dir" that is guaranteed to be on a local, writable filesystem.
Recommendation: Stick with the current approach (/tmp with unique names).

It solves the original "permission denied" and collision issues.
It works reliably in CI, containers, and weird machine setups where standard paths might be missing or network-mounted.
It avoids the "flock on NFS" trap

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/util/lock/lock.go
// Instead, we use the SHA1 hash of the user's home directory as a unique identifier.
// This ensures per-user isolation in shared temporary directories.
hash := sha1.Sum([]byte(homeDir))
return fmt.Sprintf("minikube-locks-windows-%x", hash)

Copilot AI Feb 4, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full SHA1 hash (40 hex characters) creates an excessively long directory name that could approach or exceed path length limits on some systems. Consider truncating the hash to a shorter length (e.g., first 16 characters) while still maintaining reasonable collision resistance for typical multi-user scenarios on a single machine. This would make the directory name more manageable while preserving the per-user isolation benefit.

Suggested change
return fmt.Sprintf("minikube-locks-windows-%x", hash)
hashStr := fmt.Sprintf("%x", hash)
if len(hashStr) > 16 {
hashStr = hashStr[:16]
}
return fmt.Sprintf("minikube-locks-windows-%s", hashStr)

Copilot uses AI. Check for mistakes.
if err := os.WriteFile(testFile, []byte("test"), 0600); err != nil {
t.Errorf("Expected to be able to write to %s: %v", lockDir, err)
}
os.Remove(testFile)

Copilot AI Feb 4, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If WriteFile succeeds but Remove fails, the error from Remove is silently ignored. Consider using defer to ensure cleanup happens, or at minimum log the error if cleanup fails. This prevents test file leakage in the temporary directory.

Suggested change
os.Remove(testFile)
if err := os.Remove(testFile); err != nil {
t.Logf("failed to remove test file %s: %v", testFile, err)
}

Copilot uses AI. Check for mistakes.
@minikube-pr-bot

Copy link
Copy Markdown

kvm2 driver with docker runtime

┌────────────────┬──────────┬────────────────────────┐
│    COMMAND     │ MINIKUBE │ MINIKUBE  ( PR 22623 ) │
├────────────────┼──────────┼────────────────────────┤
│ minikube start │ 37.7s    │ 37.1s                  │
│ enable ingress │ 16.3s    │ 16.0s                  │
└────────────────┴──────────┴────────────────────────┘
Details

Times for minikube ingress: 16.3s 16.8s 15.7s 16.3s 16.3s
Times for minikube (PR 22623) ingress: 15.8s 15.8s 16.3s 16.3s 15.8s

Times for minikube start: 36.7s 36.6s 38.0s 36.8s 40.4s
Times for minikube (PR 22623) start: 35.9s 36.7s 39.7s 36.3s 36.7s

docker driver with docker runtime

┌────────────────┬──────────┬────────────────────────┐
│    COMMAND     │ MINIKUBE │ MINIKUBE  ( PR 22623 ) │
├────────────────┼──────────┼────────────────────────┤
│ minikube start │ 20.0s    │ 19.7s                  │
│ enable ingress │ 10.8s    │ 11.4s                  │
└────────────────┴──────────┴────────────────────────┘
Details

Times for minikube start: 20.9s 21.7s 21.0s 17.9s 18.4s
Times for minikube (PR 22623) start: 18.5s 21.7s 21.3s 18.2s 18.6s

Times for minikube ingress: 10.6s 11.6s 10.6s 10.6s 10.6s
Times for minikube (PR 22623) ingress: 11.6s 10.7s 13.6s 10.6s 10.6s

docker driver with containerd runtime

┌────────────────┬──────────┬────────────────────────┐
│    COMMAND     │ MINIKUBE │ MINIKUBE  ( PR 22623 ) │
├────────────────┼──────────┼────────────────────────┤
│ minikube start │ 17.8s    │ 17.5s                  │
│ enable ingress │ 24.5s    │ 23.9s                  │
└────────────────┴──────────┴────────────────────────┘
Details

Times for minikube start: 19.9s 15.9s 16.8s 17.0s 19.6s
Times for minikube (PR 22623) start: 15.8s 19.3s 16.9s 16.1s 19.2s

Times for minikube ingress: 25.1s 23.1s 25.1s 25.1s 24.1s
Times for minikube (PR 22623) ingress: 23.1s 23.1s 24.1s 25.1s 24.1s

@k8s-ci-robot

Copy link
Copy Markdown
Contributor

@medyagh: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-minikube-docker-containerd-linux-x86 b4a3bd4 link true /test pull-minikube-docker-containerd-linux-x86
pull-minikube-docker-crio-linux-x86 b4a3bd4 link false /test pull-minikube-docker-crio-linux-x86
pull-minikube-docker-containerd-linux-arm b4a3bd4 link false /test pull-minikube-docker-containerd-linux-arm

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@wt

wt commented Feb 5, 2026

Copy link
Copy Markdown

This does not seem to comply with recommendations for generating temp files/dirs here. The old name and the suggested change is unfortunately guessable. Would it be better to have directory that is created with this golang api. It doesn't even need to have the username in it...something like '/tmp/minikube-${random_val}'?

Is there any reason that the directory needs to be the same for each run of the tool? Would it be reasonable to have minikube {stop|delete} clean them up?

@wt

wt commented Feb 5, 2026

Copy link
Copy Markdown

And I don't know what the multiple lock files in that directory are for, but it would be possible to name them something more understandable if the directory itself was unique.

@wt

wt commented Feb 5, 2026

Copy link
Copy Markdown

Also, the change seems to do what is described on a Fedora machine. I created a /tmp/minikube-locks dir and set the perms to 600. I then started a minikube cluster and saw that the /tmp/minikube-locks-$my_uid directory was created and populated with lock files. Start and stop seems to work fine.

@nirs

nirs commented Feb 5, 2026

Copy link
Copy Markdown
Collaborator

This does not seem to comply with recommendations for generating temp files/dirs here. The old name and the suggested change is unfortunately guessable. Would it be better to have directory that is created with this golang api. It doesn't even need to have the username in it...something like '/tmp/minikube-${random_val}'?

Is there any reason that the directory needs to be the same for each run of the tool? Would it be reasonable to have minikube {stop|delete} clean them up?

The lock files in the directory will be accessed by multiple minikube instances, so we cannot use random names. This is not a directory or temporary data but directory synchronizing multiple processes.

Creating the directory is safe since creating a directory is atomic operation. If the directory already exits the call will fail. If 2 minikube instance try to create the directory in the same time one will succeed and the other will with ErrExist.

The directory must be created with permissions 0o700 so only the user can access it.

When creating a lock file we can use O_CREATE|O_NOFOLLOW to protect forms symlinks attacks.

See #22624 for discussion on location of the runtime data.

@wt

wt commented Feb 5, 2026

Copy link
Copy Markdown

This does not seem to comply with recommendations for generating temp files/dirs here. The old name and the suggested change is unfortunately guessable. Would it be better to have directory that is created with this golang api. It doesn't even need to have the username in it...something like '/tmp/minikube-${random_val}'?
Is there any reason that the directory needs to be the same for each run of the tool? Would it be reasonable to have minikube {stop|delete} clean them up?

The lock files in the directory will be accessed by multiple minikube instances, so we cannot use random names. This is not a directory or temporary data but directory synchronizing multiple processes.

I assume that the lock files directory needs to be used for multiple invocations of the minikube binary. If that is true, couldn't minikube store that in the minikube profile config? Is there some problem with having a different dir for each profile?

Creating the directory is safe since creating a directory is atomic operation. If the directory already exits the call will fail. If 2 minikube instance try to create the directory in the same time one will succeed and the other will with ErrExist.

The problem is that another user could create the directory and prevent my user from running minikube successfully. There are other failure modes that I can think of.

The directory must be created with permissions 0o700 so only the user can access it.

For the test, I created the old name (/tmp/minikube-locks) with 0o600 to make sure that minikube wasn't writing to it with the change. I didn't create the new directory manually. The directory wsa created with 0o700, as expected.

When creating a lock file we can use O_CREATE|O_NOFOLLOW to protect forms symlinks attacks.

I am less worried about symlink attacks. I am more worried about just making sure that best practices around temp dir creation/usage are followed. Also, using os.mkTempDir will make sure that the proper temp directories are used on different platforms following their conventions. For example, the TMPDIR env var on Linux and using the equivalent appropriate env vars on Windows.

See #22624 for discussion on location of the runtime data.

I can also post a summary of this there as well, if that is useful.

@wt

wt commented Feb 5, 2026

Copy link
Copy Markdown

I read #22624. That's a really good idea. I made a small comment, but it really sounds like the right direction.

@medyagh

medyagh commented Feb 17, 2026

Copy link
Copy Markdown
Member Author

I read #22624. That's a really good idea. I made a small comment, but it really sounds like the right direction.

I can also post a summary of this there as well, if that is useful

thanks @wt :) yes a summery would be nice, please feel free to add it to the issue I created so we can tackle it in next minor version (not patch version)

@medyagh medyagh changed the title fix Lock regression Fix lock file regression by appending UID to the lock driectory Feb 17, 2026
@medyagh medyagh merged commit cf962d5 into kubernetes:master Feb 17, 2026
43 of 59 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HOST_HOME_PERMISSION "unable to acquire lock" in 1.38.0

6 participants