Skip to content

Stale dnsmasq.pid can prevent IP allocation due to PID clash #1741

@jocado

Description

@jocado

Describe the bug
If a stale dnsmasq.pid exists from a previous execution and another running process has the PID, instances fail to boot properly with the error:

launch failed: The following errors occurred:                                   
failed to determine IP address

PID allocation on systems that don't reboot often means this is unlikely, but if you reboot daily [ for instance ] , it becomes much more likely. I have seen it happen , so it is at least possible :)

To Reproduce
Reboot a system
If you are unlucky, and one of the running PIDs matches the contents of dnsmasq.pid, then dnsmasq will not start instances fail to launch.

To reliably reproduce:
Stop multipassd
echo {PID_OF_A_RUNNING_PROCESS} > /var/snap/multipass/common/data/multipassd/network/dnsmasq.pid
Start multipassd
Try and launch an instance

Expected behavior
dnsmasq.pid should either be purge on multipassd stop, or additional logic added to check for a process signature that matches dnsmaq fore assuming dnsmaqk is running.

Instances should still be able to launch. A non-technical user will find it hard to debug this issue, and it could the cuase of some other tickets I've seen:
#1653
#1584

Logs
Also see this in the service logs:

Sep 18 11:40:02 hostname multipassd[27492]: Looking for dnsmasq
Sep 18 11:40:02 hostname multipassd[27492]: Read pid "27405" from file "/var/snap/multipass/common/data/multipassd/network/dnsmasq.pid"
Sep 18 11:40:02 hostname multipassd[27492]: existing dnsmasq found with pid 27405

Additional info
Ubuntu 18.04.5

multipass 1.4.0
multipassd 1.4.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugmediumMedium priority. Combine with "low" or "high" to signal intermediate priority.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions