Restore fails after a while with connection error

### Report

We are trying to test backup and restore.  We ran a restore.  The restore was initially able to connect to mongodb and download files from S3.   However, after some time it failed with an error:

```
Fatal assertion / 2025-06-30T20:39:40.150+00:00, connect err: ping: server selection error: server selection timeout, current topology: { Type: Single, Servers: [{ Addr: localhost:27872, Type: Unknown, Last error: dial tcp [::1]:27872: connect: connection refused }, ] }
```


### More about the problem

These are the final logs/
```
2025-06-30T20:44:39.000+0000 D [restore/2025-06-30T20:36:00.578371215Z] remove /data/db/index-449-11338991585176808542.wt
2025-06-30T20:44:39.000+0000 D [restore/2025-06-30T20:36:00.578371215Z] remove /data/db/index-477-11338991585176808542.wt
2025-06-30T20:44:39.000+0000 D [restore/2025-06-30T20:36:00.578371215Z] remove /data/db/collection-454-11338991585176808542.wt
2025-06-30T20:44:39.000+0000 D [restore/2025-06-30T20:36:00.578371215Z] remove /data/db/index-496-11338991585176808542.wt
2025-06-30T20:44:39.000+0000 E [restore/2025-06-30T20:36:00.578371215Z] restore: prepare data: connect to mongo: mongo failed with [F] Fatal assertion / 2025-06-30T20:39:40.150+00:00, connect err: ping: server selection error: server selection timeout, current topology: { Type: Single, Servers: [{ Addr: localhost:27872, Type: Unknown, Last error: dial tcp [::1]:27872: connect: connection refused }, ] }
2025-06-30T20:44:39.000+0000 I change stream was closed
2025-06-30T20:44:39.000+0000 D [restore/2025-06-30T20:36:00.578371215Z] hearbeats stopped
2025-06-30T20:44:39.000+0000 D [restore/2025-06-30T20:36:00.578371215Z] uploading ".pbm.restore/2025-06-30T20:36:00.578371215Z/rs.rs0/log/stagingdb-rs0-0.stagingdb-rs0.stagingdb.svc.cluster.local:27017.0.log" [size hint: -1 (unknown); part size: 10485760 (10.00MB)]
2025-06-30T20:44:40.000+0000 D [agentCheckup] deleting agent status
2025-06-30T20:44:40.000+0000 E [pitr] init: get conf: get: server selection error: context canceled, current topology: { Type: ReplicaSetNoPrimary, Servers: [{ Addr: stagingdb-rs0-0.stagingdb-rs0.stagingdb.svc.cluster.local:27017, Type: Unknown, Last error: dial tcp 100.96.145.43:27017: connect: connection refused }, { Addr: stagingdb-rs0-1.stagingdb-rs0.stagingdb.svc.cluster.local:27017, Type: Unknown, Last error: dial tcp 100.109.101.41:27017: connect: connection refused }, { Addr: stagingdb-rs0-2.stagingdb-rs0.stagingdb.svc.cluster.local:27017, Type: Unknown, Last error: dial tcp 100.109.142.172:27017: connect: connection refused }, ] }
2025-06-30T20:44:40.000+0000 I Exit: <nil>

```

### Steps to reproduce

1. Create a replset cluster (no sharing or mongos)
2. Put some data in it
3. Create a physical manual backup
4. Delete the cluster
5. Create the cluster empty again
6. Try to run a restore of the backup
7. Notice whether the restore completes successfully


### Versions

1. Kubernetes  v1.30.12
2. Operator v1.20.1
3. Database mongodb

### Anything else?

I found it strange that it was trying to connect to mongod on a different port than usual - 27872.  Clearly there's something going on behind the scenes I don't currently understand.

CR YAML: https://gist.github.com/dobesv/c2727a9ee382ce80638d61bd0d64ca30


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Restore fails after a while with connection error #1991

Report

More about the problem

Steps to reproduce

Versions

Anything else?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Restore fails after a while with connection error #1991

Description

Report

More about the problem

Steps to reproduce

Versions

Anything else?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions