Skip to content

Add filesystems docs #710

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 77 additions & 0 deletions docs/filesystems.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Overview

The Slurm appliance supports mounting shared filesystems using [CephFS](https://docs.ceph.com/en/latest/cephfs/) via [OpenStack Manila](https://wiki.openstack.org/wiki/Manila). These docs explain:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually link to docs rather than the Wiki which tends to be unmaintained.

Upstream docs: https://docs.openstack.org/manila/latest/


- How to create the shares in OpenStack Manilla
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- How to create the shares in OpenStack Manilla
- How to create the shares in OpenStack Manila


- How to configure the Slurm Appliance to mount these Manila shares.

- How to disable use Manila shares for a shared home directory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"use of" or "using"?


## Creating shares in OpenStack

The Slurm appliance requires that the Manila shares already exist on the system. Follow the instructions below to do this.

If this is the first time Manila is being used on the system, a CephFS share type will need to be created. You will need admin credentials to do this.

```bash
openstack share type create cephfs-type false --extra-specs storage_protocol=CEPHFS, vendor_name=Ceph
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you need quotes around the extra-specs argument, or to remove the space after the comma?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually found it didn't work the same when wrapping these in quotes, it would include the second option as part of the first extra spec.
This does annoyingly mean it only worked when --extra-specs was the last argument.

openstack share type create cephfs-type-2 false --extra-specs 'storage_protocol=CEPHFS, vendor_name=Ceph'
openstack share type list
+--------------------------------------+---------------+------------+------------+--------------------------------------+---------------------------------------------+-------------+
| ID                                   | Name          | Visibility | Is Default | Required Extra Specs                 | Optional Extra Specs                        | Description |
+--------------------------------------+---------------+------------+------------+--------------------------------------+---------------------------------------------+-------------+
| c3937c6d-bbca-4d8c-b4b4-3fcf1afc89e6 | cephfs-type   | public     | False      | driver_handles_share_servers : False | vendor_name : Ceph                          | None        |
|                                      |               |            |            |                                      | storage_protocol : CEPHFS                   |             |
| 00f9c0b6-8bb4-4ce4-915c-590caee3b696 | cephfs-type-2 | public     | False      | driver_handles_share_servers : False | storage_protocol : CEPHFS, vendor_name=Ceph | None        |
+--------------------------------------+---------------+------------+------------+--------------------------------------+---------------------------------------------+-------------+

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out that wasn't working, I actually needed to drop the comma

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The help command says:

  --extra-specs [<key=value> ...]
                        Extra specs key and value of share type that will be used for share type creation. OPTIONAL: Default=None. example --extra-specs thin_provisioning='<is> True', replication_type=readable.

Is the help text wrong?

```

Once this exists, create a share using credentials for the Slurm project. An access rule also needs to be created, where the “access_to” argument (`openstack share access create <share> <access_type> <access_to>`) is a user that will be created in Ceph. This needs to be globally unique in Ceph, so needs to be different for each OpenStack project.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Once this exists, create a share using credentials for the Slurm project. An access rule also needs to be created, where the access_to argument (`openstack share access create <share> <access_type> <access_to>`) is a user that will be created in Ceph. This needs to be globally unique in Ceph, so needs to be different for each OpenStack project.
Once this exists, create a share using credentials for the Slurm project. An access rule also needs to be created, where the `access_to` argument (`openstack share access create <share> <access_type> <access_to>`) is a user that will be created in Ceph. This needs to be globally unique in Ceph, so needs to be different for each OpenStack project.


```bash
openstack share create CephFS 300 --description 'Scratch dir for Slurm prod' --name slurm-production-scratch --share-type cephfs-type --wait
openstack share access create slurm-production-scratch cephx slurm-production
```

## Configuring the Slurm Appliance for Manila

To mount shares onto hosts in a group, add the to the `manila` group.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To mount shares onto hosts in a group, add the to the `manila` group.
To mount shares onto hosts in a group, add them to the `manila` group.


```ini
[manila:children]
login
compute
```

Set the version of Ceph which is running on the system.

```yaml
os_manila_mount_ceph_version: "18.2.4"
```

Define the list of shares to be mounted, and the paths to mount them to. See the [stackhpc.os-manila-mount role](https://github.com/stackhpc/ansible-role-os-manila-mount) for further configuration options.

```yaml
os_manila_mount_shares:
- share_name: slurm-production-scratch
mount_path: /scratch
```

### Shared home directory

By default, the Slurm appliance will spin up a local NFS server and mount the home directories to it. When using Manila + CephFS for the home directory instead, this will need to be disabled.

```yaml
nfs_configurations: []
```

The basic_users home directory will need to be updated to point to this new shared directory.

```yaml
basic_users_homedir_server: "{{ groups['login'] | first }}" # if not mounting /home on control node
basic_users_homedir_server_path: /home
```

Set the Tofu variable `home_volume_size = 0` to stop Tofu from creating a new home volume. NB: If the control node has already been deployed, re-running Tofu will delete the home volume and delete/recreate the control node.

Finally, add the home directory to the list of shares (the share should be created already in OpenStack).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Finally, add the home directory to the list of shares (the share should be created already in OpenStack).
Finally, add the home directory to the list of shares (the share should already be created in OpenStack).


```yaml
os_manila_mount_shares:
- share_name: slurm-production-scratch
mount_path: /scratch
- share_name: slurm-production-home
mount_path: /home
```