|
| 1 | +# Storage with Rook Ceph on Packet cloud |
| 2 | + |
| 3 | +## Contents |
| 4 | + |
| 5 | +- [Introduction](#introduction) |
| 6 | +- [Prerequisites](#prerequisites) |
| 7 | +- [Deploy storage worker pool](#deploy-storage-worker-pool) |
| 8 | + - [Config](#config) |
| 9 | + - [Deploy the worker pool](#deploy-the-worker-pool) |
| 10 | +- [Deploy Rook](#deploy-rook) |
| 11 | + - [Config](#config-1) |
| 12 | + - [Deploy the component](#deploy-the-component) |
| 13 | +- [Deploy Rook Ceph](#deploy-rook-ceph) |
| 14 | + - [Config](#config-2) |
| 15 | + - [Deploy the component](#deploy-the-component-1) |
| 16 | +- [Access Ceph dashboard](#access-ceph-dashboard) |
| 17 | +- [Enable and access toolbox](#enable-and-access-toolbox) |
| 18 | +- [Enable monitoring](#enable-monitoring) |
| 19 | +- [Make default storage class](#make-default-storage-class) |
| 20 | +- [Additional resources](#additional-resources) |
| 21 | + |
| 22 | +## Introduction |
| 23 | + |
| 24 | +This document explains how to set up storage using Rook and Rook Ceph components of Lokomotive. Rook is a storage aggregator which supports various storage backends, but Lokomotive only supports Ceph at the moment. Lokomotive supports block storage with Ceph. |
| 25 | + |
| 26 | +## Prerequisites |
| 27 | + |
| 28 | +- A Lokomotive cluster is accessible via `kubectl` deployed on a supported provider. |
| 29 | + |
| 30 | +## Deploy storage worker pool |
| 31 | + |
| 32 | +#### Config |
| 33 | + |
| 34 | +Deploy a cluster with at least one worker pool dedicated to Rook Ceph. A dedicated worker pool configuration should look like the following: |
| 35 | + |
| 36 | +```tf |
| 37 | +cluster "packet" { |
| 38 | + ... |
| 39 | +
|
| 40 | + worker_pool "storage" { |
| 41 | + count = 3 |
| 42 | + node_type = "c2.medium.x86" |
| 43 | +
|
| 44 | + labels = "storage.lokomotive.io=ceph" |
| 45 | + taints = "storage.lokomotive.io=ceph:NoSchedule" |
| 46 | + } |
| 47 | +} |
| 48 | +``` |
| 49 | + |
| 50 | +- The number of machines provided using `count` should be an odd number greater than three. |
| 51 | +- Type of node, provided using `node_type`, should be one that has multiple disks like `c2.medium.x86` or `s1.large.x86`. Find out more servers [here](https://www.packet.com/cloud/servers/). |
| 52 | +- To steer Rook Ceph workload on these storage nodes provide `labels`. |
| 53 | +- Provide `taints` so that other workload can be **steered away** by default. This setting is not mandatory, but isolating storage workloads from others is recommended. |
| 54 | + |
| 55 | +#### Deploy the worker pool |
| 56 | + |
| 57 | +```bash |
| 58 | +lokoctl cluster apply -v |
| 59 | +``` |
| 60 | + |
| 61 | +## Deploy Rook |
| 62 | + |
| 63 | +#### Config |
| 64 | + |
| 65 | +```tf |
| 66 | +component "rook" { |
| 67 | + node_selector = { |
| 68 | + "storage.lokomotive.io" = "ceph" |
| 69 | + } |
| 70 | +
|
| 71 | + toleration { |
| 72 | + key = "storage.lokomotive.io" |
| 73 | + operator = "Equal" |
| 74 | + value = "ceph" |
| 75 | + effect = "NoSchedule" |
| 76 | + } |
| 77 | +
|
| 78 | + agent_toleration_key = "storage.lokomotive.io" |
| 79 | + agent_toleration_effect = "NoSchedule" |
| 80 | +
|
| 81 | + discover_toleration_key = "storage.lokomotive.io" |
| 82 | + discover_toleration_effect = "NoSchedule" |
| 83 | +} |
| 84 | +``` |
| 85 | + |
| 86 | +- `node_selector` should match the `labels` provided in the `worker_pool`. |
| 87 | +- `toleration` should match the `taints` mentioned in the `worker_pool`. |
| 88 | +- `agent_toleration_key` and `discover_toleration_key` should match the key of the `taints` provided in the `worker_pool`. |
| 89 | +- `agent_toleration_effect` and `discover_toleration_effect` should match the effect of the `taints` provided in the `worker_pool`. |
| 90 | + |
| 91 | +#### Deploy the component |
| 92 | + |
| 93 | +```bash |
| 94 | +lokoctl component apply rook |
| 95 | +``` |
| 96 | + |
| 97 | +Verify the operator pod in the `rook` namespace is in the `Running` state (this may take a few minutes): |
| 98 | + |
| 99 | +```console |
| 100 | +$ kubectl -n rook get pods -l app=rook-ceph-operator |
| 101 | +NAME READY STATUS RESTARTS AGE |
| 102 | +rook-ceph-operator-76d8687f95-6knf8 1/1 Running 0 81m |
| 103 | +``` |
| 104 | + |
| 105 | +## Deploy Rook Ceph |
| 106 | + |
| 107 | +#### Config |
| 108 | + |
| 109 | +```tf |
| 110 | +component "rook-ceph" { |
| 111 | + monitor_count = 3 |
| 112 | +
|
| 113 | + node_affinity { |
| 114 | + key = "storage.lokomotive.io" |
| 115 | + operator = "Exists" |
| 116 | + } |
| 117 | +
|
| 118 | + toleration { |
| 119 | + key = "storage.lokomotive.io" |
| 120 | + operator = "Equal" |
| 121 | + value = "ceph" |
| 122 | + effect = "NoSchedule" |
| 123 | + } |
| 124 | +
|
| 125 | + storage_class { |
| 126 | + enable = true |
| 127 | + } |
| 128 | +} |
| 129 | +``` |
| 130 | + |
| 131 | +- `monitor_count` should be an odd number greater than three but not higher than the `count` of workers in the `worker_pool`. |
| 132 | +- `node_affinity` should match the `labels` provided in the `worker_pool`. |
| 133 | +- `toleration` should match the `taints` provided in the `worker_pool`. |
| 134 | +- `storage_class` should be provided if workloads have PVC. |
| 135 | + |
| 136 | +#### Deploy the component |
| 137 | + |
| 138 | +```bash |
| 139 | +lokoctl component apply rook-ceph |
| 140 | +``` |
| 141 | + |
| 142 | +Verify the osd pods in the `rook` namespace are in the `Running` state (this may take a few minutes): |
| 143 | + |
| 144 | +```console |
| 145 | +$ kubectl -n rook get pods -l app=rook-ceph-osd |
| 146 | +NAME READY STATUS RESTARTS AGE |
| 147 | +rook-ceph-osd-0-6d4f69dbf9-26kzl 1/1 Running 0 67m |
| 148 | +rook-ceph-osd-1-86c9597b84-lmh94 1/1 Running 0 67m |
| 149 | +rook-ceph-osd-2-6d97697897-7bprl 1/1 Running 0 67m |
| 150 | +rook-ceph-osd-3-5bfb9d86b-rk6v4 1/1 Running 0 67m |
| 151 | +rook-ceph-osd-4-5b76cb9675-cxkdw 1/1 Running 0 67m |
| 152 | +rook-ceph-osd-5-8c86f5c6c-6qxtz 1/1 Running 0 67m |
| 153 | +rook-ceph-osd-6-5b9cc479b7-vjc9v 1/1 Running 0 67m |
| 154 | +rook-ceph-osd-7-7b84d6cc48-b46z9 1/1 Running 0 67m |
| 155 | +rook-ceph-osd-8-5868969f97-2bn9r 1/1 Running 0 67m |
| 156 | +``` |
| 157 | + |
| 158 | +## Access Ceph dashboard |
| 159 | + |
| 160 | +Ceph dashboard provides valuable visual information. It is an essential tool to monitor the Ceph cluster. Here are steps on how to access it. |
| 161 | + |
| 162 | +Find the admin password: |
| 163 | + |
| 164 | +```bash |
| 165 | +kubectl -n rook get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo |
| 166 | +``` |
| 167 | + |
| 168 | +Port forward the service locally: |
| 169 | + |
| 170 | +```bash |
| 171 | +kubectl -n rook port-forward svc/rook-ceph-mgr-dashboard 8443:8443 |
| 172 | +``` |
| 173 | + |
| 174 | +Now open the following URL: [https://localhost:8443](https://localhost:8443) and enter the username `admin` and password obtained from the first step. |
| 175 | + |
| 176 | +## Enable and access toolbox |
| 177 | + |
| 178 | +Ceph is a complex software system, and not everything that goes on into the Ceph cluster is visible at the rook layer of abstraction. So command-line interface to interact with Ceph cluster is useful. Ceph toolbox helps you access the ceph cluster using `ceph` cli utility. Here you can configure the Ceph cluster setting and debug the cluster if needed. |
| 179 | + |
| 180 | +To deploy the toolbox, the `rook-ceph` component config should set the flag `enable_toolbox` to `true`. |
| 181 | + |
| 182 | +```tf |
| 183 | +component "rook-ceph" { |
| 184 | + enable_toolbox = true |
| 185 | +
|
| 186 | + ... |
| 187 | +} |
| 188 | +``` |
| 189 | + |
| 190 | +Verify the deployment of toolbox pod to see if it is ready: |
| 191 | + |
| 192 | +```console |
| 193 | +$ kubectl -n rook get deploy rook-ceph-tools |
| 194 | +NAME READY UP-TO-DATE AVAILABLE AGE |
| 195 | +rook-ceph-tools 1/1 1 1 39s |
| 196 | +``` |
| 197 | + |
| 198 | +Access the toolbox pod using the following command: |
| 199 | + |
| 200 | +``` |
| 201 | +kubectl -n rook exec -it $(kubectl -n rook get pods -l app=rook-ceph-tools -o name) bash |
| 202 | +``` |
| 203 | + |
| 204 | +Once inside the pod you can run usual `ceph` commands: |
| 205 | + |
| 206 | +```console |
| 207 | +[root@rook-ceph-tools-674c8dd89d-pp4x9 /]# ceph status |
| 208 | + cluster: |
| 209 | + id: 1a73e7b2-c825-414e-81f0-0f70b19478d3 |
| 210 | + health: HEALTH_OK |
| 211 | + |
| 212 | + services: |
| 213 | + mon: 3 daemons, quorum a,b,c (age 50m) |
| 214 | + mgr: a(active, since 51s) |
| 215 | + osd: 9 osds: 9 up (since 48m), 9 in (since 48m) |
| 216 | + |
| 217 | + data: |
| 218 | + pools: 1 pools, 32 pgs |
| 219 | + objects: 0 objects, 0 B |
| 220 | + usage: 9.0 GiB used, 3.0 TiB / 3.1 TiB avail |
| 221 | + pgs: 32 active+clean |
| 222 | +``` |
| 223 | + |
| 224 | +## Enable monitoring |
| 225 | + |
| 226 | +Monitor `rook` and `rook-ceph` components using the `prometheus-operator` component. To enable your `rook` component config should have the flag `enable_monitoring` set to `true`. Deploy the `prometheus-operator` component before. |
| 227 | + |
| 228 | +```tf |
| 229 | +component "rook" { |
| 230 | + enable_monitoring = true |
| 231 | +
|
| 232 | + ... |
| 233 | +} |
| 234 | +``` |
| 235 | + |
| 236 | +## Make default storage class |
| 237 | + |
| 238 | +It is recommended to make the storage class as default if rook ceph is the only storage provider in the cluster. This setting helps to satisfy any PVC request made in the cluster. The `rook-ceph` component config should look like the following: |
| 239 | + |
| 240 | +```tf |
| 241 | +component "rook-ceph" { |
| 242 | + ... |
| 243 | +
|
| 244 | + storage_class { |
| 245 | + enable = true |
| 246 | + default = true |
| 247 | + } |
| 248 | +} |
| 249 | +``` |
| 250 | + |
| 251 | +## Additional resources |
| 252 | + |
| 253 | +Rook docs: |
| 254 | + |
| 255 | + - [Ceph toolbox](https://rook.io/docs/rook/master/ceph-toolbox.html). |
| 256 | + - [Ceph dashboard](https://rook.io/docs/rook/master/ceph-dashboard.html). |
| 257 | + - [Ceph direct tools](https://rook.io/docs/rook/master/direct-tools.html). |
| 258 | + - [Ceph advanced configuration](https://rook.io/docs/rook/master/ceph-advanced-configuration.html). |
| 259 | + - [Disaster recovery](https://rook.io/docs/rook/master/ceph-disaster-recovery.html). |
0 commit comments