Skip to content

Commit 1aa2842

Browse files
authored
Merge pull request #208 from acep-uaf/nicole/update-readme
Updagte readme to include scada script and chef/systemd tools
2 parents 5b75c79 + 9d5d0d9 commit 1aa2842

File tree

1 file changed

+69
-53
lines changed

1 file changed

+69
-53
lines changed

README.md

Lines changed: 69 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -1,100 +1,116 @@
1-
# SEL-735 Meter Event Data Pipeline
1+
# SEL-735 Meter and SCADA Data Pipeline
22

3-
This repository contains a set of Bash scripts that make up a data pipeline, designed to automate the process of interacting with an SEL-735 meter. The pipeline is divided into two main executable scripts:
3+
This repository contains a set of Bash scripts designed to automate the retrieval and organization of event data from SEL-735 meters, the synchronization of SCADA data between directories, and the archival of data to a dedicated remote server.
44

5-
1. **`data_pipeline.sh`**: Handles the first four steps:
6-
- Connecting to the meter via FTP
7-
- Downloading new files
8-
- Organizing and creating metadata
9-
- Compressing data
5+
## Pipeline Overview
6+
Each of the following scripts are executed seperately and have their own config file.
107

11-
2. **`archive_pipeline.sh`**: Handles the final step:
12-
- Archiving and transferring event data to the dedicated server.
8+
1. **`data_pipeline.sh`**
9+
10+
Handles fetching and organizing raw event data from SEL-735 meters via FTP:
11+
- Connects to the meter
12+
- Downloads new event data
13+
- Organizes directory structure and creates metadata
14+
- Adds checksums
15+
- Compresses raw data into `.zip`
16+
- Generates `.message` file to be ingest by [data-streams-das-mqtt-pub](https://github.com/acep-uaf/data-streams-das-mqtt-pub)
17+
18+
1. **`sync-scada-data.sh`**
19+
20+
Synchronizes SCADA data from a source directory to a destination directory:
21+
- Supports syncing data over a configurable number of past months
22+
- **TO DO**: Exclude current days data to avoid syncing partially written files.
1323

24+
1. **`archive_pipeline.sh`**
1425

15-
## Prerequisites
16-
Ensure you have the following before running the pipeline:
17-
- Unix-like environment (Linux, macOS, or a Unix-like Windows terminal)
18-
- FTP credentials for the meter
19-
- Meter Configuration
20-
- Must have installed:
21-
- `lftp`
22-
- `yq`
23-
- `zip`
24-
- `rsync`
25-
- `jq`
26+
Transfers downloaded and processed meter data to a dedicated server:
27+
- Uses `rsync` to transfer data to remote server
28+
- Automatically triggers a cleanup script if enabled via config
2629

2730
## Installation
31+
32+
1. Ensure you have the following before running the pipeline:
33+
- Unix-like environment (Linux, macOS, or a Unix-like Windows terminal)
34+
- FTP credentials for the meter
35+
- Meter Configuration
36+
- Must have installed: `lftp`, `yq`, `zip`, `rsync`, `jq`
37+
2838
1. Clone the repository:
2939

3040
```bash
3141
git clone git@github.com:acep-uaf/camio-meter-streams.git
3242
cd camio-meter-streams/cli_meter
3343
```
3444

35-
**Note**: You can check your SSH connection with `ssh -T git@github.com`
36-
3745
## Configuration
3846

39-
### General Configuration Steps
40-
1. Navigate to the `config` directory and copy the example configuration files to a new file:
47+
Each script uses its own YAML configuration file located in the `config/` directory.
48+
49+
1. **Navigate to the config directory and copy the example configuration files:**
4150

4251
```bash
4352
cd config
4453
cp config.yml.example config.yml
4554
cp archive_config.yml.example archive_config.yml
55+
cp scada_config.yml.example scada_config.yml
4656
```
4757

48-
2. **Update** the configuration files with the target details:
49-
- **`config.yml`**: Add the FTP server credentials and meter configuration data.
50-
- **`archive_config.yml`**: Add the source and destination directories and other relevant details.
58+
1. **Update each configuration file**
59+
- `config.yml` — used by `data_pipeline.sh`
60+
- `archive_config.yml` — used by `archive_pipeline.sh`
61+
- `scada_config.yml` — used by `sync-scada-data.sh`
5162

52-
3. Secure the configuration files so that only the owner can read and write:
63+
1. **Secure the configuration files**
5364

5465
```bash
55-
chmod 600 config.yml
56-
chmod 600 archive_config.yml
66+
chmod 600 config.yml archive_config.yml scada_config.yml
5767
```
68+
## Usage
5869

59-
## Execution
60-
To run the data pipeline and then transfer data to the target server:
70+
This pipeline can be used in two ways:
71+
1. **Manually**, by executing the scripts directly from the command line
72+
1. **Automatically**, by running it as a scheduled systemd service managed through Chef
6173

62-
1. **Run the Data Pipeline**
74+
### Automated Execution via systemd and Chef
6375

64-
Execute the `data_pipeline` script from the `cli_meter` directory. The script requires a configuration file specified via the `-c/--config` flag. If this is your first time running the pipeline, the initial download may take a few hours. To pause the download safely, see: [How to Stop the Pipeline](#how-to-stop-the-pipeline)
76+
In production environments, each pipeline script is run automatically using a dedicated `systemd` **service** and **timer** pair, configured through custom default attributes defined in the Chef cookbook.
6577

66-
### Command
67-
68-
```bash
69-
./data_pipeline.sh -c config/config.yml
70-
```
78+
Each configuration file has a corresponding Chef data bag that defines its values. All configuration data is centrally managed through Chef data bags and vaults. To make changes, update the appropriate Chef-managed data bags and cookbooks.
7179

72-
2. **Run the Archive Pipeline**
80+
**Cookbooks**:
81+
- [acep-camio-streams](https://github.com/acep-devops/acep-camio-streams/tree/main) - installs and configures the server.
82+
- [acep-devops-chef](https://github.com/acep-devops/acep-devops-chef/tree/main)
7383

74-
After the `data_pipeline` script completes, execute the `archive_pipeline` script from the `cli_meter` directory. The script requires a configuration file specified via the `-c/--config` flag.
84+
### Manual Execution
85+
To run the data pipeline and then transfer data to the target server:
7586

76-
### Command
87+
1. **Data Pipeline (Event Data)**
88+
```sh
89+
./data_pipeline.sh -c config/config.yml
90+
```
91+
1. **Sync SCADA Data**
92+
```sh
93+
./sync-scada-data.sh -c config/scada-sync.yml
94+
```
7795

78-
```bash
96+
1. **Archive Pipeline**
97+
```sh
7998
./archive_pipeline.sh -c config/archive_config.yml
8099
```
81-
#### Notes
82-
The **rsync** uses the `--exclude` flag to exclude the `working` directory to ensure only complete files are transfered.
83-
84-
3. **Run the Cleanup Process (Conditional)**
100+
**Note:** `rsync` uses the `--exclude` flag to exclude the `working/` directory to ensure only complete files are transfered.
85101

86-
If the `archive_pipeline` script completes successfully and the `enable_cleanup` flag is set to true in the archive configuration file, the `cleanup.sh` script will be executed automatically. This script removes outdated event files based on the retention period specified in the configuration file.
102+
1. **Run the Cleanup Process (Conditional)**
103+
The cleanup script removes outdated event files based on the retention period specified in the configuration file.
87104

88-
If the `enable_cleanup` flag is not enabled, you can run the cleanup manually by passing in the archive configuration file.
89-
90-
### Command
105+
If `enable_cleanup` is set to `true` in `archive_config.yml`, `cleanup.sh` runs automatically after `archive_pipeline.sh`.
106+
107+
Otherwise, you can run it manually:
91108

92109
```bash
93110
./cleanup.sh -c config/archive_config.yml
94111
```
95112

96-
#### Notes
97-
Ensure that the `archive_config.yml` file is properly configured with the retention periods for each directory in the cleanup process.
113+
**Note:** Ensure `archive_config.yml` specifies retention periods for each directory.
98114

99115
## How to Stop the Pipeline
100116

0 commit comments

Comments
 (0)