Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions workflows/raster/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Contents:

- [archive](#archive)
- [Standardising](#Standardising)
- [copy](#copy)
- [publish-odr](#Publish-odr)
Expand All @@ -8,6 +9,43 @@
- [Hillshade](#hillshade)
- [tests](#Tests)

# Archive

## Workflow Description

Archive files from one of the S3 buckets (for example `linz-*-upload`) to a long term archive bucket (S3 Glacier Deep Archive). This workflow is intended to be used after files in a source folder have been processed and not needed for any future update.

1. Create a manifest of file to be archived from the source location (example: `s3://linz-topographic-upload/[PROVIDER]/[SURVEY]/`)
2. Compress each file listed in the manifest and check if the compression is worthwhile (very small files may have a bigger size after compression)
3. Copy the compressed (or original if smaller) file to the archive bucket (example: `s3://topographic-archive/[PROVIDER]/[SURVEY]/`)
4. Ensure a new version is stored if a file already exists in the archive (this is managed by AWS S3 versioning)
5. Delete the original files that have been archived. The original files will be still available for retrieval for the next 90 days after deletion.

##

```mermaid
graph TD;
create-manifest-->copy;
```

This is a workflow that uses the [argo-tasks](https://github.com/linz/argo-tasks#create-manifest) container `create-manifest` (list of source and target file paths) and `copy` (the actual file copy) commands with `--compress` and `--delete-source` options.

Access permissions are controlled by the [Bucket Sharing Config](https://github.com/linz/topo-aws-infrastructure/blob/master/src/stacks/bucket.sharing.ts) which gives Argo Workflows access to the S3 buckets we use.

## Workflow Input Parameters

| Parameter | Type | Default | Description |
| -------------------- | ----- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| user_group | enum | none | Group of users running the workflow |
| ticket | str | | Ticket ID e.g. 'AIP-55' |
| source | str | s3://linz-imagery-staging/test/sample/ | The URIs (paths) to the s3 source location. Separate multiple source paths with `;` |
| include | regex | \\.tiff?\$\|\\.json\$\|\\.tfw\$\|/capture-area\\.geojson\$\|/capture-area\\.geojson\$ | A regular expression to match object path(s) or name(s) from within the source path to include in the copy. |
| exclude | regex | | A regular expression to match object path(s) or name(s) from within the source path to exclude from the copy. |
| group | int | 1000 | The maximum number of files for each pod to copy (will use the value of `group` or `group_size` that is reached first). |
| group_size | str | 100Gi | The maximum group size of files for each pod to copy (will use the value of `group` or `group_size` that is reached first). |
| transform | str | `f` | String to be transformed from source to target to renamed filenames, e.g. `f.replace("text to replace", "new_text_to_use")`. Leave as `f` for no transformation. |
| aws_role_config_path | str | `s3://linz-bucket-config/config-write.elevation.json,s3://linz-bucket-config/config-write.imagery.json,s3://linz-bucket-config/config-write.topographic.json` | s3 URL or comma-separated list of s3 URLs allowing the workflow to write to a target(s). |

# Standardising

This workflow processes supplied Aerial Imagery and Elevation TIFF files into consistent Cloud Optimised GeoTIFFs with STAC metadata. Standardisation using gdal_translate, non-visual QA, STAC creation, and STAC validation are all steps included within this workflow.
Expand Down