Skip to content

aws-samples/minimum-viable-dataspace-for-catenax

Minimum Viable Dataspace on AWS

🚀 Deploy complete data space environments on AWS with a single command.

Secure, sovereign sharing of information is emerging as a requirement across industries globally. This project accelerates data space experimentation and use case testing by making mature reference implementations available as single-command deployment blueprints on AWS. The mvd blueprint, for the Eclipse Minimum Viable Dataspace, comes with a web application through which data discovery, contract negotiation and data transfers can be explored in a visual manner.

Quick Start

This project requires additional dependencies to be installed:

  • corretto@17 and docker to build Eclipse Dataspace Components modules and container images
  • terraform, aws-cli and kubectl to deploy AWS infrastructure and Kubernetes resources
  • git to integrate with relevant GitHub projects (Eclipse MVD, EDC Data Dashboard, Tractus-X Umbrella)

Minimum Viable Dataspace on AWS comes with support for two data space blueprints: Eclipse MVD and Tractus-X MXD (deprecated!). By default, the mvd blueprint should be used.

Note

If you're accessing this repository to follow along the 2024 blog Rapidly experimenting with Catena-X data space technology on AWS, make sure to review the warning in Tractus-X MXD Data Exchange Walk-Through.

~ ./deploy.sh up mvd

Creating Minimum Viable Dataspace on AWS...
Please enter an alphanumeric string to protect access to your connector APIs.
EDC authentication key:

Enter a secret key that you would like to configure for access to EDC APIs. Deployment takes 15-20 minutes. ☕

~ kubectl get pod

NAME                                     READY   STATUS    RESTARTS   AGE
consumer-controlplane-75fcb7bb6d-pj65b   1/1     Running   0          2m13s
consumer-dataplane-78444656f7-msjcn      1/1     Running   0          107s
consumer-identityhub-576f45f8f-qt757     1/1     Running   0          2m49s
consumer-vault-0                         1/1     Running   0          2m51s
data-dashboard-59f55d784f-kwmmx          1/1     Running   0          66s
...

~ kubectl get ing

NAME                           CLASS   HOSTS   ADDRESS                                       PORTS   AGE
consumer-did-ingress           nginx   *        <lb-domain>.elb.eu-central-1.amazonaws.com   80      2m16s
consumer-identityhub-ingress   nginx   *        <lb-domain>.elb.eu-central-1.amazonaws.com   80      2m16s
consumer-ingress               nginx   *        <lb-domain>.elb.eu-central-1.amazonaws.com   80      85s
data-dashboard                 nginx   *        <lb-domain>.elb.eu-central-1.amazonaws.com   80      69s
...

🛠️ The included EDC Data Dashboard will be accessible under https://<lb-domain>.elb.eu-central-1.amazonaws.com/dashboard/home. To get started with experimenting refer to the Eclipse MVD's Postman Collections and EDC Samples repository.

Architecture

architecture diagram

Learn More

Configuration

The MVD's EDC connectors and Data Dashboard are exposed over HTTPS through a Network Load Balancer that is provisioned by the Kubernetes Nginx ingress controller. Initially, this project creates a self-signed X.509 certificate that is valid for 30 days and is being exposed to the NLB through AWS Certificate Manager (ACM). You can replace this initial certificate by requesting or importing a new one through ACM, and adjusting the NLB's listener configuration accordingly.

The MVD's EDC connectors come with API key-based authentication that is configured as part of this project's deploy.sh dialog. This API key then needs to be provided through an HTTP header x-api-key with each invocation of one of the EDCs' APIs.

By default, this project creates a new AWS VPC with all required networking components. However, you can optionally deploy into an existing VPC by providing its VPC ID to the deploy.sh script.

Using an Existing VPC

To deploy into an existing VPC, set the VPC_ID variable:

# deploy.sh
VPC_ID="vpc-1234567890abcdef0"

Requirements for Existing VPCs

  • Private Subnets: At least 2 private subnets tagged with kubernetes.io/role/internal-elb=1
  • Public Subnets: At least 2 public subnets tagged with kubernetes.io/role/elb=1
  • Availability Zones: Subnets must be distributed across multiple AZs for high availability

Considerations for Production

The default configuration of this project is not indended for use in a production scenario. It is intended as a starting point for rapid data space and Catena-X experimentation and prototyping, that needs adaptation depending on how it is being used. For design principles and best practices on implementing production-ready workloads on AWS please refer to the AWS Well-Architected Framework.

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.