Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions aks/examples/cnpack/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@
1. Update the `terraform.tfvars` file to not be prompted for variable input
- Add `cluster_name`
- Add the IDs of the members or groups who should have cluster access to the variable `admin_group_object_ids`. The GUID input can be retrieved in the Azure portal by searching for the desired user or group
- Add `fluentbit-workspace-name`. This will create Azure Log Analytics Workspace with the specified name.
- Add `prometheus-name`. This will create Azure Monitor Workspace with the specified name.
- Add `fluentbit_workspace_name`. This will create Azure Log Analytics Workspace with the specified name.
- Add `prometheus_name`. This will create Azure Monitor Workspace with the specified name.

2. Run `terraform plan` and validate that the output is correct

Expand Down Expand Up @@ -76,14 +76,14 @@ If it is needed to get the value of `log_analytics_workspace_primary_shared_key`
| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_admin_group_object_ids"></a> [admin\_group\_object\_ids](#input\_admin\_group\_object\_ids) | (Required) A list of Object IDs (GUIDs) of Azure Active Directory Groups which should have Owner Role on the Cluster. <br> This is not the email address of the group, the GUID can be found in the Azure panel by searching for the AD Group<br> NOTE: You will need Azure "Owner" role (not "Contributor") to attach an AD role to the Kubernetes cluster. | `list(any)` | n/a | yes |
| <a name="input_az_monitor-user-managed-id"></a> [az\_monitor-user-managed-id](#input\_az\_monitor-user-managed-id) | The user managed identity to *create* for use with the Azure monitor-- at this time this does not accept existing user or system managed identity | `string` | `"tf-holoscan-identity"` | no |
| <a name="input_az_monitor_user_managed_id"></a> [az\_monitor-user-managed-id](#input\_az\_monitor-user-managed-id) | The user managed identity to *create* for use with the Azure monitor-- at this time this does not accept existing user or system managed identity | `string` | `"tf-holoscan-identity"` | no |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs text update az\_monitor-user-managed-id

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching. Fixed in #11

| <a name="input_azure_log_analytics_retention_in_days"></a> [azure\_log\_analytics\_retention\_in\_days](#input\_azure\_log\_analytics\_retention\_in\_days) | The workspace data retention in days. Possible values are either 7 (Free Tier only) or range between 30 and 730 | `number` | `30` | no |
| <a name="input_azure_log_analytics_sku"></a> [azure\_log\_analytics\_sku](#input\_azure\_log\_analytics\_sku) | Specifies the SKU of the Log Analytics Workspace. Possible values are Free, PerNode, Premium, Standard, Standalone, Unlimited, CapacityReservation, and PerGB2018. Defaults to PerGB2018 | `string` | `"PerGB2018"` | no |
| <a name="input_cluster_name"></a> [cluster\_name](#input\_cluster\_name) | Name of the cluster | `string` | n/a | yes |
| <a name="input_fluentbit-workspace-name"></a> [fluentbit-workspace-name](#input\_fluentbit-workspace-name) | Name of the Azure Log Workspace for Fluentbit to be created | `string` | n/a | yes |
| <a name="input_fluentbit_workspace_name"></a> [fluentbit_workspace_name](#input\_fluentbit_workspace_name) | Name of the Azure Log Workspace for Fluentbit to be created | `string` | n/a | yes |
| <a name="input_fluentbit_enabled"></a> [fluentbit\_enabled](#input\_fluentbit\_enabled) | Set to true to enable, false to disable | `bool` | `true` | no |
| <a name="input_location"></a> [location](#input\_location) | The region to create resources in. This can be filed out in the terraform.tfvars file in this directory | `any` | n/a | yes |
| <a name="input_prometheus-name"></a> [prometheus-name](#input\_prometheus-name) | The name of the Azure Monitor Workspace for Prometheus | `string` | n/a | yes |
| <a name="input_prometheus_name"></a> [prometheus_name](#input\_prometheus_name) | The name of the Azure Monitor Workspace for Prometheus | `string` | n/a | yes |
| <a name="input_prometheus_resource_group_name"></a> [prometheus\_resource\_group\_name](#input\_prometheus\_resource\_group\_name) | Name of the Prometheus resource group | `string` | `"prometheus-rg"` | no |

## Outputs
Expand Down
4 changes: 2 additions & 2 deletions aks/examples/cnpack/azure-fluentbit.tf
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Fluentbit Config

// Create an Azure Log Analytics Workspace to send Fluentbit Logs to
resource "azurerm_log_analytics_workspace" "cnpack-fluentbit-workspace" {
name = var.fluentbit-workspace-name
name = var.fluentbit_workspace_name
location = module.holoscan-ready-aks.location
resource_group_name = module.holoscan-ready-aks.resource_group_name
sku = var.azure_log_analytics_sku
Expand All @@ -15,7 +15,7 @@ resource "azurerm_log_analytics_workspace" "cnpack-fluentbit-workspace" {

data "azurerm_log_analytics_workspace" "fluent" {
depends_on = [azurerm_log_analytics_workspace.cnpack-fluentbit-workspace]
name = var.fluentbit-workspace-name
name = var.fluentbit_workspace_name
resource_group_name = module.holoscan-ready-aks.resource_group_name
}

Expand Down
4 changes: 2 additions & 2 deletions aks/examples/cnpack/azure-prometheus.tf
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ data "azurerm_resource_group" "prometheus" {
resource "azapi_resource" "prometheus-cnpack" {
depends_on = [module.holoscan-ready-aks]
type = "microsoft.monitor/accounts@2023-04-03"
name = var.prometheus-name
name = var.prometheus_name
schema_validation_enabled = false
parent_id = data.azurerm_resource_group.prometheus.id
location = data.azurerm_resource_group.prometheus.location
Expand All @@ -43,7 +43,7 @@ resource "azapi_resource" "prometheus-cnpack" {
// Get Data on Azure Monitor Workspace for Prometheus
data "azapi_resource" "prometheus-cnpack" {
depends_on = [module.holoscan-ready-aks, azapi_resource.prometheus-cnpack]
name = var.prometheus-name
name = var.prometheus_name
type = "microsoft.monitor/accounts@2023-04-03"
parent_id = data.azurerm_resource_group.prometheus.id
response_export_values = ["*"]
Expand Down
21 changes: 12 additions & 9 deletions aks/examples/cnpack/terraform.tfvars
Original file line number Diff line number Diff line change
@@ -1,13 +1,16 @@
# SPDX-FileCopyrightText: Copyright (c) 2022-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

// Cluster Variables
# cluster_name = "cnpack"
# location = "West US 2"
# admin_group_object_ids = []
# Sample tfvars file. Uncomment out values to use
# Do not commit this file to Git with sensitive values

// Fluentbit/Azure Log Configuration Variables
# fluentbit-workspace-name = "fluentbit-test"

// Prometheus/Azure Monitor Configuration Variables
# prometheus-name = "cnpack-prometheus"
# admin_group_object_ids = ""
# az_monitor_user_managed_id = "tf-holoscan-identity"
# azure_log_analytics_retention_in_days = 30
# azure_log_analytics_sku = "PerGB2018"
# cluster_name = ""
# fluentbit_workspace_name = ""
# fluentbit_enabled = true
# location = ""
# prometheus_name = ""
# prometheus_resource_group_name = "prometheus-rg"
6 changes: 3 additions & 3 deletions aks/examples/cnpack/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ variable "location" {
description = "The region to create resources in. This can be filed out in the terraform.tfvars file in this directory"
}

variable "az_monitor-user-managed-id" {
variable "az_monitor_user_managed_id" {
type = string
default = "tf-holoscan-identity"
description = "The user managed identity to *create* for use with the Azure monitor-- at this time this does not accept existing user or system managed identity"
Expand Down Expand Up @@ -43,7 +43,7 @@ variable "fluentbit_enabled" {
description = "Set to true to enable, false to disable"
}

variable "fluentbit-workspace-name" {
variable "fluentbit_workspace_name" {
description = "Name of the Azure Log Workspace for Fluentbit to be created"
type = string
}
Expand All @@ -65,7 +65,7 @@ variable "prometheus_resource_group_name" {
default = "prometheus-rg"
description = "Name of the Prometheus resource group"
}
variable "prometheus-name" {
variable "prometheus_name" {
type = string
description = "The name of the Azure Monitor Workspace for Prometheus"
}
24 changes: 21 additions & 3 deletions aks/terraform.tfvars
Original file line number Diff line number Diff line change
@@ -1,6 +1,24 @@
# SPDX-FileCopyrightText: Copyright (c) 2022-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

# cluster_name = "nvidia-aks-cluster"
# admin_group_object_ids = []
# location = "West US 2"
# Sample tfvars file. Uncomment out values to use
# Do not commit this file to Git with sensitive values

# admin_group_object_ids = ""
# cluster_name = "holoscan-cluster"
# cpu_machine_type = "Standard_D16_v5"
# cpu_node_pool_count = 1
# cpu_node_pool_disk_size = 100
# cpu_node_pool_max_count = 5
# cpu_node_pool_min_count = 1
# cpu_os_sku = "Ubuntu"
# existing_resource_group_name = ""
# gpu_machine_type = "Standard_NC6s_v3"
# gpu_node_pool_count = 2
# gpu_node_pool_disk_size = 100
# gpu_node_pool_max_count = 5
# gpu_node_pool_min_count = 2
# gpu_operator_version = "v23.3.2"
# gpu_os_sku = "Ubuntu"
# kubernetes_version = "1.26.3"
# location = ""
11 changes: 10 additions & 1 deletion eks/examples/cnpack/terraform.tfvars
Original file line number Diff line number Diff line change
@@ -1,4 +1,13 @@
# SPDX-FileCopyrightText: Copyright (c) 2022-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

# cluster_name = "cnpack-cluster"
# Sample tfvars file. Uncomment out values to use
# Do not commit this file to Git with sensitive values

# amp_enabled = true
# cluster_name = ""
# common_name = "cluster.local"
# fluentbit_enabled = true
# metrics_server_enabled = true
# pca_enabled = true
# prom_adapter_enabled = true
47 changes: 46 additions & 1 deletion eks/terraform.tfvars
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,53 @@
# SPDX-License-Identifier: Apache-2.0

# Sample tfvars file. Uncomment out values to use
# Do not commit this file to Git with sensitive values

# cluster_name = "holoscan-cluster"
# region = "us-west-2"

# Optional: If deploying into an existing VPC, use the following variable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably this comment should be moved just above existing_vpc_details variable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #11

# existing_vpc_details = {vpc_id = "", subnet_ids = ["", ""]}
# additional_node_security_groups_rules = {}
# additional_security_group_ids = []
# additional_user_data = ""
# aws_profile = "development"
# cidr_block = "10.0.0.0/16"
# cluster_name = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already have cluster_name above

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caught this after the merge. Fixed in #11

# cluster_version = "1.26"
# cpu_instance_type = "t2.xlarge"
# cpu_node_pool_additional_user_data = ""
# cpu_node_pool_delete_on_termination = true
# cpu_node_pool_root_disk_size_gb = 512
# cpu_node_pool_root_volume_type = "gp2"
# desired_count_cpu_nodes = "1"
# desired_count_gpu_nodes = "2"
# enable_dns_hostnames = true
# enable_dns_support = true
# enable_nat_gateway = true
# existing_vpc_details = ""
# gpu_ami_id = ""
# gpu_instance_type = "p3.2xlarge"
# gpu_node_pool_additional_user_data = ""
# gpu_node_pool_delete_on_termination = true
# gpu_node_pool_root_disk_size_gb = 512
# gpu_node_pool_root_volume_type = "gp2"
# gpu_operator_driver_version = "535.54.03"
# gpu_operator_namespace = "gpu-operator"
# gpu_operator_version = "v23.3.2"
# max_cpu_nodes = "2"
# max_gpu_nodes = "5"
# min_cpu_nodes = "0"
# min_gpu_nodes = "2"
# private_subnets = [
# "10.0.1.0/24",
# "10.0.2.0/24",
# "10.0.3.0/24"
# ]
# public_subnets = [
# "10.0.4.0/24",
# "10.0.5.0/24",
# "10.0.6.0/24"
# ]
# region = "us-west-2"
# single_nat_gateway = false
# ssh_key = ""
13 changes: 8 additions & 5 deletions gke/examples/cnpack/terraform.tfvars
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
# SPDX-FileCopyrightText: Copyright (c) 2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

cluster_name = ""
gke_managed_prometheus_enabled = true
node_zones = ["us-west1-b"]
project_id = ""
region = "us-west1"
# Sample tfvars file. Uncomment out values to use
# Do not commit this file to Git with sensitive values

# cluster_name = ""
# gke_managed_prometheus_enabled = true
# node_zones = ["us-west1-b"]
# project_id = ""
# region = "us-west1"
28 changes: 23 additions & 5 deletions gke/terraform.tfvars
Original file line number Diff line number Diff line change
@@ -1,8 +1,26 @@
# SPDX-FileCopyrightText: Copyright (c) 2022-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

# Do not commit this file to Git if you add sensitive values
# project_id = ""
# cluster_name = ""
# region = "us-west1"
# node_zones = ["us-west1-b"]
# Sample tfvars file. Uncomment out values to use
# Do not commit this file to Git with sensitive values

# cluster_name = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To maintain consistency with other tfvars, can you add the cluster_name to be holoscan-cluster

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are auto generated. The values which appear here are the defaults from variables.tf. The tfvars make it simple for people to select the values themselves

# cpu_instance_type = "n1-standard-4"
# cpu_max_node_count = "5"
# cpu_min_node_count = "1"
# gpu_count = "1"
# gpu_instance_type = "n1-standard-4"
# gpu_max_node_count = "5"
# gpu_min_node_count = "2"
# gpu_operator_driver_version = "535.54.03"
# gpu_operator_namespace = "gpu-operator"
# gpu_operator_version = "v23.3.2"
# gpu_type = "nvidia-tesla-v100"
# node_zones = ""
# num_cpu_nodes = 1
# num_gpu_nodes = 2
# project_id = ""
# region = ""
# release_channel = "REGULAR"
# use_cpu_spot_instances = false
# use_gpu_spot_instances = false