-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Overview
Add a command to verify rack-aware configuration is valid, ensuring masters and slaves are properly distributed across racks for high availability.
Background
Redis Enterprise supports rack-aware deployments where nodes are assigned to logical racks (typically corresponding to availability zones, data centers, or physical racks). For HA, masters and their corresponding slaves should be on different racks.
Use Case
After configuring rack awareness, operators need to verify:
- Masters and their slaves are on different racks
- No single rack failure can take down a database
- Rack configuration meets HA requirements
Desired Behavior
# Verify rack-aware configuration
redisctl enterprise cluster verify-rack-aware
# Example output (table format)
DATABASE SHARD MASTER RACK SLAVE RACK STATUS
default-db redis:1 rack-1 rack-2 ✓ OK
default-db redis:2 rack-1 rack-2 ✓ OK
cache-db redis:3 rack-2 rack-3 ✓ OK
prod-db redis:4 rack-1 rack-1 ✗ VIOLATION
prod-db redis:5 rack-2 rack-3 ✓ OK
Rack-Aware Status: VIOLATED (1 issue found)
Issue: Database 'prod-db' shard redis:4 has master and slave on same rack (rack-1)
Recommendation: Migrate slave shard redis:4:slave to a different rack
# JSON output
redisctl enterprise cluster verify-rack-aware -o json
{
"status": "violated",
"violations": [
{
"database": "prod-db",
"shard_id": "redis:4",
"master_rack": "rack-1",
"slave_rack": "rack-1",
"issue": "master and slave on same rack"
}
],
"compliant_shards": 4,
"total_shards": 5,
"compliance_rate": 0.80
}Implementation Approach
Data Collection
// 1. Get all nodes with rack info
let nodes = client.nodes().list().await?;
let node_to_rack: HashMap<i32, String> = nodes
.iter()
.map(|n| (n.node_id, n.rack_id.clone()))
.collect();
// 2. Get all shards
let shards = client.shards().list().await?;
// 3. Group shards by database and check rack distribution
for (db_id, db_shards) in shards.grouped_by_database() {
for master_shard in db_shards.masters() {
let master_rack = node_to_rack[&master_shard.node_id];
// Find corresponding slave(s)
let slaves = db_shards.slaves_for(master_shard.shard_id);
for slave in slaves {
let slave_rack = node_to_rack[&slave.node_id];
if master_rack == slave_rack {
violations.push(RackViolation {
database: db_id,
shard: master_shard.shard_id,
master_rack,
slave_rack,
});
}
}
}
}Validation Rules
- Basic Rule: Master and slave(s) of the same shard must be on different racks
- Replication Rule: If database has replication enabled, verify slaves exist
- Rack Count: Warn if fewer than 3 racks configured (limited HA)
- Even Distribution: Check if racks have roughly equal node counts
Status Levels
- ✅ Compliant: All shards follow rack-aware rules
⚠️ Warning: Minor issues (e.g., uneven distribution)- ❌ Violated: Critical issues (master/slave on same rack)
rladmin Equivalent
# rladmin command
rladmin verify rack_aware
# Example output (text)
Checking rack aware configuration...
Database: db:1
Shard 1: Master on rack-1, Slave on rack-2 [OK]
Shard 2: Master on rack-2, Slave on rack-1 [OK]
Database: db:2
Shard 3: Master on rack-1, Slave on rack-1 [VIOLATION]
Rack aware status: VIOLATEDBenefits
- HA validation - Ensure high availability configuration is correct
- Proactive monitoring - Catch misconfigurations before failures
- Automation - JSON output for CI/CD checks
- Remote access - No SSH required (unlike rladmin)
- Actionable recommendations - Tells you what to fix
Future Enhancements
- Add
--fixflag to auto-migrate violating shards - Support for Active-Active (CRDB) rack awareness
- Integration with
cluster verify-balancefor combined health check - Alert webhooks for violations
Related
- Issue Enhancement: Consider adding rladmin-inspired features #416 - Parent issue for rladmin-inspired features
- Issue feat: Add cluster balance verification command #418 - Cluster balance verification (complementary)
examples/presentation/RLADMIN_COMPARISON.md- Feature comparison
Priority
Medium - Important for HA deployments, but not all clusters use rack awareness.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request