In this workshop we will see the benefits of latency aware load-balancing. This
directory contains a docker-compose.yml file which defines:
- 10 simple backend servers
- Slow Cooker, a load generator which is configured to send requests to the servers
- Prometheus, to collect metrics from the servers and from Slow Cooker
- Grafana, to display the metrics in Prometheus
Slow Cooker uses a naive Round Robin algorithm to send an equal number of requests to each backend server.
Start the above containers by running:
docker-compose build && docker-compose up -dView the Grafana dashboard
open http://localhost:3000 # or docker ip addressNote down the following values:
- p50 latency: ____
- p95 latency: ____
- p99 latency: ____
- success rate: ____
Notice the distribution of request volume per instance. Do some servers seem to be serving more requests than others, or are they all roughly the same?
Now let's add a Linkerd service to the mix. Paste this section into the bottom
of docker-compose.yml:
linkerd:
image: buoyantio/linkerd:1.3.5
ports:
- 4140:4140
- 9990:9990
volumes:
- ./linkerd.yml:/io/buoyant/linkerd/config.yml:ro
- ./disco:/disco
command:
- "/io/buoyant/linkerd/config.yml"Now let's point the load generator at Linkerd, rather than directly at the
application. In docker-compose.yml, in the slow_cooker service section,
replace http://server:8501 with http://linkerd:4140:
command: >
-c 'sleep 15 && slow_cooker -noreuse -metric-addr :8505 -qps 10 -concurrency 50 -interval 5s -totalRequests 10000000 http://linkerd:4140'Linkerd reads its configuration from linkerd.yml. Edit linkerd.yml to use
ewma as the load-balancer instead of p2c:
loadBalancer:
# The p2c load balancer is a good general purpose load balancing algorithm
# that attempts to send requests to the destination with the fewest
# currently pending requests. The ewma load balancer (Expoentially
# Weighted Moving Average) is a latency aware load balancing algorithm
# that performs better when latency is a good indicator of load.
kind: ewmaRedeploy the containers and look at the Grafana dashboard again:
docker-compose up -d
open http://localhost:3000 # or docker ip addressNow note the following values:
- p50 latency: ____
- p95 latency: ____
- p99 latency: ____
- success rate: ____
Notice the distribution of request volume per instance. Do some servers seem to be serving more requests than others, or are they all roughly the same?
Stop and remove all running containers:
docker-compose down