Currently, Kuantifier's prometheus queries assume each pilot will run in a distinctly named pod (such as those scheduled via a Deployment or CronJob). UNL is currently contributing capacity via JupyterHub, which repeatedly schedules pods of the same name in serial (such that two distinct instances of a pod named eg. jupyter-mwestphall-0 might be scheduled on two different days). Kuantifier's queries currently don't handle this well, with some taking the value from a single instance of the pod name and some aggregating data from across all instances of the pod name. Queries that come from kube-state-metrics also support a uid field that can be used to differentiate between distinct instances of a shared pod name, but queries directly to the kubernetes control plane typically don't. We'll likely need a pretty substantial set of code changes to support reporting on this use case.
Currently, Kuantifier's prometheus queries assume each pilot will run in a distinctly named pod (such as those scheduled via a Deployment or CronJob). UNL is currently contributing capacity via JupyterHub, which repeatedly schedules pods of the same name in serial (such that two distinct instances of a pod named eg.
jupyter-mwestphall-0might be scheduled on two different days). Kuantifier's queries currently don't handle this well, with some taking the value from a single instance of the pod name and some aggregating data from across all instances of the pod name. Queries that come fromkube-state-metricsalso support auidfield that can be used to differentiate between distinct instances of a shared pod name, but queries directly to the kubernetes control plane typically don't. We'll likely need a pretty substantial set of code changes to support reporting on this use case.