Skip to content

feat: Add flag config to store stat data into Redis or no #5051

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

fregataa
Copy link
Member

@fregataa fregataa commented Jul 9, 2025

resolves #5027 (BA-1802)

Checklist: (if applicable)

  • Milestone metadata specifying the target backport version
  • Mention to the original issue
  • Installer updates including:
    • Fixtures for db schema changes
    • New mandatory config options
  • Update of end-to-end CLI integration tests in ai.backend.test
  • API server-client counterparts (e.g., manager API -> client SDK)
  • Test case(s) to:
    • Demonstrate the difference of before/after
    • Demonstrate the flow of abstract/conceptual models with a concrete implementation
  • Documentation
    • Contents in the docs directory
    • docstrings in public interfaces and type annotations

@fregataa fregataa self-assigned this Jul 9, 2025
@github-actions github-actions bot added size:L 100~500 LoC comp:manager Related to Manager component comp:agent Related to Agent component labels Jul 9, 2025
@github-actions github-actions bot added the area:docs Documentations label Jul 9, 2025
@@ -606,50 +626,40 @@
async def batch_load_live_stat(
cls, ctx: GraphQueryContext, agent_ids: Sequence[str]
) -> Sequence[Any]:
async def _pipe_builder(r: Redis):
pipe = r.pipeline()
# TODO: make it work

Check notice

Code scanning / devskim

A "TODO" or similar was left in source code, possibly indicating incomplete functionality Note

Suspicious comment
@property
def _metric_query_endpoint(self) -> yarl.URL:
metric_query_addr = self._config_provider.config.metric.address.to_legacy()
return yarl.URL(f"http://{metric_query_addr}/api/v1")

Check warning

Code scanning / devskim

An HTTP-based URL without TLS was detected. Warning

Insecure URL
metric_name: str,
label: DeviceMetricOptionalLabel,
) -> UtilizationMetricType:
# TODO: Define device metadata for each metric and use it rather than hardcoding

Check notice

Code scanning / devskim

A "TODO" or similar was left in source code, possibly indicating incomplete functionality Note

Suspicious comment
Comment on lines +647 to +660
async def _pipe_builder(r: Redis):
pipe = r.pipeline()
for agent_id in agent_ids:
await pipe.get(agent_id)
return pipe

return ret
ret = []
for stat in await redis_helper.execute(ctx.redis_stat, _pipe_builder):
if stat is not None:
ret.append(msgpack.unpackb(stat, ext_hook_mapping=msgpack.uuid_to_str))
else:
ret.append(None)

@classmethod
async def batch_load_cpu_cur_pct(
cls, ctx: GraphQueryContext, agent_ids: Sequence[str]
) -> Sequence[Any]:
ret = []
for stat in await cls.batch_load_live_stat(ctx, agent_ids):
if stat is not None:
try:
ret.append(float(stat["node"]["cpu_util"]["pct"]))
except (KeyError, TypeError, ValueError):
ret.append(0.0)
else:
ret.append(0.0)
return ret
return ret
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part may need to be added to the valkey client implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:docs Documentations comp:agent Related to Agent component comp:manager Related to Manager component size:L 100~500 LoC
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add flag config to migrate utilization data from Redis to Prometheus
2 participants