Question about the evaluation Setting

https://github.com/scaleapi/mcp-atlas/blob/867003a0d259f4c706e2a653d4d465a002fe9835/services/mcp_eval/mcp_completion/llm.py#L49-L56

Hi, I found that all model is tested without any reasoning_effort. Are all model results on the leaderboard tested with reasoning_effort disabled? Does MCP-Atlas specifically test tool-calling abilities without the reasoning before tool-calling ?

	response = await litellm.acompletion(
	model=model,
	messages=litellm_messages,
	tools=litellm_tools,
	api_key=config.LLM_API_KEY,
	api_base=config.LLM_BASE_URL,
	timeout=config.DEFAULT_TIMEOUT,
	)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the evaluation Setting #13

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question about the evaluation Setting #13

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions