Skip to content

ulab-uiuc/RouteProfile

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

RouteProfile Logo

RouteProfile: Elucidating the Design Space of LLM Profiles for Routing

HuggingFace Paper HuggingFace Collection

🧩 Overview

RouteProfile is a general framework for designing LLM profiles for routing. It formulates LLM profiling as a structured information integration problem over heterogeneous interaction histories, enabling more principled and effective routing across queries, domains, and models.

Highlights:

  • General profile design space: Define LLM profiles along four dimensions: organizational form, representation type, aggregation depth, and learning configuration.
  • Comprehensive evaluation: Evaluate LLM profiles across three representative routers under both standard routing and new-LLM routing settings.
RouteProfile

πŸ”— Links

πŸš€ Get Started

Pipeline Overview


Step 1: Data Collection   β†’  profile_data/                        (manual / provided)
Step 2: Build Data Graph  β†’  results/result_data_graph/{mode}/
Step 3: Build Profile     β†’  results/model_profile_result/{mode}/
Step 4: Route & Evaluate  β†’  results/routing_result/{mode}/

Two routing settings:

Mode Description
standard Standard routing with a known set of candidate LLMs
newllm Generalisation to newly introduced, unseen LLMs

Installation


pip install routeprofile

For Text-GNN profiles (requires vLLM):

pip install "routeprofile[text-gnn]"

Install from source (editable):

git clone https://github.com/your-org/RouteProfile.git
cd RouteProfile
pip install -e .

Profiling Methods


Method File Org. form Repr. type Agg. depth Learning
flat flat.npz Flat Text 0 Training-free
index index.npz Flat Embedding 0 Training-free
emb_gnn emb_gnn.npz Structured Embedding Multi-hop Training-free
text_gnn text_gnn.npz Structured Text Multi-hop Training-free
trainable trainable_gnn.npz Structured Embedding Multi-hop Trainable

πŸ§ͺ Python Usage

All functions are importable directly from routeprofile:

import routeprofile
print(routeprofile.__version__)  # "0.1.0"

Step 2 β€” Build Data Graphs


from routeprofile import (
    build_task_graph,
    build_query_graph,
    build_query_task_graph,
    build_task_domain_graph,
    build_query_task_domain_graph,
)

# Uses default profile_data/ inputs; outputs to results/result_data_graph/standard/
build_task_graph(mode="standard")

# Override any input/output path
build_query_task_domain_graph(
    mode="standard",
    json="profile_data/model_feature_standard.json",
    arch="profile_data/model_family_feature.json",
    dataset="profile_data/task_feature.json",
    query="profile_data/task_queries_standard.json",
    domain_map="profile_data/domain_task_map.json",
    domain_feat="profile_data/domain_feature.json",
    save="results/result_data_graph/standard/query_task_domain_graph_full.pt",
)

Step 3a β€” Training-Free Profiles


from routeprofile import (
    build_flat_profile,
    build_emb_gnn_profile,
    build_index_profile,
    build_text_gnn_profile,
)

# Flat: Longformer encoding of model text + sampled neighbours
build_flat_profile(mode="standard")
# β†’ results/model_profile_result/standard/flat.npz

# Index: random vector baseline (no text or graph)
build_index_profile(mode="standard")
# β†’ results/model_profile_result/standard/index.npz

# Emb-GNN: K-hop neighbourhood propagation (training-free)
build_emb_gnn_profile(
    mode="standard",
    graph="results/result_data_graph/standard/task_graph_full.pt",
    K=2,
    norm="sym",   # "sym" | "rw" | "none"
    save="results/model_profile_result/standard/emb_gnn.npz",
)

# Text-GNN: LLM-based text aggregation per hop (requires vLLM)
build_text_gnn_profile(
    mode="standard",
    graph="results/result_data_graph/standard/query_task_domain_graph_full.pt",
    K=1,
    model="Qwen/Qwen2.5-7B-Instruct",
    tp=1,                        # tensor parallel size (number of GPUs)
    gpu_memory_utilization=0.6,  # fraction of GPU memory for vLLM
    keep=[],                     # [] = save all models; None = TARGET_MODELS only
    emb_save="results/model_profile_result/standard/text_gnn.npz",
)

Step 3b β€” Trainable GNN Profile (HANConv)


from routeprofile import build_trainable_gnn_profile

build_trainable_gnn_profile(
    mode="standard",
    graph="results/result_data_graph/standard/task_graph_full.pt",
    hidden_dim=256,
    out_dim=128,
    epochs=100,
    save_emb="results/model_profile_result/standard/trainable_gnn.npz",
    save_ckpt="results/trained_trainable_gnn/standard/pretrain_ckpt.pt",
)

Step 4 β€” Routing Evaluation


from routeprofile import call_simrouter, call_mlprouter, call_graphrouter

# SimRouter: training-free cosine similarity routing
call_simrouter(
    model_profile_path="results/model_profile_result/standard/flat.npz",
    routing_data_path="route_data/routing_test_data.json",
    output_path="results/routing_result/standard/SimRouter_results.json",
)

# MLPRouter: pairwise-ranking MLP
call_mlprouter(
    model_profile_path="results/model_profile_result/standard/emb_gnn.npz",
    training_data_path="route_data/pairwise_training_data_standard.json",
    testing_data_path="route_data/routing_test_data.json",
    output_path="results/routing_result/standard/MLPRouter_results.json",
    save_ckpt="results/trained_MLPRouter/standard/mlp_router_ckpt.pt",
    epochs=50,
)

# GraphRouter: bipartite GAT
call_graphrouter(
    model_profile_path="results/model_profile_result/standard/trainable_gnn.npz",
    training_data_path="route_data/pairwise_training_data_standard.json",
    testing_data_path="route_data/routing_test_data.json",
    output_path="results/routing_result/standard/GraphRouter_results.json",
    save_ckpt="results/trained_GraphRouter/standard/graphrouter_ckpt.pt",
    epochs=50,
)

You can also import the router classes directly:

from routeprofile import SimRouter, MLPRouter, GraphRouter

🧭 CLI Usage

After installation every step is available as a command-line tool:

# Step 2: Build graphs (outputs to results/result_data_graph/{mode}/)
routeprofile-build-task-graph               --mode standard
routeprofile-build-query-graph              --mode standard
routeprofile-build-query-task-graph         --mode standard
routeprofile-build-task-domain-graph        --mode standard
routeprofile-build-query-task-domain-graph  --mode standard

# Step 3a: Training-free profiles (outputs to results/model_profile_result/{mode}/)
routeprofile-flat-profile      --mode standard
routeprofile-index-profile     --mode standard
routeprofile-emb-gnn-profile   --mode standard --K 2
routeprofile-trainable-gnn-profile --mode standard --epochs 100

# Step 4: Routing (outputs to results/routing_result/{mode}/)
routeprofile-sim-router \
    --model-profile-path results/model_profile_result/standard/flat.npz \
    --routing-data-path  route_data/routing_test_data.json

routeprofile-mlp-router \
    --model-profile-path  results/model_profile_result/standard/emb_gnn.npz \
    --training-data-path  route_data/pairwise_training_data_standard.json \
    --testing-data-path   route_data/routing_test_data.json \
    --save-ckpt           results/trained_MLPRouter/standard/mlp_router_ckpt.pt

routeprofile-graph-router \
    --model-profile-path  results/model_profile_result/standard/trainable_gnn.npz \
    --training-data-path  route_data/pairwise_training_data_standard.json \
    --testing-data-path   route_data/routing_test_data.json \
    --save-ckpt           results/trained_GraphRouter/standard/graphrouter_ckpt.pt

All commands accept --help for full usage.

πŸ”§ Shell Scripts Usage

# Build all graphs (standard mode)
bash routeprofile/scripts/step2_build_data_graph.sh standard

# All training-free profiles
bash routeprofile/scripts/step3a_training_free_profile.sh standard all

# Text-GNN (requires vLLM + GPU)
bash routeprofile/scripts/step3a_training_free_profile.sh standard text_gnn

# Trainable GNN
bash routeprofile/scripts/step3b_trainable_profile.sh standard

# Routing evaluation
bash routeprofile/scripts/step4_routing_evaluation.sh standard sim flat.npz
bash routeprofile/scripts/step4_routing_evaluation.sh standard all flat.npz

πŸ“Š Extra Information

Directory Structure


RouteProfile/
β”œβ”€β”€ profile_data/                        # Input data (read-only)
β”‚   β”œβ”€β”€ model_feature_standard.json          # Model metadata (standard routing)
β”‚   β”œβ”€β”€ model_feature_newllm.json            # Model metadata (newllm routing)
β”‚   β”œβ”€β”€ model_family_feature.json            # Architecture family descriptions
β”‚   β”œβ”€β”€ task_queries_standard.json           # Queries per benchmark (standard)
β”‚   β”œβ”€β”€ task_queries_newllm.json             # Queries per benchmark (newllm)
β”‚   β”œβ”€β”€ task_feature.json                    # Benchmark task descriptions
β”‚   β”œβ”€β”€ domain_feature.json                  # Task domain descriptions
β”‚   β”œβ”€β”€ domain_task_map.json                 # Domain β†’ benchmark mapping
β”‚   └── candidate_models.json               # Candidate LLM metadata
β”‚
β”œβ”€β”€ route_data/                          # Pre-computed routing data
β”‚   β”œβ”€β”€ routing_test_data.json               # Test queries with model responses
β”‚   β”œβ”€β”€ pairwise_training_data_standard.json # Pairwise training data (standard)
β”‚   └── pairwise_training_data_newllm.json   # Pairwise training data (newllm)
β”‚
β”œβ”€β”€ routeprofile/                        # Library source
β”‚   β”œβ”€β”€ build_data_graph/                    # Step 2: graph construction
β”‚   β”œβ”€β”€ get_model_profile/
β”‚   β”‚   β”œβ”€β”€ training_free/                   # flat, index, emb_gnn, text_gnn
β”‚   β”‚   └── trainable/                       # HANConv self-supervised
β”‚   β”œβ”€β”€ routing_evaluation/                  # SimRouter, MLPRouter, GraphRouter
β”‚   └── scripts/                             # Shell scripts for batch runs
β”‚
β”œβ”€β”€ results/                             # All generated outputs (git ignored)
β”‚   β”œβ”€β”€ result_data_graph/{standard,newllm}/     # Built graphs (.pt)
β”‚   β”œβ”€β”€ model_profile_result/{standard,newllm}/  # Model profiles (.npz)
β”‚   β”œβ”€β”€ routing_result/{standard,newllm}/        # Routing evaluation results (.json)
β”‚   β”œβ”€β”€ trained_trainable_gnn/{standard,newllm}/ # HANConv checkpoints
β”‚   β”œβ”€β”€ trained_MLPRouter/{standard,newllm}/     # MLP router checkpoints
β”‚   └── trained_GraphRouter/{standard,newllm}/   # Graph router checkpoints
β”‚
β”œβ”€β”€ tests/                               # pytest test suite
└── pyproject.toml

Data Formats


profile_data/model_feature_{standard|newllm}.json

Main model metadata. Primary input to all graph builders.

{
  "model-name": {
    "size": "7B",
    "feature": "Natural language description of the model...",
    "architecture": "Qwen2ForCausalLM",
    "detailed_scores": {
      "ifeval": 75.85, "bbh": 53.94, "math": 50.0,
      "gpqa": 29.11, "musr": 40.2, "mmlu_pro": 42.87
    },
    "parameters": 7.616,
    "input_price": 0.2,
    "output_price": 0.2,
    "model": "qwen/qwen2.5-7b-instruct",
    "service": "NVIDIA",
    "api_endpoint": "https://integrate.api.nvidia.com/v1",
    "average_score": 35.2
  }
}

profile_data/model_family_feature.json

Architecture family descriptions used as architecture node features.

{
  "Qwen2ForCausalLM": "A family of decoder-only Transformer-based large language models developed by Alibaba Cloud...",
  "LlamaForCausalLM": "A family of autoregressive large language models developed by Meta AI..."
}

profile_data/task_feature.json

Natural language description of each benchmark task.

{
  "ifeval": "IFEval (Instruction-Following Evaluation) is a benchmark designed to evaluate the ability of large language models to follow explicit natural language instructions...",
  "bbh":    "BBH (BIG-Bench Hard) is a challenging subset of the BIG-Bench benchmark..."
}

profile_data/domain_task_map.json

Maps broad task domains to specific benchmarks.

{
  "knowledge": ["mmlu", "mmlu_pro", "C-Eval", "AGIEval English", "SQuAD", "gpqa"],
  "reasoning": ["bbh", "TheoremQA", "WinoGrande"],
  "math":      ["math", "gsm8k", "TheoremQA"],
  "coding":    ["human_eval", "mbpp"]
}

profile_data/domain_feature.json

Natural language description of each task domain.

{
  "knowledge": "Knowledge tasks test factual recall and information retrieval...",
  "reasoning": "Reasoning tasks require multi-step logical inference...",
  "math":      "Math tasks evaluate quantitative and symbolic problem solving..."
}

profile_data/candidate_models.json

Candidate model metadata including API endpoints and aggregate scores.

{
  "qwen2.5-7b-instruct": {
    "size": "7B",
    "feature": "Qwen2.5-7B-Instruct represents an upgraded version...",
    "input_price": 0.2,
    "output_price": 0.2,
    "model": "qwen/qwen2.5-7b-instruct",
    "service": "NVIDIA",
    "api_endpoint": "https://integrate.api.nvidia.com/v1",
    "average_score": 35.2,
    "detailed_scores": { "ifeval": 75.85, "bbh": 53.94 },
    "parameters": 7.616,
    "architecture": "Qwen2ForCausalLM"
  }
}

profile_data/task_queries_{standard|newllm}.json

Per-benchmark query lists used to build query nodes.

{
  "ifeval": ["Instruction 1...", "Instruction 2...", ...],
  "bbh":    ["Question 1...",   "Question 2...",   ...]
}

route_data/routing_test_data.json

Pre-computed model responses for test queries.

[
  {
    "task_name": "ifeval",
    "query": "Follow these instructions...",
    "ground_truth": "A",
    "metric": "em_mc",
    "choices": "{'text': ['A', 'B', 'C', 'D'], 'labels': ['A', 'B', 'C', 'D']}",
    "model_performance": {
      "qwen2.5-7b-instruct": { "response": "A", "task_performance": 1.0, "success": true }
    }
  }
]

route_data/pairwise_training_data_{standard|newllm}.json

Pairwise training data for MLPRouter and GraphRouter. Each entry records which model outperforms which on a given query.

{
  "task_data_count": {
    "agentverse-logicgrid": 1352,
    "gsm8k": 741
  },
  "pairwise_data": [
    {
      "task_name": "agentverse-logicgrid",
      "query": "Q: There are 4 houses...",
      "ground_truth": "B",
      "metric": "em_mc",
      "choices": "{'text': ['1', '2', '3', '4'], 'labels': ['A', 'B', 'C', 'D']}",
      "task_id": null,
      "better_model": "mistral-small-24b-instruct-2501-bf16",
      "worse_model":  "mixtral-8x22b-instruct-v0.1"
    }
  ]
}

Note: Use pairwise_training_data_{mode}.json as training_data_path for MLPRouter and GraphRouter. The routing_test_data.json is used for testing_data_path.

Candidate Models


The default set of 8 candidate models:

Model Size Architecture
qwen2.5-7b-instruct 7B Qwen2ForCausalLM
gemma-2-9b-it 9B Gemma2ForCausalLM
llama-3.1-8b-instruct 8B LlamaForCausalLM
mixtral-8x7b-instruct-v0.1 46.7B MixtralForCausalLM
mixtral-8x22b-instruct-v0.1 141B MixtralForCausalLM
llama-3.2-3b-instruct 3B LlamaForCausalLM
mistral-small-24b-instruct-2501-bf16 24B MistralForCausalLM
llama-3.3-70b-instruct 70B LlamaForCausalLM

Router Methods


Router Type Description
SimRouter Training-free Cosine similarity between query and model embeddings
MLPRouter Trainable Pairwise ranking loss; query + model encoders
GraphRouter Trainable Bipartite GAT with edge prediction (BCE loss)

πŸ“š Citation

If you use RouteProfile in your research, please cite:

@article{routeprofile2025,
  title={RouteProfile: Elucidating the Design Space of LLM Profiles for Routing},
  year={2025}
}

About

Designing LLM Profile for Routing

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

No contributors