kyusonglee
diff --git a/‎.github/workflows/workflow.yml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/workflow.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎.gitignore‎
Lines changed: 6 additions & 1 deletion b/‎.gitignore‎
Lines changed: 6 additions & 1 deletion
diff --git a/‎…cot/agent/cot_conclude/sys_prompt.prompt‎ ‎1.txt‎omagent-core/src/omagent_core/advanced_components/workflow/self_consist_cot/agent/cot_conclude/sys_prompt.prompt renamed to 1.txt b/‎…cot/agent/cot_conclude/sys_prompt.prompt‎ ‎1.txt‎omagent-core/src/omagent_core/advanced_components/workflow/self_consist_cot/agent/cot_conclude/sys_prompt.prompt renamed to 1.txt
diff --git a/‎README.md‎
Lines changed: 8 additions & 17 deletions b/‎README.md‎
Lines changed: 8 additions & 17 deletions
diff --git a/‎docs/concepts/tool_system/mcp.md‎
Lines changed: 43 additions & 0 deletions b/‎docs/concepts/tool_system/mcp.md‎
Lines changed: 43 additions & 0 deletions
diff --git a/‎docs/images/reflexion.png‎
Lines changed: 3 additions & 0 deletions b/‎docs/images/reflexion.png‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎docs/images/video_understanding_gradio.png‎
Lines changed: 3 additions & 0 deletions b/‎docs/images/video_understanding_gradio.png‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎docs/tutorials/run_agent_full.md‎
Lines changed: 73 additions & 0 deletions b/‎docs/tutorials/run_agent_full.md‎
Lines changed: 73 additions & 0 deletions
diff --git a/‎examples/PoT/eval_aqua_zeroshot.py‎
Lines changed: 7 additions & 3 deletions b/‎examples/PoT/eval_aqua_zeroshot.py‎
Lines changed: 7 additions & 3 deletions
diff --git a/‎examples/PoT/eval_gsm8k_fewshot.py‎
Lines changed: 6 additions & 2 deletions b/‎examples/PoT/eval_gsm8k_fewshot.py‎
Lines changed: 6 additions & 2 deletions
@@ -22,7 +22,7 @@ jobs:
     - name: Set up Python
       uses: actions/setup-python@v3
       with:
-        python-version: '3.10'
+        python-version: '3.11'
     - name: Install Poetry
       uses: snok/install-poetry@v1
     - name: Install dependencies
 
@@ -154,4 +154,9 @@ video_cache/
 *.db
 
 # vscode
-.vscode
+.vscode
+
+# JSON files
+*.json
+!mcp.json
+import os 
@@ -27,6 +27,8 @@ OmAgent is python library for building multimodal language agents with ease. We
  - A flexible agent architecture that provides graph-based workflow orchestration engine and various memory type enabling contextual reasoning.  
  - Native multimodal interaction support include VLM models, real-time API, computer vision models, mobile connection and etc.   
  - A suite of state-of-the-art unimodal and multimodal agent algorithms that goes beyond simple LLM reasoning, e.g. ReAct, CoT, SC-Cot etc.   
+ - Supports local deployment of models. You can deploy your own models locally by using Ollama[Ollama](./docs/concepts/models/Ollama.md) or [LocalAI](./examples/video_understanding/docs/local-ai.md).
+ - Fully distributed architecture, supports custom scaling. Also supports Lite mode, eliminating the need for middleware deployment.
 
 
 ## 🛠️ How To Install
@@ -40,11 +42,6 @@ OmAgent is python library for building multimodal language agents with ease. We
   ```bash
   pip install -e omagent-core
   ```
-- Set Up Conductor Server (Docker-Compose) Docker-compose includes conductor-server, Elasticsearch, and Redis.
-  ```bash
-  cd docker
-  docker-compose up -d
-  ```
 
 ## 🚀 Quick Start 
 ### Configuration
@@ -56,9 +53,7 @@ The container.yaml file is a configuration file that manages dependencies and se
    cd examples/step1_simpleVQA
    python compile_container.py
    ```
-   This will create a container.yaml file with default settings under `examples/step1_simpleVQA`.
-
-
+   This will create a container.yaml file with default settings under `examples/step1_simpleVQA`. For more information about the container.yaml configuration, please refer to the [container module](./docs/concepts/container.md)
 
 2. Configure your LLM settings in `configs/llms/gpt.yml`:
 
@@ -69,14 +64,6 @@ The container.yaml file is a configuration file that manages dependencies and se
    ```
    You can use a locally deployed Ollama to call your own language model. The tutorial is [here](docs/concepts/models/Ollama.md).
 
-3. Update settings in the generated `container.yaml`:
-      - Configure Redis connection settings, including host, port, credentials, and both `redis_stream_client` and `redis_stm_client` sections.
-   - Update the Conductor server URL under conductor_config section
-   - Adjust any other component settings as needed
-
-
-For more information about the container.yaml configuration, please refer to the [container module](./docs/concepts/container.md)
-
 ### Run the demo
 
 1. Run the simple VQA demo with webpage GUI:
@@ -91,7 +78,11 @@ For more information about the container.yaml configuration, please refer to the
 
 ## 🤖  Example Projects
 ### 1. Video QA Agents
-Build a system that can answer any questions about uploaded videos with video understanding agents. See Details [here](examples/video_understanding/README.md).  
+Build a system that can answer any questions about uploaded videos with video understanding agents. we provide a gradio based application, see details [here](examples/video_understanding/README.md).  
+<p >
+  <img src="docs/images/video_understanding_gradio.png" width="500"/>
+</p>
+
 More about the video understanding agent can be found in [paper](https://arxiv.org/abs/2406.16620).
 <p >
   <img src="docs/images/OmAgent.png" width="500"/>
 
@@ -0,0 +1,43 @@
+# Model Control Protocol (MCP)
+
+OmAgent's Model Control Protocol (MCP) system enables seamless integration with external AI models and services through a standardized interface. This protocol allows OmAgent to dynamically discover, register, and execute tools from multiple external servers, extending its capabilities without modifying the core codebase.
+
+## MCP Configuration File
+
+MCP servers are configured in a JSON file, typically named `mcp.json`. This file defines the servers that OmAgent can connect to. Each server has a unique name, command to execute, arguments, and environment variables.
+
+Here's an example of a basic `mcp.json` file that configures multiple MCP servers:
+
+```json
+{
+  "mcpServers": {
+    "desktop-commander": {
+      "command": "npx",
+      "args": [
+        "-y",
+        "@smithery/cli@latest",
+        "run",
+        "@wonderwhy-er/desktop-commander",
+        "--key",
+        "your-api-key-here"
+      ]
+    },
+    .....
+}
+```
+
+By default, OmAgent looks for this file in the following locations (in order):
+1. Inside the tool_system directory `omagent-cor/src/omagnet_core/tool_system/mcp.json`
+it will be automatically loaded.
+
+## Executing MCP Tools
+
+MCP tools can be executed just like any other tool using the ToolManager:
+
+```python
+# Let the ToolManager choose the appropriate tool
+x = tool_manager.execute_task("command ls -l for the current directory")    
+print (x)
+```
+
+For more details on creating MCP servers, refer to the [MCP specification](https://github.com/modelcontextprotocol/python-sdk).
@@ -0,0 +1,73 @@
+# Run the full version of OmAgent
+OmAgent now supports free switching between Full and Lite versions, the differences between the two versions are as follows:
+- The Full version has better concurrency performance, can view workflows as well as run logs with the help of the orchestration system GUI, and supports more device types (e.g. smartphone apps). Note that running the Full version requires a Docker deployment middleware dependencies.
+- The Lite version is suitable for developers who want to get started faster. It eliminates the steps of installing and deploying Docker, and is suitable for rapid prototyping and debugging.
+
+## Instruction of how to use Full version
+### 🛠️ How To Install
+- python >= 3.10
+- Install omagent_core  
+  Use pip to install omagent_core latest release.
+  ```bash
+  pip install omagent-core
+  ```
+  Or install the latest version from the source code like below.
+  ```bash
+  pip install -e omagent-core
+  ```
+- Set Up Conductor Server (Docker-Compose) Docker-compose includes conductor-server, Elasticsearch, and Redis.
+  ```bash
+  cd docker
+  docker-compose up -d
+  ```
+
+### 🚀 Quick Start 
+#### Configuration
+
+The container.yaml file is a configuration file that manages dependencies and settings for different components of the system. To set up your configuration:
+
+1. Generate the container.yaml file:
+   ```bash
+   cd examples/step1_simpleVQA
+   python compile_container.py
+   ```
+   This will create a container.yaml file with default settings under `examples/step1_simpleVQA`.
+
+
+
+2. Configure your LLM settings in `configs/llms/gpt.yml`:
+
+   - Set your OpenAI API key or compatible endpoint through environment variable or by directly modifying the yml file
+   ```bash
+   export custom_openai_key="your_openai_api_key"
+   export custom_openai_endpoint="your_openai_endpoint"
+   ```
+   You can use a locally deployed Ollama to call your own language model. The tutorial is [here](docs/concepts/models/Ollama.md).
+
+3. Update settings in the generated `container.yaml`:
+      - Configure Redis connection settings, including host, port, credentials, and both `redis_stream_client` and `redis_stm_client` sections.
+   - Update the Conductor server URL under conductor_config section
+   - Adjust any other component settings as needed
+
+
+For more information about the container.yaml configuration, please refer to the [container module](./docs/concepts/container.md)
+
+#### Run the demo
+
+1. Set the OmAgent to Full version by setting environment variable `OMAGENT_MODE`
+   ```bash
+   export OMAGENT_MODE=full
+   ```
+   or
+   ```pyhton
+   os.environ["OMAGENT_MODE"] = "full"
+   ```
+2. Run the simple VQA demo with webpage GUI:
+
+   For WebpageClient usage: Input and output are in the webpage
+   ```bash
+   cd examples/step1_simpleVQA
+   python run_webpage.py
+   ```
+   Open the webpage at `http://127.0.0.1:7860`, you will see the following interface:  
+   <img src="docs/images/simpleVQA_webpage.png" width="400"/>
@@ -1,10 +1,13 @@
 # Import required modules and components
+import os
+os.environ["OMAGENT_MODE"] = "lite"
+
 from omagent_core.utils.container import container
 from omagent_core.engine.workflow.conductor_workflow import ConductorWorkflow
 from omagent_core.advanced_components.workflow.pot.workflow import PoTWorkflow
 from pathlib import Path
 from omagent_core.utils.registry import registry
-from omagent_core.clients.devices.programmatic.client import ProgrammaticClient
+from omagent_core.clients.devices.programmatic import ProgrammaticClient
 from omagent_core.utils.logger import logging
 import argparse
 import json
@@ -50,6 +53,7 @@ def main():
     # Setup logging and paths
     logging.init_logger("omagent", "omagent", level="INFO")
     CURRENT_PATH = Path(__file__).parents[0]
+    container.register_stm("SharedMemSTM")
 
     # Initialize agent modules and configuration
     registry.import_module(project_path=CURRENT_PATH.joinpath('agent'))
@@ -84,7 +88,7 @@ def main():
     for r, w in zip(res, workflow_input_list):
         output_json.append({
             "id": w['id'],
-            "question": w['query'],
+            "question": w['query']+'\nOptions: '+str(question['options']),
             "last_output": r['last_output'],
             "prompt_tokens": r['prompt_tokens'],
             "completion_tokens": r['completion_tokens']
@@ -101,7 +105,7 @@ def main():
     # Save results to output file
     if not os.path.exists(args.output_path):
         os.makedirs(args.output_path)
-    with open(f'{args.output_path}/{dataset_name}_{model_id}_POT_output.json', 'w') as f:
+    with open(f'{args.output_path}/{dataset_name}_{model_id.replace("/","-")}_POT_output.json', 'w') as f:
         json.dump(final_output, f, indent=4)
 
     # Cleanup
 
@@ -1,10 +1,13 @@
+import os
+os.environ["OMAGENT_MODE"] = "lite"
+
 # Import required modules and components
 from omagent_core.utils.container import container
 from omagent_core.engine.workflow.conductor_workflow import ConductorWorkflow
 from omagent_core.advanced_components.workflow.pot.workflow import PoTWorkflow
 from pathlib import Path
 from omagent_core.utils.registry import registry
-from omagent_core.clients.devices.programmatic.client import ProgrammaticClient
+from omagent_core.clients.devices.programmatic import ProgrammaticClient
 from omagent_core.utils.logger import logging
 import argparse
 import json
@@ -114,6 +117,7 @@ def main():
     # Setup logging and paths
     logging.init_logger("omagent", "omagent", level="INFO")
     CURRENT_PATH = Path(__file__).parents[0]
+    container.register_stm("SharedMemSTM")
 
     # Initialize agent modules and configuration
     registry.import_module(project_path=CURRENT_PATH.joinpath('agent'))
@@ -164,7 +168,7 @@ def main():
     # Save results to output file
     if not os.path.exists(args.output_path):
         os.makedirs(args.output_path)
-    with open(f'{args.output_path}/{dataset_name}_{model_id}_POT_output.json', 'w') as f:
+    with open(f'{args.output_path}/{dataset_name}_{model_id.replace("/","-")}_POT_output.json', 'w') as f:
         json.dump(final_output, f, indent=4)
 
     # Cleanup