Skip to content

Simulate sandbox execution of bash code or python code. #2301

@cangcang-zcr

Description

@cangcang-zcr

I'm trying to build an agent to execute some shell and python code locally as follows

from langchain import OpenAI, LLMBashChain

llm = OpenAI(temperature=0)
llm_bash_chain = LLMBashChain(llm=llm, verbose=True)
print(llm_bash_chain.prompt)

tools = load_tools(["python_repl", "terminal"])
agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)
agent.run("Delete a file in the local path.")

During the process, I find that there is still some possibility that it could generate erroneous code.And running this erroneous code directly on the local machine could pose risks and vulnerabilities.Therefore, I thought of setting up a sandbox environment based on Docker locally, to execute users' agent code, so as to avoid damage to local files or the system.

I tried to set up a web service in Docker to execute python code and provide feedback. The following is a simple demo operation process.

Before starting the operation, I have installed Docker and pulled the Python 3.10 image.

Create Dockerfile

FROM python:3.10

RUN pip install -i https://pypi.tuna.tsinghua.edu.cn/simple pip -U \
    && pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

RUN pip install fastapi
RUN pip install uvicorn

COPY main.py /app/
WORKDIR /app

EXPOSE 8000
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

and I have set up a service in my project folder that can accept and execute code.

main.py

import sys
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import Any, Dict
import subprocess

app = FastAPI()


class CodeData(BaseModel):
    code: str
    code_type: str


@app.post("/execute", response_model=Dict[str, Any])
async def execute_code(code_data: CodeData):
    if code_data.code_type == "python":
        try:
            buffer = io.StringIO()
            sys.stdout = buffer
            exec(code_data.code)
            sys.stdout = sys.__stdout__
            exec_result = buffer.getvalue()

            return {"output": exec_result} if exec_result else {"message": "OK"}
        except Exception as e:
            raise HTTPException(status_code=400, detail=str(e))
    elif code_data.code_type == "shell":
        try:
            output = subprocess.check_output(code_data.code, stderr=subprocess.STDOUT, shell=True, text=True)
            return {"output": output.strip()} if output.strip() else {"message": "OK"}
        except subprocess.CalledProcessError as e:
            raise HTTPException(status_code=400, detail=str(e.output))

    else:
        raise HTTPException(status_code=400, detail="Invalid code_type")


if __name__ == "__main__":
    import uvicorn

    uvicorn.run("remote:app", host="localhost", port=8000)

then I've started it using Docker on a local port,and I can use the Langchain Agent to execute code in the sandbox and return results to avoid damage to the local environment.

the agent as follows


from langchain.llms import OpenAI
from langchain.agents import initialize_agent
from langchain.tools.base import BaseTool
import requests


class SandboxTool(BaseTool):
    name = "SandboxTool"
    description = '''Useful for when you need to execute python code or install library by pip for python code. 
    The input to this tool should be a comma separated list of numbers of length two, 
    the first value is code_type(type:String), the second value is code(type:String) needed to execute. 
    For example: 
    ["python", "print(1+2)"], ["shell", "pip install langchain"], ["shell", "ls"] ... '''

    def _run(self, query: str) -> str:
        return self.remote_request(query)

    async def _arun(self, tool_input: str) -> str:
        raise NotImplementedError("PythonRemoteReplTool does not support async")

    def remote_request(self, query: str) -> str:
        list = ast.literal_eval(query)
        url = "http://localhost:8000/execute"
        headers = {
            "Content-Type": "application/json",  
        }
        json_data = {
            "code_type": list[0],
            "code": list[1]
        }
        response = requests.post(url, headers=headers, json=json_data)

        if response.status_code == 200:
            data = response.json()
            return data
        else:
            return f"Request failed, status code:{response.status_code}"


llm = OpenAI(temperature=0)
tool =SandboxTool()
tools = [tool]
sandboxagent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)
sandboxagent.run("print result from 5 + 5")

Could this be a feasible sandbox solution?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions