Skip to content

TypeError: StopIteration interacts badly with generators and cannot be raised into a Future #4165

@guchunxiao

Description

@guchunxiao

System Info / 系統信息

CUDA Version: 13.0
Driver Version: 580.95.05
操作系统:Ubuntu
模型:DeepSeek-R1-Distill-Llama-70B-GGUF

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

  • docker / docker
  • pip install / 通过 pip install 安装
  • installation from source / 从源码安装

Version info / 版本信息

v1.9.0

The command used to start Xinference / 用以启动 xinference 的命令

docker run -d
--name xinference-server
-v /data/xinference/.xinference:/root/.xinference
-v /data/xinference/.cache/huggingface:/root/.cache/huggingface
-v /data/xinference/.cache/modelscope:/root/.cache/modelscope
-p 9997:9997
--gpus all
xprobe/xinference:v1.7.0
xinference-local -H 0.0.0.0

Reproduction / 复现过程

1、使用本地已下载好的量化后的模型:
DeepSeek-R1-Distill-Llama-70B-Q6_K-00001-of-00002.gguf
DeepSeek-R1-Distill-Llama-70B-Q6_K-00002-of-00002.gguf

2、运行参数配置(Web UI)

Model Path:/root/.xinference/models/deepseek-70B-Q8/DeepSeek-R1-Distill-Llama-70B-Q8_0-00001-of-00002.gguf

Model Engine:vLLM
Model Format:ggufv2
Model Size:70
Quantization:Q8_0
GPU Count per worker:2
N GPU Layers:-1
Replica:1

gpu_memory_utilization:0.8
max_model_len:8000

Image

3、运行模型报错,日志如下:
INFO 10-22 18:07:38 [init.py:239] Automatically detected platform cuda.
2025-10-22 18:07:41,960 xinference.core.model 496 INFO Start requests handler.
/opt/inference/xinference/model/llm/core.py:143: UserWarning: enable_thinking cannot be disabled for non hybrid model, will be ignored
warnings.warn(
ERROR:asyncio:Exception in callback _chain_future.._set_state(<Future pendi...ask_wakeup()]>, <Future at 0x...StopIteration>) at /usr/lib/python3.10/asyncio/futures.py:379
handle: <Handle _chain_future.._set_state(<Future pendi...ask_wakeup()]>, <Future at 0x...StopIteration>) at /usr/lib/python3.10/asyncio/futures.py:379>
Traceback (most recent call last):
File "/usr/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "/usr/lib/python3.10/asyncio/futures.py", line 381, in _set_state
_copy_future_state(other, future)
File "/usr/lib/python3.10/asyncio/futures.py", line 357, in _copy_future_state
dest.set_exception(_convert_future_exc(exception))
TypeError: StopIteration interacts badly with generators and cannot be raised into a Future

Expected behavior / 期待表现

使用vLLM运行成功

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions