[Feature] support seed parameter #3161

lizexu123 · 2025-08-03T13:31:01Z

支持用户传入seed参数

示例用法:

1. 服务用法:

1.1 输出每次随机

import openai

ip = "0.0.0.0"
service_http_port = "13188"  # 服务配置的

client = openai.Client(base_url=f"http://{ip}:{service_http_port}/v1", api_key="EMPTY_API_KEY")

response = client.chat.completions.create(
    model="default",
    messages=[
        {"role": "user", "content": "北京天安门在哪里?"},
    ],
    temperature=1,
    stream=False,
    seed=None,（也可以不加这一行)
)

print(response.choices[0].message.content)
print("\n")

也可以使用

curl -X POST "http://10.54.104.207:13188/v1/chat/completions" -H "Content-Type: application/json" -d '{
  "messages": [
    {"role": "user", "content": "北京天安门在哪里？"}
  ]
}'

1.2 固定输出

import openai

ip = "0.0.0.0"
service_http_port = "13188"  # 服务配置的

client = openai.Client(base_url=f"http://{ip}:{service_http_port}/v1", api_key="EMPTY_API_KEY")

response = client.chat.completions.create(
    model="default",
    messages=[
        {"role": "user", "content": "北京天安门在哪里?"},
    ],
    temperature=1,
    stream=False,
    seed=1，
)

print(response.choices[0].message.content)
print("\n")

也可以使用

curl -X POST "http://10.54.104.207:13188/v1/chat/completions" -H "Content-Type: application/json" -d '{
  "messages": [
    {"role": "user", "content": "北京天安门在哪里？"}
  ],
  "seed":1
}'

```2. 离线方式

2.1 输出随机

from fastdeploy.engine.sampling_params import SamplingParams
from fastdeploy.entrypoints.llm import LLM

model_name_or_path = "Qwen/Qwen3-0.6B"

# 超参设置
sampling_params = SamplingParams(temperature=0.1)
llm = LLM(model=model_name_or_path, tensor_parallel_size=1,reasoning_parser="qwen3")
prompt = "北京天安门在哪里?"
messages = [{"role": "user", "content": prompt}]
output = llm.chat([messages],
                   sampling_params)
              
print(output)

2.2 输出固定

from fastdeploy.engine.sampling_params import SamplingParams
from fastdeploy.entrypoints.llm import LLM

model_name_or_path = "Qwen/Qwen3-0.6B"

# 超参设置
sampling_params = SamplingParams(temperature=0.1,seed=1)
llm = LLM(model=model_name_or_path, tensor_parallel_size=1,reasoning_parser="qwen3")
prompt = "北京天安门在哪里?"
messages = [{"role": "user", "content": prompt}]
output = llm.chat([messages],
                   sampling_params)
              
print(output)

…into seed_3

paddle-bot · 2025-08-03T13:31:06Z

Thanks for your contribution!

iosmers · 2025-08-04T07:35:41Z

fastdeploy/worker/gpu_worker.py

@@ -69,6 +69,7 @@ def init_device(self):
        else:
            raise RuntimeError(f"Not support device type: {self.device_config.device}")

+        set_random_seed(self.fd_config.model_config.seed)


XPU这里没改xpu_worker是合理的吗？建议做下XPU测试

Jiang-Jia-Jun · 2025-08-04T07:40:47Z

fastdeploy/engine/args_utils.py

@@ -316,6 +316,11 @@ class EngineArgs:
    Must be explicitly enabled via the `--enable-logprob` startup parameter to output logprob values.
    """

+    seed: Optional[int] = None


这里看注释默认是随机生成，但上面config.py却默认给了0

已修改,vllm默认为0，每次固定输出结果，已修改

Jiang-Jia-Jun · 2025-08-04T07:41:03Z

fastdeploy/engine/args_utils.py

@@ -484,6 +489,12 @@ def add_cli_args(parser: FlexibleArgumentParser) -> FlexibleArgumentParser:
            default=EngineArgs.enable_logprob,
            help="Enable output of token-level log probabilities.",
        )
+        model_group.add_argument(


这里也给了默认为0的值

Jiang-Jia-Jun · 2025-08-04T07:43:11Z

fastdeploy/model_executor/layers/sample/meta_data.py

@@ -43,6 +43,7 @@ class SamplingMetadata:
    top_p: paddle.Tensor
    top_k: Optional[paddle.Tensor] = None
    min_p: Optional[paddle.Tensor] = None
+    seed: Optional[paddle.Tensor] = None


seed值应该也需要做合法性检查

已修改，做合法性检查

Jiang-Jia-Jun · 2025-08-04T07:45:08Z

fastdeploy/model_executor/layers/sample/sampler.py

+            seed_value = -1
+        else:
+            seed_value = int(sampling_metadata.seed[0, 0])
+        _, next_tokens = top_k_top_p_sampling(probs, sampling_metadata.top_p, sampling_metadata.top_k, seed=seed_value)


需要额外再判断sampling_metadata吗，是不是直接引用就行，性能更优？

已修改，改为直接传入sampling_metadata.seed[0,0]

Jiang-Jia-Jun · 2025-08-04T07:46:13Z

fastdeploy/worker/gpu_model_runner.py

@@ -1112,8 +1108,6 @@ def _dummy_run(
                    self.proposer.run(share_inputs=self.share_inputs)

            # 7. Updata 'infer_seed' and step_cuda()


这里做了这个删除是不是有问题，最早前内部做这个自增操作记得是为了解决某个采样问题的，可以跟lizhenyu确认下

iosmers · 2025-08-06T05:52:01Z

XPU的跑过测试了吗

* support seed * fix * add SamplingMetadata seed test * The next_tokens values are inconsistent! * add air and rejection seed test * fix * add SamplingParams seed test * fix seed=0 * Default to defualt * fix * fix args_utils * fix review * fix review * fix * fix * add xpu,gcu,iluvatar support seed * fix

lizexu123 added 6 commits July 31, 2025 16:28

support seed

d4aa468

fix

17903a8

add SamplingMetadata seed test

ccc0dbe

The next_tokens values are inconsistent!

65e3927

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

b6c36b9

…into seed_3

add air and rejection seed test

7426122

lizexu123 added 6 commits August 3, 2025 13:37

fix

587abd2

add SamplingParams seed test

b60b65a

fix seed=0

a332c23

Default to defualt

ed502fc

fix

ed12aaf

fix args_utils

e9502ba

iosmers reviewed Aug 4, 2025

View reviewed changes

Jiang-Jia-Jun requested changes Aug 4, 2025

View reviewed changes

lizexu123 added 7 commits August 4, 2025 08:58

fix review

d11814f

fix review

73f0846

fix

7da4751

fix

297b1d5

add xpu,gcu,iluvatar support seed

f0e088c

merge develop

d82322c

fix

9342cef

Jiang-Jia-Jun approved these changes Aug 6, 2025

View reviewed changes

carryyu approved these changes Aug 6, 2025

View reviewed changes

Jiang-Jia-Jun merged commit afff4d3 into PaddlePaddle:develop Aug 6, 2025
10 of 13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] support seed parameter #3161

[Feature] support seed parameter #3161

Uh oh!

lizexu123 commented Aug 3, 2025

Uh oh!

paddle-bot bot commented Aug 3, 2025

Uh oh!

iosmers Aug 4, 2025

Uh oh!

lizexu123 Aug 5, 2025

Uh oh!

Jiang-Jia-Jun Aug 4, 2025

Uh oh!

lizexu123 Aug 4, 2025

Uh oh!

Jiang-Jia-Jun Aug 4, 2025

Uh oh!

Jiang-Jia-Jun Aug 4, 2025

Uh oh!

lizexu123 Aug 4, 2025 •

edited

Loading

Uh oh!

Jiang-Jia-Jun Aug 4, 2025

Uh oh!

lizexu123 Aug 4, 2025

Uh oh!

Jiang-Jia-Jun Aug 4, 2025 •

edited

Loading

Uh oh!

lizexu123 Aug 5, 2025

Uh oh!

iosmers commented Aug 6, 2025

Uh oh!

Uh oh!

Uh oh!

		@@ -1112,8 +1108,6 @@ def _dummy_run(
		self.proposer.run(share_inputs=self.share_inputs)

		# 7. Updata 'infer_seed' and step_cuda()

[Feature] support seed parameter #3161

[Feature] support seed parameter #3161

Uh oh!

Conversation

lizexu123 commented Aug 3, 2025

示例用法:

1. 服务用法:

1.1 输出每次随机

1.2 固定输出

```2. 离线方式

2.1 输出随机

2.2 输出固定

Uh oh!

paddle-bot bot commented Aug 3, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lizexu123 Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jiang-Jia-Jun Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iosmers commented Aug 6, 2025

Uh oh!

Uh oh!

Uh oh!

lizexu123 Aug 4, 2025 •

edited

Loading

Jiang-Jia-Jun Aug 4, 2025 •

edited

Loading