Skip to content

[Feature] support seed parameter #3161

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Aug 6, 2025
Merged

Conversation

lizexu123
Copy link
Collaborator

支持用户传入seed参数

示例用法:

1. 服务用法:

1.1 输出每次随机

import openai

ip = "0.0.0.0"
service_http_port = "13188"  # 服务配置的

client = openai.Client(base_url=f"http://{ip}:{service_http_port}/v1", api_key="EMPTY_API_KEY")

response = client.chat.completions.create(
    model="default",
    messages=[
        {"role": "user", "content": "北京天安门在哪里?"},
    ],
    temperature=1,
    stream=False,
    seed=None,(也可以不加这一行)
)

print(response.choices[0].message.content)
print("\n")

也可以使用

curl -X POST "http://10.54.104.207:13188/v1/chat/completions" -H "Content-Type: application/json" -d '{
  "messages": [
    {"role": "user", "content": "北京天安门在哪里?"}
  ]
}'

1.2 固定输出

import openai

ip = "0.0.0.0"
service_http_port = "13188"  # 服务配置的

client = openai.Client(base_url=f"http://{ip}:{service_http_port}/v1", api_key="EMPTY_API_KEY")

response = client.chat.completions.create(
    model="default",
    messages=[
        {"role": "user", "content": "北京天安门在哪里?"},
    ],
    temperature=1,
    stream=False,
    seed=1,
)

print(response.choices[0].message.content)
print("\n")

也可以使用

curl -X POST "http://10.54.104.207:13188/v1/chat/completions" -H "Content-Type: application/json" -d '{
  "messages": [
    {"role": "user", "content": "北京天安门在哪里?"}
  ],
  "seed":1
}'

```2. 离线方式

2.1 输出随机

from fastdeploy.engine.sampling_params import SamplingParams
from fastdeploy.entrypoints.llm import LLM

model_name_or_path = "Qwen/Qwen3-0.6B"

# 超参设置
sampling_params = SamplingParams(temperature=0.1)
llm = LLM(model=model_name_or_path, tensor_parallel_size=1,reasoning_parser="qwen3")
prompt = "北京天安门在哪里?"
messages = [{"role": "user", "content": prompt}]
output = llm.chat([messages],
                   sampling_params)
              
print(output)    

2.2 输出固定

from fastdeploy.engine.sampling_params import SamplingParams
from fastdeploy.entrypoints.llm import LLM

model_name_or_path = "Qwen/Qwen3-0.6B"

# 超参设置
sampling_params = SamplingParams(temperature=0.1,seed=1)
llm = LLM(model=model_name_or_path, tensor_parallel_size=1,reasoning_parser="qwen3")
prompt = "北京天安门在哪里?"
messages = [{"role": "user", "content": prompt}]
output = llm.chat([messages],
                   sampling_params)
              
print(output) 

Copy link

paddle-bot bot commented Aug 3, 2025

Thanks for your contribution!

@@ -69,6 +69,7 @@ def init_device(self):
else:
raise RuntimeError(f"Not support device type: {self.device_config.device}")

set_random_seed(self.fd_config.model_config.seed)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XPU这里没改xpu_worker是合理的吗?建议做下XPU测试

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已添加

@@ -316,6 +316,11 @@ class EngineArgs:
Must be explicitly enabled via the `--enable-logprob` startup parameter to output logprob values.
"""

seed: Optional[int] = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里看注释默认是随机生成,但上面config.py却默认给了0

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改,vllm默认为0,每次固定输出结果,已修改

@@ -484,6 +489,12 @@ def add_cli_args(parser: FlexibleArgumentParser) -> FlexibleArgumentParser:
default=EngineArgs.enable_logprob,
help="Enable output of token-level log probabilities.",
)
model_group.add_argument(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里也给了默认为0的值

@@ -43,6 +43,7 @@ class SamplingMetadata:
top_p: paddle.Tensor
top_k: Optional[paddle.Tensor] = None
min_p: Optional[paddle.Tensor] = None
seed: Optional[paddle.Tensor] = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seed值应该也需要做合法性检查

Copy link
Collaborator Author

@lizexu123 lizexu123 Aug 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改,做合法性检查

seed_value = -1
else:
seed_value = int(sampling_metadata.seed[0, 0])
_, next_tokens = top_k_top_p_sampling(probs, sampling_metadata.top_p, sampling_metadata.top_k, seed=seed_value)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要额外再判断sampling_metadata吗,是不是直接引用就行,性能更优?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改,改为直接传入sampling_metadata.seed[0,0]

@@ -1112,8 +1108,6 @@ def _dummy_run(
self.proposer.run(share_inputs=self.share_inputs)

# 7. Updata 'infer_seed' and step_cuda()
Copy link
Collaborator

@Jiang-Jia-Jun Jiang-Jia-Jun Aug 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里做了这个删除是不是有问题,最早前内部做这个自增操作记得是为了解决某个采样问题的,可以跟lizhenyu确认下

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已恢复

@iosmers
Copy link
Collaborator

iosmers commented Aug 6, 2025

XPU的跑过测试了吗

@Jiang-Jia-Jun Jiang-Jia-Jun merged commit afff4d3 into PaddlePaddle:develop Aug 6, 2025
10 of 13 checks passed
gzy19990617 pushed a commit to gzy19990617/FastDeploy that referenced this pull request Aug 7, 2025
* support seed

* fix

* add SamplingMetadata seed test

* The next_tokens values are inconsistent!

* add air and rejection seed test

* fix

* add SamplingParams seed test

* fix seed=0

* Default to defualt

* fix

* fix args_utils

* fix review

* fix review

* fix

* fix

* add xpu,gcu,iluvatar support seed

* fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants