Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vLLM not working as expected with ChatGLM2 #55

Open
kerthcet opened this issue Dec 27, 2023 · 0 comments
Open

vLLM not working as expected with ChatGLM2 #55

kerthcet opened this issue Dec 27, 2023 · 0 comments
Labels
bug Categorizes issue or PR as related to a bug.

Comments

@kerthcet
Copy link
Member

When chatting with ChatGLM2 via vLLM, we only got few messages, e.g.

result = chat.completion(
...     messages=[
...         [
...             ChatMessage(role="user", content="中国共有多少人口?"),
...         ],
...         [
...             ChatMessage(role="user", content="中国首富是谁"),
...         ],
...         [
...             ChatMessage(role="user", content="如何在三年内成为中国首富"),
...         ],
...     ],
...     temperature=0.7,  # You can also overwrite the configurations in each conservation.
...     max_tokens=2048,
... )
Processed prompts: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 17.31it/s]
>>> print(result)
[' 根据2021年中国国家统计局发布的数据,截至2020', ' 中国的首富目前的个人财富来自房地产和互联网行业。根据202', ' 成为首富是一个非常具有挑战性和难以预测的因素,而且这个目标并不是每个人']

The max_tokens seems not working.

/kind bug

@kerthcet kerthcet added the bug Categorizes issue or PR as related to a bug. label Dec 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

1 participant