-
Notifications
You must be signed in to change notification settings - Fork 20.3k
docs: Added Deploying LLMs into production + a new ecosystem #4047
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
|
One possible extension to this PR:
|
|
ah, this is a good idea. Conventionally, inference services are hosted by LLM providers. I was thinking of LangChain as a part of product logic, but after reading your PR, I realized we are a valuable inference host too especially after so many customization through multi-api combination, (prefix) prompt engineering and complex agent features. Thank you for sharing with us your great ideas! |
… things Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
hwchase17
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is more of a ray integration than a comprehensive guide on deploying LLMs in production
this seems more suitable to go in the ecosystem section
|
Hey @hwchase17 , thanks for your input. I'm considering revising the section on "Deployment of LLMs in production." Instead of the current story about how Ray serve can assist with deployment, I'm thinking of creating a more general tutorial on the key concepts to consider when deploying LLMs (autoscaling, spot-instance serving, defining end-points, etc). Then I can link it to the ray integration page for follow-ups. What do you think of this approach? Do you have any other suggestions or ideas to improve this section? |
|
Hey there - also from Ray team here -- your feedback makes sense @hwchase17 ! Probably better for this PR to actually start the "comprehensive guide for deploying LLMs in production". We could use both the Ray and the BentoML (https://github.com/ssheng/BentoChain) example as a starting point, so that it doesn't just look like a basic Ray integration. And the sections of the guide would include the parts that @kouroshHakha mentioned. Thoughts? |
|
I think it makes sense. we can keep the comprehensive guide for deploying LLMs in production focused more on concept understanding and provide solution starters. From there we can link to more detailed examples. |
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
|
Hey @hwchase17, I updated the main section to be more general and more in the lines of a comprehensive guide for deploying LLMs in production. I am thinking of linking different serving ecosystems (ray serve, bentochain, etc) to this main doc and this should live somewhere in the main doc pages that is super visible (maybe under LLMs?) I can polish the text / figures a bit more if the outline sounds reasonable to you. Thanks. |
…edbacks Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
hwchase17
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ray serve notebook looks good!
Deploying LLMs notebook is also good in content, but i think in wrong place. modules is very specific to code in the library, this is more of (very good) general purpose documentation.
i would suggest we move this to the additional resources section (and maybe make it a markdown file, no reason for it to be ipynb)
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
|
Hi @hwchase17 thanks for the suggestions. @kouroshHakha implemented requested the changes. Please have a look. |
Co-authored-by: Kamil Kaczmarek <[email protected]>
|
@hwchase17 I think this PR is ready to be merged. Can someone from your team do a final pass? Thanks. |
hwchase17
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm - thanks! sorry for dealy
…in-ai#4047) Signed-off-by: Kourosh Hakhamaneshi <[email protected]> Co-authored-by: Kamil Kaczmarek <[email protected]> Co-authored-by: Harrison Chase <[email protected]>
No description provided.