Skip to content

Conversation

@kouroshHakha
Copy link
Contributor

No description provided.

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
@kouroshHakha
Copy link
Contributor Author

kouroshHakha commented May 3, 2023

One possible extension to this PR:

  • Add an example of self-hosted chains (e.g. gpt2 from hf) to serve.

@skcoirz
Copy link
Contributor

skcoirz commented May 3, 2023

ah, this is a good idea. Conventionally, inference services are hosted by LLM providers. I was thinking of LangChain as a part of product logic, but after reading your PR, I realized we are a valuable inference host too especially after so many customization through multi-api combination, (prefix) prompt engineering and complex agent features.

Thank you for sharing with us your great ideas!

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Copy link
Contributor

@hwchase17 hwchase17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is more of a ray integration than a comprehensive guide on deploying LLMs in production

this seems more suitable to go in the ecosystem section

@kouroshHakha
Copy link
Contributor Author

Hey @hwchase17 , thanks for your input. I'm considering revising the section on "Deployment of LLMs in production." Instead of the current story about how Ray serve can assist with deployment, I'm thinking of creating a more general tutorial on the key concepts to consider when deploying LLMs (autoscaling, spot-instance serving, defining end-points, etc). Then I can link it to the ray integration page for follow-ups. What do you think of this approach? Do you have any other suggestions or ideas to improve this section?

@richardliaw
Copy link

Hey there - also from Ray team here -- your feedback makes sense @hwchase17 !

Probably better for this PR to actually start the "comprehensive guide for deploying LLMs in production". We could use both the Ray and the BentoML (https://github.com/ssheng/BentoChain) example as a starting point, so that it doesn't just look like a basic Ray integration.

And the sections of the guide would include the parts that @kouroshHakha mentioned.

Thoughts?

@kamil-kaczmarek
Copy link
Contributor

I think it makes sense. we can keep the comprehensive guide for deploying LLMs in production focused more on concept understanding and provide solution starters. From there we can link to more detailed examples.

@kouroshHakha
Copy link
Contributor Author

Hey @hwchase17, I updated the main section to be more general and more in the lines of a comprehensive guide for deploying LLMs in production. I am thinking of linking different serving ecosystems (ray serve, bentochain, etc) to this main doc and this should live somewhere in the main doc pages that is super visible (maybe under LLMs?) I can polish the text / figures a bit more if the outline sounds reasonable to you. Thanks.

@kouroshHakha kouroshHakha changed the title docs: Added Deploying LLMs into production docs: Added Deploying LLMs into production + Ray serve ecosystem May 10, 2023
@kouroshHakha kouroshHakha changed the title docs: Added Deploying LLMs into production + Ray serve ecosystem docs: Added Deploying LLMs into production + a new ecosystem May 10, 2023
@kouroshHakha kouroshHakha requested a review from hwchase17 May 10, 2023 19:24
Copy link
Contributor

@hwchase17 hwchase17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ray serve notebook looks good!

Deploying LLMs notebook is also good in content, but i think in wrong place. modules is very specific to code in the library, this is more of (very good) general purpose documentation.

i would suggest we move this to the additional resources section (and maybe make it a markdown file, no reason for it to be ipynb)

@kamil-kaczmarek
Copy link
Contributor

Hi @hwchase17 thanks for the suggestions.

@kouroshHakha implemented requested the changes. Please have a look.

@kouroshHakha
Copy link
Contributor Author

@hwchase17 I think this PR is ready to be merged. Can someone from your team do a final pass? Thanks.

Copy link
Contributor

@hwchase17 hwchase17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - thanks! sorry for dealy

@hwchase17 hwchase17 added the lgtm label Jun 3, 2023
@hwchase17 hwchase17 merged commit 625717d into langchain-ai:master Jun 5, 2023
Undertone0809 pushed a commit to Undertone0809/langchain that referenced this pull request Jun 19, 2023
…in-ai#4047)

Signed-off-by: Kourosh Hakhamaneshi <[email protected]>
Co-authored-by: Kamil Kaczmarek <[email protected]>
Co-authored-by: Harrison Chase <[email protected]>
This was referenced Jun 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants