You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to configure LiteLLM to work with Google's Vertex AI models in a restrictive corporate environment. My goal is to have a LiteLLM proxy that can route requests to a custom Vertex AI REST endpoint using a dynamically generated authentication token. This would allow me to leverage frameworks like Google ADK , Langgraph etc, which integrate smoothly with LiteLLM.
Working Direct-SDK Solution
My environment has several constraints:
No gcloud CLI access.
No Application Default Credentials (ADC).
No service account .json key files.
Authentication is handled by a custom Python function, get_api_key(), which returns a short-lived bearer token as a string. I have a fully functional setup using the google-cloud-aiplatform SDK directly, which proves that my token and endpoint are valid.
Here is the code that works correctly:
importosimportvertexaifromgoogle.oauth2.credentialsimportCredentialsfromvertexai.generative_modelsimportGenerativeModeldefget_api_key():
""" Custom internal function to retrieve a temporary auth token. (Implementation is internal to my organization) """# This function returns a raw token string, e.g., "ey..."return"your-dynamic-token-string"# 1. Create a credentials object from the raw tokencredentials=Credentials(token=get_api_key())
# 2. Initialize the Vertex AI client with custom parametersvertexai.init(
project="my-gcp-project-id",
api_transport="rest",
api_endpoint="https://customendpoint/vertex", # My organization's custom endpointcredentials=credentials,
location="us-central1",
request_metadata=[("x-user", os.getenv("USERNAME"))], # Required metadata
)
# 3. Successfully generate content# This part works perfectly.model=GenerativeModel(model_name="gemini-1.0-pro")
response=model.generate_content("Hello, this is a test.")
print(response.text)
Challenge: Integrating with LiteLLM
I am now trying to replicate this successful authentication flow within LiteLLM. I have attempted to use LiteLLM's CustomLLM feature, but I am encountering an error where my custom provider is not being recognized.
What I've Tried
I created a config.yaml file and a corresponding Python file for my custom logic.
importosimportvertexaifromgoogle.oauth2.credentialsimportCredentialsfromvertexai.generative_modelsimportGenerativeModelfromlitellm.llms.custom_llmimportCustomLLMdefget_api_key():
""" Custom internal function to retrieve a temporary auth token. """return"your-dynamic-token-string"classVertexOrgLLM(CustomLLM):
""" Custom LiteLLM provider to handle our specific Vertex AI authentication. """def__init__(self, model: str, **kwargs):
super().__init__()
token=get_api_key()
ifnottoken:
raiseValueError("Failed to retrieve API token via get_api_key()")
creds=Credentials(token=token)
# Initialize vertexai within the class instancevertexai.init(
project=kwargs.get("project"),
location=kwargs.get("location"),
api_transport="rest",
credentials=creds,
api_endpoint=kwargs.get("api_endpoint"),
request_metadata=[("x-user", os.getenv("USERNAME"))],
)
self.model=GenerativeModel(model)
defcompletion(self, messages, **kwargs):
""" Handles non-streaming completion requests. """prompt="\n".join([m["content"] forminmessagesifm["role"] =="user"])
response=self.model.generate_content([prompt])
return {
"choices": [
{"message": {"content": response.text}}
]
}
defstreaming(self, messages, **kwargs):
""" Handles streaming requests. """# Placeholder for future implementationraiseNotImplementedError("Streaming not yet implemented for this custom provider.")
The Error
When I run the LiteLLM proxy using litellm --config config.yaml, the server fails to load my custom model with the following error:
LiteLLM: Proxy initialized with Config, Set models:
vertex-org-model
11:16:05 - LiteLLM Router:ERROR: router.py:4954 - Error creating deployment: Unsupported provider - vertex_org_llm.VertexOrgLLM, ignoring and continuing with other deployments.
Traceback (most recent call last):
File "xx\site-packages\litellm\router.py", line 4946, in _create_deployment
deployment = self._add_deployment(deployment=deployment)
File "xx\site-packages\litellm\router.py", line 5121, in _add_deployment
raise Exception(f"Unsupported provider - {custom_llm_provider}")
Exception: Unsupported provider - vertex_org_llm.VertexOrgLLM
This Unsupported provider error suggests that the LiteLLM router cannot register my VertexOrgLLM class from the vertex_org_llm.py file.
My Questions for the Community
How can I correctly register a CustomLLM provider?
Is this the right approach? Is creating a CustomLLM class the intended method for this scenario, or is there a more direct way to pass a dynamic bearer token and custom endpoint to LiteLLM's native Vertex AI provider?
Is there an alternative configuration? Given my specific constraints, what is the recommended way to configure LiteLLM? For example, can I somehow pass the credentials object created from my token directly to LiteLLM?
Any guidance on the correct procedure to solve this would be extremely helpful. Thank you!
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
I am trying to configure LiteLLM to work with Google's Vertex AI models in a restrictive corporate environment. My goal is to have a LiteLLM proxy that can route requests to a custom Vertex AI REST endpoint using a dynamically generated authentication token. This would allow me to leverage frameworks like Google ADK , Langgraph etc, which integrate smoothly with LiteLLM.
Working Direct-SDK Solution
My environment has several constraints:
gcloud
CLI access..json
key files.Authentication is handled by a custom Python function, get_api_key(), which returns a short-lived bearer token as a string. I have a fully functional setup using the google-cloud-aiplatform SDK directly, which proves that my token and endpoint are valid.
Here is the code that works correctly:
Challenge: Integrating with LiteLLM
I am now trying to replicate this successful authentication flow within LiteLLM. I have attempted to use LiteLLM's
CustomLLM
feature, but I am encountering an error where my custom provider is not being recognized.What I've Tried
I created a
config.yaml
file and a corresponding Python file for my custom logic.config.yaml
vertex_org_llm.py
(placed in the same directory)The Error
When I run the LiteLLM proxy using
litellm --config config.yaml
, the server fails to load my custom model with the following error:LiteLLM: Proxy initialized with Config, Set models:
vertex-org-model
11:16:05 - LiteLLM Router:ERROR: router.py:4954 - Error creating deployment: Unsupported provider - vertex_org_llm.VertexOrgLLM, ignoring and continuing with other deployments.
Traceback (most recent call last):
File "xx\site-packages\litellm\router.py", line 4946, in _create_deployment
deployment = self._add_deployment(deployment=deployment)
File "xx\site-packages\litellm\router.py", line 5121, in _add_deployment
raise Exception(f"Unsupported provider - {custom_llm_provider}")
Exception: Unsupported provider - vertex_org_llm.VertexOrgLLM
This
Unsupported provider
error suggests that the LiteLLM router cannot register myVertexOrgLLM
class from thevertex_org_llm.py
file.My Questions for the Community
CustomLLM
provider?CustomLLM
class the intended method for this scenario, or is there a more direct way to pass a dynamic bearer token and custom endpoint to LiteLLM's native Vertex AI provider?credentials
object created from my token directly to LiteLLM?Any guidance on the correct procedure to solve this would be extremely helpful. Thank you!
Beta Was this translation helpful? Give feedback.
All reactions