Skip to content

opto.utils.llm

AbstractModel

AbstractModel(
    factory: Callable, reset_freq: Union[int, None] = None
)

A minimal abstraction of a model api that refreshes the model every reset_freq seconds (this is useful for long-running models that may require refreshing certificates or memory management).

Args: factory: A function that takes no arguments and returns a model that is callable. reset_freq: The number of seconds after which the model should be refreshed. If None, the model is never refreshed.

factory instance-attribute

factory = factory

reset_freq instance-attribute

reset_freq = reset_freq

model property

model

When self.model is called, text responses should always be available at response['choices'][0]['message']['content']

AutoGenLLM

AutoGenLLM(
    config_list: List = None,
    filter_dict: Dict = None,
    reset_freq: Union[int, None] = None,
)

Bases: AbstractModel

This is the main class Trace uses to interact with the model. It is a wrapper around autogen's OpenAIWrapper. For using models not supported by autogen, subclass AutoGenLLM and override the _factory and create method. Users can pass instances of this class to optimizers' llm argument.

model property

model

create

create(**config: Any)

Make a completion for a given config using available clients. Besides the kwargs allowed in openai's [or other] client, we allow the following additional kwargs. The config in each client will be overridden by the config.

Args: - context (Dict | None): The context to instantiate the prompt or messages. Default to None. It needs to contain keys that are used by the prompt template or the filter function. E.g., prompt="Complete the following sentence: {prefix}, context={"prefix": "Today I feel"}. The actual prompt will be: "Complete the following sentence: Today I feel". More examples can be found at templating. - cache (AbstractCache | None): A Cache object to use for response cache. Default to None. Note that the cache argument overrides the legacy cache_seed argument: if this argument is provided, then the cache_seed argument is ignored. If this argument is not provided or None, then the cache_seed argument is used. - agent (AbstractAgent | None): The object responsible for creating a completion if an agent. - (Legacy) cache_seed (int | None) for using the DiskCache. Default to 41. An integer cache_seed is useful when implementing "controlled randomness" for the completion. None for no caching. Note: this is a legacy argument. It is only used when the cache argument is not provided. - filter_func (Callable | None): A function that takes in the context and the response and returns a boolean to indicate whether the response is valid. E.g., - allow_format_str_template (bool | None): Whether to allow format string template in the config. Default to false. - api_version (str | None): The api version. Default to None. E.g., "2024-02-01".

Example: >>> # filter_func example: >>> def yes_or_no_filter(context, response): >>> return context.get("yes_or_no_choice", False) is False or any( >>> text in ["Yes.", "No."] for text in client.extract_text_or_completion_object(response) >>> )

Raises: - RuntimeError: If all declared custom model clients are not registered - APIError: If any model client create call raises an APIError

LiteLLM

LiteLLM(
    model: Union[str, None] = None,
    reset_freq: Union[int, None] = None,
    cache=True,
)

Bases: AbstractModel

This is an LLM backend supported by LiteLLM library.

https://docs.litellm.ai/docs/completion/input

To use this, set the credentials through the environment variable as instructed in the LiteLLM documentation. For convenience, you can set the default model name through the environment variable TRACE_LITELLM_MODEL. When using Azure models via token provider, you can set the Azure token provider scope through the environment variable AZURE_TOKEN_PROVIDER_SCOPE.

model_name instance-attribute

model_name = model

cache instance-attribute

cache = cache

model property

model

response = litellm.completion( model=self.model, messages=[{"content": message, "role": "user"}] )

CustomLLM

CustomLLM(
    model: Union[str, None] = None,
    reset_freq: Union[int, None] = None,
    cache=True,
)

Bases: AbstractModel

This is for Custom server's API endpoints that are OpenAI Compatible. Such server includes LiteLLM proxy server.

model_name instance-attribute

model_name = model

cache instance-attribute

cache = cache

model property

model

create

create(**config: Any)

LLMFactory

Factory for creating LLM instances with predefined profiles.

The code comes with these built-in profiles:

llm_default = LLM(profile="default")     # gpt-4o-mini
llm_premium = LLM(profile="premium")     # gpt-4  
llm_cheap = LLM(profile="cheap")         # gpt-4o-mini
llm_fast = LLM(profile="fast")           # gpt-3.5-turbo-mini
llm_reasoning = LLM(profile="reasoning") # o1-mini

You can override those built-in profiles:

LLMFactory.register_profile("default", "LiteLLM", model="gpt-4o", temperature=0.5)
LLMFactory.register_profile("premium", "LiteLLM", model="o1-preview", max_tokens=8000)
LLMFactory.register_profile("cheap", "LiteLLM", model="gpt-3.5-turbo", temperature=0.9)
LLMFactory.register_profile("fast", "LiteLLM", model="gpt-3.5-turbo", max_tokens=500)
LLMFactory.register_profile("reasoning", "LiteLLM", model="o1-preview")

An Example of using Different Backends

# Register custom profiles for different use cases
LLMFactory.register_profile("advanced_reasoning", "LiteLLM", model="o1-preview", max_tokens=4000)
LLMFactory.register_profile("claude_sonnet", "LiteLLM", model="claude-3-5-sonnet-latest", temperature=0.3)
LLMFactory.register_profile("custom_server", "CustomLLM", model="llama-3.1-8b")

# Use in different contexts
reasoning_llm = LLM(profile="advanced_reasoning")  # For complex reasoning
claude_llm = LLM(profile="claude_sonnet")          # For Claude responses
local_llm = LLM(profile="custom_server")           # For local deployment

# Single LLM optimizer with custom profile
optimizer1 = OptoPrime(parameters, llm=LLM(profile="advanced_reasoning"))

# Multi-LLM optimizer with multiple profiles
optimizer2 = OptoPrimeMulti(parameters, llm_profiles=["cheap", "premium", "claude_sonnet"], generation_technique="multi_llm")

get_llm classmethod

get_llm(profile: str = 'default') -> AbstractModel

Get an LLM instance for the specified profile.

register_profile classmethod

register_profile(name: str, backend: str, **params)

Register a new LLM profile.

list_profiles classmethod

list_profiles()

List all available profiles.

get_profile_info classmethod

get_profile_info(profile: str = None)

Get information about a profile or all profiles.

DummyLLM

DummyLLM(callable, reset_freq: Union[int, None] = None)

Bases: AbstractModel

A dummy LLM that does nothing. Used for testing purposes.

callable instance-attribute

callable = callable

LLM

A unified entry point for all supported LLM backends.

Usage: # pick by env var (default: LiteLLM) llm = LLM() # or override explicitly llm = LLM(backend="AutoGen", config_list=my_configs) # or use predefined profiles llm = LLM(profile="premium") # Use premium model llm = LLM(profile="cheap") # Use cheaper model llm = LLM(profile="reasoning") # Use reasoning/thinking model

auto_construct_oai_config_list_from_env

auto_construct_oai_config_list_from_env() -> List

Collect various API keys saved in the environment and return a format like: [{"model": "gpt-4", "api_key": xxx}, {"model": "claude-3.5-sonnet", "api_key": xxx}]

Note this is a lazy function that defaults to gpt-40 and claude-3.5-sonnet. If you want to specify your own model, please provide an OAI_CONFIG_LIST in the environment or as a file