opto.trainer.guide¶

Guide ¶

Base class for all guides that provide feedback on content.

Guide evaluates generated content and provide feedback to help improve it. Different implementations may use different strategies for evaluation, such as LLM-based comparison, keyword matching, or custom verification.

forward ¶

forward(
    task: str, response: str, info: Any, **kwargs
) -> Tuple[float, str]

get_feedback ¶

get_feedback(
    query: str,
    response: str,
    reference: Optional[str] = None,
    **kwargs
) -> Tuple[float, str]

metric ¶

metric(
    query: str,
    response: str,
    reference: Optional[str] = None,
    **kwargs
) -> float

Exact match metric

copy ¶

copy()

Create a copy of the guide instance.

Returns: A new instance of the same guide class with the same parameters.

save ¶

save(path: str)

Save the guide to a file.

load ¶

load(path: str)

Load the guide from a file.

LLMJudge ¶

LLMJudge(
    model: Optional[str] = None,
    llm: Optional[AbstractModel] = None,
    prompt_template: Optional[str] = None,
    system_prompt: Optional[str] = None,
    correctness_template: Optional[str] = None,
    use_formatted_response: bool = True,
)

Bases: Guide

This is a combined metric + feedback guide that asks LLM to provide a binary judgment (True/False) and then if False, provide feedback.

This is an implementation of LLM-as-a-judge.

Initialize the VerbalGuide with an LLM and prompt templates.

Args: model: The name of the LLM model to use (if llm is not provided) llm: An instance of AbstractModel to use for generating feedback prompt_template: Custom prompt template with {response} and {reference} placeholders system_prompt: Custom system prompt for the LLM correctness_template: Template to use when response is deemed correct by metric use_formatted_response: Whether to format the response with additional context; if False, the raw LLM response is returned

DEFAULT_CORRECTNESS_TEMPLATE `class-attribute` `instance-attribute` ¶

DEFAULT_CORRECTNESS_TEMPLATE = 'Correct [TERMINATE]'

DEFAULT_INCORRECTNESS_TEMPLATE `class-attribute` `instance-attribute` ¶

DEFAULT_INCORRECTNESS_TEMPLATE = 'Incorrect'

DEFAULT_PROMPT_TEMPLATE `class-attribute` `instance-attribute` ¶

DEFAULT_PROMPT_TEMPLATE = "The query is: {query}.\n\n\nThe student answered: {response}.\n\n\nThe correct answer is: {reference}.\n\n\nReason whether the student answer is correct. If the student answer is correct, please say {correctness_template}. Otherwise, if the student answer is incorrect, say {incorrectness_template} and provide feedback to the student. The feedback should be specific and actionable."

DEFAULT_SYSTEM_PROMPT `class-attribute` `instance-attribute` ¶

DEFAULT_SYSTEM_PROMPT = "You're a helpful teacher who provides clear and constructive feedback."

model `instance-attribute` ¶

model = model

llm `instance-attribute` ¶

llm = llm or LLM(model=model)

prompt_template `instance-attribute` ¶

prompt_template = prompt_template or DEFAULT_PROMPT_TEMPLATE

system_prompt `instance-attribute` ¶

system_prompt = system_prompt or DEFAULT_SYSTEM_PROMPT

correctness_template `instance-attribute` ¶

correctness_template = (
    correctness_template or DEFAULT_CORRECTNESS_TEMPLATE
)

use_formatted_response `instance-attribute` ¶

use_formatted_response = use_formatted_response

get_feedback ¶

get_feedback(
    query: str,
    response: str,
    reference: Optional[str] = None,
    **kwargs
) -> Tuple[float, str]

Get LLM-generated feedback by comparing response with reference information.

Args: query: The query to analyze (e.g., user query, task, etc.) response: The response generated by LLM (e.g., student answer, code, etc.) reference: The expected information or correct answer **kwargs: Additional parameters (unused in this implementation)

Returns: score: a float number provided by the metric function feedback: A string containing the LLM-generated feedback

forward ¶

forward(
    task: str, response: str, info: Any, **kwargs
) -> Tuple[float, str]

exact_match_metric ¶

exact_match_metric(question, student_answer, info)

Exact match metric

opto.trainer.guide¶

Guide ¶

forward ¶

get_feedback ¶

metric ¶

copy ¶

save ¶

load ¶

LLMJudge ¶

DEFAULT_CORRECTNESS_TEMPLATE class-attribute instance-attribute ¶

DEFAULT_INCORRECTNESS_TEMPLATE class-attribute instance-attribute ¶

DEFAULT_PROMPT_TEMPLATE class-attribute instance-attribute ¶

DEFAULT_SYSTEM_PROMPT class-attribute instance-attribute ¶

model instance-attribute ¶

llm instance-attribute ¶

prompt_template instance-attribute ¶

system_prompt instance-attribute ¶

correctness_template instance-attribute ¶

use_formatted_response instance-attribute ¶

get_feedback ¶

forward ¶

exact_match_metric ¶

DEFAULT_CORRECTNESS_TEMPLATE `class-attribute` `instance-attribute` ¶

DEFAULT_INCORRECTNESS_TEMPLATE `class-attribute` `instance-attribute` ¶

DEFAULT_PROMPT_TEMPLATE `class-attribute` `instance-attribute` ¶

DEFAULT_SYSTEM_PROMPT `class-attribute` `instance-attribute` ¶

model `instance-attribute` ¶

llm `instance-attribute` ¶

prompt_template `instance-attribute` ¶

system_prompt `instance-attribute` ¶

correctness_template `instance-attribute` ¶

use_formatted_response `instance-attribute` ¶