Skip to content

opto.trainer.guide

Guide

Base class for all guides that provide feedback on content.

Guide evaluates generated content and provide feedback to help improve it. Different implementations may use different strategies for evaluation, such as LLM-based comparison, keyword matching, or custom verification.

forward

forward(
    task: str, response: str, info: Any, **kwargs
) -> Tuple[float, str]

get_feedback

get_feedback(
    query: str,
    response: str,
    reference: Optional[str] = None,
    **kwargs
) -> Tuple[float, str]

metric

metric(
    query: str,
    response: str,
    reference: Optional[str] = None,
    **kwargs
) -> float

Exact match metric

copy

copy()

Create a copy of the guide instance.

Returns: A new instance of the same guide class with the same parameters.

save

save(path: str)

Save the guide to a file.

load

load(path: str)

Load the guide from a file.

LLMJudge

LLMJudge(
    model: Optional[str] = None,
    llm: Optional[AbstractModel] = None,
    prompt_template: Optional[str] = None,
    system_prompt: Optional[str] = None,
    correctness_template: Optional[str] = None,
    use_formatted_response: bool = True,
)

Bases: Guide

This is a combined metric + feedback guide that asks LLM to provide a binary judgment (True/False) and then if False, provide feedback.

This is an implementation of LLM-as-a-judge.

Initialize the VerbalGuide with an LLM and prompt templates.

Args: model: The name of the LLM model to use (if llm is not provided) llm: An instance of AbstractModel to use for generating feedback prompt_template: Custom prompt template with {response} and {reference} placeholders system_prompt: Custom system prompt for the LLM correctness_template: Template to use when response is deemed correct by metric use_formatted_response: Whether to format the response with additional context; if False, the raw LLM response is returned

DEFAULT_CORRECTNESS_TEMPLATE class-attribute instance-attribute

DEFAULT_CORRECTNESS_TEMPLATE = 'Correct [TERMINATE]'

DEFAULT_INCORRECTNESS_TEMPLATE class-attribute instance-attribute

DEFAULT_INCORRECTNESS_TEMPLATE = 'Incorrect'

DEFAULT_PROMPT_TEMPLATE class-attribute instance-attribute

DEFAULT_PROMPT_TEMPLATE = "The query is: {query}.\n\n\nThe student answered: {response}.\n\n\nThe correct answer is: {reference}.\n\n\nReason whether the student answer is correct. If the student answer is correct, please say {correctness_template}. Otherwise, if the student answer is incorrect, say {incorrectness_template} and provide feedback to the student. The feedback should be specific and actionable."

DEFAULT_SYSTEM_PROMPT class-attribute instance-attribute

DEFAULT_SYSTEM_PROMPT = "You're a helpful teacher who provides clear and constructive feedback."

model instance-attribute

model = model

llm instance-attribute

llm = llm or LLM(model=model)

prompt_template instance-attribute

prompt_template = prompt_template or DEFAULT_PROMPT_TEMPLATE

system_prompt instance-attribute

system_prompt = system_prompt or DEFAULT_SYSTEM_PROMPT

correctness_template instance-attribute

correctness_template = (
    correctness_template or DEFAULT_CORRECTNESS_TEMPLATE
)

use_formatted_response instance-attribute

use_formatted_response = use_formatted_response

get_feedback

get_feedback(
    query: str,
    response: str,
    reference: Optional[str] = None,
    **kwargs
) -> Tuple[float, str]

Get LLM-generated feedback by comparing response with reference information.

Args: query: The query to analyze (e.g., user query, task, etc.) response: The response generated by LLM (e.g., student answer, code, etc.) reference: The expected information or correct answer **kwargs: Additional parameters (unused in this implementation)

Returns: score: a float number provided by the metric function feedback: A string containing the LLM-generated feedback

forward

forward(
    task: str, response: str, info: Any, **kwargs
) -> Tuple[float, str]

exact_match_metric

exact_match_metric(question, student_answer, info)

Exact match metric