Action
- llm_response (str): The response generated by the LLM.
- metadata (Dict[str, Any]): Additional metadata such as model parameters, prompt details, or response confidence scores.
- timestamp (datetime): The timestamp when the action was generated (UTC).
Observation
- question: The question posed to the LLM.
- context: Additional context for the question.
- metadata: Optional metadata about the observation.
StepResult
- observation: The next observation.
- reward: Dictionary of reward scores for different aspects.
- done: Whether the episode is complete.
- info: Additional information about the step.