Interface: Evaluator<Input, Output, Expected, Metadata>
Type parameters
| Name | Type | 
|---|---|
| Input | Input | 
| Output | Output | 
| Expected | Expected | 
| Metadata | extends BaseMetadata=DefaultMetadataType | 
Properties
data
• data: EvalData<Input, Expected, Metadata>
A function that returns a list of inputs, expected outputs, and metadata.
experimentName
• Optional experimentName: string
An optional name for the experiment.
isPublic
• Optional isPublic: boolean
Whether the experiment should be public. Defaults to false.
metadata
• Optional metadata: Record<string, unknown>
Optional additional metadata for the experiment.
scores
• scores: EvalScorer<Input, Output, Expected, Metadata>[]
A set of functions that take an input, output, and expected value and return a score.
task
• task: EvalTask<Input, Output>
A function that takes an input and returns an output.
trialCount
• Optional trialCount: number
The number of times to run the evaluator per input. This is useful for evaluating applications that have non-deterministic behavior and gives you both a stronger aggregate measure and a sense of the variance in the results.