Class: Experiment
An experiment is a collection of logged events, such as model inputs and outputs, which represent a snapshot of your application at a particular point in time. An experiment is meant to capture more than just the model you use, and includes the data you use to test, pre- and post- processing code, comparison metrics (scores), and any other metadata you want to include.
Experiments are associated with a project, and two experiments are meant to be easily comparable via
their inputs
. You can change the attributes of the experiments in a project (e.g. scoring functions)
over time, simply by changing what you log.
You should not create Experiment
objects directly. Instead, use the braintrust.init()
method.
Hierarchy
-
ObjectFetcher
<ExperimentEvent
>↳
Experiment
Accessors
id
• get
id(): Promise
<string
>
Returns
Promise
<string
>
Overrides
ObjectFetcher.id
name
• get
name(): Promise
<string
>
Returns
Promise
<string
>
project
• get
project(): Promise
<ObjectMetadata
>
Returns
Promise
<ObjectMetadata
>
Constructors
constructor
• new Experiment(lazyMetadata
, dataset?
): Experiment
Parameters
Name | Type |
---|---|
lazyMetadata | LazyValue <ProjectExperimentMetadata > |
dataset? | AnyDataset |
Returns
Overrides
ObjectFetcher<ExperimentEvent>.constructor
Methods
[asyncIterator]
▸ [asyncIterator](): AsyncIterator
<WithTransactionId
<ExperimentEvent
>, any
, undefined
>
Returns
AsyncIterator
<WithTransactionId
<ExperimentEvent
>, any
, undefined
>
Inherited from
ObjectFetcher.[asyncIterator]
clearCache
▸ clearCache(): void
Returns
void
Inherited from
ObjectFetcher.clearCache
close
▸ close(): Promise
<string
>
This function is deprecated. You can simply remove it from your code.
Returns
Promise
<string
>
fetch
▸ fetch(): AsyncGenerator
<WithTransactionId
<ExperimentEvent
>, any
, unknown
>
Returns
AsyncGenerator
<WithTransactionId
<ExperimentEvent
>, any
, unknown
>
Inherited from
ObjectFetcher.fetch
fetchBaseExperiment
▸ fetchBaseExperiment(): Promise
<null
| { id
: any
; name
: any
}>
Returns
Promise
<null
| { id
: any
; name
: any
}>
fetchedData
▸ fetchedData(): Promise
<WithTransactionId
<ExperimentEvent
>[]>
Returns
Promise
<WithTransactionId
<ExperimentEvent
>[]>
Inherited from
ObjectFetcher.fetchedData
flush
▸ flush(): Promise
<void
>
Flush any pending rows to the server.
Returns
Promise
<void
>
getState
▸ getState(): Promise
<BraintrustState
>
Returns
Promise
<BraintrustState
>
Overrides
ObjectFetcher.getState
log
▸ log(event
, options?
): string
Log a single event to the experiment. The event will be batched and uploaded behind the scenes.
Parameters
Name | Type | Description |
---|---|---|
event | Readonly <ExperimentLogFullArgs > | The event to log. |
options? | Object | Additional logging options |
options.allowLogConcurrentWithActiveSpan? | boolean | in rare cases where you need to log at the top level separately from an active span on the experiment, set this to true. :returns: The id of the logged event. |
Returns
string
logFeedback
▸ logFeedback(event
): void
Log feedback to an event in the experiment. Feedback is used to save feedback scores, set an expected value, or add a comment.
Parameters
Name | Type |
---|---|
event | LogFeedbackFullArgs |
Returns
void
startSpan
▸ startSpan(args?
): Span
Lower-level alternative to traced
. This allows you to start a span yourself, and can be useful in situations
where you cannot use callbacks. However, spans started with startSpan
will not be marked as the "current span",
so currentSpan()
and traced()
will be no-ops. If you want to mark a span as current, use traced
instead.
See traced
for full details.
Parameters
Name | Type |
---|---|
args? | StartSpanArgs |
Returns
summarize
▸ summarize(options?
): Promise
<ExperimentSummary
>
Summarize the experiment, including the scores (compared to the closest reference experiment) and metadata.
Parameters
Name | Type | Description |
---|---|---|
options | Object | Options for summarizing the experiment. |
options.comparisonExperimentId? | string | The experiment to compare against. If None, the most recent experiment on the origin's main branch will be used. |
options.summarizeScores? | boolean | Whether to summarize the scores. If False, only the metadata will be returned. |
Returns
Promise
<ExperimentSummary
>
A summary of the experiment, including the scores (compared to the closest reference experiment) and metadata.
traced
▸ traced<R
>(callback
, args?
): R
Create a new toplevel span underneath the experiment. The name defaults to "root".
See Span.traced
for full details.
Type parameters
Name |
---|
R |
Parameters
Name | Type |
---|---|
callback | (span : Span ) => R |
args? | StartSpanArgs & SetCurrentArg |
Returns
R
version
▸ version(): Promise
<undefined
| string
| bigint
>
Returns
Promise
<undefined
| string
| bigint
>
Inherited from
ObjectFetcher.version
Properties
dataset
• Optional
Readonly
dataset: AnyDataset
kind
• kind: "experiment"