Skip to content

Commit ba3da34

Browse files
committed
Add gen_ai.agent.invocation.duration and gen_ai.tool.execution.duration metrics
Adds two new GenAI semantic convention metrics for agent and tool latency, modeled on the recently-added gen_ai.workflow.duration metric: * gen_ai.agent.invocation.duration (histogram, seconds): end-to-end duration of a single agent invocation, aligned with the existing gen_ai.invoke_agent.{client,internal} spans. * gen_ai.tool.execution.duration (histogram, seconds): duration of a single tool execution, aligned with the existing gen_ai.execute_tool.internal span. Also adds the gen_ai.tool.version attribute, used as a dimension on gen_ai.tool.execution.duration (mirrors the existing gen_ai.agent.version). NOTE: docs/registry/ and schema-snapshot/ regeneration via 'make generate-all' has NOT been run in this commit (no Docker available in the authoring environment). Run it locally before pushing for review.
1 parent 61d4fa3 commit ba3da34

6 files changed

Lines changed: 525 additions & 16 deletions

File tree

CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,11 @@
2424
([#126](https://github.com/open-telemetry/semantic-conventions-genai/pull/126))
2525
- Add `moonshot_ai` to `gen_ai.provider.name` well-known values.
2626
([#99](https://github.com/open-telemetry/semantic-conventions-genai/pull/99))
27+
- Add `gen_ai.agent.invocation.duration` metric to track the end-to-end duration
28+
of a single agent invocation, and `gen_ai.tool.execution.duration` metric to
29+
track the duration of a single tool execution. Add the `gen_ai.tool.version`
30+
attribute used as a dimension on the tool execution metric.
31+
([#201](https://github.com/open-telemetry/semantic-conventions-genai/pull/201))
2732

2833
### 🧰 Bug fixes 🧰
2934

docs/gen-ai/gen-ai-metrics.md

Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,10 @@ linkTitle: Metrics
1919
- [Metric: `gen_ai.server.time_to_first_token`](#metric-gen_aiservertime_to_first_token)
2020
- [Generative AI workflow metrics](#generative-ai-workflow-metrics)
2121
- [Metric: `gen_ai.workflow.duration`](#metric-gen_aiworkflowduration)
22+
- [Generative AI agent metrics](#generative-ai-agent-metrics)
23+
- [Metric: `gen_ai.agent.invocation.duration`](#metric-gen_aiagentinvocationduration)
24+
- [Generative AI tool metrics](#generative-ai-tool-metrics)
25+
- [Metric: `gen_ai.tool.execution.duration`](#metric-gen_aitoolexecutionduration)
2226

2327
<!-- tocstop -->
2428

@@ -936,6 +940,137 @@ If there is no low-cardinality workflow name available for a given framework, th
936940
<!-- END AUTOGENERATED TEXT -->
937941
<!-- endweaver -->
938942

943+
## Generative AI agent metrics
944+
945+
Individual systems may include additional system-specific attributes.
946+
It is recommended to check system-specific documentation, if available.
947+
948+
`gen_ai.agent.invocation.duration` represents the end-to-end duration of a
949+
single agent invocation, measured from the point where the agent is invoked
950+
to the point where it produces its final response (or terminates with an
951+
error). It is intended for instrumentations of agent frameworks (for example,
952+
ADK, LangChain agents, CrewAI agents) that can reliably bound a single agent
953+
invocation.
954+
955+
If instrumentation can only measure a single provider-facing client operation
956+
(for example, one model API call), `gen_ai.client.operation.duration` SHOULD
957+
be used instead. If instrumentation can reliably bound a higher-level workflow
958+
that coordinates multiple agents, `gen_ai.workflow.duration` SHOULD be used
959+
for that workflow. Instrumentation MAY emit several of these metrics for the
960+
same request path when more than one boundary is available.
961+
962+
### Metric: `gen_ai.agent.invocation.duration`
963+
964+
This metric is [required][MetricRequired] when the instrumented component
965+
implements agent invocation operations.
966+
967+
When this metric is reported alongside a `gen_ai.invoke_agent` span, the
968+
metric value SHOULD be the same as the span duration.
969+
970+
This metric SHOULD be specified with [ExplicitBucketBoundaries] of
971+
[0.01, 0.02, 0.04, 0.08, 0.16, 0.32, 0.64, 1.28, 2.56, 5.12, 10.24, 20.48, 40.96, 81.92].
972+
973+
<!-- weaver .registry.metrics[] | select(.name == "gen_ai.agent.invocation.duration") -->
974+
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
975+
<!-- see templates/registry/markdown/snippet.md.j2 -->
976+
<!-- prettier-ignore-start -->
977+
978+
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
979+
| -------- | --------------- | ----------- | -------------- | --------- | ------ |
980+
| `gen_ai.agent.invocation.duration` | Histogram | `s` | GenAI agent invocation duration. [1] | ![Development](https://img.shields.io/badge/-development-blue) | |
981+
982+
**[1]:** This metric measures the end-to-end duration of a single agent invocation, from the moment the agent is invoked to the moment it produces its final response (or terminates with an error).
983+
When this metric is reported alongside a `gen_ai.invoke_agent` span, the metric value SHOULD be the same as the span duration.
984+
985+
**Attributes:**
986+
987+
| Key | Stability | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Value Type | Description | Example Values |
988+
| --- | --- | --- | --- | --- | --- |
989+
| [`error.type`](https://github.com/open-telemetry/semantic-conventions/blob/v1.41.1/docs/registry/attributes/error.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Conditionally Required` if the operation ended in an error | string | Describes a class of error the operation ended with. [1] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` |
990+
| [`gen_ai.agent.name`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` when available | string | Human-readable name of the GenAI agent provided by the application. | `Math Tutor`; `Fiction Writer` |
991+
| [`gen_ai.agent.version`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` when available | string | The version of the GenAI agent. | `1.0.0`; `2025-05-01` |
992+
993+
**[1] `error.type`:** The `error.type` SHOULD match the error code returned by the Generative AI provider or the client library,
994+
the canonical name of exception that occurred, or another low-cardinality error identifier.
995+
Instrumentations SHOULD document the list of errors they report.
996+
997+
---
998+
999+
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
1000+
1001+
| Value | Description | Stability |
1002+
| --- | --- | --- |
1003+
| `_OTHER` | A fallback error value to be used when the instrumentation doesn't define a custom value. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
1004+
1005+
<!-- prettier-ignore-end -->
1006+
<!-- END AUTOGENERATED TEXT -->
1007+
<!-- endweaver -->
1008+
1009+
## Generative AI tool metrics
1010+
1011+
Individual systems may include additional system-specific attributes.
1012+
It is recommended to check system-specific documentation, if available.
1013+
1014+
`gen_ai.tool.execution.duration` represents the duration of a single tool
1015+
execution performed by or on behalf of a GenAI agent. It is intended for
1016+
instrumentations of agent frameworks (or of application code that executes
1017+
tools on behalf of an agent) that can reliably bound a single tool call.
1018+
1019+
### Metric: `gen_ai.tool.execution.duration`
1020+
1021+
This metric is [recommended][MetricRecommended] for instrumentations that can
1022+
observe tool executions performed by or on behalf of a GenAI agent.
1023+
1024+
When this metric is reported alongside a `gen_ai.execute_tool` span, the
1025+
metric value SHOULD be the same as the span duration.
1026+
1027+
This metric SHOULD be specified with [ExplicitBucketBoundaries] of
1028+
[0.01, 0.02, 0.04, 0.08, 0.16, 0.32, 0.64, 1.28, 2.56, 5.12, 10.24, 20.48, 40.96, 81.92].
1029+
1030+
<!-- weaver .registry.metrics[] | select(.name == "gen_ai.tool.execution.duration") -->
1031+
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
1032+
<!-- see templates/registry/markdown/snippet.md.j2 -->
1033+
<!-- prettier-ignore-start -->
1034+
1035+
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
1036+
| -------- | --------------- | ----------- | -------------- | --------- | ------ |
1037+
| `gen_ai.tool.execution.duration` | Histogram | `s` | GenAI tool execution duration. [1] | ![Development](https://img.shields.io/badge/-development-blue) | |
1038+
1039+
**[1]:** This metric measures the duration of a single tool execution performed by or on behalf of a GenAI agent.
1040+
When this metric is reported alongside a `gen_ai.execute_tool` span, the metric value SHOULD be the same as the span duration.
1041+
1042+
**Attributes:**
1043+
1044+
| Key | Stability | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Value Type | Description | Example Values |
1045+
| --- | --- | --- | --- | --- | --- |
1046+
| [`gen_ai.tool.name`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Required` | string | Name of the tool utilized by the agent. | `Flights` |
1047+
| [`error.type`](https://github.com/open-telemetry/semantic-conventions/blob/v1.41.1/docs/registry/attributes/error.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Conditionally Required` if the operation ended in an error | string | Describes a class of error the operation ended with. [1] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` |
1048+
| [`gen_ai.agent.name`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` when available | string | Human-readable name of the GenAI agent provided by the application. | `Math Tutor`; `Fiction Writer` |
1049+
| [`gen_ai.agent.version`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` when available | string | The version of the GenAI agent. | `1.0.0`; `2025-05-01` |
1050+
| [`gen_ai.tool.version`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` when available | string | The version of the tool utilized by the agent. [2] | `1.0.0`; `2025-05-01` |
1051+
1052+
**[1] `error.type`:** The `error.type` SHOULD match the error code returned by the Generative AI provider or the client library,
1053+
the canonical name of exception that occurred, or another low-cardinality error identifier.
1054+
Instrumentations SHOULD document the list of errors they report.
1055+
1056+
**[2] `gen_ai.tool.version`:** The tool version is usually provided by the application that defines the
1057+
tool. It is typically a static value (for example, a release tag of the
1058+
tool's package) and is expected to have low cardinality.
1059+
1060+
`gen_ai.tool.version` MUST have low cardinality.
1061+
1062+
---
1063+
1064+
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
1065+
1066+
| Value | Description | Stability |
1067+
| --- | --- | --- |
1068+
| `_OTHER` | A fallback error value to be used when the instrumentation doesn't define a custom value. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
1069+
1070+
<!-- prettier-ignore-end -->
1071+
<!-- END AUTOGENERATED TEXT -->
1072+
<!-- endweaver -->
1073+
9391074
[DocumentStatus]: https://opentelemetry.io/docs/specs/otel/document-status
9401075
[MetricRequired]: https://github.com/open-telemetry/semantic-conventions/blob/v1.40.0/docs/general/metric-requirement-level.md#required
9411076
[MetricRecommended]: https://github.com/open-telemetry/semantic-conventions/blob/v1.40.0/docs/general/metric-requirement-level.md#recommended

docs/registry/attributes/gen-ai.md

Lines changed: 23 additions & 16 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)