open-telemetry
diff --git a/‎CHANGELOG.md‎
Lines changed: 5 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/gen-ai/gen-ai-metrics.md‎
Lines changed: 135 additions & 0 deletions b/‎docs/gen-ai/gen-ai-metrics.md‎
Lines changed: 135 additions & 0 deletions
diff --git a/‎docs/registry/attributes/gen-ai.md‎
Lines changed: 23 additions & 16 deletions b/‎docs/registry/attributes/gen-ai.md‎
Lines changed: 23 additions & 16 deletions
@@ -24,6 +24,11 @@
   ([#126](https://github.com/open-telemetry/semantic-conventions-genai/pull/126))
 - Add `moonshot_ai` to `gen_ai.provider.name` well-known values.
   ([#99](https://github.com/open-telemetry/semantic-conventions-genai/pull/99))
+- Add `gen_ai.agent.invocation.duration` metric to track the end-to-end duration
+  of a single agent invocation, and `gen_ai.tool.execution.duration` metric to
+  track the duration of a single tool execution. Add the `gen_ai.tool.version`
+  attribute used as a dimension on the tool execution metric.
+  ([#201](https://github.com/open-telemetry/semantic-conventions-genai/pull/201))
 
 ### 🧰 Bug fixes 🧰
 
 
@@ -19,6 +19,10 @@ linkTitle: Metrics
   - [Metric: `gen_ai.server.time_to_first_token`](#metric-gen_aiservertime_to_first_token)
 - [Generative AI workflow metrics](#generative-ai-workflow-metrics)
   - [Metric: `gen_ai.workflow.duration`](#metric-gen_aiworkflowduration)
+- [Generative AI agent metrics](#generative-ai-agent-metrics)
+  - [Metric: `gen_ai.agent.invocation.duration`](#metric-gen_aiagentinvocationduration)
+- [Generative AI tool metrics](#generative-ai-tool-metrics)
+  - [Metric: `gen_ai.tool.execution.duration`](#metric-gen_aitoolexecutionduration)
 
 <!-- tocstop -->
 
@@ -936,6 +940,137 @@ If there is no low-cardinality workflow name available for a given framework, th
 <!-- END AUTOGENERATED TEXT -->
 <!-- endweaver -->
 
+## Generative AI agent metrics
+
+Individual systems may include additional system-specific attributes.
+It is recommended to check system-specific documentation, if available.
+
+`gen_ai.agent.invocation.duration` represents the end-to-end duration of a
+single agent invocation, measured from the point where the agent is invoked
+to the point where it produces its final response (or terminates with an
+error). It is intended for instrumentations of agent frameworks (for example,
+ADK, LangChain agents, CrewAI agents) that can reliably bound a single agent
+invocation.
+
+If instrumentation can only measure a single provider-facing client operation
+(for example, one model API call), `gen_ai.client.operation.duration` SHOULD
+be used instead. If instrumentation can reliably bound a higher-level workflow
+that coordinates multiple agents, `gen_ai.workflow.duration` SHOULD be used
+for that workflow. Instrumentation MAY emit several of these metrics for the
+same request path when more than one boundary is available.
+
+### Metric: `gen_ai.agent.invocation.duration`
+
+This metric is [required][MetricRequired] when the instrumented component
+implements agent invocation operations.
+
+When this metric is reported alongside a `gen_ai.invoke_agent` span, the
+metric value SHOULD be the same as the span duration.
+
+This metric SHOULD be specified with [ExplicitBucketBoundaries] of
+[0.01, 0.02, 0.04, 0.08, 0.16, 0.32, 0.64, 1.28, 2.56, 5.12, 10.24, 20.48, 40.96, 81.92].
+
+<!-- weaver .registry.metrics[] | select(.name == "gen_ai.agent.invocation.duration") -->
+<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
+<!-- see templates/registry/markdown/snippet.md.j2 -->
+<!-- prettier-ignore-start -->
+
+| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
+| -------- | --------------- | ----------- | -------------- | --------- | ------ |
+| `gen_ai.agent.invocation.duration` | Histogram | `s` | GenAI agent invocation duration. [1] | ![Development](https://img.shields.io/badge/-development-blue) | |
+
+**[1]:** This metric measures the end-to-end duration of a single agent invocation, from the moment the agent is invoked to the moment it produces its final response (or terminates with an error).
+When this metric is reported alongside a `gen_ai.invoke_agent` span, the metric value SHOULD be the same as the span duration.
+
+**Attributes:**
+
+| Key | Stability | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Value Type | Description | Example Values |
+| --- | --- | --- | --- | --- | --- |
+| [`error.type`](https://github.com/open-telemetry/semantic-conventions/blob/v1.41.1/docs/registry/attributes/error.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Conditionally Required` if the operation ended in an error | string | Describes a class of error the operation ended with. [1] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` |
+| [`gen_ai.agent.name`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` when available | string | Human-readable name of the GenAI agent provided by the application. | `Math Tutor`; `Fiction Writer` |
+| [`gen_ai.agent.version`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` when available | string | The version of the GenAI agent. | `1.0.0`; `2025-05-01` |
+
+**[1] `error.type`:** The `error.type` SHOULD match the error code returned by the Generative AI provider or the client library,
+the canonical name of exception that occurred, or another low-cardinality error identifier.
+Instrumentations SHOULD document the list of errors they report.
+
+---
+
+`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
+
+| Value | Description | Stability |
+| --- | --- | --- |
+| `_OTHER` | A fallback error value to be used when the instrumentation doesn't define a custom value. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
+
+<!-- prettier-ignore-end -->
+<!-- END AUTOGENERATED TEXT -->
+<!-- endweaver -->
+
+## Generative AI tool metrics
+
+Individual systems may include additional system-specific attributes.
+It is recommended to check system-specific documentation, if available.
+
+`gen_ai.tool.execution.duration` represents the duration of a single tool
+execution performed by or on behalf of a GenAI agent. It is intended for
+instrumentations of agent frameworks (or of application code that executes
+tools on behalf of an agent) that can reliably bound a single tool call.
+
+### Metric: `gen_ai.tool.execution.duration`
+
+This metric is [recommended][MetricRecommended] for instrumentations that can
+observe tool executions performed by or on behalf of a GenAI agent.
+
+When this metric is reported alongside a `gen_ai.execute_tool` span, the
+metric value SHOULD be the same as the span duration.
+
+This metric SHOULD be specified with [ExplicitBucketBoundaries] of
+[0.01, 0.02, 0.04, 0.08, 0.16, 0.32, 0.64, 1.28, 2.56, 5.12, 10.24, 20.48, 40.96, 81.92].
+
+<!-- weaver .registry.metrics[] | select(.name == "gen_ai.tool.execution.duration") -->
+<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
+<!-- see templates/registry/markdown/snippet.md.j2 -->
+<!-- prettier-ignore-start -->
+
+| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
+| -------- | --------------- | ----------- | -------------- | --------- | ------ |
+| `gen_ai.tool.execution.duration` | Histogram | `s` | GenAI tool execution duration. [1] | ![Development](https://img.shields.io/badge/-development-blue) | |
+
+**[1]:** This metric measures the duration of a single tool execution performed by or on behalf of a GenAI agent.
+When this metric is reported alongside a `gen_ai.execute_tool` span, the metric value SHOULD be the same as the span duration.
+
+**Attributes:**
+
+| Key | Stability | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Value Type | Description | Example Values |
+| --- | --- | --- | --- | --- | --- |
+| [`gen_ai.tool.name`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Required` | string | Name of the tool utilized by the agent. | `Flights` |
+| [`error.type`](https://github.com/open-telemetry/semantic-conventions/blob/v1.41.1/docs/registry/attributes/error.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Conditionally Required` if the operation ended in an error | string | Describes a class of error the operation ended with. [1] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` |
+| [`gen_ai.agent.name`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` when available | string | Human-readable name of the GenAI agent provided by the application. | `Math Tutor`; `Fiction Writer` |
+| [`gen_ai.agent.version`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` when available | string | The version of the GenAI agent. | `1.0.0`; `2025-05-01` |
+| [`gen_ai.tool.version`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` when available | string | The version of the tool utilized by the agent. [2] | `1.0.0`; `2025-05-01` |
+
+**[1] `error.type`:** The `error.type` SHOULD match the error code returned by the Generative AI provider or the client library,
+the canonical name of exception that occurred, or another low-cardinality error identifier.
+Instrumentations SHOULD document the list of errors they report.
+
+**[2] `gen_ai.tool.version`:** The tool version is usually provided by the application that defines the
+tool. It is typically a static value (for example, a release tag of the
+tool's package) and is expected to have low cardinality.
+
+`gen_ai.tool.version` MUST have low cardinality.
+
+---
+
+`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
+
+| Value | Description | Stability |
+| --- | --- | --- |
+| `_OTHER` | A fallback error value to be used when the instrumentation doesn't define a custom value. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) |
+
+<!-- prettier-ignore-end -->
+<!-- END AUTOGENERATED TEXT -->
+<!-- endweaver -->
+
 [DocumentStatus]: https://opentelemetry.io/docs/specs/otel/document-status
 [MetricRequired]: https://github.com/open-telemetry/semantic-conventions/blob/v1.40.0/docs/general/metric-requirement-level.md#required
 [MetricRecommended]: https://github.com/open-telemetry/semantic-conventions/blob/v1.40.0/docs/general/metric-requirement-level.md#recommended