Skip to content

Commit f6a2978

Browse files
committed
Address review comments on PR #201 (round 2)
Following lmolkova's review and trask's towncrier reminder: * Rename per #249 (lmolkova confirmed on PR #201 SIG call discussion): - gen_ai.agent.invocation.duration -> gen_ai.invoke_agent.duration - gen_ai.tool.execution.duration -> gen_ai.execute_tool.duration Metric names now align with the operation name on spans (gen_ai.invoke_agent, gen_ai.execute_tool). * Move CHANGELOG entry to a Towncrier fragment per trask's reminder about #275: changelog.d/201.enhancement.md. * Bump gen_ai.agent.name from 'recommended' back to 'conditionally_required: When available.' (lmolkova: keep consistent with the internal invoke_agent span; entity work in #270 will reshape this later anyway). * Capitalize 'When available' / 'If available' per #245 sentence-case convention on every requirement_level note. * Apply lmolkova's suggested rewrites on metric briefs/notes: - Agent: more concise brief about the invocation start/end. - Tool: drop 'performed by or on behalf of a GenAI agent' since generic apps (not just agents) can execute tools. - Tool note: simplify the requirement statement (drops the explicit 'required vs recommended' framing; semconv is moving away from those labels for metrics per open-telemetry/semantic-conventions#3278). * Add a few more low-cardinality attributes on invoke_agent.duration per lmolkova: gen_ai.agent.id, gen_ai.agent.version, gen_ai.request.model (all conditionally_required When/If available). They mirror what the invoke_agent span carries and will be reshaped once #270 introduces agent entities.
1 parent 69f5afc commit f6a2978

4 files changed

Lines changed: 412 additions & 363 deletions

File tree

changelog.d/201.enhancement.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Add `gen_ai.invoke_agent.duration` metric to track the end-to-end duration of a single agent invocation, and `gen_ai.execute_tool.duration` metric to track the duration of a single tool execution.

docs/gen-ai/gen-ai-metrics.md

Lines changed: 19 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,9 @@ linkTitle: Metrics
2020
- [Generative AI workflow metrics](#generative-ai-workflow-metrics)
2121
- [Metric: `gen_ai.workflow.duration`](#metric-gen_aiworkflowduration)
2222
- [Generative AI agent metrics](#generative-ai-agent-metrics)
23-
- [Metric: `gen_ai.agent.invocation.duration`](#metric-gen_aiagentinvocationduration)
23+
- [Metric: `gen_ai.invoke_agent.duration`](#metric-gen_aiinvoke_agentduration)
2424
- [Generative AI tool metrics](#generative-ai-tool-metrics)
25-
- [Metric: `gen_ai.tool.execution.duration`](#metric-gen_aitoolexecutionduration)
25+
- [Metric: `gen_ai.execute_tool.duration`](#metric-gen_aiexecute_toolduration)
2626

2727
<!-- tocstop -->
2828

@@ -940,22 +940,22 @@ If there is no low-cardinality workflow name available for a given framework, th
940940
Individual systems may include additional system-specific attributes.
941941
It is recommended to check system-specific documentation, if available.
942942

943-
### Metric: `gen_ai.agent.invocation.duration`
943+
### Metric: `gen_ai.invoke_agent.duration`
944944

945945
This metric is [required][MetricRequired] when the instrumented component
946946
implements agent invocation operations.
947947

948948
This metric SHOULD be specified with [ExplicitBucketBoundaries] of
949949
[0.1, 0.2, 0.4, 0.8, 1.6, 3.2, 6.4, 12.8, 25.6, 51.2, 102.4, 204.8, 409.6].
950950

951-
<!-- weaver .registry.metrics[] | select(.name == "gen_ai.agent.invocation.duration") -->
951+
<!-- weaver .registry.metrics[] | select(.name == "gen_ai.invoke_agent.duration") -->
952952
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
953953
<!-- see templates/registry/markdown/snippet.md.j2 -->
954954
<!-- prettier-ignore-start -->
955955

956956
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
957957
| -------- | --------------- | ----------- | -------------- | --------- | ------ |
958-
| `gen_ai.agent.invocation.duration` | Histogram | `s` | The end-to-end duration of a single agent invocation, from the moment the agent is invoked to the moment it produces its final response (or terminates with an error). [1] | ![Development](https://img.shields.io/badge/-development-blue) | |
958+
| `gen_ai.invoke_agent.duration` | Histogram | `s` | The end-to-end duration of a single agent invocation, from the moment the invocation starts until the agent emits the last chunk of its final response or terminates with an error. [1] | ![Development](https://img.shields.io/badge/-development-blue) | |
959959

960960
**[1]:** Intended for instrumentations of agent frameworks (for example, ADK,
961961
LangChain agents, CrewAI agents) that can reliably bound a single
@@ -978,12 +978,18 @@ the metric value SHOULD be the same as the span duration.
978978
| Key | Stability | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Value Type | Description | Example Values |
979979
| --- | --- | --- | --- | --- | --- |
980980
| [`error.type`](https://github.com/open-telemetry/semantic-conventions/blob/v1.41.1/docs/registry/attributes/error.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Conditionally Required` If the operation ended in an error. | string | Describes a class of error the operation ended with. [1] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` |
981-
| [`gen_ai.agent.name`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` when available | string | Human-readable name of the GenAI agent provided by the application. | `Math Tutor`; `Fiction Writer` |
981+
| [`gen_ai.agent.id`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` When available. | string | The unique and stable identifier of the GenAI hosted agent resource. [2] | `asst_5j66UpCpwteGg4YSxUnt7lPY`; `arn:aws:bedrock:us-east-1:123:agent/42`; `urn:agent:projects-123:projects:123:locations:us-east1:aiplatform:reasoningEngines:456` |
982+
| [`gen_ai.agent.name`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` When available. | string | Human-readable name of the GenAI agent provided by the application. | `Math Tutor`; `Fiction Writer` |
983+
| [`gen_ai.agent.version`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` When available. | string | The version of the GenAI agent. | `1.0.0`; `2025-05-01` |
984+
| [`gen_ai.request.model`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` If available. | string | The name of the GenAI model a request is being made to. | `gpt-4` |
982985

983986
**[1] `error.type`:** The `error.type` SHOULD match the error code returned by the Generative AI provider or the client library,
984987
the canonical name of exception that occurred, or another low-cardinality error identifier.
985988
Instrumentations SHOULD document the list of errors they report.
986989

990+
**[2] `gen_ai.agent.id`:** For hosted agents, this SHOULD be the provider-assigned stable identifier of the agent resource such as [AWS Bedrock agent ARN](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_agent_Agent.html) or [GCP Agent Registry identifier](https://docs.cloud.google.com/agent-registry/concepts#agent-identifier).
991+
It's NOT RECOMMENDED to record in-memory agent instance ids on this attribute due to their transient nature.
992+
987993
---
988994

989995
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
@@ -1001,34 +1007,25 @@ Instrumentations SHOULD document the list of errors they report.
10011007
Individual systems may include additional system-specific attributes.
10021008
It is recommended to check system-specific documentation, if available.
10031009

1004-
### Metric: `gen_ai.tool.execution.duration`
1010+
### Metric: `gen_ai.execute_tool.duration`
10051011

10061012
This metric is [recommended][MetricRecommended] for instrumentations that can
1007-
observe tool executions performed by or on behalf of a GenAI agent.
1013+
observe tool executions.
10081014

10091015
This metric SHOULD be specified with [ExplicitBucketBoundaries] of
10101016
[0.01, 0.02, 0.04, 0.08, 0.16, 0.32, 0.64, 1.28, 2.56, 5.12, 10.24, 20.48, 40.96, 81.92].
10111017

1012-
<!-- weaver .registry.metrics[] | select(.name == "gen_ai.tool.execution.duration") -->
1018+
<!-- weaver .registry.metrics[] | select(.name == "gen_ai.execute_tool.duration") -->
10131019
<!-- NOTE: THIS TEXT IS AUTOGENERATED. DO NOT EDIT BY HAND. -->
10141020
<!-- see templates/registry/markdown/snippet.md.j2 -->
10151021
<!-- prettier-ignore-start -->
10161022

10171023
| Name | Instrument Type | Unit (UCUM) | Description | Stability | Entity Associations |
10181024
| -------- | --------------- | ----------- | -------------- | --------- | ------ |
1019-
| `gen_ai.tool.execution.duration` | Histogram | `s` | The duration of a single tool execution performed by or on behalf of a GenAI agent. [1] | ![Development](https://img.shields.io/badge/-development-blue) | |
1020-
1021-
**[1]:** Intended for instrumentations of agent frameworks (or of application
1022-
code that executes tools on behalf of an agent) that can reliably
1023-
bound a single tool call.
1025+
| `gen_ai.execute_tool.duration` | Histogram | `s` | The duration of a single tool execution. [1] | ![Development](https://img.shields.io/badge/-development-blue) | |
10241026

1025-
Unlike `gen_ai.agent.invocation.duration` (which is required), this
1026-
metric is only recommended because tools may be executed through
1027-
paths that the agent framework does not observe — for example,
1028-
external MCP servers or application-managed dispatch.
1029-
Instrumentations SHOULD record this metric for every tool execution
1030-
they observe but are not required to capture all tool calls across
1031-
the agentic system.
1027+
**[1]:** Instrumentation that can reliably bound a single tool call SHOULD
1028+
record this metric for every tool execution they can observe.
10321029

10331030
When this metric is reported alongside a `gen_ai.execute_tool` span,
10341031
the metric value SHOULD be the same as the span duration.
@@ -1039,7 +1036,7 @@ the metric value SHOULD be the same as the span duration.
10391036
| --- | --- | --- | --- | --- | --- |
10401037
| [`gen_ai.tool.name`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Required` | string | Name of the tool utilized by the agent. | `Flights` |
10411038
| [`error.type`](https://github.com/open-telemetry/semantic-conventions/blob/v1.41.1/docs/registry/attributes/error.md) | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | `Conditionally Required` If the operation ended in an error. | string | Describes a class of error the operation ended with. [1] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` |
1042-
| [`gen_ai.agent.name`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` when available | string | Human-readable name of the GenAI agent provided by the application. | `Math Tutor`; `Fiction Writer` |
1039+
| [`gen_ai.agent.name`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Conditionally Required` When available. | string | Human-readable name of the GenAI agent provided by the application. | `Math Tutor`; `Fiction Writer` |
10431040
| [`gen_ai.tool.type`](/docs/registry/attributes/gen-ai.md) | ![Development](https://img.shields.io/badge/-development-blue) | `Recommended` | string | Type of the tool utilized by the agent [2] | `function`; `extension`; `datastore` |
10441041

10451042
**[1] `error.type`:** The `error.type` SHOULD match the error code returned by the Generative AI provider or the client library,

model/gen-ai/metrics.yaml

Lines changed: 18 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -116,14 +116,14 @@ metrics:
116116
- ref: gen_ai.workflow.name
117117
requirement_level:
118118
conditionally_required: If available.
119-
- name: gen_ai.agent.invocation.duration
119+
- name: gen_ai.invoke_agent.duration
120120
annotations:
121121
code_generation:
122122
metric_value_type: double
123123
brief: >
124124
The end-to-end duration of a single agent invocation,
125-
from the moment the agent is invoked to the moment it produces its final
126-
response (or terminates with an error).
125+
from the moment the invocation starts until the agent emits
126+
the last chunk of its final response or terminates with an error.
127127
note: |
128128
Intended for instrumentations of agent frameworks (for example, ADK,
129129
LangChain agents, CrewAI agents) that can reliably bound a single
@@ -147,26 +147,25 @@ metrics:
147147
- ref_group: attributes.gen_ai.error
148148
- ref: gen_ai.agent.name
149149
requirement_level:
150-
conditionally_required: when available
151-
- name: gen_ai.tool.execution.duration
150+
conditionally_required: When available.
151+
- ref: gen_ai.agent.id
152+
requirement_level:
153+
conditionally_required: When available.
154+
- ref: gen_ai.agent.version
155+
requirement_level:
156+
conditionally_required: When available.
157+
- ref: gen_ai.request.model
158+
requirement_level:
159+
conditionally_required: If available.
160+
- name: gen_ai.execute_tool.duration
152161
annotations:
153162
code_generation:
154163
metric_value_type: double
155164
brief: >
156-
The duration of a single tool execution performed by or on behalf of a
157-
GenAI agent.
165+
The duration of a single tool execution.
158166
note: |
159-
Intended for instrumentations of agent frameworks (or of application
160-
code that executes tools on behalf of an agent) that can reliably
161-
bound a single tool call.
162-
163-
Unlike `gen_ai.agent.invocation.duration` (which is required), this
164-
metric is only recommended because tools may be executed through
165-
paths that the agent framework does not observe — for example,
166-
external MCP servers or application-managed dispatch.
167-
Instrumentations SHOULD record this metric for every tool execution
168-
they observe but are not required to capture all tool calls across
169-
the agentic system.
167+
Instrumentation that can reliably bound a single tool call SHOULD
168+
record this metric for every tool execution they can observe.
170169
171170
When this metric is reported alongside a `gen_ai.execute_tool` span,
172171
the metric value SHOULD be the same as the span duration.
@@ -181,7 +180,7 @@ metrics:
181180
requirement_level: recommended
182181
- ref: gen_ai.agent.name
183182
requirement_level:
184-
conditionally_required: when available
183+
conditionally_required: When available.
185184

186185
metric_refinements:
187186
- id: openai.client.token.usage

0 commit comments

Comments
 (0)