You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Address review comments on PR open-telemetry#202 (lmolkova, Mike, trask)
* Rename metrics: gen_ai.agent.request.size -> gen_ai.agent.input.content.size
and gen_ai.agent.response.size -> gen_ai.agent.output.content.size. The new
names don't imply a physical HTTP/gRPC request and are explicit that the
metric is about content bytes (lmolkova's feedback).
* Drop the per-invocation-increment framing. Metric semantics now: byte size
of content the agent receives/produces at its entrypoint, whatever the
framework sees natively. Addresses lmolkova's point that 'what's new' is
framework-dependent and ambiguous, and trask's question about defining in
terms of gen_ai.input.messages (which would force frameworks to serialize
full chat history).
* Spell out the byte-counting algorithm concretely: UTF-8 byte length for
text parts, raw byte length for binary parts, framing bytes (JSON keys,
role/metadata) not counted. Matches what the ADK reference implementation
does. Addresses both Mike's and lmolkova's precision requests.
* Bump gen_ai.agent.name from 'conditionally_required: when available' to
'recommended'. Same compromise as PR open-telemetry#201 - stronger than current but
doesn't break unnamed-agent frameworks.
* Add error.type via attributes.gen_ai.error ref_group (Mike's suggestion);
held off on metric_attributes.gen_ai since address/port/provider/model
don't add much for an in-process content-size metric.
* Drop gen_ai.agent.version from attribute lists (same reasoning as PR open-telemetry#201
- service.version covers it).
* Remove cross-reference to gen_ai.agent.invocation.duration since open-telemetry#201 has
not landed yet. Will re-add later.
* Restructure the docs/gen-ai/gen-ai-metrics.md section to follow the thin
MD wrapper + rich YAML note pattern (same as PR open-telemetry#201 revision).
**[1]:** This metric measures the size, in bytes, of the input payload provided to
985
-
a GenAI agent at invocation time (for example, the user message that
986
-
triggered the agent).
987
-
988
-
Instrumentations SHOULD compute the size as the byte length of the
989
-
serialized request content as the agent receives it. For multi-part
990
-
content (for example, text plus inline binary data), the size SHOULD be
991
-
the sum of the byte lengths of each part.
992
-
993
-
This metric is intended for instrumentations of agent frameworks that
994
-
can reliably observe an agent's input payload (for example, ADK,
995
-
LangChain agents, CrewAI agents).
957
+
|`gen_ai.agent.input.content.size`| Histogram |`By`| The byte size of the content the GenAI agent receives at the agent boundary for a single invocation. [1]|||
958
+
959
+
**[1]:** Intended for instrumentations of agent frameworks (for example, ADK,
960
+
LangChain agents, CrewAI agents) that can observe the content passed
961
+
to the agent at its entrypoint. Useful for capacity planning,
962
+
anomaly detection (for example, a user pasting a very large prompt),
963
+
and sizing downstream services (token budgets, vector DB inputs,
964
+
storage).
965
+
966
+
Instrumentations SHOULD record the byte size of the content the
967
+
agent receives, as observed at the framework's entrypoint. The exact
968
+
encoding is framework-defined (for example, a framework that exposes
969
+
content as typed parts MAY sum the UTF-8 byte length of text parts
970
+
and the raw byte length of binary parts; a framework that handles
971
+
content as a serialized payload MAY use the byte length of that
972
+
serialization). Instrumentations SHOULD document what they count so
973
+
operators can interpret the values correctly within a given
974
+
framework.
996
975
997
976
**Attributes:**
998
977
999
978
| Key | Stability |[Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/)| Value Type | Description | Example Values |
1000
979
| --- | --- | --- | --- | --- | --- |
1001
-
|[`gen_ai.agent.name`](/docs/registry/attributes/gen-ai.md)||`Conditionally Required` when available | string | Human-readable name of the GenAI agent provided by the application. |`Math Tutor`; `Fiction Writer`|
1002
-
|[`gen_ai.agent.version`](/docs/registry/attributes/gen-ai.md)||`Conditionally Required` when available | string | The version of the GenAI agent. |`1.0.0`; `2025-05-01`|
980
+
|[`error.type`](https://github.com/open-telemetry/semantic-conventions/blob/v1.41.1/docs/registry/attributes/error.md)||`Conditionally Required` If the operation ended in an error. | string | Describes a class of error the operation ended with. [1]|`timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500`|
981
+
|[`gen_ai.agent.name`](/docs/registry/attributes/gen-ai.md)||`Recommended`| string | Human-readable name of the GenAI agent provided by the application. |`Math Tutor`; `Fiction Writer`|
982
+
983
+
**[1]`error.type`:** The `error.type` SHOULD match the error code returned by the Generative AI provider or the client library,
984
+
the canonical name of exception that occurred, or another low-cardinality error identifier.
985
+
Instrumentations SHOULD document the list of errors they report.
986
+
987
+
---
988
+
989
+
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
990
+
991
+
| Value | Description | Stability |
992
+
| --- | --- | --- |
993
+
|`_OTHER`| A fallback error value to be used when the instrumentation doesn't define a custom value. ||
1003
994
1004
995
<!-- prettier-ignore-end -->
1005
996
<!-- END AUTOGENERATED TEXT -->
1006
997
<!-- endweaver -->
1007
998
1008
-
### Metric: `gen_ai.agent.response.size`
999
+
### Metric: `gen_ai.agent.output.content.size`
1009
1000
1010
1001
This metric is [recommended][MetricRecommended] for instrumentations that
1011
-
can observe the final response produced by an agent for a single invocation.
1012
-
1013
-
Instrumentations SHOULD record the size as the byte length of the serialized
1014
-
response content as it leaves the agent. For multi-part content (for example,
1015
-
text plus inline binary data), the size SHOULD be the sum of the byte lengths
1016
-
of each part.
1002
+
can observe the final response produced by a GenAI agent.
1017
1003
1018
1004
This metric SHOULD be specified with [ExplicitBucketBoundaries] of
**[1]:** This metric measures the size, in bytes, of the final response payload
1031
-
produced by a GenAI agent for a single invocation.
1032
-
1033
-
Instrumentations SHOULD compute the size as the byte length of the
1034
-
serialized response content as it leaves the agent. For multi-part
1035
-
content (for example, text plus inline binary data), the size SHOULD be
1036
-
the sum of the byte lengths of each part.
1037
-
1038
-
This metric is intended for instrumentations of agent frameworks that
1039
-
can reliably observe an agent's final response (for example, ADK,
1040
-
LangChain agents, CrewAI agents).
1014
+
|`gen_ai.agent.output.content.size`| Histogram |`By`| The byte size of the content the GenAI agent produces at the agent boundary for a single invocation. [1]|||
1015
+
1016
+
**[1]:** Intended for instrumentations of agent frameworks (for example, ADK,
1017
+
LangChain agents, CrewAI agents) that can observe the agent's final
1018
+
output. Useful for capacity planning, spotting unusually large
1019
+
responses, and correlating size with latency or error rate.
1020
+
1021
+
Includes only the agent's final response content. Intermediate
1022
+
content produced inside the invocation (tool calls, tool results,
1023
+
reasoning steps) SHOULD NOT be counted.
1024
+
1025
+
Instrumentations SHOULD record the byte size of the content the
1026
+
agent produces, as observed at the framework's exit point. The exact
1027
+
encoding is framework-defined (for example, a framework that exposes
1028
+
content as typed parts MAY sum the UTF-8 byte length of text parts
1029
+
and the raw byte length of binary parts; a framework that handles
1030
+
content as a serialized payload MAY use the byte length of that
1031
+
serialization). Instrumentations SHOULD document what they count so
1032
+
operators can interpret the values correctly within a given
1033
+
framework.
1041
1034
1042
1035
**Attributes:**
1043
1036
1044
1037
| Key | Stability |[Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/)| Value Type | Description | Example Values |
1045
1038
| --- | --- | --- | --- | --- | --- |
1046
-
|[`gen_ai.agent.name`](/docs/registry/attributes/gen-ai.md)||`Conditionally Required` when available | string | Human-readable name of the GenAI agent provided by the application. |`Math Tutor`; `Fiction Writer`|
1047
-
|[`gen_ai.agent.version`](/docs/registry/attributes/gen-ai.md)||`Conditionally Required` when available | string | The version of the GenAI agent. |`1.0.0`; `2025-05-01`|
1039
+
|[`error.type`](https://github.com/open-telemetry/semantic-conventions/blob/v1.41.1/docs/registry/attributes/error.md)||`Conditionally Required` If the operation ended in an error. | string | Describes a class of error the operation ended with. [1]|`timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500`|
1040
+
|[`gen_ai.agent.name`](/docs/registry/attributes/gen-ai.md)||`Recommended`| string | Human-readable name of the GenAI agent provided by the application. |`Math Tutor`; `Fiction Writer`|
1041
+
1042
+
**[1]`error.type`:** The `error.type` SHOULD match the error code returned by the Generative AI provider or the client library,
1043
+
the canonical name of exception that occurred, or another low-cardinality error identifier.
1044
+
Instrumentations SHOULD document the list of errors they report.
1045
+
1046
+
---
1047
+
1048
+
`error.type` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used.
1049
+
1050
+
| Value | Description | Stability |
1051
+
| --- | --- | --- |
1052
+
|`_OTHER`| A fallback error value to be used when the instrumentation doesn't define a custom value. ||
0 commit comments