You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Java route harvesting currently uses dynamic attach to run jcmd VM.symboltable -verbose against the target JVM:
pkg/internal/transform/route/harvest/java.go
For JVMs with very large symbol tables, this can trigger a long stop-the-world safepoint in the target application. In #2301, the reporter observed application freezes lasting 20-100 seconds while jcmd <PID> VM.symboltable -verbose printed hundreds of thousands of lines.
PR #2303 disables Java route harvesting by default as an immediate mitigation, but we should track the underlying production-safety issue separately.
Impact
This affects the instrumented Java application, not just the OBI agent.
A route discovery attempt can make the target JVM unresponsive for tens of seconds. The current route_harvester_timeout only limits how long OBI waits for the extraction result. It does not prevent or cancel JVM-side work once the diagnostic command has triggered a safepoint.
#2034 is related but narrower. It tracks that Java route extraction is not truly cancellable from OBI and can retain blocked workers after timeout. This issue tracks the target-JVM safety problem: the diagnostic command itself can pause the application.
Evidence
The reported safepoint log includes DumpHashtable and a long time at safepoint while dumping the symbol table. The Java route harvester currently extracts routes from the output of:
jcmd VM.symboltable -verbose
That command can scale poorly with symbol table size and can pause the JVM during collection.
Suggested Direction
Short term:
Keep Java route harvesting disabled by default.
Document that enabling Java route harvesting can pause the target JVM on large applications.
Make opt-in behavior explicit in both v1 and v2 configuration docs and schemas.
Long term:
Investigate a route discovery approach that does not require dumping the full JVM symbol table.
If dynamic attach remains necessary, add stronger safety controls before enabling by default again.
Summary
Java route harvesting currently uses dynamic attach to run
jcmd VM.symboltable -verboseagainst the target JVM:pkg/internal/transform/route/harvest/java.goFor JVMs with very large symbol tables, this can trigger a long stop-the-world safepoint in the target application. In #2301, the reporter observed application freezes lasting 20-100 seconds while
jcmd <PID> VM.symboltable -verboseprinted hundreds of thousands of lines.PR #2303 disables Java route harvesting by default as an immediate mitigation, but we should track the underlying production-safety issue separately.
Impact
This affects the instrumented Java application, not just the OBI agent.
A route discovery attempt can make the target JVM unresponsive for tens of seconds. The current
route_harvester_timeoutonly limits how long OBI waits for the extraction result. It does not prevent or cancel JVM-side work once the diagnostic command has triggered a safepoint.Related Work
#2034 is related but narrower. It tracks that Java route extraction is not truly cancellable from OBI and can retain blocked workers after timeout. This issue tracks the target-JVM safety problem: the diagnostic command itself can pause the application.
Evidence
The reported safepoint log includes
DumpHashtableand a long time at safepoint while dumping the symbol table. The Java route harvester currently extracts routes from the output of:That command can scale poorly with symbol table size and can pause the JVM during collection.
Suggested Direction
Short term:
Long term:
Acceptance Criteria