feat(sdk): [Enterprise Integration]: Add provider agnostic traceing#145
feat(sdk): [Enterprise Integration]: Add provider agnostic traceing#145namrataghadi-galileo wants to merge 3 commits intomainfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
Why not emit these from the server? |
|
@lan17 Keeping emission in the SDK gives us one merged, ordered batch and a single integration model for both logger and non-logger flows |
What about other language sdks, like typescript, java, etc.? If we can delegate this to agent control server then it makes sdk easier. Could we just send additional metadata to agent control server when we send events to integrate with Galileo spans/traces? |
|
Server-side emission is attractive for thinner SDKs, but it defeats the main purpose of the logger-based design: reusing Galileo Logger’s in-process trace buffering and flush. For logger integrations, the SDK must have the merged control events locally before flush. Other languages can still be supported through the thinner OTEL non-logger path until they need full logger integration. |
Should this be done outside of core OSS sdk, then? This pattern conflicts with one @abhinav-galileo implemented where we emit events to the server (also via a buffer on SDK), so it maybe confusing to have both systems at same time. |
|
The current PR only adds provider-agnostic hooks to OSS. It does not move logger context into OSS or change the default buffered event-to-server flow. Logger context, span conversion, trace attachment, and flush integration still belong to the external Galileo integration layer, so OSS is not taking on a second concrete observability system. Also, it would be confusing if both were active default systems. It is much less confusing if the existing buffered SDK-to-server flow remains the only default OSS behavior and the new sink/provider APIs are treated purely as optional extension points for external integrations. |
|
I’m also waiting to hear back from both Davids on the RFC and get their perspective on the proposed logger-based and non-logger-based approaches. That said, the hook additions themselves are independent of any Galileo-specific observability or tracing design. They are generic enough to support other third-party observability systems as well. For example, if LangSmith is the external observability framework, these hooks could still be used to publish trace_id and span_id |
Summary
Added a new provider-agnostic telemetry package to the AgentControl Python SDK for external trace context resolution and merged control event emission.
Updated tracing to consult a registered external trace context provider before falling back to OTEL context.
Exported the new telemetry APIs from the top-level agent_control package.
Added focused tests to ensure provider/sink failures do not affect existing behavior.
Scope
User-facing/API changes:
New SDK APIs:
Internal changes:
Added:
Risk and Rollout
Risk level: low
Rollback plan:
Revert the new telemetry package and the small tracing/export changes in the SDK.
Since the change is additive and inactive unless a provider or sink is explicitly registered, rollback is straightforward.
Testing
Checklist