You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi team, I am MLflow maintainer and found an issue in the instrumentation mechanism with the new LlamaIndex workflow. The issue is not only for MLflow, but also affect other observability providers like LlamaTrace.
MLflow Tracing implements the new instrumentation mechanism (span + event handlers) and it works perfectly for tracing LlamaIndex workflow until this change was introduced in 0.11.10. After this async context was introduced, the span dispatcher invokes the prepare_to_exit_span handler with the pending WorkflowHandler. This is problematic beacuse:
Users cannot see the actual result but just pending future object as an span/trace output.
Users cannot measure the latency of actual operation. Instead it only records the Future creation time.
The parent span (run) exits earlier than the child task is scheduled, then the self.open_spans attribute of the span handler to be empty when the child span starts, as a result, span handler cannot determine the parent child relationship.
The dispatcher should await for the result and call prepare_to_exist_span. I'm not very familiar with the dispatching logic for workflow, but apparently the issue is that dispatcher.span decorator that invokes the span handler immediately with the Future object.
I also validate the same issue happens in LlamaTrace as well:
(The parent-child relationship is maintained probably because of some difference in the implementation detail, but you can see the output and span duration is off for the root span).
Version
0.11.10 (or later)
Steps to Reproduce
Reproduce with dummy span handler (no dependency)
Define a dummy handler
import inspect
from typing import Any, Dict, Optional
from llama_index.core.bridge.pydantic import Field
from llama_index.core.instrumentation.span.base import BaseSpan
from llama_index.core.instrumentation.span_handlers import BaseSpanHandler
from llama_index.core.workflow.handler import WorkflowHandler
class TestSpanHandler(BaseSpanHandler[BaseSpan]):
def new_span(
self,
id_: str,
bound_args: inspect.BoundArguments,
instance: Optional[Any] = None,
parent_span_id: Optional[str] = None,
tags: Optional[Dict[str, Any]] = None,
**kwargs: Any,
) -> Optional[BaseSpan]:
"""Create a span."""
pass
def prepare_to_exit_span(
self,
id_: str,
bound_args: inspect.BoundArguments,
instance: Optional[Any] = None,
result: Optional[Any] = None,
**kwargs: Any,
) -> Any:
"""Logic for preparing to exit a span."""
if isinstance(result, WorkflowHandler) and not result.is_done():
print(f"prepare_to_exit_span called with pending WorkflowHandler")
def prepare_to_drop_span(
self,
id_: str,
bound_args: inspect.BoundArguments,
instance: Optional[Any] = None,
err: Optional[BaseException] = None,
**kwargs: Any,
) -> Any:
"""Logic for preparing to drop a span."""
pass
Define a simple workflow
from llama_index.core.workflow import StartEvent, StopEvent,Workflow, step
import time
class MyWorkflow(Workflow):
@step
async def my_step(self, ev: StartEvent) -> StopEvent:
time.sleep(3)
return StopEvent(result="Hi, world!")
w = MyWorkflow(timeout=10, verbose=False)
Run the workflow with the span handler
import llama_index.core.instrumentation as instrument
dispatcher = instrument.get_dispatcher()
dispatcher.add_span_handler(TestSpanHandler())
result = await w.run()
Output:
prepare_to_exit_span called with pending WorkflowHandler
Reproduce with MLflow
Install dev branch pip install git+https://github.com/B-Step62/mlflow.git@llama-workflow-trace (Installing dev branch only for applying some fix for recent async handling. The stable release works with LlamaIndex workflow earlier than 0.11.10)
Enable MLflow tracing for LlamaIndex
import mlflow
mlflow.llama_index.autolog()
Run the same workflow as above.
Relevant Logs/Tracbacks
No response
The text was updated successfully, but these errors were encountered:
To address the issue where the span dispatcher in LlamaIndex version 0.11.10 invokes the prepare_to_exit_span handler with a pending WorkflowHandler, you need to ensure that the prepare_to_exit_span method correctly handles the state of the span before it is exited. This can be done by adding a check to see if the span is in a valid state to be exited.
Here is a possible solution:
Modify the prepare_to_exit_span method to include a check for the span's state.
Ensure that the span is only removed from open_spans if it is in a valid state to be exited.
Here is the updated code:
defspan_exit(
self,
id_: str,
bound_args: inspect.BoundArguments,
instance: Optional[Any] =None,
result: Optional[Any] =None,
**kwargs: Any,
) ->None:
"""Logic for exiting a span."""span=self.prepare_to_exit_span(
id_=id_, bound_args=bound_args, instance=instance, result=result
)
ifspan:
withself.lock:
ifid_inself.open_spans:
delself.open_spans[id_]
else:
# Handle the case where the span is not in open_spans# This could involve logging an error or taking other appropriate actionpass
In the prepare_to_exit_span method, you should ensure that the span is in a valid state to be exited. This might involve checking certain attributes of the span or ensuring that it has completed its intended operations.
Here is the relevant part of the prepare_to_exit_span method from the AgentOpsSpanHandler class:
Additionally, you can refer to the test cases provided in the LlamaIndex core tests to ensure that your changes are correctly handling the span lifecycle. For example, the test_dispatcher_span_drop_args and test_dispatcher_async_span_drop_args_with_instance tests show how to mock and assert the behavior of span enter, drop, and exit methods.
By adding these checks and referring to the provided test cases, you can prevent the prepare_to_exit_span handler from being invoked with a pending WorkflowHandler, thereby avoiding incorrect span/trace outputs and latency measurements [1][2][3][4].
Bug Description
Hi team, I am MLflow maintainer and found an issue in the instrumentation mechanism with the new LlamaIndex workflow. The issue is not only for MLflow, but also affect other observability providers like LlamaTrace.
MLflow Tracing implements the new instrumentation mechanism (span + event handlers) and it works perfectly for tracing LlamaIndex workflow until this change was introduced in
0.11.10
. After this async context was introduced, the span dispatcher invokes theprepare_to_exit_span
handler with the pendingWorkflowHandler
. This is problematic beacuse:run
) exits earlier than the child task is scheduled, then theself.open_spans
attribute of the span handler to be empty when the child span starts, as a result, span handler cannot determine the parent child relationship.The dispatcher should await for the result and call
prepare_to_exist_span
. I'm not very familiar with the dispatching logic for workflow, but apparently the issue is thatdispatcher.span
decorator that invokes the span handler immediately with the Future object.llama_index/llama-index-core/llama_index/core/workflow/workflow.py
Line 280 in 75c2d10
I also validate the same issue happens in LlamaTrace as well:
(The parent-child relationship is maintained probably because of some difference in the implementation detail, but you can see the output and span duration is off for the root span).
Version
0.11.10 (or later)
Steps to Reproduce
Reproduce with dummy span handler (no dependency)
Output:
Reproduce with MLflow
pip install git+https://github.com/B-Step62/mlflow.git@llama-workflow-trace
(Installing dev branch only for applying some fix for recent async handling. The stable release works with LlamaIndex workflow earlier than 0.11.10)Relevant Logs/Tracbacks
No response
The text was updated successfully, but these errors were encountered: