-
Notifications
You must be signed in to change notification settings - Fork 5.8k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Description
Found that the EventListener class has a memory leak where completed/failed tasks are never fully removed from the execution_spans dictionary. Instead of removing entries, the code sets them to None, causing task objects to remain referenced indefinitely and preventing garbage collection.
This causes unbounded memory growth in long-running processes or systems executing many tasks, as the dictionary grows with every completed/failed task but never shrinks.
Steps to Reproduce
- Look at
lib/crewai/src/crewai/events/event_listener.py - Check lines 218-228 (
TaskCompletedEventhandler) and 235-243 (TaskFailedEventhandler) - Notice the pattern:
span = self.execution_spans.get(source)- retrieves the span- Uses the span for telemetry
self.execution_spans[source] = None- sets to None but doesn't remove the key
- Execute tasks in an async environment
- Monitor memory usage - it continuously grows
- Inspect
execution_spansdictionary - it contains all past task objects as keys withNonevalues
Expected behavior
The execution_spans dictionary should properly clean up completed/failed tasks:
- When a task completes or fails, its entry should be removed from the dictionary
- Task objects should be eligible for garbage collection
- Memory usage should remain stable in long-running processes
- Dictionary size should only contain active (in-progress) tasks
Screenshots/Code snippets
Current problematic code (lib/crewai/src/crewai/events/event_listener.py):
@crewai_event_bus.on(TaskCompletedEvent)
def on_task_completed(source: Any, event: TaskCompletedEvent) -> None:
# Handle telemetry
span = self.execution_spans.get(source)
if span:
self._telemetry.task_ended(span, source, source.agent.crew)
self.execution_spans[source] = None # ❌ Sets to None, keeps key in dict!Same issue in TaskFailedEvent handler:
@crewai_event_bus.on(TaskFailedEvent)
def on_task_failed(source: Any, event: TaskFailedEvent) -> None:
span = self.execution_spans.get(source)
if span:
if source.agent and source.agent.crew:
self._telemetry.task_ended(span, source, source.agent.crew)
self.execution_spans[source] = None # ❌ Same problem!Operating System
Ubuntu 24.04
Python Version
3.12
crewAI Version
Latest
crewAI Tools Version
na
Virtual Environment
Venv
Evidence
Clear indicators of memory leak:
- Code pattern: Setting dictionary values to
Noneinstead of removing keys - Memory growth: Long-running crews show unbounded memory growth, especially in async execution
- Dictionary bloat:
execution_spanscontains all historical task objects - Garbage collection prevention: Task objects can't be GC'd while referenced as dict keys
- Same pattern in two handlers: Both
TaskCompletedEventandTaskFailedEventhave this issue
This pattern indicates:
execution_spansdictionary grows indefinitely- Each completed/failed task leaves a permanent entry (key=Task, value=None)
- Memory leak compounds over time
- Particularly problematic for:
- Long-running crews
- Async task execution with many concurrent tasks
- Automated systems executing many tasks
- Production environments with continuous operation
Possible Solution
Additional context
na
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working