This document describes the execution tracking and state management system used throughout workflow and API deployment executions. The system maintains execution status at multiple granularity levels (workflow-level, file-level, and tool-level) using a dual-layer architecture combining Redis for real-time tracking and PostgreSQL for persistent records.
For information about the overall workflow execution orchestration, see Workflow Orchestration and Management. For batch processing strategies, see File Processing Pipeline and Batching. For container orchestration, see Tool Sandbox and Container Execution.
The execution tracking system provides:
Sources: unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py1-766 runner/src/unstract/runner/runner.py1-574 backend/workflow_manager/endpoint_v2/source.py558-661
The FileExecutionStatusTracker maintains real-time execution state in Redis with a 24-hour TTL. It enables crash recovery by allowing workers to query execution state and resume from the last completed stage.
Key Data Structure:
Redis Key Pattern: file_execution_status:{execution_id}:{file_execution_id}
Operations:
set_data(data) - Initialize or overwrite execution dataget_data(execution_id, file_execution_id) - Retrieve current stateupdate_stage_status(execution_id, file_execution_id, stage_status) - Advance stageupdate_tool_container_name(execution_id, file_execution_id, tool_container_name) - Store container name for cleanupUpdate Locations:
TOOL_EXECUTION stage and stores container name at runner/src/unstract/runner/runner.py450-463Sources: unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py279-360 runner/src/unstract/runner/runner.py449-463
The ToolExecutionTracker captures tool container exit status with a 1-hour TTL. Tool containers write their final status to Redis before exiting, which is then read by ToolSandboxHelper to determine success/failure.
Key Data Structure:
Redis Key Pattern: tool_execution_status:{execution_id}:{file_execution_id}
Write Flow:
ToolExecutionStatusData to Redis using ToolExecutionTracker.set_status()Read Flow:
ToolSandboxHelper.poll_tool_status() detects container in final stateToolSandboxHelper._handle_tool_execution_status() reads status from Redis at unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py708-765FAILED, error is propagated as RunnerContainerRunResponse.errorTTL Management:
TOOL_EXECUTION_TRACKER_COMPLETED_TTL_IN_SECOND at unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py762-765Sources: unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py708-765 runner/src/unstract/runner/runner.py331-378
Represents a single workflow execution instance. Created at execution start and updated on completion/failure.
Key Fields:
workflow - Foreign key to Workflowexecution_id - UUID, indexedstatus - Enum: PENDING, EXECUTING, SUCCESS, FAILURE, ERRORexecution_mode - Enum: ETL, API, INSTANT, SCHEDULEDresult_acknowledged - Boolean flag for API deploymentserror - Captured error messageStatus Lifecycle:
Update Signal: WorkflowExecution.update_execution() triggers post-save signal that releases API rate limit slots at backend/workflow_manager/workflow_v2/models/execution.py
Sources: backend/workflow_manager/workflow_v2/dto.py30-117 backend/workflow_manager/endpoint_v2/source.py579-586
Represents execution for a single file within a workflow execution. One WorkflowExecution has many WorkflowFileExecution records.
Key Fields:
workflow_execution - Foreign key to WorkflowExecutionfile_execution_id - UUID, indexedfile_path - Source file pathfile_hash - Content hash for deduplicationprovider_file_uuid - Provider-specific file identifier (e.g., S3 ETag, Google Drive file ID)status - Enum: PENDING, EXECUTING, SUCCESS, FAILURE, ERRORerror - Captured error messageDeduplication Keys: Both file_hash and provider_file_uuid are indexed for fast lookups during active execution checks.
Sources: backend/workflow_manager/endpoint_v2/source.py613-649 backend/workflow_manager/endpoint_v2/dto.py122-155
Tracks file processing completion for deduplication across executions. Prevents reprocessing of previously completed files unless file content changes.
Key Fields:
workflow - Foreign key to Workflowfile_path - Source file pathfile_hash - Content hashprovider_file_uuid - Provider-specific identifierstatus - Completion statuscompleted_at - Timestamp of completionUsage in SourceConnector:
Sources: backend/workflow_manager/endpoint_v2/source.py805-848
Stage Progression Diagram:
Stage + Status Tracking: Each stage has an associated status. ToolSandboxHelper.call_tool_handler() checks current stage and resumes appropriately at unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py260-360:
Sources: unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py260-360
ToolSandboxHelper.poll_tool_status() polls the runner service to check container status with configurable intervals:
Configuration:
MAX_RUNNER_POLLING_WAIT_SECONDS - Maximum polling duration (default: 10,800s / 3 hours)RUNNER_POLLING_INTERVAL_SECONDS - Poll interval (default: 2s)Polling Loop:
Final Statuses: Polling terminates when container reaches one of these states defined at unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py41-46:
EXITED - Normal container exitDEAD - Container terminated abnormallyERROR - Docker errorNOT_FOUND - Container not found (with grace period)Sources: unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py123-258
To prevent race conditions during container startup, the system implements a grace period for NOT_FOUND status.
Problem: When ToolSandboxHelper starts polling immediately after dispatching container creation to the runner, the container might not exist yet in Docker's registry. Without a grace period, this would be treated as immediate failure.
Solution: Grace period of 40 seconds (configurable via POLL_NOT_FOUND_GRACE_PERIOD) at unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py39
Grace Period Logic:
Duplicate Detection: During the grace period, the system checks if another worker has already completed the file by querying FileExecutionStatusTracker at unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py167-193:
Sources: unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py123-258
Sources: unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py123-258 unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py708-765
Errors are captured at multiple levels and propagated upward:
Level 1: Tool Container
ToolExecutionStatusData with status=FAILED and error message to RedisToolExecutionTracker before exitLevel 2: ToolSandboxHelper
RunnerContainerRunResponse.errorFileExecutionStatusTracker stage status to FAILEDLevel 3: Worker
ToolSandboxHelper.call_tool_handler()FileExecutionResult with error at backend/workflow_manager/endpoint_v2/dto.py122-155WorkflowFileExecution.error and status=FAILURELevel 4: WorkflowExecution
WorkflowExecution.status to FAILURE if any files failedWorkflowExecution.error to summary messageError Propagation Flow:
Sources: unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py708-765 backend/workflow_manager/endpoint_v2/dto.py122-155
ToolsUtils.run_tool_with_retry() implements retry logic with a maximum of 3 attempts (configurable via ToolExecution.MAXIMUM_RETRY) at unstract/workflow-execution/src/unstract/workflow_execution/tools_utils.py185-213:
Retry Strategy:
ToolNotFoundInRegistryErrorNon-Retriable Errors:
ToolNotFoundInRegistryError - Tool image not in registry, retrying won't help at unstract/workflow-execution/src/unstract/workflow_execution/tools_utils.py199-202Retry Counter: Passed to ToolSandboxHelper.call_tool_handler() and used in container naming at unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py637-650:
Sources: unstract/workflow-execution/src/unstract/workflow_execution/tools_utils.py185-213 unstract/workflow-execution/src/unstract/workflow_execution/constants.py6-11
SourceConnector._is_new_file() checks FileHistory to prevent reprocessing completed files at backend/workflow_manager/endpoint_v2/source.py805-848:
Deduplication Logic:
Identifier Priority:
provider_file_uuid - Preferred (stable across content changes)file_hash - Fallback (changes with content)Sources: backend/workflow_manager/endpoint_v2/source.py805-848
To prevent multiple workers from processing the same file concurrently, SourceConnector._is_file_being_processed() queries active executions at backend/workflow_manager/endpoint_v2/source.py558-611:
Check Algorithm:
Skip-Processing Statuses: Defined in ExecutionStatus.get_skip_processing_statuses():
EXECUTING - Currently being processedSUCCESS - Already completed successfullyFAILURE, ERROR - allow retry)Query at backend/workflow_manager/endpoint_v2/source.py613-649:
Sources: backend/workflow_manager/endpoint_v2/source.py558-661
Sources: backend/workflow_manager/endpoint_v2/source.py662-690
Initialization Phase:
WorkflowExecution (status: PENDING)WorkflowFileExecution for each file (status: PENDING)ToolSandboxHelper.call_tool_handler() creates FileExecutionData in Redis (stage: INITIALIZATION)Source Phase:
4. Worker copies file to execution volume
5. Worker updates stage to SOURCE in Redis
6. Worker creates metadata file
Tool Execution Phase:
7. UnstractRunner.run_container() updates stage to TOOL_EXECUTION at runner/src/unstract/runner/runner.py450-463
8. UnstractRunner stores tool_container_name in Redis for cleanup
9. Tool container runs
10. Tool container writes ToolExecutionStatusData to Redis on exit
11. ToolSandboxHelper.poll_tool_status() detects container exit
12. ToolSandboxHelper reads tool status from ToolExecutionTracker
13. ToolSandboxHelper updates stage status to SUCCESS/FAILED
Finalization Phase:
14. Worker updates stage to FINALIZATION in Redis
15. Worker routes output to destination
16. Worker updates WorkflowFileExecution (status: SUCCESS/FAILURE)
17. Worker creates/updates FileHistory with completion status
Completion Phase:
18. Callback aggregates all file results
19. Callback updates WorkflowExecution (status: SUCCESS/FAILURE)
20. Post-save signal releases API rate limit slots (if API deployment)
Sources: unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py260-360 runner/src/unstract/runner/runner.py404-574 backend/workflow_manager/endpoint_v2/source.py805-848
Redis TTLs:
FileExecutionStatusTracker: 24 hours (allows late queries for debugging)ToolExecutionTracker: 1 hour initially, reduced after read at unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py762-765Container Cleanup:
ToolSandboxHelper.cleanup_tool_container() after polling completesfinally block if exceptions occur at unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py422-431File Execution Directory Cleanup:
ExecutionFileHandler.delete_file_execution_directory() at unstract/workflow-execution/src/unstract/workflow_execution/execution_file_handler.py267-294Sources: unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py422-431 unstract/tool-sandbox/src/unstract/tool_sandbox/helper.py762-765 unstract/workflow-execution/src/unstract/workflow_execution/execution_file_handler.py267-294
Refresh this wiki