Skip to content

Conversation

@ShuhangGe
Copy link

@ShuhangGe ShuhangGe commented Dec 20, 2025

This commit fixes a concurrency bug in the Qwen VL video processor where
multiple concurrent requests decoding different videos would cause crashes.

Problem

When multiple async requests process different videos simultaneously,
the decord library's VideoReader.get_batch() call fails with:

DECORDError: Check failed: avcodec_send_packet(dec_ctx_.get(), pkt.get()) >= 0 (-11 vs. 0)
Thread worker: Error sending packet.

Error code -11 (EAGAIN) indicates resource contention in FFmpeg's
internal threaded decoder, which is not thread-safe for concurrent
multi-file access.

Root Cause

The decord library uses FFmpeg internally for video decoding. When
multiple VideoReader instances decode different video files at the same
time, FFmpeg's internal thread workers collide, causing the EAGAIN error.
This is a known issue with decord due to C package conflicts that are
difficult to resolve at the library level.

Solution

Add an asyncio.Semaphore(1) to serialize the get_batch() call, ensuring
only one video decoding operation runs at a time.

Note: This is a Quick Fix

This is a minimal fix to quickly resolve the crash. The semaphore
serializes video decoding, which may slightly reduce throughput when
processing multiple different videos concurrently.

Future Improvement: OpenCV-based Decoding

A more comprehensive solution would be to use OpenCV (cv2.VideoCapture)
as the primary video decoder with decord as a fallback. This approach
is being adopted by QwenLM/Qwen3-VL in PR #1078, which:

  1. Uses OpenCV as the primary decoder (more thread-safe and compatible)
  2. Adds a timeout mechanism for decord to prevent hanging
  3. Falls back to decord only when OpenCV fails

To implement this in SGLang would require:

  1. Modify load_video() in sglang/srt/utils/common.py:

    • Keep temporary video files alive (currently deleted immediately)
    • Attach the file path to the VideoReader object for later use
  2. Modify preprocess_video() in qwen_vl.py:

    • Use OpenCV as the primary decoder:
      def _decode_with_opencv(video_path: str, frame_indices: np.ndarray) -> np.ndarray:
      cap = cv2.VideoCapture(video_path)
      frames = []
      for idx in frame_indices:
      cap.set(cv2.CAP_PROP_POS_FRAMES, int(idx))
      ret, frame = cap.read()
      if ret:
      frames.append(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
      cap.release()
      return np.stack(frames)
      • Fall back to decord with semaphore protection if OpenCV fails

This would allow full parallel video decoding while maintaining reliability.
Happy to submit a follow-up PR for the OpenCV implementation.

References

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant