Skip to content

Conversation

@cfsmp3
Copy link
Contributor

@cfsmp3 cfsmp3 commented Dec 21, 2025

Summary

  • When processing DVB subtitles from live streams or corrupted files, bitmap clipping can fail
  • Previously this caused a fatal crash: "Failed to perform OCR - Failed to get text"
  • Now gracefully skips problematic frames instead of crashing

Root Cause

The code path was:

  1. pixClipRectangle fails → "box outside rectangle" error
  2. pixConvertRGBToGray receives NULL → "pixs not defined" error
  3. cpix_gs is NULL, but code continues to line 415
  4. TessBaseAPIGetUTF8Text returns NULL (no image was set)
  5. fatal() is called → crash

Fix

  • Handle cpix_gs == NULL by cleaning up and returning NULL (skip bitmap)
  • Change TessBaseAPIGetUTF8Text NULL case from fatal error to graceful skip
  • Both cases properly clean up allocated resources before returning

Testing

This fix allows CCExtractor to continue processing even when individual subtitle frames are corrupted, which is critical for live streams where packet loss can occur.

Fixes #1010

🤖 Generated with Claude Code

When processing DVB subtitles from live streams or corrupted files,
the bitmap clipping operation can fail, resulting in a NULL pix object.
Previously, this would cause a fatal crash with "Failed to perform OCR -
Failed to get text" because the code continued to call TessBaseAPIGetUTF8Text
even when no image was set.

Changes:
- Handle cpix_gs == NULL by logging a message and returning NULL
  (skip this bitmap) instead of continuing and crashing
- Change the fatal error when TessBaseAPIGetUTF8Text returns NULL
  to a non-fatal skip, since this can happen with empty/invalid bitmaps
- Both cases now properly clean up allocated resources before returning

This allows CCExtractor to gracefully skip problematic subtitle frames
instead of crashing, which is especially important for live streams
where packet loss or discontinuities can occur.

Fixes #1010

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@ccextractor-bot
Copy link
Collaborator

CCExtractor CI platform finished running the test files on windows. Below is a summary of the test results, when compared to test for commit e3b0def...:
Report Name Tests Passed
Broken 13/13
CEA-708 14/14
DVB 6/7
DVD 3/3
DVR-MS 2/2
General 24/27
Hardsubx 1/1
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 80/86
Teletext 21/21
WTV 13/13
XDS 34/34

Your PR breaks these cases:

  • ccextractor --autoprogram --out=srt --latin1 --quant 0 85271be4d2...
  • ccextractor --autoprogram --out=ttxt --latin1 1974a299f0...
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla dab1c1bd65...
  • ccextractor --out=srt --latin1 --autoprogram 29e5ffd34b...
  • ccextractor --out=spupng c83f765c66...
  • ccextractor --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsnotbefore 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsnotafter 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsforatleast 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsforatmost 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...

It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Failing to extract DVB subtitles from live stream (Failed to perform OCR)

3 participants