Performance Improvements #55

alexkeizer · 2020-12-03T23:20:14Z

Describe the solution you'd like
I've been tinkering with the MediaRenderer code and noticed some room for performance improvements.

First off, I combined the rendering of multiple intervals in a single ffmpeg call, reducing process creation overhead.
The larger the filtergraph gets, though, the slower ffmpeg becomes. The sweet spot seems to be around 32 intervals per ffmpeg process. With this, ffmpeg itself already does a bit of multithreading, but not enough to fuly utilize my CPU.

The Popen interface already runs the subprocess in a non-blocking fashion. To run multiple ffmpeg instances simultaneously, we thus don't need multiple threads in the Python script.

On my (8 core, 16 threads) machine, the optimal setting seems to run 4 ffmpeg processes with access to 4 cores each. We should probably make this configurable.

Finally, we can speed up encoding quite a bit by passing -c:v libx264 -preset fast -profile:v baseline to ffmpeg.

Tests
Implementing all these things made unsilencing a 1.5 hour 300MB take about 220 seconds rather than 400. Almost a ~2x speedup. Unsilencing a smaller. 5 minute, video sped up ~1.5 times.

The resulting video, however, is smaller and shorter than when using the original unsilence tool. Smaller can be explained by better compression when processing larger segments, but shorter seems bad. Taking the same 1:32:52 file as before, the CLI tells me the result should be 1:18:01 (both in the original and after modifications), the original gives a 1:20:25 file and the modified gives a 1:18:12 video. There are no obvious faults in either file (they start and end the same), but I haven't been able to do a full side-by-side comparison yet.

TL;DR: I made some performance enhancements. The results seem promising, but I'm not sure if the modified version is still correct. I plan to make a PR once I've cleaned it up, so any comments are welcome.

The text was updated successfully, but these errors were encountered:

lagmoellertim · 2020-12-04T00:48:12Z

Hey, great job! The unsilence time calculation doesn't factor in certain settings, so a minimal discrepancy is acceptable.

As long as the final output file time is above or equal the predicted time, everything should work, below would most likely be a problem, but that is not the case here.

As far as I remember, the current unsilence implementation does not do any re-encoding (which is good for speed but overall seems to be the cause of many errors, see e.g. #37). Since no reencoding happens, there is a chance a video segment is corrupted (since is was not reencoded and therefor cleaned up before. That's why the current implementation has two methods to deal with these problems: dci (drop corrupted intervals), which removes any interval that can not be processed (sped up/slowed down, ...) and the default way, which, in case a corrupted segment appears, just keeps that segment (without any editing) and includes it in the final video merge (where all seperate segments are merged together). I think this happend to you, since the time usually gets longer if the segments are not processed.

Therefor, you're solution is probably the better one if I'm judging this correctly. I wanted to move to reencoding, but I haven't figured out a way to do it without increasing the overall rendering time. If you could include this process in your pr (or as it seems you already reencode some parts/all of the video), this would be awesome and would fix multiple bugs.

I'm looking forward to your PR!

alexkeizer · 2020-12-04T12:37:08Z

If I understand correctly, reencoding always happens when you apply a filter. So each interval (or chunk of 32 intervals in my version) is encoded separately. The final concatenation indeed does not do any reencoding, but does a simple stream copy.

I think the issue with corruption might be because of missing keyframes in the cut part for an interval (https://stackoverflow.com/questions/21420296/how-to-extract-time-accurate-video-segments-with-ffmpeg). I did not change that, so my modifications probably will not solve these issues. I could take a look at including this in my PR.

Now, we could try to encode each interval with a very simple, very fast codec (or maybe even some uncompressed format, although we should take care we don't replace a CPU bottleneck with an IO bottleneck). We could then do the final, proper, encode of the entire file during the concatenation. I would expect this doesn't increase render times too much.

lagmoellertim · 2020-12-04T15:38:54Z

If you could include the problem in your PR, that would be awesome! I also like your idea of indiviual interval encoding, so you can try implementing it and then we can look at the performance difference.

So once you are ready/satisfied with your code, you can do a PR and I look over and test the improvements

alexkeizer · 2020-12-04T23:17:52Z

Hmm, after looking at the ffmpeg documentation some more I found this:

-ss position (input/output)

    When used as an input option (before -i), seeks in this input file to position. 
    Note that in most formats it is not possible to seek exactly, so ffmpeg will seek to the closest seek point before position. 
    When transcoding and -accurate_seek is enabled (the default), this extra segment between the seek point and position 
    will be decoded and discarded. When doing stream copy or when -noaccurate_seek is used, it will be preserved.

So missing keyframes are not the problem (we are transcoding). I tried running my version on the file linked in #37 ; I got a 54 hour file which actually played. It seems like the file is correctly sped up, but something goes wrong with concatenation. During playback, the timestamp will sometimes jump ~20 minutes forward, which probably explains the 54 hours.

lagmoellertim mentioned this issue Dec 29, 2020

Hardware Acceleration #59

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Improvements #55

Performance Improvements #55

alexkeizer commented Dec 3, 2020

lagmoellertim commented Dec 4, 2020 •

edited

Loading

alexkeizer commented Dec 4, 2020 •

edited

Loading

lagmoellertim commented Dec 4, 2020

alexkeizer commented Dec 4, 2020 •

edited

Loading

Performance Improvements #55

Performance Improvements #55

Comments

alexkeizer commented Dec 3, 2020

lagmoellertim commented Dec 4, 2020 • edited Loading

alexkeizer commented Dec 4, 2020 • edited Loading

lagmoellertim commented Dec 4, 2020

alexkeizer commented Dec 4, 2020 • edited Loading

lagmoellertim commented Dec 4, 2020 •

edited

Loading

alexkeizer commented Dec 4, 2020 •

edited

Loading

alexkeizer commented Dec 4, 2020 •

edited

Loading