-
-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Improvements #55
Comments
Hey, great job! The unsilence time calculation doesn't factor in certain settings, so a minimal discrepancy is acceptable. As long as the final output file time is above or equal the predicted time, everything should work, below would most likely be a problem, but that is not the case here. As far as I remember, the current unsilence implementation does not do any re-encoding (which is good for speed but overall seems to be the cause of many errors, see e.g. #37). Since no reencoding happens, there is a chance a video segment is corrupted (since is was not reencoded and therefor cleaned up before. That's why the current implementation has two methods to deal with these problems: dci (drop corrupted intervals), which removes any interval that can not be processed (sped up/slowed down, ...) and the default way, which, in case a corrupted segment appears, just keeps that segment (without any editing) and includes it in the final video merge (where all seperate segments are merged together). I think this happend to you, since the time usually gets longer if the segments are not processed. Therefor, you're solution is probably the better one if I'm judging this correctly. I wanted to move to reencoding, but I haven't figured out a way to do it without increasing the overall rendering time. If you could include this process in your pr (or as it seems you already reencode some parts/all of the video), this would be awesome and would fix multiple bugs. I'm looking forward to your PR! |
If I understand correctly, reencoding always happens when you apply a filter. So each interval (or chunk of 32 intervals in my version) is encoded separately. The final concatenation indeed does not do any reencoding, but does a simple stream copy. I think the issue with corruption might be because of missing keyframes in the cut part for an interval (https://stackoverflow.com/questions/21420296/how-to-extract-time-accurate-video-segments-with-ffmpeg). I did not change that, so my modifications probably will not solve these issues. I could take a look at including this in my PR. Now, we could try to encode each interval with a very simple, very fast codec (or maybe even some uncompressed format, although we should take care we don't replace a CPU bottleneck with an IO bottleneck). We could then do the final, proper, encode of the entire file during the concatenation. I would expect this doesn't increase render times too much. |
If you could include the problem in your PR, that would be awesome! I also like your idea of indiviual interval encoding, so you can try implementing it and then we can look at the performance difference. So once you are ready/satisfied with your code, you can do a PR and I look over and test the improvements |
Hmm, after looking at the ffmpeg documentation some more I found this:
So missing keyframes are not the problem (we are transcoding). I tried running my version on the file linked in #37 ; I got a 54 hour file which actually played. It seems like the file is correctly sped up, but something goes wrong with concatenation. During playback, the timestamp will sometimes jump ~20 minutes forward, which probably explains the 54 hours. |
Describe the solution you'd like
I've been tinkering with the MediaRenderer code and noticed some room for performance improvements.
First off, I combined the rendering of multiple intervals in a single ffmpeg call, reducing process creation overhead.
The larger the filtergraph gets, though, the slower ffmpeg becomes. The sweet spot seems to be around 32 intervals per ffmpeg process. With this, ffmpeg itself already does a bit of multithreading, but not enough to fuly utilize my CPU.
The Popen interface already runs the subprocess in a non-blocking fashion. To run multiple ffmpeg instances simultaneously, we thus don't need multiple threads in the Python script.
On my (8 core, 16 threads) machine, the optimal setting seems to run 4 ffmpeg processes with access to 4 cores each. We should probably make this configurable.
Finally, we can speed up encoding quite a bit by passing
-c:v libx264 -preset fast -profile:v baseline
to ffmpeg.Tests
Implementing all these things made unsilencing a 1.5 hour 300MB take about 220 seconds rather than 400. Almost a ~2x speedup. Unsilencing a smaller. 5 minute, video sped up ~1.5 times.
The resulting video, however, is smaller and shorter than when using the original unsilence tool. Smaller can be explained by better compression when processing larger segments, but shorter seems bad. Taking the same 1:32:52 file as before, the CLI tells me the result should be 1:18:01 (both in the original and after modifications), the original gives a 1:20:25 file and the modified gives a 1:18:12 video. There are no obvious faults in either file (they start and end the same), but I haven't been able to do a full side-by-side comparison yet.
TL;DR: I made some performance enhancements. The results seem promising, but I'm not sure if the modified version is still correct. I plan to make a PR once I've cleaned it up, so any comments are welcome.
The text was updated successfully, but these errors were encountered: