Merge pull request #60 from pozitronik/issue_59

pozitronik · web-flow · commit 56c32ee7dff5 · 2023-08-20T14:46:32.000+04:00
Issue 59
diff --git a/README.md b/README.md
@@ -81,8 +81,6 @@ Swap all faces on the `d:\videos\not_a_porn.mp4` video file to the face from `d:
 python sin.py --source="d:\pictures\any_picture.jpg" --target="d:\pictures\pngs_dir" --output="d:\pictures\pngs_dir\enhanced" --frame-processor=FaceEnhancer --many-faces --max-memory=24 --execution-provider=cuda --execution-threads=8
 ```
 Enhance all faces in every PNG file in the `d:\pictures\pngs_dir` directory using the `cuda` provider and 8 simultaneous execution threads, with limit of 24 Gb RAM, and save every enhanced image to the `d:\pictures\pngs_dir\enhanced` directory.<br/>
-**Note 1**: only PNG images are supported at the moment.<br/>
-**Note 2**: even if the selected frame processor does not require a `source`, you should provide one at this time.
 
 ## Configuration file
 
@@ -104,12 +102,24 @@ You also can pass path to the custom configuration file as a command line parame
 python sin.py --ini="d:\path\custom.ini"
 ```
 
+## How to handle output videos quality/encoding speed/etc?
+
+In brief, sinner relies on the `ffmpeg` software almost every time video processing is required, and it's possible to utilize all the incredible powers of `ffmpeg`. Use the `--ffmpeg_resulting_parameters` key to control how `ffmpeg` will encode the output video: simply pass the usual `ffmpeg` parameters as the value for this key (remember not to forget enclosing the value string in commas). There are some examples:
+
+* `--ffmpeg_resulting_parameters="-c:v libx264 -preset medium -crf 20 -pix_fmt yuv420p"`: use software x264 encoder (`-c:v libx264`) with the medium quality (`-preset medium` and `-crf 20`) and `yuv420p` pixel format. This is the default parameter value.
+* `--ffmpeg_resulting_parameters="-c:v h264_nvenc -preset slow -qp 20 -pix_fmt yuv420p"`: use nVidia GPU-accelerated x264 encoder (`-c:v h264_nvenc`) with the good encoding quality (`-preset slow` and `-qp 20`). This encoder is worth to use if it supported by your GPU.
+* `--ffmpeg_resulting_parameters="-c:v hevc_nvenc -preset slow -qp 20 -pix_fmt yuv420p"`: the same as above, but with x265 encoding.
+* `--ffmpeg_resulting_parameters="-c:v h264_amf -b:v 2M -pix_fmt yuv420p"`: the AMD hardware-accelerated x264 encoder (`-c:v h264_amf`) with 2mbps resulting video bitrate (-b:v 2M). This should be good for AMD GPUs.
+
+And so on. As you can find, there are a lot of different presets and options for the every `ffmpeg` encoder, and you can rely on the [documentation](https://ffmpeg.org/ffmpeg-codecs.html) to achieve desired results. 
+
+In case, when `ffmpeg` is not available in your system, sinner will gracefully degrade to CV2 library possibilities. In that case all video processing features should work, but in a very basic way: only with the software x264 encoder, which is slow and thriftless. 
+
 ## FAQ
 
 :question: What are the differences between sinner and roop?<br/>
 :exclamation: As said before, sinner has started as a fork of roop. They share similar ideas, but they differ in the ways how those ideas should be implemented.
-sinner uses the same ML libraries to perform its magic, but handles them in its own way. From a developer's perspective, it has a better architecture (OOP instead of functional approach),
- stricter types handling and more comprehensive tests. From the point of view of a user, sinner offers additional features that Roop currently lacks.
+sinner uses the same ML libraries to perform its magic, but handles them in its own way. From a developer's perspective, it has a better architecture (OOP instead of functional approach), stricter types handling and more comprehensive tests. From the point of view of a user, sinner offers additional features that Roop currently lacks.
 
 :question: Is there a NSWF filter?<br/>
 :exclamation: Nope. I don't care if you will do nasty things with sinner, it's your responsibility. And sinner is just a neutral tool, like a hammer or a knife, it is the responsibility of the user to decide how they want to use it.
diff --git a/sinner/handlers/frame/FFmpegVideoHandler.py b/sinner/handlers/frame/FFmpegVideoHandler.py
@@ -18,6 +18,7 @@ class FFmpegVideoHandler(BaseFrameHandler):
     emoji: str = '🎥'
 
     output_fps: float
+    ffmpeg_resulting_parameters: str
 
     def rules(self) -> Rules:
         return super().rules() + [
@@ -26,6 +27,11 @@ def rules(self) -> Rules:
                 'default': lambda: self.fps,
                 'help': 'FPS of resulting video'
             },
+            {
+                'parameter': ['ffmpeg_resulting_parameters'],
+                'default': '-c:v libx264 -preset medium -crf 20 -pix_fmt yuv420p',
+                'help': 'ffmpeg command-line part to adjust resulting video parameters'
+            },
             {
                 'module_help': 'The video processing module, based on ffmpeg'
             }
@@ -97,8 +103,9 @@ def result(self, from_dir: str, filename: str, audio_target: str | None = None)
         self.update_status(f"Resulting frames from {from_dir} to {filename} with {self.output_fps} FPS")
         filename_length = len(str(self.fc))  # a way to determine frame names length
         Path(os.path.dirname(filename)).mkdir(parents=True, exist_ok=True)
-        command = ['-r', str(self.output_fps), '-i', os.path.join(from_dir, f'%0{filename_length}d.png'), '-c:v', 'h264_nvenc', '-preset', 'medium', '-qp', '18', '-pix_fmt', 'yuv420p', '-vf',
-                   'colorspace=bt709:iall=bt601-6-625:fast=1', filename]
+        command = ['-framerate', str(self.output_fps), '-i', os.path.join(from_dir, f'%0{filename_length}d.png')]
+        command.extend(self.ffmpeg_resulting_parameters.split(' '))
+        command.extend(['-r', str(self.output_fps), filename])
         if audio_target:
             command.extend(['-i', audio_target, '-shortest'])
         return self.run(command)