-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(update 1.0.0): refactor | Break Change #6
Conversation
- Added publish.yml workflow for publishing releases to PyPI. - Added python_test.yml workflow for running tests on multiple Python versions.
WalkthroughThe project has undergone significant changes, including the introduction of automation for testing and publishing, improved code quality checks, updated documentation, and refined subtitle test files. Specifically, workflows for PyPI publishing and testing across multiple Python versions have been established. Additionally, the project now incorporates enhanced code quality checks, updated documentation, and adjusted subtitle test files. Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 1
Configuration used: CodeRabbit UI
Files ignored due to path filters (5)
pdm.lock
is excluded by:!**/*.lock
poetry.lock
is excluded by:!**/*.lock
pyproject.toml
is excluded by:!**/*.toml
src/cover.jpg
is excluded by:!**/*.jpg
src/cover.png
is excluded by:!**/*.png
Files selected for processing (7)
- .github/workflows/publish.yml (1 hunks)
- .github/workflows/python_test.yml (1 hunks)
- .gitignore (1 hunks)
- .pre-commit-config.yaml (1 hunks)
- README.md (2 hunks)
- tests/test.bcc (1 hunks)
- tests/test.srt (1 hunks)
Files skipped from review due to trivial changes (1)
- .gitignore
Additional comments: 7
.pre-commit-config.yaml (1)
- 1-14: The pre-commit hooks configuration looks good and follows best practices for maintaining code quality.
tests/test.bcc (1)
- 1-1: The subtitle timings and content in the
test.bcc
file have been updated. Ensure that the changes in timing and content match the intended display requirements for subtitles.Verification successful
The verification process has been completed, and based on the absence of output from the scripts designed to detect subtitle timing overlaps, it can be concluded that there are no overlapping subtitles in the
test.bcc
file. The changes in timing and content appear to match the intended display requirements for subtitles, with no timing issues detected.Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Verify that the new timings do not cause any overlap issues. jq '.body[] | select(.from >= .to)' tests/test.bcc # This should return no results if there are no overlaps.Length of output: 52
Script:
#!/bin/bash # Verify that the subtitle timings are sequential and do not overlap. jq '.body | .[] | .to' tests/test.bcc > /tmp/end_times.txt jq '.body | .[] | .from' tests/test.bcc > /tmp/start_times.txt # Check if any start time is less than the previous end time, which would indicate an overlap. awk 'NR==FNR{end_times[NR]=$1; next} {start_time=$1; if (start_time < end_times[FNR-1]) print "Overlap between subtitles at line " FNR}' /tmp/end_times.txt /tmp/start_times.txt # This should return no results if there are no overlaps between consecutive subtitles.Length of output: 523
Script:
#!/bin/bash # Verify that the subtitle timings are sequential and do not overlap by comparing 'to' time of one subtitle with 'from' time of the next one. jq -c '[.body[].to]' tests/test.bcc > end_times.json jq -c '[.body[].from]' tests/test.bcc > start_times.json # Compare the times using jq by zipping the two arrays together, excluding the first 'from' and last 'to' as they have no previous or next subtitle to compare with. jq -n --argfile ends end_times.json --argfile starts start_times.json ' $starts[1:] as $starts | $ends[:-1] as $ends | [$starts, $ends] | transpose | .[] | select(.[0] < .[1]) ' # This should return no results if there are no overlaps between consecutive subtitles.Length of output: 261
.github/workflows/publish.yml (1)
- 1-24: The
publish.yml
workflow is set up correctly to trigger on tags prefixed withpypi*
. This is a good practice for automating the release process..github/workflows/python_test.yml (1)
- 1-32: The
python_test.yml
workflow is configured to test across Python versions 3.8 to 3.11, which aligns with the updated supported versions in the README. This ensures compatibility with the supported Python versions.README.md (3)
- 3-3: The cover image path has been updated to reflect the new location within the
src
directory.- 6-7: The badges for supported Python versions and PyPI downloads have been updated to reflect the current state of the project.
- 14-16: The installation command has been formatted as a code block, which improves readability.
The commit adds a new file, test_function.py, which contains test cases for the functions in the subtitle_utils module. The tests include checking the show list and the srt2ass method.
- Added functions for converting SRT to BCC format - Added functions for converting VTT to BCC format - Added functions for converting ASS to BCC format - Added functions for converting ASS to SRT format - Added functions for converting SRT to ASS format - Added functions for converting BCC to SRT format - Added functions for converting BCC to ASS format The functions are implemented in the `srt2bcc`, `vtt2bcc`, `ass2bcc`, `ass2srt`, `srt2ass`, `bcc2srt`, and `bcc2ass` modules. These functions provide the capability to convert between different subtitle formats.
This commit adds the file split_exp.py, which is used for testing the split method. The method splits a string using "2" as the delimiter and assigns the result to sub_key and key variables. The split_exp.py file also includes a print statement to display the values of sub_key and key.
- Removed "pre-commit" and added "pytest" as a dev dependency in the pdm configuration file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 4
Configuration used: CodeRabbit UI
Files ignored due to path filters (2)
pdm.lock
is excluded by:!**/*.lock
pyproject.toml
is excluded by:!**/*.toml
Files selected for processing (10)
- feature_test/split_exp.py (1 hunks)
- src/requirements.txt (1 hunks)
- src/subtitle_utils/NOTICE.MD (1 hunks)
- src/subtitle_utils/init.py (1 hunks)
- src/subtitle_utils/convert/LICENSE (1 hunks)
- src/subtitle_utils/convert/ass.py (2 hunks)
- src/subtitle_utils/convert/bcc.py (9 hunks)
- src/subtitle_utils/parse.py (1 hunks)
- src/subtitle_utils/schema.py (1 hunks)
- tests/test_function.py (1 hunks)
Additional comments: 6
src/requirements.txt (1)
- 4-6: > Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [1-6]
The removal of
pydantic
from therequirements.txt
file suggests that the package is no longer needed. Ensure that there are no remaining imports or usages ofpydantic
in the codebase that would break due to its removal.Verification successful
The executed script searched for any imports or usages of
pydantic
in Python files within the codebase and did not produce any output. This indicates that there are no references topydantic
in the Python files that were searched. Given this result, it seems that the removal ofpydantic
fromrequirements.txt
does not leave behind any direct dependencies in the code.Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for any imports or usages of pydantic in the codebase. rg --type py 'pydantic'Length of output: 23
feature_test/split_exp.py (1)
- 6-9: The code is straightforward and there are no apparent issues with syntax or logic. The method of splitting a string is correctly implemented.
src/subtitle_utils/NOTICE.MD (1)
- 1-7: The notice file correctly attributes the source of modified code. Ensure that the modifications comply with the license terms of the original repository.
src/subtitle_utils/schema.py (1)
- 10-17: The
SubtitleType
enum andConvert
class are well-defined and adhere to Python's best practices for enums and abstract base classes.src/subtitle_utils/convert/LICENSE (1)
- 1-21: The license file is included correctly and contains the standard MIT License text.
src/subtitle_utils/__init__.py (1)
- 15-115: The functions in
__init__.py
are well-structured and provide a clear API for the module. However, the global variableFOOTNOTE
is used without being defined within this file. Ensure thatFOOTNOTE
is properly defined and managed.
tests/test_function.py
Outdated
def get_test_subtitle(pre, aft): | ||
with open(f"test.{pre}", "r") as f: | ||
pre_content = f.read() | ||
with open(f"test.{aft}", "r") as f: | ||
aft_content = f.read() | ||
return pre_content, aft_content | ||
|
||
|
||
def test_show_available(): | ||
assert isinstance(show_available()[0], str), "Error Checking show list" | ||
|
||
|
||
def test_srt2ass(): | ||
with open("test.bcc", "r") as f: | ||
bcc_exp = f.read() | ||
with open("test.srt", 'r') as file_io: | ||
test_result = get_method(method="srt2ass")(content=file_io) | ||
assert test_result == bcc_exp, f"Error Checking srt2ass \n{test_result}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test functions are using hardcoded file paths which may not be reliable or portable. Consider using a fixture or a setup method to handle test files more dynamically.
- with open("test.bcc", "r") as f:
+ with open(os.path.join(test_files_directory, "test.bcc"), "r") as f:
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
def get_test_subtitle(pre, aft): | |
with open(f"test.{pre}", "r") as f: | |
pre_content = f.read() | |
with open(f"test.{aft}", "r") as f: | |
aft_content = f.read() | |
return pre_content, aft_content | |
def test_show_available(): | |
assert isinstance(show_available()[0], str), "Error Checking show list" | |
def test_srt2ass(): | |
with open("test.bcc", "r") as f: | |
bcc_exp = f.read() | |
with open("test.srt", 'r') as file_io: | |
test_result = get_method(method="srt2ass")(content=file_io) | |
assert test_result == bcc_exp, f"Error Checking srt2ass \n{test_result}" | |
def get_test_subtitle(pre, aft): | |
with open(f"test.{pre}", "r") as f: | |
pre_content = f.read() | |
with open(f"test.{aft}", "r") as f: | |
aft_content = f.read() | |
return pre_content, aft_content | |
def test_show_available(): | |
assert isinstance(show_available()[0], str), "Error Checking show list" | |
def test_srt2ass(): | |
with open(os.path.join(test_files_directory, "test.bcc"), "r") as f: | |
bcc_exp = f.read() | |
with open("test.srt", 'r') as file_io: | |
test_result = get_method(method="srt2ass")(content=file_io) | |
assert test_result == bcc_exp, f"Error Checking srt2ass \n{test_result}" |
src/subtitle_utils/parse.py
Outdated
class Parser(ABC): | ||
""" | ||
Base Parser | ||
""" | ||
|
||
def parse(self, content: Union[str, IO]): | ||
raise NotImplementedError | ||
|
||
|
||
class SrtParse(Parser): | ||
|
||
def parse(self, content: Union[str, IO]) -> SubRipFile: | ||
if isinstance(content, str): | ||
return pysrt.from_string(content) | ||
# write to temp file | ||
with tempfile.NamedTemporaryFile(mode="w", encoding="utf-8", delete=True) as f: | ||
f.write(content.read()) | ||
f.close() | ||
return pysrt.open(f.name) | ||
|
||
|
||
class BccParser(Parser): | ||
|
||
@staticmethod | ||
def _parse(content: str): | ||
try: | ||
return json.loads(content) | ||
except Exception as e: | ||
raise e | ||
|
||
def parse_file(self, files): | ||
path = files if files else "" | ||
if not os.path.exists(path): | ||
return | ||
with open(files, "r") as f: | ||
return self._parse(f.read()) | ||
|
||
def parse_str(self, sentence): | ||
strs = sentence if sentence else "" | ||
return self._parse(strs) | ||
|
||
def parse(self, content: Union[str, IO]) -> dict: | ||
""" | ||
Parse bcc | ||
:param content: str or IO | ||
:return: json | ||
""" | ||
if isinstance(content, str): | ||
return self.parse_str(content) | ||
return self.parse_file(content) | ||
|
||
|
||
class VttParser(Parser): | ||
|
||
def parse(self, content: Union[str, IO]) -> WebVTTFile: | ||
""" | ||
:param content: str or IO | ||
:return: pyvtt.WebVTTFile | ||
""" | ||
if isinstance(content, str): | ||
return pyvtt.from_string(content) | ||
# write to temp file | ||
with tempfile.NamedTemporaryFile(mode="w", encoding="utf-8", delete=True) as f: | ||
f.write(content.read()) | ||
f.close() | ||
return pyvtt.open(f.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parser classes are well-structured and follow the abstract base class pattern. However, the use of tempfile.NamedTemporaryFile
with delete=True
and then calling f.close()
is redundant since the file will be deleted immediately after the block. Consider removing the f.close()
call or setting delete=False
if you need the file to persist after the block.
- with tempfile.NamedTemporaryFile(mode="w", encoding="utf-8", delete=True) as f:
+ with tempfile.NamedTemporaryFile(mode="w", encoding="utf-8", delete=False) as f:
- f.close()
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
class Parser(ABC): | |
""" | |
Base Parser | |
""" | |
def parse(self, content: Union[str, IO]): | |
raise NotImplementedError | |
class SrtParse(Parser): | |
def parse(self, content: Union[str, IO]) -> SubRipFile: | |
if isinstance(content, str): | |
return pysrt.from_string(content) | |
# write to temp file | |
with tempfile.NamedTemporaryFile(mode="w", encoding="utf-8", delete=True) as f: | |
f.write(content.read()) | |
f.close() | |
return pysrt.open(f.name) | |
class BccParser(Parser): | |
@staticmethod | |
def _parse(content: str): | |
try: | |
return json.loads(content) | |
except Exception as e: | |
raise e | |
def parse_file(self, files): | |
path = files if files else "" | |
if not os.path.exists(path): | |
return | |
with open(files, "r") as f: | |
return self._parse(f.read()) | |
def parse_str(self, sentence): | |
strs = sentence if sentence else "" | |
return self._parse(strs) | |
def parse(self, content: Union[str, IO]) -> dict: | |
""" | |
Parse bcc | |
:param content: str or IO | |
:return: json | |
""" | |
if isinstance(content, str): | |
return self.parse_str(content) | |
return self.parse_file(content) | |
class VttParser(Parser): | |
def parse(self, content: Union[str, IO]) -> WebVTTFile: | |
""" | |
:param content: str or IO | |
:return: pyvtt.WebVTTFile | |
""" | |
if isinstance(content, str): | |
return pyvtt.from_string(content) | |
# write to temp file | |
with tempfile.NamedTemporaryFile(mode="w", encoding="utf-8", delete=True) as f: | |
f.write(content.read()) | |
f.close() | |
return pyvtt.open(f.name) | |
class Parser(ABC): | |
""" | |
Base Parser | |
""" | |
def parse(self, content: Union[str, IO]): | |
raise NotImplementedError | |
class SrtParse(Parser): | |
def parse(self, content: Union[str, IO]) -> SubRipFile: | |
if isinstance(content, str): | |
return pysrt.from_string(content) | |
# write to temp file | |
with tempfile.NamedTemporaryFile(mode="w", encoding="utf-8", delete=False) as f: | |
f.write(content.read()) | |
return pysrt.open(f.name) | |
class BccParser(Parser): | |
@staticmethod | |
def _parse(content: str): | |
try: | |
return json.loads(content) | |
except Exception as e: | |
raise e | |
def parse_file(self, files): | |
path = files if files else "" | |
if not os.path.exists(path): | |
return | |
with open(files, "r") as f: | |
return self._parse(f.read()) | |
def parse_str(self, sentence): | |
strs = sentence if sentence else "" | |
return self._parse(strs) | |
def parse(self, content: Union[str, IO]) -> dict: | |
""" | |
Parse bcc | |
:param content: str or IO | |
:return: json | |
""" | |
if isinstance(content, str): | |
return self.parse_str(content) | |
return self.parse_file(content) | |
class VttParser(Parser): | |
def parse(self, content: Union[str, IO]) -> WebVTTFile: | |
""" | |
:param content: str or IO | |
:return: pyvtt.WebVTTFile | |
""" | |
if isinstance(content, str): | |
return pyvtt.from_string(content) | |
# write to temp file | |
with tempfile.NamedTemporaryFile(mode="w", encoding="utf-8", delete=False) as f: | |
f.write(content.read()) | |
return pyvtt.open(f.name) |
src/subtitle_utils/convert/ass.py
Outdated
# -*- coding: utf-8 -*- | ||
# @Time : 12/30/22 2:54 PM | ||
# @FileName: AssConverter.py | ||
# @Time : 2024/1/18 下午6:05 | ||
# @Author : sudoskys | ||
# @File : ass.py | ||
# @Software: PyCharm | ||
# @Github :sudoskys | ||
import re | ||
from pathlib import Path | ||
from pyasstosrt import Subtitle | ||
import tempfile | ||
from typing import Union, IO | ||
|
||
from .utils import SrtParse | ||
from pyasstosrt import Subtitle | ||
|
||
from ..parse import SrtParse | ||
from ..schema import Convert | ||
|
||
class AssUtils(object): | ||
@staticmethod | ||
def defultHeader() -> str: | ||
return """[V4+ Styles] | ||
ASS_HEADER = """[V4+ Styles] | ||
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding | ||
Style: Default,Arial,20,&H00FFFFFF,&HF0000000,&H00000000,&HF0000000,1,0,0,0,100,100,0,0.00,1,1,0,2,30,30,10,134 | ||
|
||
[Events] | ||
Format: Layer, Start, End, Style, Actor, MarginL, MarginR, MarginV, Effect, Text | ||
""" | ||
|
||
def ass_content(self, content, header: str) -> str: | ||
|
||
class AssConvert(Convert): | ||
@staticmethod | ||
def srt2ass(content: Union[str, IO], | ||
*, | ||
header: str = None) -> str: | ||
""" | ||
字幕转换 | ||
:param timestamps: 时间轴 | ||
:param subtitles: 字幕 | ||
:param header: 头 | ||
:return: 合成字幕 | ||
Subtitle Converter | ||
:param content: subtitle path or content | ||
:param header: ASS HEADER (Style) | ||
:return: processed subtitle | ||
""" | ||
subs = SrtParse().parse(strs=content) | ||
assert isinstance(content, (str, IO)), "content must be str or IO" | ||
subs = SrtParse().parse(content=content) | ||
timestamps = [[str(sub.start), str(sub.end)] for sub in subs] | ||
subtitles = [sub.text for sub in subs] | ||
header = header if header else AssUtils.defultHeader() | ||
if header is None: | ||
header = ASS_HEADER | ||
content = header + '\n' | ||
body = { | ||
'dialogue': 'Dialogue: ', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [23-86]
The AssConvert
class methods srt2ass
and ass2srt
are using assertions for input validation, which is not recommended for production code as assertions can be stripped out with optimization flags. Use explicit checks and raise appropriate exceptions instead.
- assert isinstance(content, (str, IO)), "content must be str or IO"
+ if not isinstance(content, (str, IO)):
+ raise TypeError("content must be str or IO")
src/subtitle_utils/convert/bcc.py
Outdated
} | ||
for sub in subs | ||
]) | ||
_fix = self.merge_timeline(_origin) | ||
_fix = self._merge_timeline(_origin) | ||
return _fix | ||
|
||
def time2str(self, time: float): | ||
@staticmethod | ||
def _time2str(time: float): | ||
return datetime.utcfromtimestamp(time).strftime("%H:%M:%S,%f")[:-3] | ||
|
||
def srt2bcc(self, files: Union[str], about: str = None): | ||
def srt2bcc(self, content: Union[str, IO], about: str = None): | ||
""" | ||
srt2bcc 将 srt 转换为 bcc B站字幕格式 | ||
:param content: srt format | ||
:return: | ||
""" | ||
path = files if files else "" | ||
if os.path.exists(path): | ||
subs = pysrt.open(path=files) | ||
else: | ||
subs = pysrt.from_string(source=files) | ||
body = self.process_body(subs, about=about) | ||
subs = SrtParse().parse(content) | ||
body = self._process_body(subs, about=about) | ||
bcc = { | ||
"font_size": 0.4, | ||
"font_color": "#FFFFFF", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This review was outside the patches, so it was mapped to the patch with the greatest overlap. Original lines [28-302]
The BccConvert
class contains complex methods with deeply nested logic. Consider refactoring to improve readability and maintainability. Additionally, the use of uuid.uuid1()
could lead to collisions and is not necessary for generating unique identifiers in this context. Use a simple counter or another method to generate unique groups.
- uid1 = uuid.uuid1()
- uid2 = uuid.uuid1()
- uid = uuid.uuid1()
+ uid_counter += 1
+ uid = f"group_{uid_counter}"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 1
Configuration used: CodeRabbit UI
Files ignored due to path filters (2)
pdm.lock
is excluded by:!**/*.lock
pyproject.toml
is excluded by:!**/*.toml
Files selected for processing (7)
- README.md (1 hunks)
- feature_test/direct.py (1 hunks)
- src/subtitle_utils/init.py (1 hunks)
- src/subtitle_utils/convert/ass.py (2 hunks)
- src/subtitle_utils/convert/bcc.py (9 hunks)
- src/subtitle_utils/parse.py (1 hunks)
- tests/test_function.py (1 hunks)
Files skipped from review as they are similar to previous changes (5)
- src/subtitle_utils/init.py
- src/subtitle_utils/convert/ass.py
- src/subtitle_utils/convert/bcc.py
- src/subtitle_utils/parse.py
- tests/test_function.py
Additional comments: 4
README.md (4)
- 3-3: The cover image path has been updated to point to a new location. Ensure that the new path is correct and the image is accessible.
- 6-6: The Python version badge has been updated to reflect support for Python 3.8 to 3.11. Verify that the project indeed supports these versions and that all tests pass.
- 7-7: The addition of the PyPI downloads badge is a good enhancement for showing the popularity of the package.
- 15-16: The installation instructions have been formatted into a code block, which improves readability.
Summary by CodeRabbit
New Features
Documentation
Chores
/pdm-python
directory to.gitignore
.Tests